Search | arXiv e-print repository

Meta 3D Gen

Authors: Raphael Bensadoun, Tom Monnier, Yanir Kleiman, Filippos Kokkinos, Yawar Siddiqui, Mahendra Kariya, Omri Harosh, Roman Shapovalov, Benjamin Graham, Emilien Garreau, Animesh Karnewar, Ang Cao, Idan Azuri, Iurii Makarov, Eric-Tuan Le, Antoine Toisoul, David Novotny, Oran Gafni, Natalia Neverova, Andrea Vedaldi

Abstract: We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset generation. 3DGen offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in under a minute. It supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications. Additionally, 3DGen supports generative retexturing of previously gener… ▽ More We introduce Meta 3D Gen (3DGen), a new state-of-the-art, fast pipeline for text-to-3D asset generation. 3DGen offers 3D asset creation with high prompt fidelity and high-quality 3D shapes and textures in under a minute. It supports physically-based rendering (PBR), necessary for 3D asset relighting in real-world applications. Additionally, 3DGen supports generative retexturing of previously generated (or artist-created) 3D shapes using additional textual inputs provided by the user. 3DGen integrates key technical components, Meta 3D AssetGen and Meta 3D TextureGen, that we developed for text-to-3D and text-to-texture generation, respectively. By combining their strengths, 3DGen represents 3D objects simultaneously in three ways: in view space, in volumetric space, and in UV (or texture) space. The integration of these two techniques achieves a win rate of 68% with respect to the single-stage model. We compare 3DGen to numerous industry baselines, and show that it outperforms them in terms of prompt fidelity and visual quality for complex textual prompts, while being significantly faster. △ Less

Submitted 2 July, 2024; originally announced July 2024.

arXiv:2406.10180 [pdf, other]

MeshPose: Unifying DensePose and 3D Body Mesh reconstruction

Authors: Eric-Tuan Lê, Antonis Kakolyris, Petros Koutras, Himmy Tam, Efstratios Skordos, George Papandreou, Rıza Alp Güler, Iasonas Kokkinos

Abstract: DensePose provides a pixel-accurate association of images with 3D mesh coordinates, but does not provide a 3D mesh, while Human Mesh Reconstruction (HMR) systems have high 2D reprojection error, as measured by DensePose localization metrics. In this work we introduce MeshPose to jointly tackle DensePose and HMR. For this we first introduce new losses that allow us to use weak DensePose supervision… ▽ More DensePose provides a pixel-accurate association of images with 3D mesh coordinates, but does not provide a 3D mesh, while Human Mesh Reconstruction (HMR) systems have high 2D reprojection error, as measured by DensePose localization metrics. In this work we introduce MeshPose to jointly tackle DensePose and HMR. For this we first introduce new losses that allow us to use weak DensePose supervision to accurately localize in 2D a subset of the mesh vertices ('VertexPose'). We then lift these vertices to 3D, yielding a low-poly body mesh ('MeshPose'). Our system is trained in an end-to-end manner and is the first HMR method to attain competitive DensePose accuracy, while also being lightweight and amenable to efficient inference, making it suitable for real-time AR applications. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

MSC Class: 68 ACM Class: I.2.10

Journal ref: CVPR 2024

arXiv:2308.12256 [pdf, other]

doi 10.1145/3604915.3610244

Learning from Negative User Feedback and Measuring Responsiveness for Sequential Recommenders

Authors: Yueqi Wang, Yoni Halpern, Shuo Chang, Jingchen Feng, Elaine Ya Le, Longfei Li, Xujian Liang, Min-Cheng Huang, Shane Li, Alex Beutel, Yaping Zhang, Shuchao Bi

Abstract: Sequential recommenders have been widely used in industry due to their strength in modeling user preferences. While these models excel at learning a user's positive interests, less attention has been paid to learning from negative user feedback. Negative user feedback is an important lever of user control, and comes with an expectation that recommenders should respond quickly and reduce similar re… ▽ More Sequential recommenders have been widely used in industry due to their strength in modeling user preferences. While these models excel at learning a user's positive interests, less attention has been paid to learning from negative user feedback. Negative user feedback is an important lever of user control, and comes with an expectation that recommenders should respond quickly and reduce similar recommendations to the user. However, negative feedback signals are often ignored in the training objective of sequential retrieval models, which primarily aim at predicting positive user interactions. In this work, we incorporate explicit and implicit negative user feedback into the training objective of sequential recommenders in the retrieval stage using a "not-to-recommend" loss function that optimizes for the log-likelihood of not recommending items with negative feedback. We demonstrate the effectiveness of this approach using live experiments on a large-scale industrial recommender system. Furthermore, we address a challenge in measuring recommender responsiveness to negative feedback by developing a counterfactual simulation framework to compare recommender responses between different user actions, showing improved responsiveness from the modeling change. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: RecSys 2023 Industry Track

arXiv:2305.07764 [pdf, other]

Long-Term Value of Exploration: Measurements, Findings and Algorithms

Authors: Yi Su, Xiangyu Wang, Elaine Ya Le, Liang Liu, Yuening Li, Haokai Lu, Benjamin Lipshitz, Sriraj Badam, Lukasz Heldt, Shuchao Bi, Ed Chi, Cristos Goodrow, Su-Lin Wu, Lexi Baugher, Minmin Chen

Abstract: Effective exploration is believed to positively influence the long-term user experience on recommendation platforms. Determining its exact benefits, however, has been challenging. Regular A/B tests on exploration often measure neutral or even negative engagement metrics while failing to capture its long-term benefits. We here introduce new experiment designs to formally quantify the long-term valu… ▽ More Effective exploration is believed to positively influence the long-term user experience on recommendation platforms. Determining its exact benefits, however, has been challenging. Regular A/B tests on exploration often measure neutral or even negative engagement metrics while failing to capture its long-term benefits. We here introduce new experiment designs to formally quantify the long-term value of exploration by examining its effects on content corpus, and connecting content corpus growth to the long-term user experience from real-world experiments. Once established the values of exploration, we investigate the Neural Linear Bandit algorithm as a general framework to introduce exploration into any deep learning based ranking systems. We conduct live experiments on one of the largest short-form video recommendation platforms that serves billions of users to validate the new experiment designs, quantify the long-term values of exploration, and to verify the effectiveness of the adopted neural linear bandit algorithm for exploration. △ Less

Submitted 25 February, 2024; v1 submitted 12 May, 2023; originally announced May 2023.

Comments: 11 pages, WSDM 2024

arXiv:2109.00113 [pdf, other]

CPFN: Cascaded Primitive Fitting Networks for High-Resolution Point Clouds

Authors: Eric-Tuan Lê, Minhyuk Sung, Duygu Ceylan, Radomir Mech, Tamy Boubekeur, Niloy J. Mitra

Abstract: Representing human-made objects as a collection of base primitives has a long history in computer vision and reverse engineering. In the case of high-resolution point cloud scans, the challenge is to be able to detect both large primitives as well as those explaining the detailed parts. While the classical RANSAC approach requires case-specific parameter tuning, state-of-the-art networks are limit… ▽ More Representing human-made objects as a collection of base primitives has a long history in computer vision and reverse engineering. In the case of high-resolution point cloud scans, the challenge is to be able to detect both large primitives as well as those explaining the detailed parts. While the classical RANSAC approach requires case-specific parameter tuning, state-of-the-art networks are limited by memory consumption of their backbone modules such as PointNet++, and hence fail to detect the fine-scale primitives. We present Cascaded Primitive Fitting Networks (CPFN) that relies on an adaptive patch sampling network to assemble detection results of global and local primitive detection networks. As a key enabler, we present a merging formulation that dynamically aggregates the primitives across global and local scales. Our evaluation demonstrates that CPFN improves the state-of-the-art SPFN performance by 13-14% on high-resolution point cloud datasets and specifically improves the detection of fine-scale primitives by 20-22%. △ Less

Submitted 6 September, 2021; v1 submitted 31 August, 2021; originally announced September 2021.

Comments: ICCV 2021: 15 pages, 8 figures

Journal ref: ICCV 2021

arXiv:1907.00960 [pdf, other]

Going Deeper with Lean Point Networks

Authors: Eric-Tuan Le, Iasonas Kokkinos, Niloy J. Mitra

Abstract: In this work we introduce Lean Point Networks (LPNs) to train deeper and more accurate point processing networks by relying on three novel point processing blocks that improve memory consumption, inference time, and accuracy: a convolution-type block for point sets that blends neighborhood information in a memory-efficient manner; a crosslink block that efficiently shares information across low- a… ▽ More In this work we introduce Lean Point Networks (LPNs) to train deeper and more accurate point processing networks by relying on three novel point processing blocks that improve memory consumption, inference time, and accuracy: a convolution-type block for point sets that blends neighborhood information in a memory-efficient manner; a crosslink block that efficiently shares information across low- and high-resolution processing branches; and a multiresolution point cloud processing block for faster diffusion of information. By combining these blocks, we design wider and deeper point-based architectures. We report systematic accuracy and memory consumption improvements on multiple publicly available segmentation tasks by using our generic modules as drop-in replacements for the blocks of multiple architectures (PointNet++, DGCNN, SpiderNet, PointCNN). △ Less

Submitted 16 June, 2020; v1 submitted 1 July, 2019; originally announced July 2019.

Comments: 16 pages, 11 figures, 9 tables

MSC Class: 68T45 ACM Class: I.2.10; I.3.0; I.4.8

arXiv:1603.01562 [pdf, other]

doi 10.1088/1361-6420/aa6cbd

A Data-Scalable Randomized Misfit Approach for Solving Large-Scale PDE-Constrained Inverse Problems

Authors: Ellen B. Le, Aaron Myers, Tan Bui-Thanh, Quoc P. Nguyen

Abstract: A randomized misfit approach is presented for the efficient solution of large-scale PDE-constrained inverse problems with high-dimensional data. The purpose of this paper is to offer a theory-based framework for random projections in this inverse problem setting. The stochastic approximation to the misfit is analyzed using random projection theory. By expanding beyond mean estimator convergence, a… ▽ More A randomized misfit approach is presented for the efficient solution of large-scale PDE-constrained inverse problems with high-dimensional data. The purpose of this paper is to offer a theory-based framework for random projections in this inverse problem setting. The stochastic approximation to the misfit is analyzed using random projection theory. By expanding beyond mean estimator convergence, a practical characterization of randomized misfit convergence can be achieved. The theoretical results developed hold with any valid random projection in the literature. The class of feasible distributions is broad yet simple to characterize compared to previous stochastic misfit methods. This class includes very sparse random projections which provide additional computational benefit. A different proof for a variant of the Johnson-Lindenstrauss lemma is also provided. This leads to a different intuition for the $O(ε^{-2})$ factor in bounds for Johnson-Lindenstrauss results. The main contribution of this paper is a theoretical result showing the method guarantees a valid solution for small reduced misfit dimensions. The interplay between Johnson-Lindenstrauss theory and Morozov's discrepancy principle is shown to be essential to the result. The computational cost savings for large-scale PDE-constrained problems with high- dimensional data is discussed. Numerical verification of the developed theory is presented for model problems of estimating a distributed parameter in an elliptic partial differential equation. Results with different random projections are presented to demonstrate the viability and accuracy of the proposed approach. △ Less

Submitted 17 April, 2017; v1 submitted 4 March, 2016; originally announced March 2016.

Comments: 29 pages

Showing 1–7 of 7 results for author: Le, E