Zum Hauptinhalt springen

Showing 51–100 of 287 results for author: Anandkumar, A

.
  1. arXiv:2304.14554  [pdf, other

    physics.med-ph cond-mat.soft physics.bio-ph physics.flu-dyn

    AI-aided Geometric Design of Anti-infection Catheters

    Authors: Tingtao Zhou, Xuan Wan, Daniel Zhengyu Huang, Zongyi Li, Zhiwei Peng, Anima Anandkumar, John F. Brady, Paul W. Sternberg, Chiara Daraio

    Abstract: Bacteria can swim upstream due to hydrodynamic interactions with the fluid flow in a narrow tube, and pose a clinical threat of urinary tract infection to patients implanted with catheters. Coatings and structured surfaces have been proposed as a way to suppress bacterial contamination in catheters. However, there is no surface structuring or coating approach to date that thoroughly addresses the… ▽ More

    Submitted 27 April, 2023; originally announced April 2023.

    Comments: maint text 4 figures, SI 5 figures

  2. arXiv:2304.06762  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study

    Authors: Boxin Wang, Wei Ping, Peng Xu, Lawrence McAfee, Zihan Liu, Mohammad Shoeybi, Yi Dong, Oleksii Kuchaiev, Bo Li, Chaowei Xiao, Anima Anandkumar, Bryan Catanzaro

    Abstract: Large decoder-only language models (LMs) can be largely improved in terms of perplexity by retrieval (e.g., RETRO), but its impact on text generation quality and downstream task accuracy is unclear. Thus, it is still an open question: shall we pretrain large autoregressive LMs with retrieval? To answer it, we perform a comprehensive study on a scalable pre-trained retrieval-augmented LM (i.e., RET… ▽ More

    Submitted 20 December, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

    Comments: EMNLP 2023

  3. arXiv:2303.02506  [pdf, other

    cs.LG cs.AI cs.CV

    Prismer: A Vision-Language Model with Multi-Task Experts

    Authors: Shikun Liu, Linxi Fan, Edward Johns, Zhiding Yu, Chaowei Xiao, Anima Anandkumar

    Abstract: Recent vision-language models have shown impressive multi-modal generation capabilities. However, typically they require training huge models on massive datasets. As a more scalable alternative, we introduce Prismer, a data- and parameter-efficient vision-language model that leverages an ensemble of task-specific experts. Prismer only requires training of a small number of components, with the maj… ▽ More

    Submitted 18 January, 2024; v1 submitted 4 March, 2023; originally announced March 2023.

    Comments: Published at TMLR 2024. Project Page: https://shikun.io/projects/prismer Code: https://github.com/NVlabs/prismer

  4. arXiv:2302.12422  [pdf, other

    cs.RO

    MimicPlay: Long-Horizon Imitation Learning by Watching Human Play

    Authors: Chen Wang, Linxi Fan, Jiankai Sun, Ruohan Zhang, Li Fei-Fei, Danfei Xu, Yuke Zhu, Anima Anandkumar

    Abstract: Imitation learning from human demonstrations is a promising paradigm for teaching robots manipulation skills in the real world. However, learning complex long-horizon tasks often requires an unattainable amount of demonstrations. To reduce the high data requirement, we resort to human play data - video sequences of people freely interacting with the environment using their hands. Even with differe… ▽ More

    Submitted 13 October, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: 7th Conference on Robot Learning (CoRL 2023 oral presentation)

  5. arXiv:2302.12251  [pdf, other

    cs.CV cs.AI cs.LG cs.RO

    VoxFormer: Sparse Voxel Transformer for Camera-based 3D Semantic Scene Completion

    Authors: Yiming Li, Zhiding Yu, Christopher Choy, Chaowei Xiao, Jose M. Alvarez, Sanja Fidler, Chen Feng, Anima Anandkumar

    Abstract: Humans can easily imagine the complete 3D geometry of occluded objects and scenes. This appealing ability is vital for recognition and understanding. To enable such capability in AI systems, we propose VoxFormer, a Transformer-based semantic scene completion framework that can output complete 3D volumetric semantics from only 2D images. Our framework adopts a two-stage design where we start from a… ▽ More

    Submitted 25 March, 2023; v1 submitted 23 February, 2023; originally announced February 2023.

    Comments: CVPR 2023 Highlight (10% of accepted papers, 2.5% of submissions)

  6. arXiv:2302.07400  [pdf, other

    cs.LG math.FA stat.ML

    Score-based Diffusion Models in Function Space

    Authors: Jae Hyun Lim, Nikola B. Kovachki, Ricardo Baptista, Christopher Beckham, Kamyar Azizzadenesheli, Jean Kossaifi, Vikram Voleti, Jiaming Song, Karsten Kreis, Jan Kautz, Christopher Pal, Arash Vahdat, Anima Anandkumar

    Abstract: Diffusion models have recently emerged as a powerful framework for generative modeling. They consist of a forward process that perturbs input data with Gaussian white noise and a reverse process that learns a score function to generate samples by denoising. Despite their tremendous success, they are mostly formulated on finite-dimensional spaces, e.g. Euclidean, limiting their applications to many… ▽ More

    Submitted 22 November, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    Comments: 52 pages

    MSC Class: 46B09 (Primary); 60J22 (Secondary) ACM Class: I.2.6; J.2

  7. arXiv:2302.07371  [pdf, other

    cs.CL cs.CY

    BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models

    Authors: Rafal Kocielnik, Shrimai Prabhumoye, Vivian Zhang, Roy Jiang, R. Michael Alvarez, Anima Anandkumar

    Abstract: Pretrained Language Models (PLMs) harbor inherent social biases that can result in harmful real-world implications. Such social biases are measured through the probability values that PLMs output for different social groups and attributes appearing in a set of test sentences. However, bias testing is currently cumbersome since the test sentences are generated either from a limited set of manual te… ▽ More

    Submitted 6 December, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

    MSC Class: 68T50 ACM Class: I.2.7; J.5; K.4.1

  8. arXiv:2302.06637  [pdf, other

    cs.LG cs.AI

    PerAda: Parameter-Efficient Federated Learning Personalization with Generalization Guarantees

    Authors: Chulin Xie, De-An Huang, Wenda Chu, Daguang Xu, Chaowei Xiao, Bo Li, Anima Anandkumar

    Abstract: Personalized Federated Learning (pFL) has emerged as a promising solution to tackle data heterogeneity across clients in FL. However, existing pFL methods either (1) introduce high communication and computation costs or (2) overfit to local data, which can be limited in scope, and are vulnerable to evolved test samples with natural shifts. In this paper, we propose PerAda, a parameter-efficient pF… ▽ More

    Submitted 23 July, 2024; v1 submitted 13 February, 2023; originally announced February 2023.

    Comments: CVPR 2024

  9. arXiv:2302.06542  [pdf, other

    physics.plasm-ph physics.comp-ph

    Fourier Neural Operator for Plasma Modelling

    Authors: Vignesh Gopakumar, Stanislas Pamela, Lorenzo Zanisi, Zongyi Li, Anima Anandkumar, MAST Team

    Abstract: Predicting plasma evolution within a Tokamak is crucial to building a sustainable fusion reactor. Whether in the simulation space or within the experimental domain, the capability to forecast the spatio-temporal evolution of plasma field variables rapidly and accurately could improve active control methods on current tokamak devices and future fusion reactors. In this work, we demonstrate the util… ▽ More

    Submitted 13 February, 2023; originally announced February 2023.

  10. arXiv:2302.05872  [pdf, other

    cs.CV cs.LG stat.ML

    I$^2$SB: Image-to-Image Schrödinger Bridge

    Authors: Guan-Horng Liu, Arash Vahdat, De-An Huang, Evangelos A. Theodorou, Weili Nie, Anima Anandkumar

    Abstract: We propose Image-to-Image Schrödinger Bridge (I$^2$SB), a new class of conditional diffusion models that directly learn the nonlinear diffusion processes between two given distributions. These diffusion bridges are particularly useful for image restoration, as the degraded images are structurally informative priors for reconstructing the clean images. I$^2$SB belongs to a tractable class of Schröd… ▽ More

    Submitted 25 May, 2023; v1 submitted 12 February, 2023; originally announced February 2023.

    Comments: ICML camera ready (high-resolution figures)

  11. arXiv:2302.04858  [pdf, other

    cs.CV cs.AI cs.CL cs.IR cs.LG

    Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning

    Authors: Zhuolin Yang, Wei Ping, Zihan Liu, Vijay Korthikanti, Weili Nie, De-An Huang, Linxi Fan, Zhiding Yu, Shiyi Lan, Bo Li, Ming-Yu Liu, Yuke Zhu, Mohammad Shoeybi, Bryan Catanzaro, Chaowei Xiao, Anima Anandkumar

    Abstract: Augmenting pretrained language models (LMs) with a vision encoder (e.g., Flamingo) has obtained the state-of-the-art results in image-to-text generation. However, these models store all the knowledge within their parameters, thus often requiring enormous model parameters to model the abundant visual concepts and very rich textual descriptions. Additionally, they are inefficient in incorporating ne… ▽ More

    Submitted 22 October, 2023; v1 submitted 9 February, 2023; originally announced February 2023.

    Comments: Findings of EMNLP 2023

  12. arXiv:2302.04611  [pdf, other

    cs.LG cs.AI q-bio.QM stat.ML

    A Text-guided Protein Design Framework

    Authors: Shengchao Liu, Yanjing Li, Zhuoxinran Li, Anthony Gitter, Yutao Zhu, Jiarui Lu, Zhao Xu, Weili Nie, Arvind Ramanathan, Chaowei Xiao, Jian Tang, Hongyu Guo, Anima Anandkumar

    Abstract: Current AI-assisted protein design mainly utilizes protein sequential and structural information. Meanwhile, there exists tremendous knowledge curated by humans in the text format describing proteins' high-level functionalities. Yet, whether the incorporation of such text data can help protein design tasks has not been explored. To bridge this gap, we propose ProteinDT, a multi-modal framework tha… ▽ More

    Submitted 12 August, 2024; v1 submitted 9 February, 2023; originally announced February 2023.

  13. arXiv:2301.08290  [pdf, ps, other

    physics.flu-dyn cs.LG

    Forecasting subcritical cylinder wakes with Fourier Neural Operators

    Authors: Peter I Renn, Cong Wang, Sahin Lale, Zongyi Li, Anima Anandkumar, Morteza Gharib

    Abstract: We apply Fourier neural operators (FNOs), a state-of-the-art operator learning technique, to forecast the temporal evolution of experimentally measured velocity fields. FNOs are a recently developed machine learning method capable of approximating solution operators to systems of partial differential equations through data alone. The learned FNO solution operator can be evaluated in milliseconds,… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

    Comments: 12 pages, 6 figures

  14. arXiv:2301.03992  [pdf, other

    cs.CV cs.LG cs.MM

    Vision Transformers Are Good Mask Auto-Labelers

    Authors: Shiyi Lan, Xitong Yang, Zhiding Yu, Zuxuan Wu, Jose M. Alvarez, Anima Anandkumar

    Abstract: We propose Mask Auto-Labeler (MAL), a high-quality Transformer-based mask auto-labeling framework for instance segmentation using only box annotations. MAL takes box-cropped images as inputs and conditionally generates their mask pseudo-labels.We show that Vision Transformers are good mask auto-labelers. Our method significantly reduces the gap between auto-labeling and human annotation regarding… ▽ More

    Submitted 10 January, 2023; originally announced January 2023.

  15. arXiv:2212.11296  [pdf, other

    quant-ph cs.LG cs.NE

    Towards Neural Variational Monte Carlo That Scales Linearly with System Size

    Authors: Or Sharir, Garnet Kin-Lic Chan, Anima Anandkumar

    Abstract: Quantum many-body problems are some of the most challenging problems in science and are central to demystifying some exotic quantum phenomena, e.g., high-temperature superconductors. The combination of neural networks (NN) for representing quantum states, coupled with the Variational Monte Carlo (VMC) algorithm, has been shown to be a promising method for solving such problems. However, the run-ti… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: Appeared on NeurIPS 2022 AI for Science Workshop (a non-archival poster presentation)

  16. arXiv:2212.10789  [pdf, other

    cs.LG cs.CL q-bio.QM stat.ML

    Multi-modal Molecule Structure-text Model for Text-based Retrieval and Editing

    Authors: Shengchao Liu, Weili Nie, Chengpeng Wang, Jiarui Lu, Zhuoran Qiao, Ling Liu, Jian Tang, Chaowei Xiao, Anima Anandkumar

    Abstract: There is increasing adoption of artificial intelligence in drug discovery. However, existing studies use machine learning to mainly utilize the chemical structures of molecules but ignore the vast textual knowledge available in chemistry. Incorporating textual knowledge enables us to realize new drug design objectives, adapt to text-based instructions and predict complex biological activities. Her… ▽ More

    Submitted 29 January, 2024; v1 submitted 21 December, 2022; originally announced December 2022.

  17. arXiv:2211.16749  [pdf, other

    cs.LG cs.AI cs.AR

    HEAT: Hardware-Efficient Automatic Tensor Decomposition for Transformer Compression

    Authors: Jiaqi Gu, Ben Keller, Jean Kossaifi, Anima Anandkumar, Brucek Khailany, David Z. Pan

    Abstract: Transformers have attained superior performance in natural language processing and computer vision. Their self-attention and feedforward layers are overparameterized, limiting inference speed and energy efficiency. Tensor decomposition is a promising technique to reduce parameter redundancy by leveraging tensor algebraic properties to express the parameters in a factorized form. Prior efforts used… ▽ More

    Submitted 30 November, 2022; originally announced November 2022.

    Comments: 9 pages. Accepted to NeurIPS ML for System Workshop 2022 (Spotlight)

  18. arXiv:2211.15960  [pdf, other

    cs.LG

    Fourier Continuation for Exact Derivative Computation in Physics-Informed Neural Operators

    Authors: Haydn Maust, Zongyi Li, Yixuan Wang, Daniel Leibovici, Oscar Bruno, Thomas Hou, Anima Anandkumar

    Abstract: The physics-informed neural operator (PINO) is a machine learning architecture that has shown promising empirical results for learning partial differential equations. PINO uses the Fourier neural operator (FNO) architecture to overcome the optimization challenges often faced by physics-informed neural networks. Since the convolution operator in PINO uses the Fourier series representation, its grad… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

  19. arXiv:2211.15188  [pdf, other

    cs.LG

    Incremental Spatial and Spectral Learning of Neural Operators for Solving Large-Scale PDEs

    Authors: Robert Joseph George, Jiawei Zhao, Jean Kossaifi, Zongyi Li, Anima Anandkumar

    Abstract: Fourier Neural Operators (FNO) offer a principled approach to solving challenging partial differential equations (PDE) such as turbulent flows. At the core of FNO is a spectral layer that leverages a discretization-convergent representation in the Fourier domain, and learns weights over a fixed set of frequencies. However, training FNO presents two significant challenges, particularly in large-sca… ▽ More

    Submitted 4 March, 2024; v1 submitted 28 November, 2022; originally announced November 2022.

  20. arXiv:2211.15044  [pdf, other

    eess.SY cs.LG

    Machine Learning Accelerated PDE Backstepping Observers

    Authors: Yuanyuan Shi, Zongyi Li, Huan Yu, Drew Steeves, Anima Anandkumar, Miroslav Krstic

    Abstract: State estimation is important for a variety of tasks, from forecasting to substituting for unmeasured states in feedback controllers. Performing real-time state estimation for PDEs using provably and rapidly converging observers, such as those based on PDE backstepping, is computationally expensive and in many cases prohibitive. We propose a framework for accelerating PDE observer computations usi… ▽ More

    Submitted 27 November, 2022; originally announced November 2022.

    Comments: Accepted to the 61st IEEE Conference on Decision and Control (CDC), 2022

  21. arXiv:2211.13449  [pdf, other

    cs.LG cs.CV

    Fast Sampling of Diffusion Models via Operator Learning

    Authors: Hongkai Zheng, Weili Nie, Arash Vahdat, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: Diffusion models have found widespread adoption in various areas. However, their sampling process is slow because it requires hundreds to thousands of network evaluations to emulate a continuous process defined by differential equations. In this work, we use neural operators, an efficient method to solve the probability flow differential equations, to accelerate the sampling process of diffusion m… ▽ More

    Submitted 22 July, 2023; v1 submitted 24 November, 2022; originally announced November 2022.

  22. arXiv:2211.11798  [pdf, other

    cs.CL cs.AI

    Can You Label Less by Using Out-of-Domain Data? Active & Transfer Learning with Few-shot Instructions

    Authors: Rafal Kocielnik, Sara Kangaslahti, Shrimai Prabhumoye, Meena Hari, R. Michael Alvarez, Anima Anandkumar

    Abstract: Labeling social-media data for custom dimensions of toxicity and social bias is challenging and labor-intensive. Existing transfer and active learning approaches meant to reduce annotation effort require fine-tuning, which suffers from over-fitting to noise and can cause domain shift with small sample sizes. In this work, we propose a novel Active Transfer Few-shot Instructions (ATF) approach whic… ▽ More

    Submitted 21 November, 2022; originally announced November 2022.

    Comments: Accepted to NeurIPS Workshop on Transfer Learning for Natural Language Processing, 2022, New Orleans

  23. arXiv:2211.00322  [pdf, other

    cs.LG cs.AI cs.CR

    DensePure: Understanding Diffusion Models towards Adversarial Robustness

    Authors: Chaowei Xiao, Zhongzhu Chen, Kun Jin, Jiongxiao Wang, Weili Nie, Mingyan Liu, Anima Anandkumar, Bo Li, Dawn Song

    Abstract: Diffusion models have been recently employed to improve certified robustness through the process of denoising. However, the theoretical understanding of why diffusion models are able to improve the certified robustness is still lacking, preventing from further improvement. In this study, we close this gap by analyzing the fundamental properties of diffusion models and establishing the conditions u… ▽ More

    Submitted 1 November, 2022; originally announced November 2022.

  24. arXiv:2210.17051  [pdf, other

    cs.LG physics.flu-dyn

    Real-time high-resolution CO$_2$ geological storage prediction using nested Fourier neural operators

    Authors: Gege Wen, Zongyi Li, Qirui Long, Kamyar Azizzadenesheli, Anima Anandkumar, Sally M. Benson

    Abstract: Carbon capture and storage (CCS) plays an essential role in global decarbonization. Scaling up CCS deployment requires accurate and high-resolution modeling of the storage reservoir pressure buildup and the gaseous plume migration. However, such modeling is very challenging at scale due to the high computational costs of existing numerical methods. This challenge leads to significant uncertainties… ▽ More

    Submitted 1 June, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

    Journal ref: Energy & Environmental Science, 16(4), 1732-1741 (2023)

  25. arXiv:2210.15765  [pdf, other

    cs.LG

    An Adversarial Active Sampling-based Data Augmentation Framework for Manufacturable Chip Design

    Authors: Mingjie Liu, Haoyu Yang, Zongyi Li, Kumara Sastry, Saumyadip Mukhopadhyay, Selim Dogru, Anima Anandkumar, David Z. Pan, Brucek Khailany, Haoxing Ren

    Abstract: Lithography modeling is a crucial problem in chip design to ensure a chip design mask is manufacturable. It requires rigorous simulations of optical and chemical models that are computationally expensive. Recent developments in machine learning have provided alternative solutions in replacing the time-consuming lithography simulations with deep neural networks. However, the considerable accuracy d… ▽ More

    Submitted 27 October, 2022; originally announced October 2022.

  26. arXiv:2210.12852  [pdf, other

    cs.CV

    1st Place Solution of The Robust Vision Challenge 2022 Semantic Segmentation Track

    Authors: Junfei Xiao, Zhichao Xu, Shiyi Lan, Zhiding Yu, Alan Yuille, Anima Anandkumar

    Abstract: This report describes the winning solution to the Robust Vision Challenge (RVC) semantic segmentation track at ECCV 2022. Our method adopts the FAN-B-Hybrid model as the encoder and uses SegFormer as the segmentation framework. The model is trained on a composite dataset consisting of images from 9 datasets (ADE20K, Cityscapes, Mapillary Vistas, ScanNet, VIPER, WildDash 2, IDD, BDD, and COCO) with… ▽ More

    Submitted 7 November, 2022; v1 submitted 23 October, 2022; originally announced October 2022.

    Comments: The Winning Solution to The Robust Vision Challenge 2022 Semantic Segmentation Track

  27. arXiv:2210.06349  [pdf, other

    cs.CL cs.AI

    Context Generation Improves Open Domain Question Answering

    Authors: Dan Su, Mostofa Patwary, Shrimai Prabhumoye, Peng Xu, Ryan Prenger, Mohammad Shoeybi, Pascale Fung, Anima Anandkumar, Bryan Catanzaro

    Abstract: Closed-book question answering (QA) requires a model to directly answer an open-domain question without access to any external knowledge. Prior work on closed-book QA either directly finetunes or prompts a pretrained language model (LM) to leverage the stored knowledge. However, they do not fully exploit the parameterized knowledge. To address this issue, we propose a two-stage, closed-book QA fra… ▽ More

    Submitted 27 April, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: 8 pages; Accepted at EACL2023

  28. arXiv:2210.03094  [pdf, other

    cs.RO cs.AI cs.LG

    VIMA: General Robot Manipulation with Multimodal Prompts

    Authors: Yunfan Jiang, Agrim Gupta, Zichen Zhang, Guanzhi Wang, Yongqiang Dou, Yanjun Chen, Li Fei-Fei, Anima Anandkumar, Yuke Zhu, Linxi Fan

    Abstract: Prompt-based learning has emerged as a successful paradigm in natural language processing, where a single general-purpose language model can be instructed to perform any task specified by input prompts. Yet task specification in robotics comes in various forms, such as imitating one-shot demonstrations, following language instructions, and reaching visual goals. They are often considered different… ▽ More

    Submitted 28 May, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: ICML 2023 Camera-ready version. Project website: https://vimalabs.github.io/

  29. arXiv:2209.15171  [pdf, other

    q-bio.QM cs.LG q-bio.BM

    State-specific protein-ligand complex structure prediction with a multi-scale deep generative model

    Authors: Zhuoran Qiao, Weili Nie, Arash Vahdat, Thomas F. Miller III, Anima Anandkumar

    Abstract: The binding complexes formed by proteins and small molecule ligands are ubiquitous and critical to life. Despite recent advancements in protein structure prediction, existing algorithms are so far unable to systematically predict the binding ligand structures along with their regulatory effects on protein folding. To address this discrepancy, we present NeuralPLexer, a computational approach that… ▽ More

    Submitted 19 April, 2023; v1 submitted 29 September, 2022; originally announced September 2022.

    Comments: 19 pages, 5 figures, 1 table & Supplementary Information (18 pages, 2 figures, 7 tables, 12 algorithms); supersedes an earlier version arXiv:2209.15171v1 presented at the NeurIPS 2022 MLSB workshop as a contributed talk

  30. arXiv:2209.08744  [pdf, other

    cs.LG cs.AI cs.CR

    AdvDO: Realistic Adversarial Attacks for Trajectory Prediction

    Authors: Yulong Cao, Chaowei Xiao, Anima Anandkumar, Danfei Xu, Marco Pavone

    Abstract: Trajectory prediction is essential for autonomous vehicles (AVs) to plan correct and safe driving behaviors. While many prior works aim to achieve higher prediction accuracy, few study the adversarial robustness of their methods. To bridge this gap, we propose to study the adversarial robustness of data-driven trajectory prediction systems. We devise an optimization-based adversarial attack framew… ▽ More

    Submitted 18 September, 2022; originally announced September 2022.

    Comments: To appear in ECCV 2022

  31. arXiv:2209.07669  [pdf, other

    eess.SY

    Stability Constrained Reinforcement Learning for Decentralized Real-Time Voltage Control

    Authors: Jie Feng, Yuanyuan Shi, Guannan Qu, Steven H. Low, Anima Anandkumar, Adam Wierman

    Abstract: Deep reinforcement learning has been recognized as a promising tool to address the challenges in real-time control of power systems. However, its deployment in real-world power systems has been hindered by a lack of explicit stability and safety guarantees. In this paper, we propose a stability-constrained reinforcement learning (RL) method for real-time voltage control, that guarantees system sta… ▽ More

    Submitted 3 October, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: This paper is accepted by TCNS. arXiv admin note: text overlap with arXiv:2109.14854

  32. arXiv:2209.07511  [pdf, other

    cs.CV

    Test-Time Prompt Tuning for Zero-Shot Generalization in Vision-Language Models

    Authors: Manli Shu, Weili Nie, De-An Huang, Zhiding Yu, Tom Goldstein, Anima Anandkumar, Chaowei Xiao

    Abstract: Pre-trained vision-language models (e.g., CLIP) have shown promising zero-shot generalization in many downstream tasks with properly designed text prompts. Instead of relying on hand-engineered prompts, recent works learn prompts using the training data from downstream tasks. While effective, training on domain-specific data reduces a model's generalization capability to unseen new domains. In thi… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: NeurIPS 2022

  33. arXiv:2208.11537  [pdf, other

    cs.CV

    PeRFception: Perception using Radiance Fields

    Authors: Yoonwoo Jeong, Seungjoo Shin, Junha Lee, Christopher Choy, Animashree Anandkumar, Minsu Cho, Jaesik Park

    Abstract: The recent progress in implicit 3D representation, i.e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner. This new representation can effectively convey the information of hundreds of high-resolution images in one compact format and allows photorealistic synthesis of novel views. In this work, using the variant of NeRF call… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: Project Page: https://postech-cvlab.github.io/PeRFception/

  34. arXiv:2208.11126  [pdf, other

    q-bio.QM cs.LG

    Retrieval-based Controllable Molecule Generation

    Authors: Zichao Wang, Weili Nie, Zhuoran Qiao, Chaowei Xiao, Richard Baraniuk, Anima Anandkumar

    Abstract: Generating new molecules with specified chemical and biological properties via generative models has emerged as a promising direction for drug discovery. However, existing methods require extensive training/fine-tuning with a large dataset, often unavailable in real-world generation tasks. In this work, we propose a new retrieval-based framework for controllable molecule generation. We use a small… ▽ More

    Submitted 24 April, 2023; v1 submitted 23 August, 2022; originally announced August 2022.

    Comments: ICLR 2023

  35. arXiv:2208.05419  [pdf, ps, other

    physics.ao-ph cs.AI cs.CV cs.LG cs.PF

    FourCastNet: Accelerating Global High-Resolution Weather Forecasting using Adaptive Fourier Neural Operators

    Authors: Thorsten Kurth, Shashank Subramanian, Peter Harrington, Jaideep Pathak, Morteza Mardani, David Hall, Andrea Miele, Karthik Kashinath, Animashree Anandkumar

    Abstract: Extreme weather amplified by climate change is causing increasingly devastating impacts across the globe. The current use of physics-based numerical weather prediction (NWP) limits accuracy due to high computational cost and strict time-to-solution limits. We report that a data-driven deep learning Earth system emulator, FourCastNet, can predict global weather and generate medium-range forecasts f… ▽ More

    Submitted 8 August, 2022; originally announced August 2022.

  36. arXiv:2208.02245  [pdf, other

    cs.CV cs.AI

    MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training

    Authors: De-An Huang, Zhiding Yu, Anima Anandkumar

    Abstract: We propose MinVIS, a minimal video instance segmentation (VIS) framework that achieves state-of-the-art VIS performance with neither video-based architectures nor training procedures. By only training a query-based image instance segmentation model, MinVIS outperforms the previous best result on the challenging Occluded VIS dataset by over 10% AP. Since MinVIS treats frames in training videos as i… ▽ More

    Submitted 3 August, 2022; originally announced August 2022.

  37. arXiv:2208.00094  [pdf, other

    cs.LG cs.AI cs.CR cs.CV

    Robust Trajectory Prediction against Adversarial Attacks

    Authors: Yulong Cao, Danfei Xu, Xinshuo Weng, Zhuoqing Mao, Anima Anandkumar, Chaowei Xiao, Marco Pavone

    Abstract: Trajectory prediction using deep neural networks (DNNs) is an essential component of autonomous driving (AD) systems. However, these methods are vulnerable to adversarial attacks, leading to serious consequences such as collisions. In this work, we identify two key ingredients to defend trajectory prediction models against adversarial attacks including (1) designing effective adversarial training… ▽ More

    Submitted 29 July, 2022; originally announced August 2022.

  38. Fourier Neural Operator with Learned Deformations for PDEs on General Geometries

    Authors: Zongyi Li, Daniel Zhengyu Huang, Burigede Liu, Anima Anandkumar

    Abstract: Deep learning surrogate models have shown promise in solving partial differential equations (PDEs). Among them, the Fourier neural operator (FNO) achieves good accuracy, and is significantly faster compared to numerical solvers, on a variety of PDEs, such as fluid flows. However, the FNO uses the Fast Fourier transform (FFT), which is limited to rectangular domains with uniform grids. In this work… ▽ More

    Submitted 2 May, 2024; v1 submitted 11 July, 2022; originally announced July 2022.

    Journal ref: Journal of Machine Learning Research (2023) Volume 24, Issue 1, Article No. 388, pp 18593-18618

  39. arXiv:2207.04056  [pdf, other

    cs.LG cs.AI

    Large Scale Mask Optimization Via Convolutional Fourier Neural Operator and Litho-Guided Self Training

    Authors: Haoyu Yang, Zongyi Li, Kumara Sastry, Saumyadip Mukhopadhyay, Anima Anandkumar, Brucek Khailany, Vivek Singh, Haoxing Ren

    Abstract: Machine learning techniques have been extensively studied for mask optimization problems, aiming at better mask printability, shorter turnaround time, better mask manufacturability, and so on. However, most of these researches are focusing on the initial solution generation of small design regions. To further realize the potential of machine learning techniques on mask optimization tasks, we prese… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

    Comments: 9 pages, 10 figures, in preparation for journal submission

    ACM Class: J.6; B.7.2

  40. Quantum Goemans-Williamson Algorithm with the Hadamard Test and Approximate Amplitude Constraints

    Authors: Taylor L. Patti, Jean Kossaifi, Anima Anandkumar, Susanne F. Yelin

    Abstract: Semidefinite programs are optimization methods with a wide array of applications, such as approximating difficult combinatorial problems. One such semidefinite program is the Goemans-Williamson algorithm, a popular integer relaxation technique. We introduce a variational quantum algorithm for the Goemans-Williamson algorithm that uses only $n{+}1$ qubits, a constant number of circuit preparations,… ▽ More

    Submitted 8 July, 2023; v1 submitted 29 June, 2022; originally announced June 2022.

    Comments: 21 pages, 6 figures. Updated files to the version of manuscript accepted by Quantum

    Journal ref: Quantum 7, 1057 (2023)

  41. arXiv:2206.11254  [pdf, other

    cs.LG stat.ML

    Langevin Monte Carlo for Contextual Bandits

    Authors: Pan Xu, Hongkai Zheng, Eric Mazumdar, Kamyar Azizzadenesheli, Anima Anandkumar

    Abstract: We study the efficiency of Thompson sampling for contextual bandits. Existing Thompson sampling-based algorithms need to construct a Laplace approximation (i.e., a Gaussian distribution) of the posterior distribution, which is inefficient to sample in high dimensional applications for general covariance matrices. Moreover, the Gaussian approximation may not be a good surrogate for the posterior di… ▽ More

    Submitted 22 June, 2022; originally announced June 2022.

    Comments: 21 pages, 3 figures, 2 tables. To appear in the proceedings of the 39th International Conference on Machine Learning (ICML2022)

  42. arXiv:2206.08853  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge

    Authors: Linxi Fan, Guanzhi Wang, Yunfan Jiang, Ajay Mandlekar, Yuncong Yang, Haoyi Zhu, Andrew Tang, De-An Huang, Yuke Zhu, Anima Anandkumar

    Abstract: Autonomous agents have made great strides in specialist domains like Atari games and Go. However, they typically learn tabula rasa in isolated environments with limited and manually conceived objectives, thus failing to generalize across a wide spectrum of tasks and capabilities. Inspired by how humans continually learn and adapt in the open world, we advocate a trinity of ingredients for building… ▽ More

    Submitted 22 November, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: Outstanding Paper Award at NeurIPS 2022. Project website: https://minedojo.org

  43. arXiv:2206.08520  [pdf, ps, other

    cs.LG eess.SY math.OC stat.ML

    Thompson Sampling Achieves $\tilde O(\sqrt{T})$ Regret in Linear Quadratic Control

    Authors: Taylan Kargin, Sahin Lale, Kamyar Azizzadenesheli, Anima Anandkumar, Babak Hassibi

    Abstract: Thompson Sampling (TS) is an efficient method for decision-making under uncertainty, where an action is sampled from a carefully prescribed distribution which is updated based on the observed data. In this work, we study the problem of adaptive control of stabilizable linear-quadratic regulators (LQRs) using TS, where the system dynamics are unknown. Previous works have established that… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: Accepted for presentation at the Conference on Learning Theory (COLT) 2022

  44. arXiv:2206.08077  [pdf, other

    cs.RO cs.AI cs.CV cs.LG

    Neural Scene Representation for Locomotion on Structured Terrain

    Authors: David Hoeller, Nikita Rudin, Christopher Choy, Animashree Anandkumar, Marco Hutter

    Abstract: We propose a learning-based method to reconstruct the local terrain for locomotion with a mobile robot traversing urban environments. Using a stream of depth measurements from the onboard cameras and the robot's trajectory, the algorithm estimates the topography in the robot's vicinity. The raw measurements from these cameras are noisy and only provide partial and occluded observations that in man… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

  45. arXiv:2206.03520  [pdf, ps, other

    stat.ML cs.LG

    Finite-Time Regret of Thompson Sampling Algorithms for Exponential Family Multi-Armed Bandits

    Authors: Tianyuan Jin, Pan Xu, Xiaokui Xiao, Anima Anandkumar

    Abstract: We study the regret of Thompson sampling (TS) algorithms for exponential family bandits, where the reward distribution is from a one-dimensional exponential family, which covers many common reward distributions including Bernoulli, Gaussian, Gamma, Exponential, etc. We propose a Thompson sampling algorithm, termed ExpTS, which uses a novel sampling distribution to avoid the under-estimation of the… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: 49 pages

  46. arXiv:2206.01704  [pdf, ps, other

    cs.LG eess.SY math.OC stat.ML

    KCRL: Krasovskii-Constrained Reinforcement Learning with Guaranteed Stability in Nonlinear Dynamical Systems

    Authors: Sahin Lale, Yuanyuan Shi, Guannan Qu, Kamyar Azizzadenesheli, Adam Wierman, Anima Anandkumar

    Abstract: Learning a dynamical system requires stabilizing the unknown dynamics to avoid state blow-ups. However, current reinforcement learning (RL) methods lack stabilization guarantees, which limits their applicability for the control of safety-critical systems. We propose a model-based RL framework with formal stability guarantees, Krasovskii Constrained RL (KCRL), that adopts Krasovskii's family of Lya… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

  47. arXiv:2205.13803  [pdf, other

    cs.CV cs.AI cs.LG

    Bongard-HOI: Benchmarking Few-Shot Visual Reasoning for Human-Object Interactions

    Authors: Huaizu Jiang, Xiaojian Ma, Weili Nie, Zhiding Yu, Yuke Zhu, Song-Chun Zhu, Anima Anandkumar

    Abstract: A significant gap remains between today's visual pattern recognition models and human-level visual cognition especially when it comes to few-shot learning and compositional reasoning of novel concepts. We introduce Bongard-HOI, a new visual reasoning benchmark that focuses on compositional learning of human-object interactions (HOIs) from natural images. It is inspired by two desirable characteris… ▽ More

    Submitted 13 April, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: CVPR 2022 (oral); First two authors contributed equally; Code: https://github.com/NVlabs/Bongard-HOI

  48. arXiv:2205.07460  [pdf, other

    cs.LG cs.CR cs.CV

    Diffusion Models for Adversarial Purification

    Authors: Weili Nie, Brandon Guo, Yujia Huang, Chaowei Xiao, Arash Vahdat, Anima Anandkumar

    Abstract: Adversarial purification refers to a class of defense methods that remove adversarial perturbations using a generative model. These methods do not make assumptions on the form of attack and the classification model, and thus can defend pre-existing classifiers against unseen threats. However, their performance currently falls behind adversarial training methods. In this work, we propose DiffPure t… ▽ More

    Submitted 16 May, 2022; originally announced May 2022.

    Comments: ICML 2022

  49. arXiv:2205.06908  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Neural-Fly Enables Rapid Learning for Agile Flight in Strong Winds

    Authors: Michael O'Connell, Guanya Shi, Xichen Shi, Kamyar Azizzadenesheli, Anima Anandkumar, Yisong Yue, Soon-Jo Chung

    Abstract: Executing safe and precise flight maneuvers in dynamic high-speed winds is important for the ongoing commoditization of uninhabited aerial vehicles (UAVs). However, because the relationship between various wind conditions and its effect on aircraft maneuverability is not well understood, it is challenging to design effective robot controllers using traditional control design methods. We present Ne… ▽ More

    Submitted 11 April, 2024; v1 submitted 13 May, 2022; originally announced May 2022.

    Comments: This is the accepted version of Science Robotics Vol. 7, Issue 66, eabm6597 (2022). Video: https://youtu.be/TuF9teCZX0U

  50. arXiv:2205.03028  [pdf, other

    cs.RO cs.CV cs.LG

    Quantification of Robotic Surgeries with Vision-Based Deep Learning

    Authors: Dani Kiyasseh, Runzhuo Ma, Taseen F. Haque, Jessica Nguyen, Christian Wagner, Animashree Anandkumar, Andrew J. Hung

    Abstract: Surgery is a high-stakes domain where surgeons must navigate critical anatomical structures and actively avoid potential complications while achieving the main task at hand. Such surgical activity has been shown to affect long-term patient outcomes. To better understand this relationship, whose mechanics remain unknown for the majority of surgical procedures, we hypothesize that the core elements… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.