Zum Hauptinhalt springen

Showing 1–50 of 86 results for author: Norouzi, M

.
  1. arXiv:2405.02422  [pdf, other

    eess.SP

    Precision Enhancement in Sustained Visual Attention Training Platforms: Offline EEG Signal Analysis for Classifier Fine-Tuning

    Authors: Maryam Norouzi, Mohammad Zaeri Amirani, Yalda Shahriari, Reza Abiri

    Abstract: In this study, a novel open-source brain-computer interface (BCI) platform was developed to decode scalp electroencephalography (EEG) signals associated with sustained attention. The EEG signal collection was conducted using a wireless headset during a sustained visual attention task, where participants were instructed to discriminate between composite images superimposed with scenes and faces, re… ▽ More

    Submitted 7 May, 2024; v1 submitted 3 May, 2024; originally announced May 2024.

    Comments: 5 pages, 3 figures, 18 references, EMBC conference

  2. arXiv:2310.17214  [pdf, other

    physics.flu-dyn

    Towards a phase-field based model for combustion in particle beds: Reactive fluid flow

    Authors: Reza Namdar, Mohammad Norouzi, Fathollah Varnik

    Abstract: The present study provide a systematic derivation of a phase-field version of the momentum, mass and heat transport equations, while accounting for chemical reactions in the fluid phase. To achieve this goal, the volume averaging technique is used to reformulate the conservation equations in the presence of multiple phases and their respective diffuse interfaces. A careful analysis, and neglecting… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: 39 pages, 19 figures

    MSC Class: 0000; 1111

  3. arXiv:2307.02960  [pdf, other

    physics.flu-dyn math.DS

    Parametric 3D Convolutional Autoencoder for the Prediction of Flow Fields in a Bed Configuration of Hot Particles

    Authors: Ali Mjalled, Reza Namdar, Lucas Reineking, Mohammad Norouzi, Fathollah Varnik, Martin Mönnigmann

    Abstract: The use of deep learning methods for modeling fluid flow has drawn a lot of attention in the past few years. In situations where conventional numerical approaches can be computationally expensive, these techniques have shown promise in offering accurate, rapid, and practical solutions for modeling complex fluid flow problems. The success of deep learning is often due to its ability to extract hidd… ▽ More

    Submitted 12 February, 2024; v1 submitted 6 July, 2023; originally announced July 2023.

  4. arXiv:2306.08276  [pdf, other

    cs.CV cs.GR

    TryOnDiffusion: A Tale of Two UNets

    Authors: Luyang Zhu, Dawei Yang, Tyler Zhu, Fitsum Reda, William Chan, Chitwan Saharia, Mohammad Norouzi, Ira Kemelmacher-Shlizerman

    Abstract: Given two images depicting a person and a garment worn by another person, our goal is to generate a visualization of how the garment might look on the input person. A key challenge is to synthesize a photorealistic detail-preserving visualization of the garment, while warping the garment to accommodate a significant body pose and shape change across the subjects. Previous methods either focus on g… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: CVPR 2023. Project page: https://tryondiffusion.github.io/

  5. arXiv:2306.01923  [pdf, other

    cs.CV

    The Surprising Effectiveness of Diffusion Models for Optical Flow and Monocular Depth Estimation

    Authors: Saurabh Saxena, Charles Herrmann, Junhwa Hur, Abhishek Kar, Mohammad Norouzi, Deqing Sun, David J. Fleet

    Abstract: Denoising diffusion probabilistic models have transformed image generation with their impressive fidelity and diversity. We show that they also excel in estimating optical flow and monocular depth, surprisingly, without task-specific architectures and loss functions that are predominant for these tasks. Compared to the point estimates of conventional regression-based methods, diffusion models also… ▽ More

    Submitted 5 December, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 (Oral)

  6. arXiv:2304.08466  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Synthetic Data from Diffusion Models Improves ImageNet Classification

    Authors: Shekoofeh Azizi, Simon Kornblith, Chitwan Saharia, Mohammad Norouzi, David J. Fleet

    Abstract: Deep generative models are becoming increasingly powerful, now generating diverse high fidelity photo-realistic samples given text prompts. Have they reached the point where models of natural images can be used for generative data augmentation, helping to improve challenging discriminative tasks? We show that large-scale text-to image diffusion models can be fine-tuned to produce class conditional… ▽ More

    Submitted 17 April, 2023; originally announced April 2023.

  7. arXiv:2302.14816  [pdf, other

    cs.CV

    Monocular Depth Estimation using Diffusion Models

    Authors: Saurabh Saxena, Abhishek Kar, Mohammad Norouzi, David J. Fleet

    Abstract: We formulate monocular depth estimation using denoising diffusion models, inspired by their recent successes in high fidelity image generation. To that end, we introduce innovations to address problems arising due to noisy, incomplete depth maps in training data, including step-unrolled denoising diffusion, an $L_1$ loss, and depth infilling during training. To cope with the limited availability o… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

  8. An RFID-Based Assistive Glove to Help the Visually Impaired

    Authors: Paniz Sedighi, Mohammad Hesam Norouzi, Mehdi Delrobaei

    Abstract: Recent studies have focused on facilitating perception and outdoor navigation for people with blindness or some form of vision loss. However, a significant portion of these studies is centered around treatment and vision rehabilitation, leaving some immediate needs, such as interaction with the surrounding objects or recognizing colors and fine patterns without tactile feedback. This study targets… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    ACM Class: J.2

    Journal ref: IEEE Transactions on Instrumentation and Measurement 70 (2021): 1-9

  9. arXiv:2212.10562  [pdf, other

    cs.CL cs.CV

    Character-Aware Models Improve Visual Text Rendering

    Authors: Rosanne Liu, Dan Garrette, Chitwan Saharia, William Chan, Adam Roberts, Sharan Narang, Irina Blok, RJ Mical, Mohammad Norouzi, Noah Constant

    Abstract: Current image generation models struggle to reliably produce well-formed visual text. In this paper, we investigate a key contributing factor: popular text-to-image models lack character-level input features, making it much harder to predict a word's visual makeup as a series of glyphs. To quantify this effect, we conduct a series of experiments comparing character-aware vs. character-blind text e… ▽ More

    Submitted 3 May, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

  10. arXiv:2212.06909  [pdf, other

    cs.CV cs.AI

    Imagen Editor and EditBench: Advancing and Evaluating Text-Guided Image Inpainting

    Authors: Su Wang, Chitwan Saharia, Ceslee Montgomery, Jordi Pont-Tuset, Shai Noy, Stefano Pellegrini, Yasumasa Onoe, Sarah Laszlo, David J. Fleet, Radu Soricut, Jason Baldridge, Mohammad Norouzi, Peter Anderson, William Chan

    Abstract: Text-guided image editing can have a transformative impact in supporting creative applications. A key challenge is to generate edits that are faithful to input text prompts, while consistent with input images. We present Imagen Editor, a cascaded diffusion model built, by fine-tuning Imagen on text-guided image inpainting. Imagen Editor's edits are faithful to the text prompts, which is accomplish… ▽ More

    Submitted 12 April, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

    Comments: CVPR 2023 Camera Ready

  11. arXiv:2212.02475  [pdf, other

    cs.CL

    Meta-Learning Fast Weight Language Models

    Authors: Kevin Clark, Kelvin Guu, Ming-Wei Chang, Panupong Pasupat, Geoffrey Hinton, Mohammad Norouzi

    Abstract: Dynamic evaluation of language models (LMs) adapts model parameters at test time using gradient information from previous tokens and substantially improves LM performance. However, it requires over 3x more compute than standard inference. We present Fast Weight Layers (FWLs), a neural component that provides the benefits of dynamic evaluation much more efficiently by expressing gradient updates as… ▽ More

    Submitted 5 December, 2022; originally announced December 2022.

    Comments: EMNLP 2022 short paper

  12. arXiv:2210.04628  [pdf, other

    cs.CV cs.GR cs.LG

    Novel View Synthesis with Diffusion Models

    Authors: Daniel Watson, William Chan, Ricardo Martin-Brualla, Jonathan Ho, Andrea Tagliasacchi, Mohammad Norouzi

    Abstract: We present 3DiM, a diffusion model for 3D novel view synthesis, which is able to translate a single input view into consistent and sharp completions across many views. The core component of 3DiM is a pose-conditional image-to-image diffusion model, which takes a source view and its pose as inputs, and generates a novel view for a target pose as output. 3DiM can generate multiple views that are 3D… ▽ More

    Submitted 6 October, 2022; originally announced October 2022.

  13. arXiv:2210.02303  [pdf, other

    cs.CV cs.LG

    Imagen Video: High Definition Video Generation with Diffusion Models

    Authors: Jonathan Ho, William Chan, Chitwan Saharia, Jay Whang, Ruiqi Gao, Alexey Gritsenko, Diederik P. Kingma, Ben Poole, Mohammad Norouzi, David J. Fleet, Tim Salimans

    Abstract: We present Imagen Video, a text-conditional video generation system based on a cascade of video diffusion models. Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models. We describe how we scale up the system as a high definition text-to-video model including design deci… ▽ More

    Submitted 5 October, 2022; originally announced October 2022.

    Comments: See accompanying website: https://imagen.research.google/video/

  14. arXiv:2205.11487  [pdf, other

    cs.CV cs.LG

    Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding

    Authors: Chitwan Saharia, William Chan, Saurabh Saxena, Lala Li, Jay Whang, Emily Denton, Seyed Kamyar Seyed Ghasemipour, Burcu Karagol Ayan, S. Sara Mahdavi, Rapha Gontijo Lopes, Tim Salimans, Jonathan Ho, David J Fleet, Mohammad Norouzi

    Abstract: We present Imagen, a text-to-image diffusion model with an unprecedented degree of photorealism and a deep level of language understanding. Imagen builds on the power of large transformer language models in understanding text and hinges on the strength of diffusion models in high-fidelity image generation. Our key discovery is that generic large language models (e.g. T5), pretrained on text-only c… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

  15. arXiv:2205.11423  [pdf, other

    cs.CV

    Decoder Denoising Pretraining for Semantic Segmentation

    Authors: Emmanuel Brempong Asiedu, Simon Kornblith, Ting Chen, Niki Parmar, Matthias Minderer, Mohammad Norouzi

    Abstract: Semantic segmentation labels are expensive and time consuming to acquire. Hence, pretraining is commonly used to improve the label-efficiency of segmentation models. Typically, the encoder of a segmentation model is pretrained as a classifier and the decoder is randomly initialized. Here, we argue that random initialization of the decoder can be suboptimal, especially when few labeled examples are… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    ACM Class: I.4.6; I.5.4; I.2.10

  16. arXiv:2205.09723  [pdf, other

    cs.CV cs.AI cs.LG

    Robust and Efficient Medical Imaging with Self-Supervision

    Authors: Shekoofeh Azizi, Laura Culp, Jan Freyberg, Basil Mustafa, Sebastien Baur, Simon Kornblith, Ting Chen, Patricia MacWilliams, S. Sara Mahdavi, Ellery Wulczyn, Boris Babenko, Megan Wilson, Aaron Loh, Po-Hsuan Cameron Chen, Yuan Liu, Pinal Bavishi, Scott Mayer McKinney, Jim Winkens, Abhijit Guha Roy, Zach Beaver, Fiona Ryan, Justin Krogue, Mozziyar Etemadi, Umesh Telang, Yun Liu , et al. (9 additional authors not shown)

    Abstract: Recent progress in Medical Artificial Intelligence (AI) has delivered systems that can reach clinical expert level performance. However, such systems tend to demonstrate sub-optimal "out-of-distribution" performance when evaluated in clinical settings different from the training environment. A common mitigation strategy is to develop separate systems for each clinical setting using site-specific d… ▽ More

    Submitted 3 July, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

  17. arXiv:2205.06344  [pdf

    quant-ph

    Engineered Josephson Parametric Amplifier in quantum two-modes squeezed radar

    Authors: Seyed Mohammad Hosseiny, Milad Norouzi, Jamileh Seyed-Yazdi, Mohammad Hossein Ghamat

    Abstract: Josephson parametric amplifier (JPA) engineering is a significant component in the quantum two-mode squeezed radar (QTMS), to enhance, for instance, radar performance and the detection range or bandwidth. In this study, we apply quantum theory to a research domain focusing the design of QTMS radar. We apply engineered JPA (EJPA) to enhance the performance of a quantum radar (QR). We investigate th… ▽ More

    Submitted 12 May, 2022; originally announced May 2022.

    Comments: 27 pages, 10 figures, journal article

  18. arXiv:2204.03458  [pdf, other

    cs.CV cs.AI cs.LG

    Video Diffusion Models

    Authors: Jonathan Ho, Tim Salimans, Alexey Gritsenko, William Chan, Mohammad Norouzi, David J. Fleet

    Abstract: Generating temporally coherent high fidelity video is an important milestone in generative modeling research. We make progress towards this milestone by proposing a diffusion model for video generation that shows very promising initial results. Our model is a natural extension of the standard image diffusion architecture, and it enables jointly training from image and video data, which we find to… ▽ More

    Submitted 22 June, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

  19. arXiv:2202.05830  [pdf, other

    cs.LG

    Learning Fast Samplers for Diffusion Models by Differentiating Through Sample Quality

    Authors: Daniel Watson, William Chan, Jonathan Ho, Mohammad Norouzi

    Abstract: Diffusion models have emerged as an expressive family of generative models rivaling GANs in sample quality and autoregressive models in likelihood scores. Standard diffusion models typically require hundreds of forward passes through the model to generate a single high-fidelity sample. We introduce Differentiable Diffusion Sampler Search (DDSS): a method that optimizes fast samplers for any pre-tr… ▽ More

    Submitted 11 February, 2022; originally announced February 2022.

    Comments: Published as a conference paper at ICLR 2022

  20. arXiv:2111.05826  [pdf, other

    cs.CV cs.LG

    Palette: Image-to-Image Diffusion Models

    Authors: Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan Ho, Tim Salimans, David J. Fleet, Mohammad Norouzi

    Abstract: This paper develops a unified framework for image-to-image translation based on conditional diffusion models and evaluates this framework on four challenging image-to-image translation tasks, namely colorization, inpainting, uncropping, and JPEG restoration. Our simple implementation of image-to-image diffusion models outperforms strong GAN and regression baselines on all tasks, without task-speci… ▽ More

    Submitted 3 May, 2022; v1 submitted 10 November, 2021; originally announced November 2021.

  21. arXiv:2109.12444  [pdf, ps, other

    math.RA

    Solvable Lie algebras derived from Lie hyperalgebras

    Authors: Hesam Safa, Morteza Norouzi

    Abstract: Recently in \cite{s-n}, we have investigated Lie algebras and abelian Lie algebras derived from Lie hyperalgebras using the fundamental relations $\mathcal{L}$ and $\mathcal{A}$, respectively. In the present paper, continuing this method we obtain solvable Lie algebras from Lie hyperalgebras by $\mathcal{S}_n$-relations. We show that $\bigcap_{n\geq 1}\mathcal{S}^*_n$ is the smallest equivalence r… ▽ More

    Submitted 25 September, 2021; originally announced September 2021.

    MSC Class: 17B60; 17B99; 20N20

  22. arXiv:2106.15282  [pdf, other

    cs.CV cs.AI cs.LG

    Cascaded Diffusion Models for High Fidelity Image Generation

    Authors: Jonathan Ho, Chitwan Saharia, William Chan, David J. Fleet, Mohammad Norouzi, Tim Salimans

    Abstract: We show that cascaded diffusion models are capable of generating high fidelity images on the class-conditional ImageNet generation benchmark, without any assistance from auxiliary image classifiers to boost sample quality. A cascaded diffusion model comprises a pipeline of multiple diffusion models that generate images of increasing resolution, beginning with a standard diffusion model at the lowe… ▽ More

    Submitted 17 December, 2021; v1 submitted 30 May, 2021; originally announced June 2021.

  23. arXiv:2106.09660  [pdf, ps, other

    eess.AS cs.LG cs.SD

    WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis

    Authors: Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan

    Abstract: This paper introduces WaveGrad 2, a non-autoregressive generative model for text-to-speech synthesis. WaveGrad 2 is trained to estimate the gradient of the log conditional density of the waveform given a phoneme sequence. The model takes an input phoneme sequence, and through an iterative refinement process, generates an audio waveform. This contrasts to the original WaveGrad vocoder which conditi… ▽ More

    Submitted 18 June, 2021; v1 submitted 17 June, 2021; originally announced June 2021.

    Comments: Proceedings of INTERSPEECH

  24. arXiv:2106.06168  [pdf, other

    cs.LG

    Generate, Annotate, and Learn: NLP with Synthetic Text

    Authors: Xuanli He, Islam Nassar, Jamie Kiros, Gholamreza Haffari, Mohammad Norouzi

    Abstract: This paper studies the use of language models as a source of synthetic unlabeled text for NLP. We formulate a general framework called ``generate, annotate, and learn (GAL)'' to take advantage of synthetic text within knowledge distillation, self-training, and few-shot learning applications. To generate high-quality task-specific text, we either fine-tune LMs on inputs from the task of interest, o… ▽ More

    Submitted 31 May, 2022; v1 submitted 11 June, 2021; originally announced June 2021.

    Comments: accepted to TACL2022

  25. arXiv:2106.03802  [pdf, other

    cs.LG

    Learning to Efficiently Sample from Diffusion Probabilistic Models

    Authors: Daniel Watson, Jonathan Ho, Mohammad Norouzi, William Chan

    Abstract: Denoising Diffusion Probabilistic Models (DDPMs) have emerged as a powerful family of generative models that can yield high-fidelity samples and competitive log-likelihoods across a range of domains, including image and speech synthesis. Key advantages of DDPMs include ease of training, in contrast to generative adversarial networks, and speed of generation, in contrast to autoregressive models. H… ▽ More

    Submitted 7 June, 2021; originally announced June 2021.

  26. arXiv:2106.02819  [pdf

    physics.flu-dyn

    Droplets with circular stagnation lines: combined effects of viscoelastic and inertial forces on drop shapes

    Authors: A. Emamian M. Norouzi, M. Davoodi

    Abstract: Hydrodynamic problems with stagnation points are of particular importance in fluid mechanics as they allow study and investigation of elongational flows. In this article, the uniaxial elongational flow appearing at the surface of a viscoelastic drop and its role on the deformation of the droplet at low inertial regimes is studied. In studies related to viscoelastic droplets falling/raising in an i… ▽ More

    Submitted 5 June, 2021; originally announced June 2021.

  27. arXiv:2104.13877  [pdf, other

    cs.LG cs.AI stat.ML

    Autoregressive Dynamics Models for Offline Policy Evaluation and Optimization

    Authors: Michael R. Zhang, Tom Le Paine, Ofir Nachum, Cosmin Paduraru, George Tucker, Ziyu Wang, Mohammad Norouzi

    Abstract: Standard dynamics models for continuous control make use of feedforward computation to predict the conditional distribution of next state and reward given current state and action using a multivariate Gaussian with a diagonal covariance structure. This modeling choice assumes that different dimensions of the next state and reward are conditionally independent given the current state and action and… ▽ More

    Submitted 28 April, 2021; originally announced April 2021.

    Comments: ICLR 2021. 17 pages

  28. arXiv:2104.07636  [pdf, other

    eess.IV cs.CV cs.LG

    Image Super-Resolution via Iterative Refinement

    Authors: Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J. Fleet, Mohammad Norouzi

    Abstract: We present SR3, an approach to image Super-Resolution via Repeated Refinement. SR3 adapts denoising diffusion probabilistic models to conditional image generation and performs super-resolution through a stochastic denoising process. Inference starts with pure Gaussian noise and iteratively refines the noisy output using a U-Net model trained on denoising at various noise levels. SR3 exhibits stron… ▽ More

    Submitted 30 June, 2021; v1 submitted 15 April, 2021; originally announced April 2021.

  29. arXiv:2104.02133  [pdf, ps, other

    cs.CL cs.LG

    SpeechStew: Simply Mix All Available Speech Recognition Data to Train One Large Neural Network

    Authors: William Chan, Daniel Park, Chris Lee, Yu Zhang, Quoc Le, Mohammad Norouzi

    Abstract: We present SpeechStew, a speech recognition model that is trained on a combination of various publicly available speech recognition datasets: AMI, Broadcast News, Common Voice, LibriSpeech, Switchboard/Fisher, Tedlium, and Wall Street Journal. SpeechStew simply mixes all of these datasets together, without any special re-weighting or re-balancing of the datasets. SpeechStew achieves SoTA or near S… ▽ More

    Submitted 27 April, 2021; v1 submitted 5 April, 2021; originally announced April 2021.

    Comments: submitted to INTERSPEECH

  30. arXiv:2103.16596  [pdf, other

    cs.LG stat.ML

    Benchmarks for Deep Off-Policy Evaluation

    Authors: Justin Fu, Mohammad Norouzi, Ofir Nachum, George Tucker, Ziyu Wang, Alexander Novikov, Mengjiao Yang, Michael R. Zhang, Yutian Chen, Aviral Kumar, Cosmin Paduraru, Sergey Levine, Tom Le Paine

    Abstract: Off-policy evaluation (OPE) holds the promise of being able to leverage large, offline datasets for both evaluating and selecting complex policies for decision making. The ability to learn offline is particularly important in many real-world domains, such as in healthcare, recommender systems, or robotics, where online data collection is an expensive and potentially dangerous process. Being able t… ▽ More

    Submitted 30 March, 2021; originally announced March 2021.

    Comments: ICLR 2021 paper. Policies and evaluation code are available at https://github.com/google-research/deep_ope

  31. arXiv:2101.05224  [pdf, other

    eess.IV cs.CV cs.LG

    Big Self-Supervised Models Advance Medical Image Classification

    Authors: Shekoofeh Azizi, Basil Mustafa, Fiona Ryan, Zachary Beaver, Jan Freyberg, Jonathan Deaton, Aaron Loh, Alan Karthikesalingam, Simon Kornblith, Ting Chen, Vivek Natarajan, Mohammad Norouzi

    Abstract: Self-supervised pretraining followed by supervised fine-tuning has seen success in image recognition, especially when labeled examples are scarce, but has received limited attention in medical image analysis. This paper studies the effectiveness of self-supervised learning as a pretraining strategy for medical image classification. We conduct experiments on two distinct tasks: dermatology skin con… ▽ More

    Submitted 1 April, 2021; v1 submitted 13 January, 2021; originally announced January 2021.

  32. arXiv:2010.16402  [pdf, other

    cs.CV cs.LG

    Why Do Better Loss Functions Lead to Less Transferable Features?

    Authors: Simon Kornblith, Ting Chen, Honglak Lee, Mohammad Norouzi

    Abstract: Previous work has proposed many new loss functions and regularizers that improve test accuracy on image classification tasks. However, it is not clear whether these loss functions learn better representations for downstream tasks. This paper studies how the choice of training objective affects the transferability of the hidden representations of convolutional neural networks trained on ImageNet. W… ▽ More

    Submitted 3 November, 2021; v1 submitted 30 October, 2020; originally announced October 2020.

    Comments: NeurIPS 2021

  33. arXiv:2010.04230  [pdf, other

    cs.LG cs.AI

    No MCMC for me: Amortized sampling for fast and stable training of energy-based models

    Authors: Will Grathwohl, Jacob Kelly, Milad Hashemi, Mohammad Norouzi, Kevin Swersky, David Duvenaud

    Abstract: Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty. Despite recent advances, training EBMs on high-dimensional data remains a challenging problem as the state-of-the-art approaches are costly, unstable, and require considerable tuning and domain expertise to apply successfully. In this work, we present a simple method for training EBMs at scale which uses an e… ▽ More

    Submitted 6 June, 2021; v1 submitted 8 October, 2020; originally announced October 2020.

  34. arXiv:2010.02193  [pdf, other

    cs.LG cs.AI stat.ML

    Mastering Atari with Discrete World Models

    Authors: Danijar Hafner, Timothy Lillicrap, Mohammad Norouzi, Jimmy Ba

    Abstract: Intelligent agents need to generalize from past experience to achieve goals in complex environments. World models facilitate such generalization and allow learning behaviors from imagined outcomes to increase sample-efficiency. While learning world models from image inputs has recently become feasible for some tasks, modeling Atari games accurately enough to derive successful behaviors has remaine… ▽ More

    Submitted 12 February, 2022; v1 submitted 5 October, 2020; originally announced October 2020.

    Comments: Published at ICLR 2021. Website: https://danijar.com/dreamerv2

  35. arXiv:2009.00713  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    WaveGrad: Estimating Gradients for Waveform Generation

    Authors: Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, William Chan

    Abstract: This paper introduces WaveGrad, a conditional model for waveform generation which estimates gradients of the data density. The model is built on prior work on score matching and diffusion probabilistic models. It starts from a Gaussian white noise signal and iteratively refines the signal via a gradient-based sampler conditioned on the mel-spectrogram. WaveGrad offers a natural way to trade infere… ▽ More

    Submitted 9 October, 2020; v1 submitted 2 September, 2020; originally announced September 2020.

  36. arXiv:2006.13888  [pdf, other

    cs.LG stat.ML

    RL Unplugged: A Suite of Benchmarks for Offline Reinforcement Learning

    Authors: Caglar Gulcehre, Ziyu Wang, Alexander Novikov, Tom Le Paine, Sergio Gomez Colmenarejo, Konrad Zolna, Rishabh Agarwal, Josh Merel, Daniel Mankowitz, Cosmin Paduraru, Gabriel Dulac-Arnold, Jerry Li, Mohammad Norouzi, Matt Hoffman, Ofir Nachum, George Tucker, Nicolas Heess, Nando de Freitas

    Abstract: Offline methods for reinforcement learning have a potential to help bridge the gap between reinforcement learning research and real-world applications. They make it possible to learn policies from offline datasets, thus overcoming concerns associated with online data collection in the real-world, including cost, safety, or ethical concerns. In this paper, we propose a benchmark called RL Unplugged… ▽ More

    Submitted 12 February, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: NeurIPS paper. 21 pages including supplementary material, the github link for the datasets: https://github.com/deepmind/deepmind-research/rl_unplugged

  37. arXiv:2006.10029  [pdf, other

    cs.LG cs.CV stat.ML

    Big Self-Supervised Models are Strong Semi-Supervised Learners

    Authors: Ting Chen, Simon Kornblith, Kevin Swersky, Mohammad Norouzi, Geoffrey Hinton

    Abstract: One paradigm for learning from few labeled examples while making best use of a large amount of unlabeled data is unsupervised pretraining followed by supervised fine-tuning. Although this paradigm uses unlabeled data in a task-agnostic way, in contrast to common approaches to semi-supervised learning for computer vision, we show that it is surprisingly effective for semi-supervised learning on Ima… ▽ More

    Submitted 25 October, 2020; v1 submitted 17 June, 2020; originally announced June 2020.

    Comments: NeurIPS'2020. Code and pretrained models at https://github.com/google-research/simclr

  38. arXiv:2005.06606  [pdf, other

    cs.CL cs.LG

    Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation

    Authors: Xuanli He, Gholamreza Haffari, Mohammad Norouzi

    Abstract: This paper introduces Dynamic Programming Encoding (DPE), a new segmentation algorithm for tokenizing sentences into subword units. We view the subword segmentation of output sentences as a latent variable that should be marginalized out for learning and inference. A mixed character-subword transformer is proposed, which enables exact log marginal likelihood estimation and exact MAP inference to f… ▽ More

    Submitted 1 August, 2020; v1 submitted 3 May, 2020; originally announced May 2020.

    Comments: update related work

  39. arXiv:2004.07437  [pdf, ps, other

    cs.CL cs.LG

    Non-Autoregressive Machine Translation with Latent Alignments

    Authors: Chitwan Saharia, William Chan, Saurabh Saxena, Mohammad Norouzi

    Abstract: This paper presents two strong methods, CTC and Imputer, for non-autoregressive machine translation that model latent alignments with dynamic programming. We revisit CTC for machine translation and demonstrate that a simple CTC model can achieve state-of-the-art for single-step non-autoregressive machine translation, contrary to what prior work indicates. In addition, we adapt the Imputer model fo… ▽ More

    Submitted 16 November, 2020; v1 submitted 15 April, 2020; originally announced April 2020.

  40. arXiv:2004.05980  [pdf, other

    cs.GR cs.LG

    NiLBS: Neural Inverse Linear Blend Skinning

    Authors: Timothy Jeruzalski, David I. W. Levin, Alec Jacobson, Paul Lalonde, Mohammad Norouzi, Andrea Tagliasacchi

    Abstract: In this technical report, we investigate efficient representations of articulated objects (e.g. human bodies), which is an important problem in computer vision and graphics. To deform articulated geometry, existing approaches represent objects as meshes and deform them using "skinning" techniques. The skinning operation allows a wide range of deformations to be achieved with a small number of cont… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.

  41. arXiv:2004.04795  [pdf, other

    cs.LG cs.CV stat.ML

    Exemplar VAE: Linking Generative Models, Nearest Neighbor Retrieval, and Data Augmentation

    Authors: Sajad Norouzi, David J. Fleet, Mohammad Norouzi

    Abstract: We introduce Exemplar VAEs, a family of generative models that bridge the gap between parametric and non-parametric, exemplar based generative models. Exemplar VAE is a variant of VAE with a non-parametric prior in the latent space based on a Parzen window estimator. To sample from it, one first draws a random exemplar from a training set, then stochastically transforms that exemplar into a latent… ▽ More

    Submitted 24 November, 2020; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: NeurIPS 2020

  42. arXiv:2004.00353  [pdf, other

    cs.LG stat.ML

    SUMO: Unbiased Estimation of Log Marginal Probability for Latent Variable Models

    Authors: Yucen Luo, Alex Beatson, Mohammad Norouzi, Jun Zhu, David Duvenaud, Ryan P. Adams, Ricky T. Q. Chen

    Abstract: Standard variational lower bounds used to train latent variable models produce biased estimates of most quantities of interest. We introduce an unbiased estimator of the log marginal likelihood and its gradients for latent variable models based on randomized truncation of infinite series. If parameterized by an encoder-decoder architecture, the parameters of the encoder can be optimized to minimiz… ▽ More

    Submitted 10 July, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

    Comments: ICLR 2020

  43. arXiv:2002.08926  [pdf, ps, other

    eess.AS cs.CL cs.LG cs.SD

    Imputer: Sequence Modelling via Imputation and Dynamic Programming

    Authors: William Chan, Chitwan Saharia, Geoffrey Hinton, Mohammad Norouzi, Navdeep Jaitly

    Abstract: This paper presents the Imputer, a neural sequence model that generates output sequences iteratively via imputations. The Imputer is an iterative generative model, requiring only a constant number of generation steps independent of the number of input or output tokens. The Imputer can be trained to approximately marginalize over all possible alignments between the input and output sequences, and a… ▽ More

    Submitted 22 April, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

  44. arXiv:2002.05709  [pdf, other

    cs.LG cs.CV stat.ML

    A Simple Framework for Contrastive Learning of Visual Representations

    Authors: Ting Chen, Simon Kornblith, Mohammad Norouzi, Geoffrey Hinton

    Abstract: This paper presents SimCLR: a simple framework for contrastive learning of visual representations. We simplify recently proposed contrastive self-supervised learning algorithms without requiring specialized architectures or a memory bank. In order to understand what enables the contrastive prediction tasks to learn useful representations, we systematically study the major components of our framewo… ▽ More

    Submitted 30 June, 2020; v1 submitted 13 February, 2020; originally announced February 2020.

    Comments: ICML'2020. Code and pretrained models at https://github.com/google-research/simclr

  45. arXiv:1912.03263  [pdf, other

    cs.LG cs.CV stat.ML

    Your Classifier is Secretly an Energy Based Model and You Should Treat it Like One

    Authors: Will Grathwohl, Kuan-Chieh Wang, Jörn-Henrik Jacobsen, David Duvenaud, Mohammad Norouzi, Kevin Swersky

    Abstract: We propose to reinterpret a standard discriminative classifier of p(y|x) as an energy based model for the joint distribution p(x,y). In this setting, the standard class probabilities can be easily computed as well as unnormalized values of p(x) and p(x|y). Within this framework, standard discriminative architectures may beused and the model can also be trained on unlabeled data. We demonstrate tha… ▽ More

    Submitted 15 September, 2020; v1 submitted 6 December, 2019; originally announced December 2019.

  46. arXiv:1912.03207  [pdf, other

    cs.CV cs.GR cs.LG

    NASA: Neural Articulated Shape Approximation

    Authors: Boyang Deng, JP Lewis, Timothy Jeruzalski, Gerard Pons-Moll, Geoffrey Hinton, Mohammad Norouzi, Andrea Tagliasacchi

    Abstract: Efficient representation of articulated objects such as human bodies is an important problem in computer vision and graphics. To efficiently simulate deformation, existing approaches represent 3D objects using polygonal meshes and deform them using skinning techniques. This paper introduces neural articulated shape approximation (NASA), an alternative framework that enables efficient representatio… ▽ More

    Submitted 21 July, 2022; v1 submitted 6 December, 2019; originally announced December 2019.

    Comments: ECCV 2020; Project Page: https://nasa-eccv20.github.io/

  47. arXiv:1912.01603  [pdf, other

    cs.LG cs.AI cs.RO

    Dream to Control: Learning Behaviors by Latent Imagination

    Authors: Danijar Hafner, Timothy Lillicrap, Jimmy Ba, Mohammad Norouzi

    Abstract: Learned world models summarize an agent's experience to facilitate learning complex behaviors. While learning world models from high-dimensional sensory inputs is becoming feasible through deep learning, there are many potential ways for deriving behaviors from them. We present Dreamer, a reinforcement learning agent that solves long-horizon tasks from images purely by latent imagination. We effic… ▽ More

    Submitted 17 March, 2020; v1 submitted 3 December, 2019; originally announced December 2019.

    Comments: 9 pages, 12 figures

  48. arXiv:1911.02469  [pdf, other

    cs.LG stat.ML

    Don't Blame the ELBO! A Linear VAE Perspective on Posterior Collapse

    Authors: James Lucas, George Tucker, Roger Grosse, Mohammad Norouzi

    Abstract: Posterior collapse in Variational Autoencoders (VAEs) arises when the variational posterior distribution closely matches the prior for a subset of latent variables. This paper presents a simple and intuitive explanation for posterior collapse through the analysis of linear VAEs and their direct correspondence with Probabilistic PCA (pPCA). We explain how posterior collapse may occur in pPCA due to… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

    Comments: 11 main pages, 10 appendix pages. 13 figures total. Accepted at 33rd Conference on Neural Information Processing Systems (NeurIPS 2019)

  49. arXiv:1907.10247  [pdf, other

    cs.LG cs.AI stat.ML

    Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards

    Authors: Yijie Guo, Jongwook Choi, Marcin Moczulski, Shengyu Feng, Samy Bengio, Mohammad Norouzi, Honglak Lee

    Abstract: Reinforcement learning with sparse rewards is challenging because an agent can rarely obtain non-zero rewards and hence, gradient-based optimization of parameterized policies can be incremental and slow. Recent work demonstrated that using a memory buffer of previous successful trajectories can result in more effective policies. However, existing methods may overly exploit past successful experien… ▽ More

    Submitted 14 February, 2021; v1 submitted 24 July, 2019; originally announced July 2019.

  50. arXiv:1907.04543  [pdf, other

    cs.LG cs.AI stat.ML

    An Optimistic Perspective on Offline Reinforcement Learning

    Authors: Rishabh Agarwal, Dale Schuurmans, Mohammad Norouzi

    Abstract: Off-policy reinforcement learning (RL) using a fixed offline dataset of logged interactions is an important consideration in real world applications. This paper studies offline RL using the DQN replay dataset comprising the entire replay experience of a DQN agent on 60 Atari 2600 games. We demonstrate that recent off-policy deep RL algorithms, even when trained solely on this fixed dataset, outper… ▽ More

    Submitted 22 June, 2020; v1 submitted 10 July, 2019; originally announced July 2019.

    Comments: ICML 2020. An earlier version was titled "Striving for Simplicity in Off-Policy Deep Reinforcement Learning". Project Website: https://offline-rl.github.io

    Journal ref: Proceedings of the 37th International Conference on Machine Learning, PMLR 119:104-114, 2020