Skip to main content

Showing 1–19 of 19 results for author: Razavi, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2305.02402  [pdf, other

    hep-lat cond-mat.stat-mech cs.LG

    Normalizing flows for lattice gauge theory in arbitrary space-time dimension

    Authors: Ryan Abbott, Michael S. Albergo, Aleksandar Botev, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Gurtej Kanwar, Alexander G. D. G. Matthews, Sébastien Racanière, Ali Razavi, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban

    Abstract: Applications of normalizing flows to the sampling of field configurations in lattice gauge theory have so far been explored almost exclusively in two space-time dimensions. We report new algorithmic developments of gauge-equivariant flow architectures facilitating the generalization to higher-dimensional lattice geometries. Specifically, we discuss masked autoregressive transformations with tracta… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

  2. arXiv:2211.07541  [pdf, other

    hep-lat cond-mat.stat-mech cs.LG

    Aspects of scaling and scalability for flow-based sampling of lattice QCD

    Authors: Ryan Abbott, Michael S. Albergo, Aleksandar Botev, Denis Boyda, Kyle Cranmer, Daniel C. Hackett, Alexander G. D. G. Matthews, Sébastien Racanière, Ali Razavi, Danilo J. Rezende, Fernando Romero-López, Phiala E. Shanahan, Julian M. Urban

    Abstract: Recent applications of machine-learned normalizing flows to sampling in lattice field theory suggest that such methods may be able to mitigate critical slowing down and topological freezing. However, these demonstrations have been at the scale of toy models, and it remains to be determined whether they can be applied to state-of-the-art lattice quantum chromodynamics calculations. Assessing the vi… ▽ More

    Submitted 14 November, 2022; originally announced November 2022.

    Comments: 22 pages, 8 figures

    Report number: MIT-CTP/5496

  3. arXiv:2205.06175  [pdf, other

    cs.AI cs.CL cs.LG cs.RO

    A Generalist Agent

    Authors: Scott Reed, Konrad Zolna, Emilio Parisotto, Sergio Gomez Colmenarejo, Alexander Novikov, Gabriel Barth-Maron, Mai Gimenez, Yury Sulsky, Jackie Kay, Jost Tobias Springenberg, Tom Eccles, Jake Bruce, Ali Razavi, Ashley Edwards, Nicolas Heess, Yutian Chen, Raia Hadsell, Oriol Vinyals, Mahyar Bordbar, Nando de Freitas

    Abstract: Inspired by progress in large-scale language modeling, we apply a similar approach towards building a single generalist agent beyond the realm of text outputs. The agent, which we refer to as Gato, works as a multi-modal, multi-task, multi-embodiment generalist policy. The same network with the same weights can play Atari, caption images, chat, stack blocks with a real robot arm and much more, dec… ▽ More

    Submitted 11 November, 2022; v1 submitted 12 May, 2022; originally announced May 2022.

    Comments: Published at TMLR, 42 pages

    Journal ref: Transactions on Machine Learning Research, 11/2022, https://openreview.net/forum?id=1ikK0kHjvj

  4. arXiv:2203.01187  [pdf, other

    cs.CV

    Visual Feature Encoding for GNNs on Road Networks

    Authors: Oliver Stromann, Alireza Razavi, Michael Felsberg

    Abstract: In this work, we present a novel approach to learning an encoding of visual features into graph neural networks with the application on road network data. We propose an architecture that combines state-of-the-art vision backbone networks with graph neural networks. More specifically, we perform a road type classification task on an Open Street Map road network through encoding of satellite imagery… ▽ More

    Submitted 2 March, 2022; originally announced March 2022.

  5. arXiv:2112.10624  [pdf, other

    cs.CV

    Learning to integrate vision data into road network data

    Authors: Oliver Stromann, Alireza Razavi, Michael Felsberg

    Abstract: Road networks are the core infrastructure for connected and autonomous vehicles, but creating meaningful representations for machine learning applications is a challenging task. In this work, we propose to integrate remote sensing vision data into road network data for improved embeddings with graph neural networks. We present a segmentation of road edges based on spatio-temporal road and traffic… ▽ More

    Submitted 2 March, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

  6. arXiv:2106.04615  [pdf, other

    cs.LG cs.AI stat.ML

    Vector Quantized Models for Planning

    Authors: Sherjil Ozair, Yazhe Li, Ali Razavi, Ioannis Antonoglou, Aäron van den Oord, Oriol Vinyals

    Abstract: Recent developments in the field of model-based RL have proven successful in a range of environments, especially ones where planning is essential. However, such successes have been limited to deterministic fully-observed environments. We present a new approach that handles stochastic and partially-observable environments. Our key insight is to use discrete autoencoders to capture the multiple poss… ▽ More

    Submitted 10 June, 2021; v1 submitted 8 June, 2021; originally announced June 2021.

    Comments: ICML 2021

  7. arXiv:2103.01950  [pdf, other

    cs.CV cs.LG

    Predicting Video with VQVAE

    Authors: Jacob Walker, Ali Razavi, Aäron van den Oord

    Abstract: In recent years, the task of video prediction-forecasting future video given past video frames-has attracted attention in the research community. In this paper we propose a novel approach to this problem with Vector Quantized Variational AutoEncoders (VQ-VAE). With VQ-VAE we compress high-resolution videos into a hierarchical set of multi-scale discrete latent variables. Compared to pixels, this c… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

    Comments: 13 Pages

    ACM Class: I.2.6; I.2.10

  8. arXiv:2007.03356  [pdf, other

    cs.LG cs.CL stat.ML

    Do Transformers Need Deep Long-Range Memory

    Authors: Jack W. Rae, Ali Razavi

    Abstract: Deep attention models have advanced the modelling of sequential data across many domains. For language modelling in particular, the Transformer-XL -- a Transformer augmented with a long-range memory of past activations -- has been shown to be state-of-the-art across a variety of well-studied benchmarks. The Transformer-XL incorporates a long-range memory at every layer of the network, which render… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: published at 58th Annual Meeting of the Association for Computational Linguistics. 6 pages, 4 figures, 1 table

  9. arXiv:2001.10568  [pdf, other

    cs.LG eess.SP stat.ML

    Landmark2Vec: An Unsupervised Neural Network-Based Landmark Positioning Method

    Authors: Alireza Razavi

    Abstract: A Neural Network-based method for unsupervised landmarks map estimation from measurements taken from landmarks is introduced. The measurements needed for training the network are the signals observed/received from landmarks by an agent. The definition of landmarks, agent, and the measurements taken by agent from landmarks is rather broad here: landmarks can be visual objects, e.g., poles along a r… ▽ More

    Submitted 28 January, 2020; originally announced January 2020.

  10. arXiv:1906.00446  [pdf, other

    cs.LG cs.CV stat.ML

    Generating Diverse High-Fidelity Images with VQ-VAE-2

    Authors: Ali Razavi, Aaron van den Oord, Oriol Vinyals

    Abstract: We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation. To this end, we scale and enhance the autoregressive priors used in VQ-VAE to generate synthetic samples of much higher coherence and fidelity than possible before. We use simple feed-forward encoder and decoder networks, making our model an attractive candidate for applications where t… ▽ More

    Submitted 2 June, 2019; originally announced June 2019.

  11. arXiv:1905.09272  [pdf, other

    cs.CV cs.LG

    Data-Efficient Image Recognition with Contrastive Predictive Coding

    Authors: Olivier J. Hénaff, Aravind Srinivas, Jeffrey De Fauw, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord

    Abstract: Human observers can learn to recognize new categories of images from a handful of examples, yet doing so with artificial ones remains an open challenge. We hypothesize that data-efficient recognition is enabled by representations which make the variability in natural signals more predictable. We therefore revisit and improve Contrastive Predictive Coding, an unsupervised objective for learning suc… ▽ More

    Submitted 1 July, 2020; v1 submitted 22 May, 2019; originally announced May 2019.

  12. arXiv:1901.03416  [pdf, other

    cs.LG stat.ML

    Preventing Posterior Collapse with delta-VAEs

    Authors: Ali Razavi, Aäron van den Oord, Ben Poole, Oriol Vinyals

    Abstract: Due to the phenomenon of "posterior collapse," current latent variable generative models pose a challenging design choice that either weakens the capacity of the decoder or requires augmenting the objective so it does not only maximize the likelihood of the data. In this paper, we propose an alternative that utilizes the most powerful generative models as decoders, whilst optimising the variationa… ▽ More

    Submitted 10 January, 2019; originally announced January 2019.

  13. arXiv:1805.09786  [pdf, other

    cs.NE

    Hyperbolic Attention Networks

    Authors: Caglar Gulcehre, Misha Denil, Mateusz Malinowski, Ali Razavi, Razvan Pascanu, Karl Moritz Hermann, Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, Nando de Freitas

    Abstract: We introduce hyperbolic attention networks to endow neural networks with enough capacity to match the complexity of data with hierarchical and power-law structure. A few recent approaches have successfully demonstrated the benefits of imposing hyperbolic geometry on the parameters of shallow networks. We extend this line of work by imposing hyperbolic geometry on the activations of neural networks… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

  14. arXiv:1711.09846  [pdf, other

    cs.LG cs.NE

    Population Based Training of Neural Networks

    Authors: Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, Chrisantha Fernando, Koray Kavukcuoglu

    Abstract: Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm. In this work we present \emph{Population Based Training (PBT)}, a simple asynchronous optimisation algorithm which effectively utilises a fixed computational budget… ▽ More

    Submitted 28 November, 2017; v1 submitted 27 November, 2017; originally announced November 2017.

  15. K-Means Fingerprint Clustering for Low-Complexity Floor Estimation in Indoor Mobile Localization

    Authors: Alireza Razavi, Mikko Valkama, Elena-Simona Lohan

    Abstract: Indoor localization in multi-floor buildings is an important research problem. Finding the correct floor, in a fast and efficient manner, in a shopping mall or an unknown university building can save the users' search time and can enable a myriad of Location Based Services in the future. One of the most widely spread techniques for floor estimation in multi-floor buildings is the fingerprinting-ba… ▽ More

    Submitted 24 September, 2015; v1 submitted 4 September, 2015; originally announced September 2015.

    Comments: Accepted to IEEE Globecom 2015, Workshop on Localization and Tracking: Indoors, Outdoors and Emerging Networks

  16. Compressive Detection of Random Subspace Signals

    Authors: Alireza Razavi, Mikko Valkama, Danijela Cabric

    Abstract: The problem of compressive detection of random subspace signals is studied. We consider signals modeled as $\mathbf{s} = \mathbf{H} \mathbf{x}$ where $\mathbf{H}$ is an $N \times K$ matrix with $K \le N$ and $\mathbf{x} \sim \mathcal{N}(\mathbf{0}_{K,1},σ_x^2 \mathbf{I}_K)$. We say that signal $\mathbf{s}$ lies in or leans toward a subspace if the largest eigenvalue of $\mathbf{H} \mathbf{H}^T$ is… ▽ More

    Submitted 30 December, 2015; v1 submitted 10 July, 2015; originally announced July 2015.

    Comments: 33 pages, 11 figures, Revised version

  17. Compressive Identification of Active OFDM Subcarriers in Presence of Timing Offset

    Authors: Alireza Razavi, Mikko Valkama, Danijela Cabric

    Abstract: In this paper we study the problem of identifying active subcarriers in an OFDM signal from compressive measurements sampled at sub-Nyquist rate. The problem is of importance in Cognitive Radio systems when secondary users (SUs) are looking for available spectrum opportunities to communicate over them while sensing at Nyquist rate sampling can be costly or even impractical in case of very wide ban… ▽ More

    Submitted 9 July, 2015; originally announced July 2015.

    Comments: To appear in the proceedings of the IEEE Global Communications Conference (GLOBECOM) 2015

  18. Covariance-Based OFDM Spectrum Sensing with Sub-Nyquist Samples

    Authors: Alireza Razavi, Mikko Valkama, Danijela Cabric

    Abstract: In this paper, we propose a feature-based method for spectrum sensing of OFDM signals from sub-Nyquist samples over a single band. We exploit the structure of the covariance matrix of OFDM signals to convert an underdetermined set of covariance-based equations to an overdetermined one. The statistical properties of sample covariance matrix are analyzed and then based on that an approximate General… ▽ More

    Submitted 10 January, 2015; originally announced January 2015.

    Comments: 30 pages, 5 figures

    Journal ref: Signal Processing, Volume 109, April 2015, Pages 261-268

  19. arXiv:1203.6177  [pdf, ps, other

    cs.DM

    On Distance Function among Finite Set of Points

    Authors: Hajar Ghahremani Gol, Asadollah Razavi, Farzad Didehva

    Abstract: In practical purposes for some geometrical problems in computer science we have as information the coordinates of some finite points in surface instead of the whole body of a surface. The problem arised here is: "How to define a distance function in a finite space?" as we will show the appropriate function for this purpose is not a metric function. Here we try to define this distance function in o… ▽ More

    Submitted 28 March, 2012; originally announced March 2012.

    MSC Class: 97PXX