Zum Hauptinhalt springen

Showing 1–21 of 21 results for author: Parmar, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2210.15923  [pdf, other

    cs.LG

    DELFI: Deep Mixture Models for Long-term Air Quality Forecasting in the Delhi National Capital Region

    Authors: Naishadh Parmar, Raunak Shah, Tushar Goswamy, Vatsalya Tandon, Ravi Sahu, Ronak Sutaria, Purushottam Kar, Sachchida Nand Tripathi

    Abstract: The identification and control of human factors in climate change is a rapidly growing concern and robust, real-time air-quality monitoring and forecasting plays a critical role in allowing effective policy formulation and implementation. This paper presents DELFI, a novel deep learning-based mixture model to make effective long-term predictions of Particulate Matter (PM) 2.5 concentrations. A key… ▽ More

    Submitted 28 October, 2022; originally announced October 2022.

    Comments: 6 pages

  2. arXiv:2205.11423  [pdf, other

    cs.CV

    Decoder Denoising Pretraining for Semantic Segmentation

    Authors: Emmanuel Brempong Asiedu, Simon Kornblith, Ting Chen, Niki Parmar, Matthias Minderer, Mohammad Norouzi

    Abstract: Semantic segmentation labels are expensive and time consuming to acquire. Hence, pretraining is commonly used to improve the label-efficiency of segmentation models. Typically, the encoder of a segmentation model is pretrained as a classifier and the decoder is randomly initialized. Here, we argue that random initialization of the decoder can be suboptimal, especially when few labeled examples are… ▽ More

    Submitted 23 May, 2022; originally announced May 2022.

    ACM Class: I.4.6; I.5.4; I.2.10

  3. arXiv:2104.08710  [pdf, other

    cs.CL

    Simple and Efficient ways to Improve REALM

    Authors: Vidhisha Balachandran, Ashish Vaswani, Yulia Tsvetkov, Niki Parmar

    Abstract: Dense retrieval has been shown to be effective for retrieving relevant documents for Open Domain QA, surpassing popular sparse retrieval methods like BM25. REALM (Guu et al., 2020) is an end-to-end dense retrieval system that relies on MLM based pretraining for improved downstream QA efficiency across multiple datasets. We study the finetuning of REALM on various QA tasks and explore the limits of… ▽ More

    Submitted 18 April, 2021; originally announced April 2021.

  4. arXiv:2103.12731  [pdf, other

    cs.CV

    Scaling Local Self-Attention for Parameter Efficient Visual Backbones

    Authors: Ashish Vaswani, Prajit Ramachandran, Aravind Srinivas, Niki Parmar, Blake Hechtman, Jonathon Shlens

    Abstract: Self-attention has the promise of improving computer vision systems due to parameter-independent scaling of receptive fields and content-dependent interactions, in contrast to parameter-dependent scaling and content-independent interactions of convolutions. Self-attention models have recently been shown to have encouraging improvements on accuracy-parameter trade-offs compared to baseline convolut… ▽ More

    Submitted 7 June, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

    Comments: CVPR 2021 Oral

  5. arXiv:2101.11605  [pdf, other

    cs.CV cs.AI cs.LG

    Bottleneck Transformers for Visual Recognition

    Authors: Aravind Srinivas, Tsung-Yi Lin, Niki Parmar, Jonathon Shlens, Pieter Abbeel, Ashish Vaswani

    Abstract: We present BoTNet, a conceptually simple yet powerful backbone architecture that incorporates self-attention for multiple computer vision tasks including image classification, object detection and instance segmentation. By just replacing the spatial convolutions with global self-attention in the final three bottleneck blocks of a ResNet and no other changes, our approach improves upon the baseline… ▽ More

    Submitted 2 August, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

    Comments: Technical Report, 20 pages, 13 figures, 19 tables

  6. arXiv:2007.15628  [pdf, ps, other

    cs.CY

    IIT Kanpur Consulting Group: Using Machine Learning and Management Consulting for Social Good

    Authors: Tushar Goswamy, Vatsalya Tandon, Naishadh Parmar, Raunak Shah, Ayush Gupta

    Abstract: The IIT Kanpur Consulting Group is one of the pioneering research groups in India which focuses on the applications of Machine Learning and Strategy Consulting for social good. The group has been working since 2018 to help social organizations, nonprofits, and government entities in India leverage better insights from their data, with a special emphasis on the healthcare, environmental, and agricu… ▽ More

    Submitted 30 July, 2020; originally announced July 2020.

    Comments: 4 pages. Accepted to the ICML 2020 Workshop on Healthcare Systems, Population Health, and the Role of Health-Tech

  7. arXiv:2007.15619  [pdf, other

    cs.CY cs.CL cs.LG

    AI-based Monitoring and Response System for Hospital Preparedness towards COVID-19 in Southeast Asia

    Authors: Tushar Goswamy, Naishadh Parmar, Ayush Gupta, Raunak Shah, Vatsalya Tandon, Varun Goyal, Sanyog Gupta, Karishma Laud, Shivam Gupta, Sudhanshu Mishra, Ashutosh Modi

    Abstract: This research paper proposes a COVID-19 monitoring and response system to identify the surge in the volume of patients at hospitals and shortage of critical equipment like ventilators in South-east Asian countries, to understand the burden on health facilities. This can help authorities in these regions with resource planning measures to redirect resources to the regions identified by the model. D… ▽ More

    Submitted 5 September, 2022; v1 submitted 30 July, 2020; originally announced July 2020.

    Comments: 5 pages, 5 figures. Accepted to the ICML 2020 Workshop on Healthcare Systems, Population Health, and the Role of Health-Tech

  8. arXiv:2005.08100  [pdf, other

    eess.AS cs.LG cs.SD

    Conformer: Convolution-augmented Transformer for Speech Recognition

    Authors: Anmol Gulati, James Qin, Chung-Cheng Chiu, Niki Parmar, Yu Zhang, Jiahui Yu, Wei Han, Shibo Wang, Zhengdong Zhang, Yonghui Wu, Ruoming Pang

    Abstract: Recently Transformer and Convolution neural network (CNN) based models have shown promising results in Automatic Speech Recognition (ASR), outperforming Recurrent neural networks (RNNs). Transformer models are good at capturing content-based global interactions, while CNNs exploit local features effectively. In this work, we achieve the best of both worlds by studying how to combine convolution ne… ▽ More

    Submitted 16 May, 2020; originally announced May 2020.

    Comments: Submitted to Interspeech 2020

  9. arXiv:1909.03108  [pdf, other

    eess.IV cs.CV cs.LG

    High Resolution Medical Image Analysis with Spatial Partitioning

    Authors: Le Hou, Youlong Cheng, Noam Shazeer, Niki Parmar, Yeqing Li, Panagiotis Korfiatis, Travis M. Drucker, Daniel J. Blezek, Xiaodan Song

    Abstract: Medical images such as 3D computerized tomography (CT) scans and pathology images, have hundreds of millions or billions of voxels/pixels. It is infeasible to train CNN models directly on such high resolution images, because neural activations of a single image do not fit in the memory of a single GPU/TPU, and naive data and model parallelism approaches do not work. Existing image analysis approac… ▽ More

    Submitted 12 September, 2019; v1 submitted 6 September, 2019; originally announced September 2019.

  10. arXiv:1906.05909  [pdf, other

    cs.CV

    Stand-Alone Self-Attention in Vision Models

    Authors: Prajit Ramachandran, Niki Parmar, Ashish Vaswani, Irwan Bello, Anselm Levskaya, Jonathon Shlens

    Abstract: Convolutions are a fundamental building block of modern computer vision systems. Recent approaches have argued for going beyond convolutions in order to capture long-range dependencies. These efforts focus on augmenting convolutional models with content-based interactions, such as self-attention and non-local means, to achieve gains on a number of vision tasks. The natural question that arises is… ▽ More

    Submitted 13 June, 2019; originally announced June 2019.

  11. arXiv:1904.05780  [pdf, other

    cs.CL stat.ML

    Corpora Generation for Grammatical Error Correction

    Authors: Jared Lichtarge, Chris Alberti, Shankar Kumar, Noam Shazeer, Niki Parmar, Simon Tong

    Abstract: Grammatical Error Correction (GEC) has been recently modeled using the sequence-to-sequence framework. However, unlike sequence transduction problems such as machine translation, GEC suffers from the lack of plentiful parallel data. We describe two approaches for generating large parallel datasets for GEC using publicly available Wikipedia data. The first method extracts source-target pairs from W… ▽ More

    Submitted 10 April, 2019; originally announced April 2019.

    Comments: Accepted at NAACL 2019. arXiv admin note: text overlap with arXiv:1811.01710

  12. arXiv:1811.02084  [pdf, other

    cs.LG cs.DC stat.ML

    Mesh-TensorFlow: Deep Learning for Supercomputers

    Authors: Noam Shazeer, Youlong Cheng, Niki Parmar, Dustin Tran, Ashish Vaswani, Penporn Koanantakool, Peter Hawkins, HyoukJoong Lee, Mingsheng Hong, Cliff Young, Ryan Sepassi, Blake Hechtman

    Abstract: Batch-splitting (data-parallelism) is the dominant distributed Deep Neural Network (DNN) training strategy, due to its universal applicability and its amenability to Single-Program-Multiple-Data (SPMD) programming. However, batch-splitting suffers from problems including the inability to train very large models (due to memory constraints), high latency, and inefficiency at small batch sizes. All o… ▽ More

    Submitted 5 November, 2018; originally announced November 2018.

  13. arXiv:1811.01710  [pdf, other

    cs.CL cs.LG stat.ML

    Weakly Supervised Grammatical Error Correction using Iterative Decoding

    Authors: Jared Lichtarge, Christopher Alberti, Shankar Kumar, Noam Shazeer, Niki Parmar

    Abstract: We describe an approach to Grammatical Error Correction (GEC) that is effective at making use of models trained on large amounts of weakly supervised bitext. We train the Transformer sequence-to-sequence model on 4B tokens of Wikipedia revisions and employ an iterative decoding strategy that is tailored to the loosely-supervised nature of the Wikipedia training corpus. Finetuning on the Lang-8 cor… ▽ More

    Submitted 30 October, 2018; originally announced November 2018.

  14. arXiv:1805.11063  [pdf, other

    cs.LG stat.ML

    Theory and Experiments on Vector Quantized Autoencoders

    Authors: Aurko Roy, Ashish Vaswani, Arvind Neelakantan, Niki Parmar

    Abstract: Deep neural networks with discrete latent variables offer the promise of better symbolic reasoning, and learning abstractions that are more useful to new tasks. There has been a surge in interest in discrete latent variable models, however, despite several recent improvements, the training of discrete latent variable models has remained challenging and their performance has mostly failed to match… ▽ More

    Submitted 20 July, 2018; v1 submitted 28 May, 2018; originally announced May 2018.

  15. arXiv:1804.09849  [pdf, other

    cs.CL cs.AI

    The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation

    Authors: Mia Xu Chen, Orhan Firat, Ankur Bapna, Melvin Johnson, Wolfgang Macherey, George Foster, Llion Jones, Niki Parmar, Mike Schuster, Zhifeng Chen, Yonghui Wu, Macduff Hughes

    Abstract: The past year has witnessed rapid advances in sequence-to-sequence (seq2seq) modeling for Machine Translation (MT). The classic RNN-based approaches to MT were first out-performed by the convolutional seq2seq model, which was then out-performed by the more recent Transformer model. Each of these new approaches consists of a fundamental architecture accompanied by a set of modeling and training tec… ▽ More

    Submitted 26 April, 2018; v1 submitted 25 April, 2018; originally announced April 2018.

  16. arXiv:1803.07416  [pdf, other

    cs.LG cs.CL stat.ML

    Tensor2Tensor for Neural Machine Translation

    Authors: Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, Jakob Uszkoreit

    Abstract: Tensor2Tensor is a library for deep learning models that is well-suited for neural machine translation and includes the reference implementation of the state-of-the-art Transformer model.

    Submitted 16 March, 2018; originally announced March 2018.

    Comments: arXiv admin note: text overlap with arXiv:1706.03762

  17. arXiv:1803.03382  [pdf, other

    cs.LG

    Fast Decoding in Sequence Models using Discrete Latent Variables

    Authors: Łukasz Kaiser, Aurko Roy, Ashish Vaswani, Niki Parmar, Samy Bengio, Jakob Uszkoreit, Noam Shazeer

    Abstract: Autoregressive sequence models based on deep neural networks, such as RNNs, Wavenet and the Transformer attain state-of-the-art results on many tasks. However, they are difficult to parallelize and are thus slow at processing long sequences. RNNs lack parallelism both during training and decoding, while architectures like WaveNet and Transformer are much more parallelizable during training, yet st… ▽ More

    Submitted 7 June, 2018; v1 submitted 8 March, 2018; originally announced March 2018.

    Comments: ICML 2018

  18. arXiv:1802.05751  [pdf, other

    cs.CV

    Image Transformer

    Authors: Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Łukasz Kaiser, Noam Shazeer, Alexander Ku, Dustin Tran

    Abstract: Image generation has been successfully cast as an autoregressive sequence generation or transformation problem. Recent work has shown that self-attention is an effective way of modeling textual sequences. In this work, we generalize a recently proposed model architecture based on self-attention, the Transformer, to a sequence modeling formulation of image generation with a tractable likelihood. By… ▽ More

    Submitted 15 June, 2018; v1 submitted 15 February, 2018; originally announced February 2018.

    Comments: Appears in International Conference on Machine Learning, 2018. Code available at https://github.com/tensorflow/tensor2tensor

  19. arXiv:1706.05137  [pdf, other

    cs.LG stat.ML

    One Model To Learn Them All

    Authors: Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, Jakob Uszkoreit

    Abstract: Deep learning yields great results across many fields, from speech recognition, image classification, to translation. But for each problem, getting a deep model to work well involves research into the architecture and a long period of tuning. We present a single model that yields good results on a number of problems spanning multiple domains. In particular, this single model is trained concurrentl… ▽ More

    Submitted 15 June, 2017; originally announced June 2017.

  20. arXiv:1706.03762  [pdf, other

    cs.CL cs.LG

    Attention Is All You Need

    Authors: Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin

    Abstract: The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experi… ▽ More

    Submitted 1 August, 2023; v1 submitted 12 June, 2017; originally announced June 2017.

    Comments: 15 pages, 5 figures

  21. arXiv:1403.0485  [pdf

    cs.CV

    Face Recognition Methods & Applications

    Authors: Divyarajsinh N. Parmar, Brijesh B. Mehta

    Abstract: Face recognition presents a challenging problem in the field of image analysis and computer vision. The security of information is becoming very significant and difficult. Security cameras are presently common in airports, Offices, University, ATM, Bank and in any locations with a security system. Face recognition is a biometric system used to identify or verify a person from a digital image. Face… ▽ More

    Submitted 3 March, 2014; originally announced March 2014.

    Comments: 3 pages, 1 figure

    Journal ref: International Journal of Computer Technology & Applications, Vol 4 (1), pp. 84-86, Jan-Feb 2013