Skip to main content

Showing 1–32 of 32 results for author: Vyas, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12762  [pdf, ps, other

    cs.CC

    Quasi-Linear Size PCPs with Small Soundness from HDX

    Authors: Mitali Bafna, Dor Minzer, Nikhil Vyas

    Abstract: We construct 2-query, quasi-linear sized probabilistically checkable proofs (PCPs) with arbitrarily small constant soundness, improving upon Dinur's 2-query quasi-linear size PCPs with soundness $1-Ω(1)$. As an immediate corollary, we get that under the exponential time hypothesis, for all $ε>0$ no approximation algorithm for $3$-SAT can obtain an approximation ratio of $7/8+ε$ in time… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Appendix D by Zhiwei Yun showing that variants of the Chapman-Lubotzky complexes can be constructed with $q=\text{polylog}(n)$

  2. arXiv:2407.07972  [pdf, other

    cs.LG cs.AI

    Deconstructing What Makes a Good Optimizer for Language Models

    Authors: Rosie Zhao, Depen Morwani, David Brandfonbrener, Nikhil Vyas, Sham Kakade

    Abstract: Training language models becomes increasingly expensive with scale, prompting numerous attempts to improve optimization efficiency. Despite these efforts, the Adam optimizer remains the most widely used, due to a prevailing view that it is the most effective approach. We aim to compare several optimization algorithms, including SGD, Adafactor, Adam, and Lion, in the context of autoregressive langu… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  3. arXiv:2406.17748  [pdf, other

    cs.LG math.OC stat.ML

    A New Perspective on Shampoo's Preconditioner

    Authors: Depen Morwani, Itai Shapira, Nikhil Vyas, Eran Malach, Sham Kakade, Lucas Janson

    Abstract: Shampoo, a second-order optimization algorithm which uses a Kronecker product preconditioner, has recently garnered increasing attention from the machine learning community. The preconditioner used by Shampoo can be viewed either as an approximation of the Gauss--Newton component of the Hessian or the covariance matrix of the gradients maintained by Adagrad. We provide an explicit and novel connec… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  4. arXiv:2402.13136  [pdf, other

    quant-ph cs.NI

    Relaxing Trust Assumptions on Quantum Key Distribution Networks

    Authors: Nilesh Vyas, Paulo Mendes

    Abstract: Quantum security over long distances with untrusted relays is largely unfounded and is still an open question for active research. Nevertheless, quantum networks based on trusted relays are being built across the globe. However, standard QKD network architecture implores a complete trust requirement on QKD relays, which is too demanding and limits the use cases for QKD networks. In this work, we e… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

  5. arXiv:2402.03563  [pdf, other

    cs.LG cs.AI cs.CL

    Distinguishing the Knowable from the Unknowable with Language Models

    Authors: Gustaf Ahdritz, Tian Qin, Nikhil Vyas, Boaz Barak, Benjamin L. Edelman

    Abstract: We study the feasibility of identifying epistemic uncertainty (reflecting a lack of knowledge), as opposed to aleatoric uncertainty (reflecting entropy in the underlying distribution), in the outputs of large language models (LLMs) over free-form text. In the absence of ground-truth probabilities, we explore a setting where, in order to (approximately) disentangle a given LLM's uncertainty, a sign… ▽ More

    Submitted 27 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  6. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  7. arXiv:2307.12941  [pdf, other

    cs.LG cs.AI cs.CV

    On Privileged and Convergent Bases in Neural Network Representations

    Authors: Davis Brown, Nikhil Vyas, Yamini Bansal

    Abstract: In this study, we investigate whether the representations learned by neural networks possess a privileged and convergent basis. Specifically, we examine the significance of feature directions represented by individual neurons. First, we establish that arbitrary rotations of neural representations cannot be inverted (unlike linear networks), indicating that they do not exhibit complete rotational i… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: In the Workshop on High-dimensional Learning Dynamics at ICML 2023

  8. arXiv:2306.08590  [pdf, other

    cs.LG stat.ML

    Beyond Implicit Bias: The Insignificance of SGD Noise in Online Learning

    Authors: Nikhil Vyas, Depen Morwani, Rosie Zhao, Gal Kaplun, Sham Kakade, Boaz Barak

    Abstract: The success of SGD in deep learning has been ascribed by prior works to the implicit bias induced by finite batch sizes ("SGD noise"). While prior works focused on offline learning (i.e., multiple-epoch training), we study the impact of SGD noise on online (i.e., single epoch) learning. Through an extensive empirical analysis of image and language data, we demonstrate that small batch sizes do not… ▽ More

    Submitted 7 June, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

  9. arXiv:2305.18411  [pdf, other

    cs.LG

    Feature-Learning Networks Are Consistent Across Widths At Realistic Scales

    Authors: Nikhil Vyas, Alexander Atanasov, Blake Bordelon, Depen Morwani, Sabarish Sainathan, Cengiz Pehlevan

    Abstract: We study the effect of width on the dynamics of feature-learning neural networks across a variety of architectures and datasets. Early in training, wide neural networks trained on online data have not only identical loss curves but also agree in their point-wise test predictions throughout training. For simple tasks such as CIFAR-5m this holds throughout training for networks of realistic widths.… ▽ More

    Submitted 5 December, 2023; v1 submitted 28 May, 2023; originally announced May 2023.

    Comments: 24 pages, 19 figures. NeurIPS 2023. Revised based on reviewer feedback

  10. arXiv:2302.10870  [pdf, other

    cs.LG stat.ML

    On Provable Copyright Protection for Generative Models

    Authors: Nikhil Vyas, Sham Kakade, Boaz Barak

    Abstract: There is a growing concern that learned conditional generative models may output samples that are substantially similar to some copyrighted data $C$ that was in their training set. We give a formal definition of $\textit{near access-freeness (NAF)}$ and prove bounds on the probability that a model satisfying this definition outputs a sample similar to $C$, even if $C$ is included in its training s… ▽ More

    Submitted 21 July, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

    Comments: Accepted at ICML 2023

  11. arXiv:2207.00104  [pdf, other

    cs.CC

    On the Number of Quantifiers as a Complexity Measure

    Authors: Ronald Fagin, Jonathan Lenchner, Nikhil Vyas, Ryan Williams

    Abstract: In 1981, Neil Immerman described a two-player game, which he called the "separability game" \cite{Immerman81}, that captures the number of quantifiers needed to describe a property in first-order logic. Immerman's paper laid the groundwork for studying the number of quantifiers needed to express properties in first-order logic, but the game seemed to be too complicated to study, and the arguments… ▽ More

    Submitted 4 July, 2022; v1 submitted 30 June, 2022; originally announced July 2022.

    ACM Class: F.4.1

  12. arXiv:2206.10012  [pdf, other

    cs.LG cs.AI

    Limitations of the NTK for Understanding Generalization in Deep Learning

    Authors: Nikhil Vyas, Yamini Bansal, Preetum Nakkiran

    Abstract: The ``Neural Tangent Kernel'' (NTK) (Jacot et al 2018), and its empirical variants have been proposed as a proxy to capture certain behaviors of real neural networks. In this work, we study NTKs through the lens of scaling laws, and demonstrate that they fall short of explaining important aspects of neural network generalization. In particular, we demonstrate realistic settings where finite-width… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

  13. arXiv:2112.07844  [pdf, other

    cs.LG

    Fix your Models by Fixing your Datasets

    Authors: Atindriyo Sanyal, Vikram Chatterji, Nidhi Vyas, Ben Epstein, Nikita Demir, Anthony Corletti

    Abstract: The quality of underlying training data is very crucial for building performant machine learning models with wider generalizabilty. However, current machine learning (ML) tools lack streamlined processes for improving the data quality. So, getting data quality insights and iteratively pruning the errors to obtain a dataset which is most representative of downstream use cases is still an ad-hoc man… ▽ More

    Submitted 14 December, 2021; originally announced December 2021.

  14. arXiv:2111.11372  [pdf, other

    cs.CL

    Namesakes: Ambiguously Named Entities from Wikipedia and News

    Authors: Oleg Vasilyev, Aysu Altun, Nidhi Vyas, Vedant Dharnidharka, Erika Lam, John Bohannon

    Abstract: We present Namesakes, a dataset of ambiguously named entities obtained from English-language Wikipedia and news articles. It consists of 58862 mentions of 4148 unique entities and their namesakes: 1000 mentions from news, 28843 from Wikipedia articles about the entity, and 29019 Wikipedia backlink mentions. Namesakes should be helpful in establishing challenging benchmarks for the task of named en… ▽ More

    Submitted 22 November, 2021; originally announced November 2021.

    Comments: 11 pages, 6 figures

  15. arXiv:2106.13210  [pdf, ps, other

    cs.DS cs.CC

    Optimal Fine-grained Hardness of Approximation of Linear Equations

    Authors: Mitali Bafna, Nikhil Vyas

    Abstract: The problem of solving linear systems is one of the most fundamental problems in computer science, where given a satisfiable linear system $(A,b)$, for $A \in \mathbb{R}^{n \times n}$ and $b \in \mathbb{R}^n$, we wish to find a vector $x \in \mathbb{R}^n$ such that $Ax = b$. The current best algorithms for solving dense linear systems reduce the problem to matrix multiplication, and run in time… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

    Comments: To appear in ICALP 2021

  16. arXiv:2105.13464  [pdf, other

    cs.LG cs.AI cs.CV

    Training With Data Dependent Dynamic Learning Rates

    Authors: Shreyas Saxena, Nidhi Vyas, Dennis DeCoste

    Abstract: Recently many first and second order variants of SGD have been proposed to facilitate training of Deep Neural Networks (DNNs). A common limitation of these works stem from the fact that they use the same learning rate across all instances present in the dataset. This setting is widely adopted under the assumption that loss functions for each instance are similar in nature, and hence, a common lear… ▽ More

    Submitted 27 May, 2021; originally announced May 2021.

  17. arXiv:2104.14709  [pdf, other

    cs.LO

    Multi-Structural Games and Number of Quantifiers

    Authors: Ronald Fagin, Jonathan Lenchner, Kenneth W. Regan, Nikhil Vyas

    Abstract: We study multi-structural games, played on two sets $\mathcal{A}$ and $\mathcal{B}$ of structures. These games generalize Ehrenfeucht-Fraïssé games. Whereas Ehrenfeucht-Fraïssé games capture the quantifier rank of a first-order sentence, multi-structural games capture the number of quantifiers, in the sense that Spoiler wins the $r$-round game if and only if there is a first-order sentence $φ$ wit… ▽ More

    Submitted 3 March, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

    Comments: Appeared in LICS 2021

  18. arXiv:2011.03910  [pdf, other

    cs.CV

    Faster object tracking pipeline for real time tracking

    Authors: Parthesh Soni, Falak Shah, Nisarg Vyas

    Abstract: Multi-object tracking (MOT) is a challenging practical problem for vision based applications. Most recent approaches for MOT use precomputed detections from models such as Faster RCNN, performing fine-tuning of bounding boxes and association in subsequent phases. However, this is not suitable for actual industrial applications due to unavailability of detections upfront. In their recent work, Wang… ▽ More

    Submitted 8 November, 2020; originally announced November 2020.

    Comments: 12 pages, 6 figures

  19. arXiv:2011.03819  [pdf, ps, other

    cs.DS

    Fast Low-Space Algorithms for Subset Sum

    Authors: Ce Jin, Nikhil Vyas, Ryan Williams

    Abstract: We consider the canonical Subset Sum problem: given a list of positive integers $a_1,\ldots,a_n$ and a target integer $t$ with $t > a_i$ for all $i$, determine if there is an $S \subseteq [n]$ such that $\sum_{i \in S} a_i = t$. The well-known pseudopolynomial-time dynamic programming algorithm [Bellman, 1957] solves Subset Sum in $O(nt)$ time, while requiring $Ω(t)$ space. In this paper we pres… ▽ More

    Submitted 7 November, 2020; originally announced November 2020.

    Comments: To appear in SODA 2021

  20. arXiv:2009.09496  [pdf, other

    cs.LG cs.CV stat.ML

    Learning Soft Labels via Meta Learning

    Authors: Nidhi Vyas, Shreyas Saxena, Thomas Voice

    Abstract: One-hot labels do not represent soft decision boundaries among concepts, and hence, models trained on them are prone to overfitting. Using soft labels as targets provide regularization, but different soft labels might be optimal at different stages of optimization. Also, training with fixed labels in the presence of noisy annotations leads to worse generalization. To address these limitations, we… ▽ More

    Submitted 20 September, 2020; originally announced September 2020.

  21. arXiv:2004.06378  [pdf

    cs.NI

    Various Secure Routing Schemes for MANETs: A Survey

    Authors: Priya R. Soni, Charmi A. Joshi, Dhwani R. Bhadra, Nikita P. Vyas, Rutvij H. Jhaveri

    Abstract: MANET is an infrastructure less as well as self configuring network consisting of mobile nodes communicating with each other using radio medium. Its exclusive properties such as dynamic topology, decentralization, and wireless medium make MANET to become very unique network amongst other traditional networks, thereby determining security to be a major challenge. In this paper, we have carried out… ▽ More

    Submitted 14 April, 2020; originally announced April 2020.

  22. arXiv:2001.07788  [pdf, ps, other

    cs.CC

    Lower Bounds Against Sparse Symmetric Functions of ACC Circuits: Expanding the Reach of $\#$SAT Algorithms

    Authors: Nikhil Vyas, Ryan Williams

    Abstract: We continue the program of proving circuit lower bounds via circuit satisfiability algorithms. So far, this program has yielded several concrete results, proving that functions in $\text{Quasi-NP} = \text{NTIME}[n^{(\log n)^{O(1)}}]$ and $\text{NEXP}$ do not have small circuits from various circuit classes ${\cal C}$, by showing that ${\cal C}$ admits non-trivial satisfiability and/or $\#$SAT algo… ▽ More

    Submitted 21 January, 2020; originally announced January 2020.

    Comments: To appear in STACS 2020

  23. arXiv:1907.08185  [pdf, ps, other

    cs.CC

    Imperfect Gaps in Gap-ETH and PCPs

    Authors: Mitali Bafna, Nikhil Vyas

    Abstract: We study the role of perfect completeness in probabilistically checkable proof systems (PCPs) and give a new way to transform a PCP with imperfect completeness to a PCP with perfect completeness when the initial gap is a constant. In particular, we show that $\text{PCP}_{c,s}[r,q] \subseteq \text{PCP}_{1,1-Ω(1)}[r+O(1),q+O(r)]$, for $c-s=Ω(1)$. This implies that one can convert imperfect completen… ▽ More

    Submitted 18 July, 2019; originally announced July 2019.

    Comments: To appear in CCC 2019

  24. arXiv:1906.07337  [pdf, ps, other

    cs.CL

    Measuring Bias in Contextualized Word Representations

    Authors: Keita Kurita, Nidhi Vyas, Ayush Pareek, Alan W Black, Yulia Tsvetkov

    Abstract: Contextual word embeddings such as BERT have achieved state of the art performance in numerous NLP tasks. Since they are optimized to capture the statistical properties of training data, they tend to pick up on and amplify social stereotypes present in the data as well. In this study, we (1)~propose a template-based method to quantify bias in BERT; (2)~show that this method obtains more consistent… ▽ More

    Submitted 17 June, 2019; originally announced June 2019.

    Comments: 1st ACL Workshop on Gender Bias for Natural Language Processing 2019

  25. arXiv:1904.11606  [pdf, other

    cs.DS

    Approximation Algorithms for Min-Distance Problems

    Authors: Mina Dalirrooyfard, Virginia Vassilevska Williams, Nikhil Vyas, Nicole Wein, Yinzhan Xu, Yuancheng Yu

    Abstract: We study fundamental graph parameters such as the Diameter and Radius in directed graphs, when distances are measured using a somewhat unorthodox but natural measure: the distance between $u$ and $v$ is the minimum of the shortest path distances from $u$ to $v$ and from $v$ to $u$. The center node in a graph under this measure can for instance represent the optimal location for a hospital to ensur… ▽ More

    Submitted 17 June, 2019; v1 submitted 25 April, 2019; originally announced April 2019.

    Comments: To appear in ICALP 2019

  26. arXiv:1904.11601  [pdf, ps, other

    cs.DS

    Tight Approximation Algorithms for Bichromatic Graph Diameter and Related Problems

    Authors: Mina Dalirrooyfard, Virginia Vassilevska Williams, Nikhil Vyas, Nicole Wein

    Abstract: Some of the most fundamental and well-studied graph parameters are the Diameter (the largest shortest paths distance) and Radius (the smallest distance for which a "center" node can reach all other nodes). The natural and important $ST$-variant considers two subsets $S$ and $T$ of the vertex set and lets the $ST$-diameter be the maximum distance between a node in $S$ and a node in $T$, and the… ▽ More

    Submitted 25 April, 2019; originally announced April 2019.

    Comments: To appear in ICALP 2019

  27. arXiv:1902.08899  [pdf, other

    cs.CL

    The ARIEL-CMU Systems for LoReHLT18

    Authors: Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard Hovy, Alan W Black, Jaime Carbonell, Graham V. Horwood , et al. (5 additional authors not shown)

    Abstract: This paper describes the ARIEL-CMU submissions to the Low Resource Human Language Technologies (LoReHLT) 2018 evaluations for the tasks Machine Translation (MT), Entity Discovery and Linking (EDL), and detection of Situation Frames in Text and Speech (SF Text and Speech).

    Submitted 24 February, 2019; originally announced February 2019.

  28. arXiv:1812.05013  [pdf, other

    cs.LG cs.CR cs.DS stat.ML

    Thwarting Adversarial Examples: An $L_0$-RobustSparse Fourier Transform

    Authors: Mitali Bafna, Jack Murtagh, Nikhil Vyas

    Abstract: We give a new algorithm for approximating the Discrete Fourier transform of an approximately sparse signal that has been corrupted by worst-case $L_0$ noise, namely a bounded number of coordinates of the signal have been corrupted arbitrarily. Our techniques generalize to a wide range of linear transformations that are used in data analysis such as the Discrete Cosine and Sine transforms, the Hada… ▽ More

    Submitted 12 December, 2018; originally announced December 2018.

    Comments: Accepted at 32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montréal, Canada

  29. arXiv:1810.06081  [pdf, ps, other

    cs.DS cs.CC

    Super Strong ETH is False for Random $k$-SAT

    Authors: Nikhil Vyas

    Abstract: It has been hypothesized that $k$-SAT is hard to solve for randomly chosen instances near the "critical threshold", where the clause-to-variable ratio is $2^k \ln 2-θ(1)$. Feige's hypothesis for $k$-SAT says that for all sufficiently large clause-to-variable ratios, random $k$-SAT cannot be refuted in polynomial time. It has also been hypothesized that the worst-case $k$-SAT problem cannot be solv… ▽ More

    Submitted 14 October, 2018; originally announced October 2018.

    Comments: 15 pages

  30. arXiv:1804.09341  [pdf, ps, other

    cs.LO cs.CC cs.DM

    Distribution-based objectives for Markov Decision Processes

    Authors: S. Akshay, Blaise Genest, Nikhil Vyas

    Abstract: We consider distribution-based objectives for Markov Decision Processes (MDP). This class of objectives gives rise to an interesting trade-off between full and partial information. As in full observation, the strategy in the MDP can depend on the state of the system, but similar to partial information, the strategy needs to account for all the states at the same time. In this paper, we focus on tw… ▽ More

    Submitted 25 April, 2018; originally announced April 2018.

    Comments: An extended abstract of this paper has been accepted in the conference LICS'2018

  31. arXiv:1701.08573  [pdf, ps, other

    quant-ph cond-mat.other cs.GT math-ph

    Is the essence of a quantum game captured completely in the original classical game?

    Authors: Muhammed Jabir T, Nilesh Vyas, Colin Benjamin

    Abstract: S. J. van Enk and R. Pike in PRA 66, 024306 (2002) argue that the equilibrium solution to a quantum game isn't unique but is already present in the classical game itself. In this work, we contest this assertion by showing that a random strategy in a particular quantum (Hawk-Dove) game is unique to the quantum game. In other words, one cannot obtain the equilibrium solution of the quantum Hawk-Dove… ▽ More

    Submitted 13 August, 2021; v1 submitted 30 January, 2017; originally announced January 2017.

    Comments: 11 Pages, accepted for publication in Physica A: Statistical Mechanics and its Applications (2021)

    Journal ref: Physica A: Statistical Mechanics and its Applications 584, 126360 (2021)

  32. arXiv:1612.02788  [pdf, ps, other

    cs.DS cs.CC cs.CR

    Faster Space-Efficient Algorithms for Subset Sum, k-Sum and Related Problems

    Authors: Nikhil Bansal, Shashwat Garg, Jesper Nederlof, Nikhil Vyas

    Abstract: We present space efficient Monte Carlo algorithms that solve Subset Sum and Knapsack instances with $n$ items using $O^*(2^{0.86n})$ time and polynomial space, where the $O^*(\cdot)$ notation suppresses factors polynomial in the input size. Both algorithms assume random read-only access to random bits. Modulo this mild assumption, this resolves a long-standing open problem in exact algorithms for… ▽ More

    Submitted 24 June, 2017; v1 submitted 8 December, 2016; originally announced December 2016.

    Comments: 23 pages, 3 figures