Zum Hauptinhalt springen

Showing 1–8 of 8 results for author: Holtmann-Rice, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  2. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  3. arXiv:1810.07076  [pdf, ps, other

    cs.LG stat.ML

    Stochastic Negative Mining for Learning with Large Output Spaces

    Authors: Sashank J. Reddi, Satyen Kale, Felix Yu, Dan Holtmann-Rice, Jiecao Chen, Sanjiv Kumar

    Abstract: We consider the problem of retrieving the most relevant labels for a given input when the size of the output space is very large. Retrieval methods are modeled as set-valued classifiers which output a small set of classes for each input, and a mistake is made if the label is not in the output set. Despite its practical importance, a statistically principled, yet practical solution to this problem… ▽ More

    Submitted 16 October, 2018; originally announced October 2018.

  4. arXiv:1806.10175  [pdf, other

    stat.ML cs.IT cs.LG

    Learning a Compressed Sensing Measurement Matrix via Gradient Unrolling

    Authors: Shanshan Wu, Alexandros G. Dimakis, Sujay Sanghavi, Felix X. Yu, Daniel Holtmann-Rice, Dmitry Storcheus, Afshin Rostamizadeh, Sanjiv Kumar

    Abstract: Linear encoding of sparse vectors is widely popular, but is commonly data-independent -- missing any possible extra (but a priori unknown) structure beyond sparsity. In this paper we present a new method to learn linear encoders that adapt to data, while still performing well with the widely used $\ell_1$ decoder. The convex $\ell_1$ decoder prevents gradient propagation as needed in standard grad… ▽ More

    Submitted 2 July, 2019; v1 submitted 26 June, 2018; originally announced June 2018.

    Comments: 17 pages, 7 tables, 8 figures, published in ICML 2019; part of this work was done while Shanshan was an intern at Google Research, New York

  5. arXiv:1711.05448  [pdf, other

    stat.ML cs.CL cs.LG

    Lattice Rescoring Strategies for Long Short Term Memory Language Models in Speech Recognition

    Authors: Shankar Kumar, Michael Nirschl, Daniel Holtmann-Rice, Hank Liao, Ananda Theertha Suresh, Felix Yu

    Abstract: Recurrent neural network (RNN) language models (LMs) and Long Short Term Memory (LSTM) LMs, a variant of RNN LMs, have been shown to outperform traditional N-gram LMs on speech recognition tasks. However, these models are computationally more expensive than N-gram LMs for decoding, and thus, challenging to integrate into speech recognizers. Recent research has proposed the use of lattice-rescoring… ▽ More

    Submitted 15 November, 2017; originally announced November 2017.

    Comments: Accepted at ASRU 2017

    Journal ref: Proceedings of ASRU 2017

  6. arXiv:1705.05902  [pdf, other

    cs.CV

    Tensors, Differential Geometry and Statistical Shading Analysis

    Authors: Daniel Niels Holtmann-Rice, Benjamin S. Kunsberg, Steven W. Zucker

    Abstract: We develop a linear algebraic framework for the shape-from-shading problem, because tensors arise when scalar (e.g. image) and vector (e.g. surface normal) fields are differentiated multiple times. Using this framework, we first investigate when image derivatives exhibit invariance to changing illumination by calculating the statistics of image derivatives under general distributions on the light… ▽ More

    Submitted 27 July, 2018; v1 submitted 16 May, 2017; originally announced May 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1705.05885

  7. arXiv:1705.05885  [pdf, other

    cs.CV

    What's In A Patch, I: Tensors, Differential Geometry and Statistical Shading Analysis

    Authors: Daniel Niels Holtmann-Rice, Benjamin S. Kunsberg, Steven W. Zucker

    Abstract: We develop a linear algebraic framework for the shape-from-shading problem, because tensors arise when scalar (e.g. image) and vector (e.g. surface normal) fields are differentiated multiple times. The work is in two parts. In this first part we investigate when image derivatives exhibit invariance to changing illumination by calculating the statistics of image derivatives under general distributi… ▽ More

    Submitted 16 May, 2017; originally announced May 2017.

  8. arXiv:1610.09072  [pdf, other

    cs.LG stat.ML

    Orthogonal Random Features

    Authors: Felix X. Yu, Ananda Theertha Suresh, Krzysztof Choromanski, Daniel Holtmann-Rice, Sanjiv Kumar

    Abstract: We present an intriguing discovery related to Random Fourier Features: in Gaussian kernel approximation, replacing the random Gaussian matrix by a properly scaled random orthogonal matrix significantly decreases kernel approximation error. We call this technique Orthogonal Random Features (ORF), and provide theoretical and empirical justification for this behavior. Motivated by this discovery, we… ▽ More

    Submitted 27 October, 2016; originally announced October 2016.

    Comments: NIPS 2016