Zum Hauptinhalt springen

Showing 1–23 of 23 results for author: Levine, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12687  [pdf, other

    cs.CY cs.AI cs.LG

    Towards Responsible Development of Generative AI for Education: An Evaluation-Driven Approach

    Authors: Irina Jurenka, Markus Kunesch, Kevin R. McKee, Daniel Gillick, Shaojian Zhu, Sara Wiltberger, Shubham Milind Phal, Katherine Hermann, Daniel Kasenberg, Avishkar Bhoopchand, Ankit Anand, Miruna Pîslar, Stephanie Chan, Lisa Wang, Jennifer She, Parsa Mahmoudieh, Aliya Rysbek, Wei-Jen Ko, Andrea Huber, Brett Wiltshire, Gal Elidan, Roni Rabin, Jasmin Rubinovitz, Amit Pitaru, Mac McAllister , et al. (49 additional authors not shown)

    Abstract: A major challenge facing the world is the provision of equitable and universal access to quality education. Recent advances in generative AI (gen AI) have created excitement about the potential of new technologies to offer a personal tutor for every learner and a teaching assistant for every teacher. The full extent of this dream, however, has not yet materialised. We argue that this is primarily… ▽ More

    Submitted 19 July, 2024; v1 submitted 21 May, 2024; originally announced July 2024.

  2. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  3. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  4. arXiv:2106.12772  [pdf, other

    cs.LG stat.ML

    Task-agnostic Continual Learning with Hybrid Probabilistic Models

    Authors: Polina Kirichenko, Mehrdad Farajtabar, Dushyant Rao, Balaji Lakshminarayanan, Nir Levine, Ang Li, Huiyi Hu, Andrew Gordon Wilson, Razvan Pascanu

    Abstract: Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning. In this work we propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. We model the distribution of each task and each class with a normalizing flow. The flow is used to lea… ▽ More

    Submitted 24 June, 2021; originally announced June 2021.

  5. arXiv:2012.05339  [pdf, other

    cs.LG cs.CV

    Neural Rate Control for Video Encoding using Imitation Learning

    Authors: Hongzi Mao, Chenjie Gu, Miaosen Wang, Angie Chen, Nevena Lazic, Nir Levine, Derek Pang, Rene Claus, Marisabel Hechtman, Ching-Han Chiang, Cheng Chen, Jingning Han

    Abstract: In modern video encoders, rate control is a critical component and has been heavily engineered. It decides how many bits to spend to encode each frame, in order to optimize the rate-distortion trade-off over all video frames. This is a challenging constrained planning problem because of the complex dependency among decisions for different video frames and the bitrate constraint defined at the end… ▽ More

    Submitted 9 December, 2020; originally announced December 2020.

  6. arXiv:2012.03706  [pdf, other

    cs.CR

    Pricing Security in Proof-of-Work Systems

    Authors: George Bissias, Rainer Böhme, David Thibodeau, Brian N. Levine

    Abstract: A key component of security in decentralized blockchains is proof of opportunity cost among block producers. In the case of proof-of-work (PoW), currently used by the most prominent systems, the cost is due to spent computation. In this paper, we characterize the security investment of miners in terms of its cost in fiat money. This enables comparison of security allocations across PoW blockchains… ▽ More

    Submitted 7 December, 2020; originally announced December 2020.

  7. arXiv:2010.06324  [pdf, other

    cs.LG cs.AI stat.ML

    Balancing Constraints and Rewards with Meta-Gradient D4PG

    Authors: Dan A. Calian, Daniel J. Mankowitz, Tom Zahavy, Zhongwen Xu, Junhyuk Oh, Nir Levine, Timothy Mann

    Abstract: Deploying Reinforcement Learning (RL) agents to solve real-world applications often requires satisfying complex system constraints. Often the constraint thresholds are incorrectly set due to the complex nature of a system or the inability to verify the thresholds offline (e.g, no simulator or reasonable offline evaluation procedure exists). This results in solutions where a task cannot be solved w… ▽ More

    Submitted 27 November, 2020; v1 submitted 13 October, 2020; originally announced October 2020.

  8. arXiv:2006.12620  [pdf, other

    cs.LG cs.AI

    A maximum-entropy approach to off-policy evaluation in average-reward MDPs

    Authors: Nevena Lazic, Dong Yin, Mehrdad Farajtabar, Nir Levine, Dilan Gorur, Chris Harris, Dale Schuurmans

    Abstract: This work focuses on off-policy evaluation (OPE) with function approximation in infinite-horizon undiscounted Markov decision processes (MDPs). For MDPs that are ergodic and linear (i.e. where rewards and dynamics are linear in some known features), we provide the first finite-sample OPE error bound, extending existing results beyond the episodic and discounted cases. In a more general setting, wh… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

  9. arXiv:2006.10974  [pdf, ps, other

    cs.LG stat.ML

    Optimization and Generalization of Regularization-Based Continual Learning: a Loss Approximation Viewpoint

    Authors: Dong Yin, Mehrdad Farajtabar, Ang Li, Nir Levine, Alex Mott

    Abstract: Neural networks have achieved remarkable success in many cognitive tasks. However, when they are trained sequentially on multiple tasks without access to old data, their performance on early tasks tend to drop significantly. This problem is often referred to as catastrophic forgetting, a key challenge in continual learning of neural networks. The regularization-based approach is one of the primary… ▽ More

    Submitted 8 February, 2021; v1 submitted 19 June, 2020; originally announced June 2020.

    Comments: Preliminary version with a different title presented at ICML Workshop on Continual Learning, 2020 (spotlight)

  10. arXiv:2003.11881  [pdf, other

    cs.LG cs.AI

    An empirical investigation of the challenges of real-world reinforcement learning

    Authors: Gabriel Dulac-Arnold, Nir Levine, Daniel J. Mankowitz, Jerry Li, Cosmin Paduraru, Sven Gowal, Todd Hester

    Abstract: Reinforcement learning (RL) has proven its worth in a series of artificial domains, and is beginning to show some successes in real-world scenarios. However, much of the research advances in RL are hard to leverage in real-world systems due to a series of assumptions that are rarely satisfied in practice. In this work, we identify and formalize a series of independent challenges that embody the di… ▽ More

    Submitted 4 March, 2021; v1 submitted 24 March, 2020; originally announced March 2020.

    Comments: arXiv admin note: text overlap with arXiv:1904.12901

  11. arXiv:1909.01506  [pdf, other

    cs.LG stat.ML

    Prediction, Consistency, Curvature: Representation Learning for Locally-Linear Control

    Authors: Nir Levine, Yinlam Chow, Rui Shu, Ang Li, Mohammad Ghavamzadeh, Hung Bui

    Abstract: Many real-world sequential decision-making problems can be formulated as optimal control with high-dimensional observations and unknown dynamics. A promising approach is to embed the high-dimensional observations into a lower-dimensional latent representation space, estimate the latent dynamics model, then utilize this model for control in the latent space. An important open question is how to lea… ▽ More

    Submitted 10 February, 2020; v1 submitted 3 September, 2019; originally announced September 2019.

  12. arXiv:1907.09883  [pdf, other

    cs.GT cs.CR

    Greedy but Cautious: Conditions for Miner Convergence to Resource Allocation Equilibrium

    Authors: George Bissias, Brian N. Levine, David Thibodeau

    Abstract: All public blockchains are secured by a proof of opportunity cost among block producers. For example, the security offered by proof-of-work (PoW) systems, like Bitcoin, is due to spent computation; it is work precisely because it cannot be performed for free. In general, more resources provably lost in producing blocks yields more security for the blockchain. When two blockchains share the same me… ▽ More

    Submitted 26 August, 2019; v1 submitted 19 July, 2019; originally announced July 2019.

  13. arXiv:1907.00302  [pdf, other

    cs.CR

    Bonded Mining: Difficulty Adjustment by Miner Commitment

    Authors: George Bissias, David Thibodeau, Brian N. Levine

    Abstract: Proof-of-work blockchains must implement a difficulty adjustment algorithm (DAA) in order to maintain a consistent inter-arrival time between blocks. Conventional DAAs are essentially feedback controllers, and as such, they are inherently reactive. This approach leaves them susceptible to manipulation and often causes them to either under- or over-correct. We present Bonded Mining, a proactive DAA… ▽ More

    Submitted 5 August, 2019; v1 submitted 29 June, 2019; originally announced July 2019.

  14. arXiv:1906.07516  [pdf, other

    cs.LG cs.AI stat.ML

    Robust Reinforcement Learning for Continuous Control with Model Misspecification

    Authors: Daniel J. Mankowitz, Nir Levine, Rae Jeong, Yuanyuan Shi, Jackie Kay, Abbas Abdolmaleki, Jost Tobias Springenberg, Timothy Mann, Todd Hester, Martin Riedmiller

    Abstract: We provide a framework for incorporating robustness -- to perturbations in the transition dynamics which we refer to as model misspecification -- into continuous control Reinforcement Learning (RL) algorithms. We specifically focus on incorporating robustness into a state-of-the-art continuous control RL algorithm called Maximum a-posteriori Policy Optimization (MPO). We achieve this by learning a… ▽ More

    Submitted 11 February, 2020; v1 submitted 18 June, 2019; originally announced June 2019.

  15. arXiv:1902.03393  [pdf, other

    cs.LG cs.AI stat.ML

    Improved Knowledge Distillation via Teacher Assistant

    Authors: Seyed-Iman Mirzadeh, Mehrdad Farajtabar, Ang Li, Nir Levine, Akihiro Matsukawa, Hassan Ghasemzadeh

    Abstract: Despite the fact that deep neural networks are powerful models and achieve appealing results on many tasks, they are too large to be deployed on edge devices like smartphones or embedded sensor nodes. There have been efforts to compress these networks, and a popular method is knowledge distillation, where a large (teacher) pre-trained network is used to train a smaller (student) network. However,… ▽ More

    Submitted 16 December, 2019; v1 submitted 9 February, 2019; originally announced February 2019.

    Comments: AAAI 2020

  16. arXiv:1806.07189  [pdf, other

    cs.CR

    Using Economic Risk to Model Miner Hash Rate Allocation in Cryptocurrencies

    Authors: George Bissias, Brian N. Levine, David Thibodeau

    Abstract: Abrupt changes in the miner hash rate applied to a proof-of-work (PoW) blockchain can adversely affect user experience and security. Because different PoW blockchains often share hashing algorithms, miners face a complex choice in deciding how to allocate their hash power among chains. We present an economic model that leverages Modern Portfolio Theory to predict a miner's allocation over time usi… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

  17. arXiv:1709.08750  [pdf, other

    cs.CR

    Bobtail: A Proof-of-Work Target that Minimizes Blockchain Mining Variance (Draft)

    Authors: George Bissias, Brian Neil Levine

    Abstract: Blockchain systems are designed to produce blocks at a constant average rate. The most popular systems currently employ a Proof of Work (PoW) algorithm as a means of creating these blocks. Bitcoin produces, on average, one block every 10 minutes. An unfortunate limitation of all deployed PoW blockchain systems is that the time between blocks has high variance. For example, 5% of the time, Bitcoin'… ▽ More

    Submitted 13 August, 2019; v1 submitted 25 September, 2017; originally announced September 2017.

  18. arXiv:1706.02061  [pdf, ps, other

    cs.IR

    An Extended Relevance Model for Session Search

    Authors: Nir Levine, Haggai Roitman, Doron Cohen

    Abstract: The session search task aims at best serving the user's information need given her previous search behavior during the session. We propose an extended relevance model that captures the user's dynamic information need in the session. Our relevance modelling approach is directly driven by the user's query reformulation (change) decisions and the estimate of how much the user's search behavior affect… ▽ More

    Submitted 7 June, 2017; originally announced June 2017.

  19. arXiv:1705.07461  [pdf, other

    cs.AI cs.LG stat.ML

    Shallow Updates for Deep Reinforcement Learning

    Authors: Nir Levine, Tom Zahavy, Daniel J. Mankowitz, Aviv Tamar, Shie Mannor

    Abstract: Deep reinforcement learning (DRL) methods such as the Deep Q-Network (DQN) have achieved state-of-the-art results in a variety of challenging, high-dimensional domains. This success is mainly attributed to the power of deep neural networks to learn rich domain representations for approximating the value function or policy. Batch reinforcement learning methods with linear representations, on the ot… ▽ More

    Submitted 2 November, 2017; v1 submitted 21 May, 2017; originally announced May 2017.

  20. arXiv:1702.07274  [pdf, other

    stat.ML cs.LG

    Rotting Bandits

    Authors: Nir Levine, Koby Crammer, Shie Mannor

    Abstract: The Multi-Armed Bandits (MAB) framework highlights the tension between acquiring new knowledge (Exploration) and leveraging available knowledge (Exploitation). In the classical MAB problem, a decision maker must choose an arm at each time step, upon which she receives a reward. The decision maker's objective is to maximize her cumulative expected reward over the time horizon. The MAB problem has b… ▽ More

    Submitted 2 November, 2017; v1 submitted 23 February, 2017; originally announced February 2017.

  21. arXiv:1701.03977  [pdf, other

    cs.CR

    An Explanation of Nakamoto's Analysis of Double-spend Attacks

    Authors: A. Pinar Ozisik, Brian Neil Levine

    Abstract: The fundamental attack against blockchain systems is the double-spend attack. In this tutorial, we provide a very detailed explanation of just one section of Satoshi Nakamoto's original paper where the attack's probability of success is stated. We show the derivation of the mathematics relied upon by Nakamoto to create a model of the attack. We also validate the model with a Monte Carlo simulation… ▽ More

    Submitted 14 January, 2017; originally announced January 2017.

  22. arXiv:1610.07985  [pdf, other

    cs.CR

    An Analysis of Attacks on Blockchain Consensus

    Authors: George Bissias, Brian Neil Levine, A. Pinar Ozisik, Gavin Andresen

    Abstract: We present and validate a novel mathematical model of the blockchain mining process and use it to conduct an economic evaluation of the double-spend attack, which is fundamental to all blockchain systems. Our analysis focuses on the value of transactions that can be secured under a conventional double-spend attack, both with and without a concurrent eclipse attack. Our model quantifies the importa… ▽ More

    Submitted 20 November, 2016; v1 submitted 25 October, 2016; originally announced October 2016.

  23. arXiv:1504.04114  [pdf, other

    stat.ML cs.LG cs.SI

    Actively Learning to Attract Followers on Twitter

    Authors: Nir Levine, Timothy A. Mann, Shie Mannor

    Abstract: Twitter, a popular social network, presents great opportunities for on-line machine learning research. However, previous research has focused almost entirely on learning from passively collected data. We study the problem of learning to acquire followers through normative user behavior, as opposed to the mass following policies applied by many bots. We formalize the problem as a contextual bandit… ▽ More

    Submitted 16 April, 2015; originally announced April 2015.