Zum Hauptinhalt springen

Showing 1–33 of 33 results for author: Horesh, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.16661  [pdf, ps, other

    math.NA cs.CC cs.DM

    Regenerative Ulam-von Neumann Algorithm: An Innovative Markov chain Monte Carlo Method for Matrix Inversion

    Authors: Soumyadip Ghosh, Lior Horesh, Vassilis Kalantzis, Yingdong Lu, Tomasz Nowicki

    Abstract: This paper presents an extension of the classical Ulan-von Neumann Markov chain Monte-Carlo algorithm for the computation of the matrix inverse. The algorithm presented in this paper, termed as \emph{regenerative Ulam-von Neumann algorithm}, utilizes the regenerative structure of classical, non-truncated Neumann series defined by a non-singular matrix and produces an unbiased estimator of the matr… ▽ More

    Submitted 16 August, 2024; v1 submitted 23 July, 2024; originally announced July 2024.

    MSC Class: 68Q25; 68R10; 65C05

  2. arXiv:2405.01098  [pdf, ps, other

    quant-ph cs.LG math.NA

    Multivariate trace estimation using quantum state space linear algebra

    Authors: Liron Mor Yosef, Shashanka Ubaru, Lior Horesh, Haim Avron

    Abstract: In this paper, we present a quantum algorithm for approximating multivariate traces, i.e. the traces of matrix products. Our research is motivated by the extensive utility of multivariate traces in elucidating spectral characteristics of matrices, as well as by recent advancements in leveraging quantum computing for faster numerical linear algebra. Central to our approach is a direct translation o… ▽ More

    Submitted 2 May, 2024; originally announced May 2024.

  3. arXiv:2401.13754  [pdf, other

    math.NA cs.ET

    Multi-Function Multi-Way Analog Technology for Sustainable Machine Intelligence Computation

    Authors: Vassilis Kalantzis, Mark S. Squillante, Shashanka Ubaru, Tayfun Gokmen, Chai Wah Wu, Anshul Gupta, Haim Avron, Tomasz Nowicki, Malte Rasch, Murat Onen, Vanessa Lopez Marrero, Effendi Leobandung, Yasuteru Kohda, Wilfried Haensch, Lior Horesh

    Abstract: Numerical computation is essential to many areas of artificial intelligence (AI), whose computing demands continue to grow dramatically, yet their continued scaling is jeopardized by the slowdown in Moore's law. Multi-function multi-way analog (MFMWA) technology, a computing architecture comprising arrays of memristors supporting in-memory computation of matrix operations, can offer tremendous imp… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    MSC Class: 65F10; C3; G1 ACM Class: G.1.3

  4. On the Prospects of Incorporating Large Language Models (LLMs) in Automated Planning and Scheduling (APS)

    Authors: Vishal Pallagani, Kaushik Roy, Bharath Muppasani, Francesco Fabiano, Andrea Loreggia, Keerthiram Murugesan, Biplav Srivastava, Francesca Rossi, Lior Horesh, Amit Sheth

    Abstract: Automated Planning and Scheduling is among the growing areas in Artificial Intelligence (AI) where mention of LLMs has gained popularity. Based on a comprehensive review of 126 papers, this paper investigates eight categories based on the unique applications of LLMs in addressing various aspects of planning problems: language translation, plan generation, model construction, multi-agent planning,… ▽ More

    Submitted 20 January, 2024; v1 submitted 4 January, 2024; originally announced January 2024.

    Journal ref: Proceedings of the International Conference on Automated Planning and Scheduling, 34(1), 432-444 (2024)

  5. arXiv:2308.09474  [pdf, other

    cs.AI cs.SC math.OC

    Evolving Scientific Discovery by Unifying Data and Background Knowledge with AI Hilbert

    Authors: Ryan Cory-Wright, Cristina Cornelio, Sanjeeb Dash, Bachir El Khadir, Lior Horesh

    Abstract: The discovery of scientific formulae that parsimoniously explain natural phenomena and align with existing background theory is a key goal in science. Historically, scientists have derived natural laws by manipulating equations based on existing knowledge, forming new equations, and verifying them experimentally. In recent years, data-driven scientific discovery has emerged as a viable competitor… ▽ More

    Submitted 29 April, 2024; v1 submitted 18 August, 2023; originally announced August 2023.

    Comments: Revised version, including a significant number of new experiments+supplementary material in appendix, and a title change

  6. arXiv:2307.07628  [pdf, other

    cs.AI cs.CY cs.HC

    Value-based Fast and Slow AI Nudging

    Authors: Marianna B. Ganapini, Francesco Fabiano, Lior Horesh, Andrea Loreggia, Nicholas Mattei, Keerthiram Murugesan, Vishal Pallagani, Francesca Rossi, Biplav Srivastava, Brent Venable

    Abstract: Nudging is a behavioral strategy aimed at influencing people's thoughts and actions. Nudging techniques can be found in many situations in our daily lives, and these nudging techniques can targeted at human fast and unconscious thinking, e.g., by using images to generate fear or the more careful and effortful slow thinking, e.g., by releasing information that makes us reflect on our choices. In th… ▽ More

    Submitted 14 July, 2023; originally announced July 2023.

  7. arXiv:2305.16151  [pdf, other

    cs.AI

    Understanding the Capabilities of Large Language Models for Automated Planning

    Authors: Vishal Pallagani, Bharath Muppasani, Keerthiram Murugesan, Francesca Rossi, Biplav Srivastava, Lior Horesh, Francesco Fabiano, Andrea Loreggia

    Abstract: Automated planning is concerned with developing efficient algorithms to generate plans or sequences of actions to achieve a specific goal in a given environment. Emerging Large Language Models (LLMs) can answer questions, write high-quality programming code, and predict protein folding, showcasing their versatility in solving various tasks beyond language-based problems. In this paper, we aim to e… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: 12 pages

  8. arXiv:2303.04283  [pdf, other

    cs.AI

    Fast and Slow Planning

    Authors: Francesco Fabiano, Vishal Pallagani, Marianna Bergamaschi Ganapini, Lior Horesh, Andrea Loreggia, Keerthiram Murugesan, Francesca Rossi, Biplav Srivastava

    Abstract: The concept of Artificial Intelligence has gained a lot of attention over the last decade. In particular, AI-based tools have been employed in several scenarios and are, by now, pervading our everyday life. Nonetheless, most of these systems lack many capabilities that we would naturally consider to be included in a notion of "intelligence". In this work, we present an architecture that, inspired… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  9. arXiv:2212.08681  [pdf

    cs.AI

    Plansformer: Generating Symbolic Plans using Transformers

    Authors: Vishal Pallagani, Bharath Muppasani, Keerthiram Murugesan, Francesca Rossi, Lior Horesh, Biplav Srivastava, Francesco Fabiano, Andrea Loreggia

    Abstract: Large Language Models (LLMs) have been the subject of active research, significantly advancing the field of Natural Language Processing (NLP). From BERT to BLOOM, LLMs have surpassed state-of-the-art results in various natural language tasks such as question answering, summarization, and text generation. Many ongoing efforts focus on understanding LLMs' capabilities, including their knowledge of t… ▽ More

    Submitted 16 December, 2022; originally announced December 2022.

    Comments: 44 pages including supplementary material

  10. arXiv:2211.15860  [pdf, other

    cs.LG stat.CO

    Bayesian Experimental Design for Symbolic Discovery

    Authors: Kenneth L. Clarkson, Cristina Cornelio, Sanjeeb Dash, Joao Goncalves, Lior Horesh, Nimrod Megiddo

    Abstract: This study concerns the formulation and application of Bayesian optimal experimental design to symbolic discovery, which is the inference from observational data of predictive models taking general functional forms. We apply constrained first-order methods to optimize an appropriate selection criterion, using Hamiltonian Monte Carlo to sample from the prior. A step for computing the predictive dis… ▽ More

    Submitted 28 November, 2022; originally announced November 2022.

  11. arXiv:2209.11452  [pdf, other

    quant-ph cs.DS cs.ET

    From String Detection to Orthogonal Vector Problem

    Authors: Yunhao Wang, Tianyuan Zheng, Lior Horesh

    Abstract: Considering Grover's Search Algorithm (GSA) with the standard diffuser stage applied, we revisit the $3$-qubit unique String Detection Problem (SDP) and extend the algorithm to $4$-qubit SDP with multiple winners. We then investigate unstructured search problems with non-uniform distributions and define the Orthogonal Vector Problem (OVP) under quantum settings. Although no numerically stable resu… ▽ More

    Submitted 23 September, 2022; originally announced September 2022.

  12. arXiv:2209.10146  [pdf, other

    quant-ph cs.CR

    Creating quantum-resistant classical-classical OWFs from quantum-classical OWFs

    Authors: Wei Zheng Teo, Marco Carmosino, Lior Horesh

    Abstract: One-way functions (OWF) are one of the most essential cryptographic primitives, the existence of which results in wide-ranging ramifications such as private-key encryption and proving $P \neq NP$. These OWFs are often thought of as having classical input and output (i.e. binary strings), however, recent work proposes OWF constructions where the input and/or the output can be quantum. In this paper… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

  13. arXiv:2209.09371  [pdf, other

    quant-ph cs.LG math.NA

    Topological data analysis on noisy quantum computers

    Authors: Ismail Yunus Akhalwaya, Shashanka Ubaru, Kenneth L. Clarkson, Mark S. Squillante, Vishnu Jejjala, Yang-Hui He, Kugendran Naidoo, Vasileios Kalantzis, Lior Horesh

    Abstract: Topological data analysis (TDA) is a powerful technique for extracting complex and valuable shape-related summaries of high-dimensional data. However, the computational demands of classical algorithms for computing TDA are exorbitant, and quickly become impractical for high-order characteristics. Quantum computers offer the potential of achieving significant speedup for certain computational probl… ▽ More

    Submitted 19 March, 2024; v1 submitted 19 September, 2022; originally announced September 2022.

    Comments: This paper is a follow up to arXiv:2108.02811 with improved theoretical results and other additional results. This new version presents an improved runtime for the algorithm, and fixes an issue present in the previous version

    Journal ref: In the Proceedings of The Twelfth International Conference on Learning Representations (ICLR 2024)

  14. arXiv:2206.06257  [pdf, other

    cs.LG

    Distributed Adversarial Training to Robustify Deep Neural Networks at Scale

    Authors: Gaoyuan Zhang, Songtao Lu, Yihua Zhang, Xiangyi Chen, Pin-Yu Chen, Quanfu Fan, Lee Martie, Lior Horesh, Mingyi Hong, Sijia Liu

    Abstract: Current deep neural networks (DNNs) are vulnerable to adversarial attacks, where adversarial perturbations to the inputs can change or manipulate classification. To defend against such attacks, an effective and popular approach, known as adversarial training (AT), has been shown to mitigate the negative impact of adversarial attacks by virtue of a min-max robust training method. While effective, i… ▽ More

    Submitted 7 September, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

  15. arXiv:2202.05063  [pdf, ps, other

    cs.LG stat.ML

    PCENet: High Dimensional Surrogate Modeling for Learning Uncertainty

    Authors: Paz Fink Shustin, Shashanka Ubaru, Vasileios Kalantzis, Lior Horesh, Haim Avron

    Abstract: Learning data representations under uncertainty is an important task that emerges in numerous machine learning applications. However, uncertainty quantification (UQ) techniques are computationally intensive and become prohibitively expensive for high-dimensional data. In this paper, we present a novel surrogate model for representation learning and uncertainty quantification, which aims to deal wi… ▽ More

    Submitted 11 February, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

  16. arXiv:2201.07050  [pdf, other

    cs.AI cs.LG

    Combining Fast and Slow Thinking for Human-like and Efficient Navigation in Constrained Environments

    Authors: Marianna B. Ganapini, Murray Campbell, Francesco Fabiano, Lior Horesh, Jon Lenchner, Andrea Loreggia, Nicholas Mattei, Taher Rahgooy, Francesca Rossi, Biplav Srivastava, Brent Venable

    Abstract: Current AI systems lack several important human capabilities, such as adaptability, generalizability, self-control, consistency, common sense, and causal reasoning. We believe that existing cognitive theories of human decision making, such as the thinking fast and slow theory, can provide insights on how to advance AI systems towards some of these capabilities. In this paper, we propose a general… ▽ More

    Submitted 12 February, 2022; v1 submitted 18 January, 2022; originally announced January 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2110.01834

  17. arXiv:2110.01834  [pdf, other

    cs.AI

    Thinking Fast and Slow in AI: the Role of Metacognition

    Authors: Marianna Bergamaschi Ganapini, Murray Campbell, Francesco Fabiano, Lior Horesh, Jon Lenchner, Andrea Loreggia, Nicholas Mattei, Francesca Rossi, Biplav Srivastava, Kristen Brent Venable

    Abstract: AI systems have seen dramatic advancement in recent years, bringing many applications that pervade our everyday life. However, we are still mostly seeing instances of narrow AI: many of these recent developments are typically focused on a very limited set of competencies and goals, e.g., image interpretation, natural language processing, classification, prediction, and many others. Moreover, while… ▽ More

    Submitted 5 October, 2021; originally announced October 2021.

  18. arXiv:2109.01634  [pdf, other

    cs.AI

    AI Descartes: Combining Data and Theory for Derivable Scientific Discovery

    Authors: Cristina Cornelio, Sanjeeb Dash, Vernon Austel, Tyler Josephson, Joao Goncalves, Kenneth Clarkson, Nimrod Megiddo, Bachir El Khadir, Lior Horesh

    Abstract: Scientists have long aimed to discover meaningful formulae which accurately describe experimental data. A common approach is to manually create mathematical models of natural phenomena using domain knowledge, and then fit these models to data. In contrast, machine-learning algorithms automate the construction of accurate data-driven models while consuming large amounts of data. The problem of inco… ▽ More

    Submitted 9 January, 2023; v1 submitted 3 September, 2021; originally announced September 2021.

  19. arXiv:2108.02811  [pdf, ps, other

    quant-ph cs.LG math.NA

    Quantum Topological Data Analysis with Linear Depth and Exponential Speedup

    Authors: Shashanka Ubaru, Ismail Yunus Akhalwaya, Mark S. Squillante, Kenneth L. Clarkson, Lior Horesh

    Abstract: Quantum computing offers the potential of exponential speedups for certain classical computations. Over the last decade, many quantum machine learning (QML) algorithms have been proposed as candidates for such exponential improvements. However, two issues unravel the hope of exponential speedup for some of these QML algorithms: the data-loading problem and, more recently, the stunning dequantizati… ▽ More

    Submitted 5 August, 2021; originally announced August 2021.

    Comments: 27 pages

  20. arXiv:2107.08739  [pdf, ps, other

    cs.AI cs.CL

    E-PDDL: A Standardized Way of Defining Epistemic Planning Problems

    Authors: Francesco Fabiano, Biplav Srivastava, Jonathan Lenchner, Lior Horesh, Francesca Rossi, Marianna Bergamaschi Ganapini

    Abstract: Epistemic Planning (EP) refers to an automated planning setting where the agent reasons in the space of knowledge states and tries to find a plan to reach a desirable state from the current state. Its general form, the Multi-agent Epistemic Planning (MEP) problem involves multiple agents who need to reason about both the state of the world and the information flow between agents. In a MEP problem,… ▽ More

    Submitted 19 July, 2021; originally announced July 2021.

    Comments: 9 pages, Knowledge Engineering for Planning and Scheduling - ICAPS 2021

  21. arXiv:2107.06973  [pdf, other

    cs.ET

    Solving sparse linear systems with approximate inverse preconditioners on analog devices

    Authors: Vasileios Kalantzis, Anshul Gupta, Lior Horesh, Tomasz Nowicki, Mark S. Squillante, Chai Wah Wu

    Abstract: Sparse linear system solvers are computationally expensive kernels that lie at the heart of numerous applications. This paper proposes a flexible preconditioning framework to substantially reduce the time and energy requirements of this task by utilizing a hybrid architecture that combines conventional digital microprocessors with analog crossbar array accelerators. Our analysis and experiments wi… ▽ More

    Submitted 14 July, 2021; originally announced July 2021.

    MSC Class: C3; G1

  22. arXiv:2012.14714  [pdf, other

    quant-ph cs.LG

    Denoising quantum states with Quantum Autoencoders -- Theory and Applications

    Authors: Tom Achache, Lior Horesh, John Smolin

    Abstract: We implement a Quantum Autoencoder (QAE) as a quantum circuit capable of correcting Greenberger-Horne-Zeilinger (GHZ) states subject to various noisy quantum channels : the bit-flip channel and the more general quantum depolarizing channel. The QAE shows particularly interesting results, as it enables to perform an almost perfect reconstruction of noisy states, but can also, more surprisingly, act… ▽ More

    Submitted 29 December, 2020; originally announced December 2020.

    Comments: 13 pages

  23. arXiv:2011.12581  [pdf, other

    cs.LG stat.ML

    Overcoming Catastrophic Forgetting via Direction-Constrained Optimization

    Authors: Yunfei Teng, Anna Choromanska, Murray Campbell, Songtao Lu, Parikshit Ram, Lior Horesh

    Abstract: This paper studies a new design of the optimization algorithm for training deep learning models with a fixed architecture of the classification network in a continual learning framework. The training data is non-stationary and the non-stationarity is imposed by a sequence of distinct tasks. We first analyze a deep model trained on only one learning task in isolation and identify a region in networ… ▽ More

    Submitted 1 July, 2022; v1 submitted 25 November, 2020; originally announced November 2020.

  24. arXiv:2011.04125  [pdf, ps, other

    cs.DS cs.LG quant-ph

    Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra

    Authors: Nadiia Chepurko, Kenneth L. Clarkson, Lior Horesh, Honghao Lin, David P. Woodruff

    Abstract: We create classical (non-quantum) dynamic data structures supporting queries for recommender systems and least-squares regression that are comparable to their quantum analogues. De-quantizing such algorithms has received a flurry of attention in recent years; we obtain sharper bounds for these problems. More significantly, we achieve these improvements by arguing that the previous quantum-inspired… ▽ More

    Submitted 28 June, 2022; v1 submitted 8 November, 2020; originally announced November 2020.

    Comments: Minor edits to exposition

  25. arXiv:2010.06392  [pdf, ps, other

    math.NA cs.IR stat.ML

    Projection techniques to update the truncated SVD of evolving matrices

    Authors: Vassilis Kalantzis, Georgios Kollias, Shashanka Ubaru, Athanasios N. Nikolakopoulos, Lior Horesh, Kenneth L. Clarkson

    Abstract: This paper considers the problem of updating the rank-k truncated Singular Value Decomposition (SVD) of matrices subject to the addition of new rows and/or columns over time. Such matrix problems represent an important computational kernel in applications such as Latent Semantic Indexing and Recommender Systems. Nonetheless, the proposed framework is purely algebraic and targets general updating p… ▽ More

    Submitted 13 October, 2020; originally announced October 2020.

    Comments: 13 pages

  26. arXiv:2010.06002  [pdf, ps, other

    cs.AI

    Thinking Fast and Slow in AI

    Authors: Grady Booch, Francesco Fabiano, Lior Horesh, Kiran Kate, Jon Lenchner, Nick Linck, Andrea Loreggia, Keerthiram Murugesan, Nicholas Mattei, Francesca Rossi, Biplav Srivastava

    Abstract: This paper proposes a research direction to advance AI which draws inspiration from cognitive theories of human decision making. The premise is that if we gain insights about the causes of some human capabilities that are still lacking in AI (for instance, adaptability, generalizability, common sense, and causal reasoning), we may obtain similar capabilities in an AI system by embedding these caus… ▽ More

    Submitted 15 December, 2020; v1 submitted 12 October, 2020; originally announced October 2020.

    Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence 2021, 35(17), 15042-15046

  27. arXiv:2006.06813  [pdf, other

    cs.LG stat.ML

    Symbolic Regression using Mixed-Integer Nonlinear Optimization

    Authors: Vernon Austel, Cristina Cornelio, Sanjeeb Dash, Joao Goncalves, Lior Horesh, Tyler Josephson, Nimrod Megiddo

    Abstract: The Symbolic Regression (SR) problem, where the goal is to find a regression function that does not have a pre-specified form but is any function that can be composed of a list of operators, is a hard problem in machine learning, both theoretically and computationally. Genetic programming based methods, that heuristically search over a very large space of functions, are the most commonly used meth… ▽ More

    Submitted 11 June, 2020; originally announced June 2020.

  28. arXiv:1910.08135  [pdf, other

    quant-ph cs.CR

    Communication over Continuous Quantum Secure Dialogue using Einstein-Podolsky-Rosen States

    Authors: Shaokai Lin, Zichuan Wang, Lior Horesh

    Abstract: With the emergence of quantum computing and quantum networks, many communication protocols that take advantage of the unique properties of quantum mechanics to achieve a secure bidirectional exchange of information, have been proposed. In this study, we propose a new quantum communication protocol, called Continuous Quantum Secure Dialogue (CQSD), that allows two parties to continuously exchange m… ▽ More

    Submitted 27 October, 2019; v1 submitted 17 October, 2019; originally announced October 2019.

    Comments: Accepted for presentation in a poster session at QIP 2020

  29. Dynamic Graph Convolutional Networks Using the Tensor M-Product

    Authors: Osman Asif Malik, Shashanka Ubaru, Lior Horesh, Misha E. Kilmer, Haim Avron

    Abstract: Many irregular domains such as social networks, financial transactions, neuron connections, and natural language constructs are represented using graph structures. In recent years, a variety of graph neural networks (GNNs) have been successfully applied for representation learning and prediction on such graphs. In many of the real-world applications, the underlying graph changes over time, however… ▽ More

    Submitted 22 January, 2021; v1 submitted 16 October, 2019; originally announced October 2019.

    Comments: Accepted to SIAM International Conference on Data Mining (SDM) 2021

  30. arXiv:1904.12933  [pdf, other

    cs.LG quant-ph stat.ML

    Recurrent Neural Networks in the Eye of Differential Equations

    Authors: Murphy Yuezhen Niu, Lior Horesh, Isaac Chuang

    Abstract: To understand the fundamental trade-offs between training stability, temporal dynamics and architectural complexity of recurrent neural networks~(RNNs), we directly analyze RNN architectures using numerical methods of ordinary differential equations~(ODEs). We define a general family of RNNs--the ODERNNs--by relating the composition rules of RNNs to integration methods of ODEs at discrete time ste… ▽ More

    Submitted 29 April, 2019; originally announced April 2019.

    Comments: 25pages, 3 figures

  31. arXiv:1811.06569  [pdf, other

    cs.LG math.NA stat.ML

    Stable Tensor Neural Networks for Rapid Deep Learning

    Authors: Elizabeth Newman, Lior Horesh, Haim Avron, Misha Kilmer

    Abstract: We propose a tensor neural network ($t$-NN) framework that offers an exciting new paradigm for designing neural networks with multidimensional (tensor) data. Our network architecture is based on the $t$-product (Kilmer and Martin, 2011), an algebraic formulation to multiply tensors via circulant convolution. In this $t$-product algebra, we interpret tensors as $t$-linear operators analogous to mat… ▽ More

    Submitted 15 November, 2018; originally announced November 2018.

    Comments: 20 pages, 6 figures, submitted to SIMODS

  32. arXiv:1706.09693  [pdf, other

    stat.ML cs.LG stat.CO

    Image classification using local tensor singular value decompositions

    Authors: Elizabeth Newman, Misha Kilmer, Lior Horesh

    Abstract: From linear classifiers to neural networks, image classification has been a widely explored topic in mathematics, and many algorithms have proven to be effective classifiers. However, the most accurate classifiers typically have significantly high storage costs, or require complicated procedures that may be computationally expensive. We present a novel (nonlinear) classification approach using tru… ▽ More

    Submitted 29 June, 2017; originally announced June 2017.

    Comments: Submitted to IEEE CAMSAP 2017 Conference, 5 pages, 9 figures and tables

  33. arXiv:1309.1508  [pdf, other

    cs.LG cs.CL cs.NE math.OC stat.ML

    Accelerating Hessian-free optimization for deep neural networks by implicit preconditioning and sampling

    Authors: Tara N. Sainath, Lior Horesh, Brian Kingsbury, Aleksandr Y. Aravkin, Bhuvana Ramabhadran

    Abstract: Hessian-free training has become a popular parallel second or- der optimization technique for Deep Neural Network training. This study aims at speeding up Hessian-free training, both by means of decreasing the amount of data used for training, as well as through reduction of the number of Krylov subspace solver iterations used for implicit estimation of the Hessian. In this paper, we develop an L-… ▽ More

    Submitted 10 December, 2013; v1 submitted 5 September, 2013; originally announced September 2013.

    Comments: this paper is not supposed to be posted publically before the conference in December due to company policy. another co-author was not informed of this and posted without the permission of the first author. pls remove

    MSC Class: 65K05; 90C15; 90C90