Zum Hauptinhalt springen

Showing 1–17 of 17 results for author: Hinton, G E

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.16564  [pdf, other

    cs.CV cs.LG

    Testing GLOM's ability to infer wholes from ambiguous parts

    Authors: Laura Culp, Sara Sabour, Geoffrey E. Hinton

    Abstract: The GLOM architecture proposed by Hinton [2021] is a recurrent neural network for parsing an image into a hierarchy of wholes and parts. When a part is ambiguous, GLOM assumes that the ambiguity can be resolved by allowing the part to make multi-modal predictions for the pose and identity of the whole to which it belongs and then using attention to similar predictions coming from other possibly am… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

  2. arXiv:2011.13920  [pdf, other

    cs.CV cs.LG

    Unsupervised part representation by Flow Capsules

    Authors: Sara Sabour, Andrea Tagliasacchi, Soroosh Yazdani, Geoffrey E. Hinton, David J. Fleet

    Abstract: Capsule networks aim to parse images into a hierarchy of objects, parts and relations. While promising, they remain limited by an inability to learn effective low level part descriptions. To address this issue we propose a way to learn primary capsule encoders that detect atomic parts from a single image. During training we exploit motion as a powerful perceptual cue for part definition, with an e… ▽ More

    Submitted 19 February, 2021; v1 submitted 27 November, 2020; originally announced November 2020.

  3. arXiv:1906.06818  [pdf, other

    stat.ML cs.CV cs.LG cs.NE

    Stacked Capsule Autoencoders

    Authors: Adam R. Kosiorek, Sara Sabour, Yee Whye Teh, Geoffrey E. Hinton

    Abstract: Objects are composed of a set of geometrically organized parts. We introduce an unsupervised capsule autoencoder (SCAE), which explicitly uses geometric relationships between parts to reason about objects. Since these relationships do not depend on the viewpoint, our model is robust to viewpoint changes. SCAE consists of two stages. In the first stage, the model predicts presences and poses of par… ▽ More

    Submitted 2 December, 2019; v1 submitted 16 June, 2019; originally announced June 2019.

    Comments: NeurIPS 2019; 14 pages, 7 figures, 4 tables, code is available at https://github.com/google-research/google-research/tree/master/stacked_capsule_autoencoders

  4. arXiv:1905.13678  [pdf, other

    cs.LG stat.ML

    Learning Sparse Networks Using Targeted Dropout

    Authors: Aidan N. Gomez, Ivan Zhang, Siddhartha Rao Kamalakara, Divyam Madaan, Kevin Swersky, Yarin Gal, Geoffrey E. Hinton

    Abstract: Neural networks are easier to optimise when they have many more weights than are required for modelling the mapping from inputs to outputs. This suggests a two-stage learning procedure that first learns a large net and then prunes away connections or hidden units. But standard training does not necessarily encourage nets to be amenable to pruning. We introduce targeted dropout, a method for traini… ▽ More

    Submitted 9 September, 2019; v1 submitted 31 May, 2019; originally announced May 2019.

  5. arXiv:1807.04587  [pdf, other

    cs.LG cs.AI cs.NE stat.ML

    Assessing the Scalability of Biologically-Motivated Deep Learning Algorithms and Architectures

    Authors: Sergey Bartunov, Adam Santoro, Blake A. Richards, Luke Marris, Geoffrey E. Hinton, Timothy Lillicrap

    Abstract: The backpropagation of error algorithm (BP) is impossible to implement in a real brain. The recent success of deep networks in machine learning and AI, however, has inspired proposals for understanding how the brain might learn across multiple layers, and hence how it might approximate BP. As of yet, none of these proposals have been rigorously evaluated on tasks where BP-guided deep learning has… ▽ More

    Submitted 20 November, 2018; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: NIPS 2018. Version 2 contains more experimental data including best hyperparameters found

  6. arXiv:1804.03235  [pdf, other

    cs.LG cs.AI stat.ML

    Large scale distributed neural network training through online distillation

    Authors: Rohan Anil, Gabriel Pereyra, Alexandre Passos, Robert Ormandi, George E. Dahl, Geoffrey E. Hinton

    Abstract: Techniques such as ensembling and distillation promise model quality improvements when paired with almost any base model. However, due to increased test-time cost (for ensembles) and increased complexity of the training pipeline (for distillation), these techniques are challenging to use in industrial settings. In this paper we explore a variant of distillation which is relatively straightforward… ▽ More

    Submitted 20 August, 2020; v1 submitted 9 April, 2018; originally announced April 2018.

    Comments: Clarify that implementations should use available parallelism in pseudo-code

  7. arXiv:1710.09829  [pdf, other

    cs.CV

    Dynamic Routing Between Capsules

    Authors: Sara Sabour, Nicholas Frosst, Geoffrey E Hinton

    Abstract: A capsule is a group of neurons whose activity vector represents the instantiation parameters of a specific type of entity such as an object or an object part. We use the length of the activity vector to represent the probability that the entity exists and its orientation to represent the instantiation parameters. Active capsules at one level make predictions, via transformation matrices, for the… ▽ More

    Submitted 7 November, 2017; v1 submitted 26 October, 2017; originally announced October 2017.

  8. arXiv:1703.08774  [pdf, other

    cs.LG cs.CV

    Who Said What: Modeling Individual Labelers Improves Classification

    Authors: Melody Y. Guan, Varun Gulshan, Andrew M. Dai, Geoffrey E. Hinton

    Abstract: Data are often labeled by many different experts with each expert only labeling a small fraction of the data and each data point being labeled by several experts. This reduces the workload on individual experts and also gives a better estimate of the unobserved ground truth. When experts disagree, the standard approaches are to treat the majority opinion as the correct label or to model the correc… ▽ More

    Submitted 4 January, 2018; v1 submitted 26 March, 2017; originally announced March 2017.

    Comments: AAAI 2018

  9. arXiv:1607.06450  [pdf, other

    stat.ML cs.LG

    Layer Normalization

    Authors: Jimmy Lei Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

    Abstract: Training state-of-the-art, deep neural networks is computationally expensive. One way to reduce the training time is to normalize the activities of the neurons. A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which are then used to normalize the summed input to that n… ▽ More

    Submitted 21 July, 2016; originally announced July 2016.

  10. arXiv:1603.08575  [pdf, other

    cs.CV cs.LG

    Attend, Infer, Repeat: Fast Scene Understanding with Generative Models

    Authors: S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, Geoffrey E. Hinton

    Abstract: We present a framework for efficient inference in structured image models that explicitly reason about objects. We achieve this by performing probabilistic inference using a recurrent neural network that attends to scene elements and processes them one at a time. Crucially, the model itself learns to choose the appropriate number of inference steps. We use this scheme to learn to perform inference… ▽ More

    Submitted 12 August, 2016; v1 submitted 28 March, 2016; originally announced March 2016.

  11. arXiv:1504.00941  [pdf, ps, other

    cs.NE cs.LG

    A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

    Authors: Quoc V. Le, Navdeep Jaitly, Geoffrey E. Hinton

    Abstract: Learning long term dependencies in recurrent networks is difficult due to vanishing and exploding gradients. To overcome this difficulty, researchers have developed sophisticated optimization techniques and network architectures. In this paper, we propose a simpler solution that use recurrent neural networks composed of rectified linear units. Key to our solution is the use of the identity matrix… ▽ More

    Submitted 7 April, 2015; v1 submitted 3 April, 2015; originally announced April 2015.

  12. arXiv:1309.6865  [pdf

    cs.LG cs.IR stat.ML

    Modeling Documents with Deep Boltzmann Machines

    Authors: Nitish Srivastava, Ruslan R Salakhutdinov, Geoffrey E. Hinton

    Abstract: We introduce a Deep Boltzmann Machine model suitable for modeling and extracting latent semantic representations from a large unstructured collection of documents. We overcome the apparent difficulty of training a DBM with judicious parameter tying. This parameter tying enables an efficient pretraining algorithm and a state initialization scheme that aids inference. The model can be trained just a… ▽ More

    Submitted 26 September, 2013; originally announced September 2013.

    Comments: Appears in Proceedings of the Twenty-Ninth Conference on Uncertainty in Artificial Intelligence (UAI2013)

    Report number: UAI-P-2013-PG-616-624

  13. arXiv:1301.2278  [pdf

    cs.LG stat.ML

    Discovering Multiple Constraints that are Frequently Approximately Satisfied

    Authors: Geoffrey E. Hinton, Yee Whye Teh

    Abstract: Some high-dimensional data.sets can be modelled by assuming that there are many different linear constraints, each of which is Frequently Approximately Satisfied (FAS) by the data. The probability of a data vector under the model is then proportional to the product of the probabilities of its constraint violations. We describe three methods of learning products of constraints using a heavy-tailed… ▽ More

    Submitted 10 January, 2013; originally announced January 2013.

    Comments: Appears in Proceedings of the Seventeenth Conference on Uncertainty in Artificial Intelligence (UAI2001)

    Report number: UAI-P-2001-PG-227-234

  14. arXiv:1212.2513  [pdf

    cs.LG stat.ML

    Efficient Parametric Projection Pursuit Density Estimation

    Authors: Max Welling, Richard S. Zemel, Geoffrey E. Hinton

    Abstract: Product models of low dimensional experts are a powerful way to avoid the curse of dimensionality. We present the ``under-complete product of experts' (UPoE), where each expert models a one dimensional projection of the data. The UPoE is fully tractable and may be interpreted as a parametric probabilistic model for projection pursuit. Its ML learning rules are identical to the… ▽ More

    Submitted 19 October, 2012; originally announced December 2012.

    Comments: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

    Report number: UAI-P-2003-PG-575-582

  15. arXiv:1207.0580  [pdf, other

    cs.NE cs.CV cs.LG

    Improving neural networks by preventing co-adaptation of feature detectors

    Authors: Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, Ilya Sutskever, Ruslan R. Salakhutdinov

    Abstract: When a large feedforward neural network is trained on a small training set, it typically performs poorly on held-out test data. This "overfitting" is greatly reduced by randomly omitting half of the feature detectors on each training case. This prevents complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors. Instead, each neuro… ▽ More

    Submitted 3 July, 2012; originally announced July 2012.

  16. arXiv:1205.2614  [pdf

    cs.LG stat.ML

    Products of Hidden Markov Models: It Takes N>1 to Tango

    Authors: Graham W Taylor, Geoffrey E. Hinton

    Abstract: Products of Hidden Markov Models(PoHMMs) are an interesting class of generative models which have received little attention since their introduction. This maybe in part due to their more computationally expensive gradient-based learning algorithm,and the intractability of computing the log likelihood of sequences under the model. In this paper, we demonstrate how the partition function can be esti… ▽ More

    Submitted 9 May, 2012; originally announced May 2012.

    Comments: Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009)

    Report number: UAI-P-2009-PG-522-529

  17. arXiv:1202.3748  [pdf

    cs.LG stat.ML

    Conditional Restricted Boltzmann Machines for Structured Output Prediction

    Authors: Volodymyr Mnih, Hugo Larochelle, Geoffrey E. Hinton

    Abstract: Conditional Restricted Boltzmann Machines (CRBMs) are rich probabilistic models that have recently been applied to a wide range of problems, including collaborative filtering, classification, and modeling motion capture data. While much progress has been made in training non-conditional RBMs, these algorithms are not applicable to conditional models and there has been almost no work on training an… ▽ More

    Submitted 14 February, 2012; originally announced February 2012.

    Report number: UAI-P-2011-PG-514-522