Zum Hauptinhalt springen

Showing 101–150 of 177 results for author: Vedaldi, A

.
  1. arXiv:1908.09884  [pdf, other

    cs.CV

    Learning to Discover Novel Visual Categories via Deep Transfer Clustering

    Authors: Kai Han, Andrea Vedaldi, Andrew Zisserman

    Abstract: We consider the problem of discovering novel object categories in an image collection. While these images are unlabelled, we also assume prior knowledge of related but different image classes. We use such prior knowledge to reduce the ambiguity of clustering, and improve the quality of the newly discovered classes. Our contributions are twofold. The first contribution is to extend Deep Embedded Cl… ▽ More

    Submitted 26 August, 2019; originally announced August 2019.

    Comments: ICCV 2019

  2. arXiv:1908.06427  [pdf, other

    cs.CV

    Unsupervised Learning of Landmarks by Descriptor Vector Exchange

    Authors: James Thewlis, Samuel Albanie, Hakan Bilen, Andrea Vedaldi

    Abstract: Equivariance to random image transformations is an effective method to learn landmarks of object categories, such as the eyes and the nose in faces, without manual supervision. However, this method does not explicitly guarantee that the learned landmarks are consistent with changes between different instances of the same object, such as different facial identities. In this paper, we develop a new… ▽ More

    Submitted 18 August, 2019; originally announced August 2019.

    Comments: ICCV 2019

  3. arXiv:1908.05263  [pdf, other

    cs.CV

    AutoCorrect: Deep Inductive Alignment of Noisy Geometric Annotations

    Authors: Honglie Chen, Weidi Xie, Andrea Vedaldi, Andrew Zisserman

    Abstract: We propose AutoCorrect, a method to automatically learn object-annotation alignments from a dataset with annotations affected by geometric noise. The method is based on a consistency loss that enables deep neural networks to be trained, given only noisy annotations as input, to correct the annotations. When some noise-free annotations are available, we show that the consistency loss reduces to a s… ▽ More

    Submitted 14 August, 2019; originally announced August 2019.

    Comments: BMVC 2019 (Spotlight)

  4. Self-supervised Learning of Interpretable Keypoints from Unlabelled Videos

    Authors: Tomas Jakab, Ankush Gupta, Hakan Bilen, Andrea Vedaldi

    Abstract: We propose KeypointGAN, a new method for recognizing the pose of objects from a single image that for learning uses only unlabelled videos and a weak empirical prior on the object poses. Video frames differ primarily in the pose of the objects they contain, so our method distils the pose information by analyzing the differences between frames. The distillation uses a new dual representation of the… ▽ More

    Submitted 23 December, 2020; v1 submitted 3 July, 2019; originally announced July 2019.

    Comments: CVPR 2020 (oral). Project page: http://www.robots.ox.ac.uk/~vgg/research/unsupervised_pose/

    Journal ref: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 8787-8797

  5. arXiv:1906.06423  [pdf, other

    cs.CV cs.LG

    Fixing the train-test resolution discrepancy

    Authors: Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou

    Abstract: Data-augmentation is key to the training of neural networks for image classification. This paper first shows that existing augmentations induce a significant discrepancy between the typical size of the objects seen by the classifier at train and test time. We experimentally validate that, for a target test resolution, using a lower train resolution offers better classification at test time. We t… ▽ More

    Submitted 20 January, 2022; v1 submitted 14 June, 2019; originally announced June 2019.

  6. arXiv:1906.05706  [pdf, other

    cs.CV

    Slim DensePose: Thrifty Learning from Sparse Annotations and Motion Cues

    Authors: Natalia Neverova, James Thewlis, Rıza Alp Güler, Iasonas Kokkinos, Andrea Vedaldi

    Abstract: DensePose supersedes traditional landmark detectors by densely mapping image pixels to body surface coordinates. This power, however, comes at a greatly increased annotation time, as supervising the model requires to manually label hundreds of points per pose instance. In this work, we thus seek methods to significantly slim down the DensePose annotations, proposing more efficient data collection… ▽ More

    Submitted 13 June, 2019; originally announced June 2019.

    Comments: CVPR 2019

  7. arXiv:1906.01568  [pdf, other

    cs.CV

    Photo-Geometric Autoencoding to Learn 3D Objects from Unlabelled Images

    Authors: Shangzhe Wu, Christian Rupprecht, Andrea Vedaldi

    Abstract: We show that generative models can be used to capture visual geometry constraints statistically. We use this fact to infer the 3D shape of object categories from raw single-view images. Differently from prior work, we use no external supervision, nor do we use multiple views or videos of the objects. We achieve this by a simple reconstruction task, exploiting the symmetry of the objects' shape and… ▽ More

    Submitted 4 June, 2019; originally announced June 2019.

    Comments: Appendix included, 17 pages. Project page: https://elliottwu.com/projects/unsup3d

  8. arXiv:1905.10793  [pdf, other

    cs.CV cs.AI cs.GR

    Unsupervised Intuitive Physics from Past Experiences

    Authors: Sébastien Ehrhardt, Aron Monszpart, Niloy J. Mitra, Andrea Vedaldi

    Abstract: We are interested in learning models of intuitive physics similar to the ones that animals use for navigation, manipulation and planning. In addition to learning general physical principles, however, we are also interested in learning ``on the fly'', from a few experiences, physical properties specific to new environments. We do all this in an unsupervised manner, using a meta-learning formulation… ▽ More

    Submitted 26 May, 2019; originally announced May 2019.

    Comments: Under review

  9. arXiv:1905.08845  [pdf, other

    cs.CV stat.ML

    Semi-Supervised Learning with Scarce Annotations

    Authors: Sylvestre-Alvise Rebuffi, Sebastien Ehrhardt, Kai Han, Andrea Vedaldi, Andrew Zisserman

    Abstract: While semi-supervised learning (SSL) algorithms provide an efficient way to make use of both labelled and unlabelled data, they generally struggle when the number of annotated samples is very small. In this work, we consider the problem of SSL multi-class classification with very few labelled instances. We introduce two key ideas. The first is a simple but effective one: we leverage the power of t… ▽ More

    Submitted 21 April, 2020; v1 submitted 21 May, 2019; originally announced May 2019.

    Comments: Workshop on Deep Vision, CVPR 2020

  10. arXiv:1904.13132  [pdf, other

    cs.CV

    A critical analysis of self-supervision, or what we can learn from a single image

    Authors: Yuki M. Asano, Christian Rupprecht, Andrea Vedaldi

    Abstract: We look critically at popular self-supervision techniques for learning deep convolutional neural networks without manual labels. We show that three different and representative methods, BiGAN, RotNet and DeepCluster, can learn the first few layers of a convolutional network from a single image as well as using millions of images and manual labels, provided that strong data augmentation is used. Ho… ▽ More

    Submitted 19 February, 2020; v1 submitted 30 April, 2019; originally announced April 2019.

    Comments: Accepted paper at the International Conference on Learning Representations (ICLR) 2020

  11. arXiv:1902.05509  [pdf, other

    cs.CV

    MultiGrain: a unified image embedding for classes and instances

    Authors: Maxim Berman, Hervé Jégou, Andrea Vedaldi, Iasonas Kokkinos, Matthijs Douze

    Abstract: MultiGrain is a network architecture producing compact vector representations that are suited both for image classification and particular object retrieval. It builds on a standard classification trunk. The top of the network produces an embedding containing coarse and fine-grained information, so that images can be recognized based on the object class, particular object, or if they are distorted… ▽ More

    Submitted 3 April, 2019; v1 submitted 14 February, 2019; originally announced February 2019.

  12. arXiv:1810.12348  [pdf, other

    cs.CV

    Gather-Excite: Exploiting Feature Context in Convolutional Neural Networks

    Authors: Jie Hu, Li Shen, Samuel Albanie, Gang Sun, Andrea Vedaldi

    Abstract: While the use of bottom-up local operators in convolutional neural networks (CNNs) matches well some of the statistics of natural images, it may also prevent such models from capturing contextual long-range feature interactions. In this work, we propose a simple, lightweight approach for better context exploitation in CNNs. We do so by introducing a pair of operators: gather, which efficiently agg… ▽ More

    Submitted 12 January, 2019; v1 submitted 29 October, 2018; originally announced October 2018.

    Comments: NeurIPS 2018

  13. arXiv:1809.08675  [pdf, other

    cs.CV

    Learning to Read by Spelling: Towards Unsupervised Text Recognition

    Authors: Ankush Gupta, Andrea Vedaldi, Andrew Zisserman

    Abstract: This work presents a method for visual text recognition without using any paired supervisory data. We formulate the text recognition task as one of aligning the conditional distribution of strings predicted from given text images, with lexically valid strings sampled from target corpora. This enables fully automated, and unsupervised learning from just line-level text-images, and unpaired text-str… ▽ More

    Submitted 9 December, 2018; v1 submitted 23 September, 2018; originally announced September 2018.

  14. arXiv:1808.05561  [pdf, other

    cs.CV

    Emotion Recognition in Speech using Cross-Modal Transfer in the Wild

    Authors: Samuel Albanie, Arsha Nagrani, Andrea Vedaldi, Andrew Zisserman

    Abstract: Obtaining large, human labelled speech datasets to train models for emotion recognition is a notoriously challenging task, hindered by annotation cost and label ambiguity. In this work, we consider the task of learning embeddings for speech classification without access to any form of labelled audio. We base our approach on a simple hypothesis: that the emotional content of speech correlates with… ▽ More

    Submitted 16 August, 2018; originally announced August 2018.

    Comments: Conference paper at ACM Multimedia 2018

  15. arXiv:1807.10712  [pdf, other

    cs.CV

    Semi-convolutional Operators for Instance Segmentation

    Authors: David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi

    Abstract: Object detection and instance segmentation are dominated by region-based methods such as Mask RCNN. However, there is a growing interest in reducing these problems to pixel labeling tasks, as the latter could be more efficient, could be integrated seamlessly in image-to-image network architectures as used in many other tasks, and could be more accurate for objects that are not well approximated by… ▽ More

    Submitted 27 July, 2018; originally announced July 2018.

    Comments: Accepted as a conference paper at ECCV 2018

  16. arXiv:1807.08179  [pdf, other

    cs.CV

    Inductive Visual Localisation: Factorised Training for Superior Generalisation

    Authors: Ankush Gupta, Andrea Vedaldi, Andrew Zisserman

    Abstract: End-to-end trained Recurrent Neural Networks (RNNs) have been successfully applied to numerous problems that require processing sequences, such as image captioning, machine translation, and text recognition. However, RNNs often struggle to generalise to sequences longer than the ones encountered during training. In this work, we propose to optimise neural networks explicitly for induction. The ide… ▽ More

    Submitted 21 July, 2018; originally announced July 2018.

    Comments: In BMVC 2018 (spotlight)

  17. arXiv:1807.07939  [pdf, other

    cs.CV

    Large scale evaluation of local image feature detectors on homography datasets

    Authors: Karel Lenc, Andrea Vedaldi

    Abstract: We present a large scale benchmark for the evaluation of local feature detectors. Our key innovation is the introduction of a new evaluation protocol which extends and improves the standard detection repeatability measure. The new protocol is better for assessment on a large number of images and reduces the dependency of the results on unwanted distractors such as the number of detected features a… ▽ More

    Submitted 20 July, 2018; originally announced July 2018.

    Comments: Accepted to BMVC 2018

  18. arXiv:1807.06653  [pdf, other

    cs.CV cs.LG

    Invariant Information Clustering for Unsupervised Image Classification and Segmentation

    Authors: Xu Ji, João F. Henriques, Andrea Vedaldi

    Abstract: We present a novel clustering objective that learns a neural network classifier from scratch, given only unlabelled data samples. The model discovers clusters that accurately match semantic classes, achieving state-of-the-art results in eight unsupervised clustering benchmarks spanning image classification and segmentation. These include STL10, an unsupervised variant of ImageNet, and CIFAR10, whe… ▽ More

    Submitted 22 August, 2019; v1 submitted 17 July, 2018; originally announced July 2018.

    Comments: International Conference on Computer Vision 2019

  19. arXiv:1807.05636  [pdf, other

    cs.CV cs.LG cs.NE

    Cross Pixel Optical Flow Similarity for Self-Supervised Learning

    Authors: Aravindh Mahendran, James Thewlis, Andrea Vedaldi

    Abstract: We propose a novel method for learning convolutional neural image representations without manual supervision. We use motion cues in the form of optical flow, to supervise representations of static images. The obvious approach of training a network to predict flow from a single image can be needlessly difficult due to intrinsic ambiguities in this prediction task. We instead propose a much simpler… ▽ More

    Submitted 15 July, 2018; originally announced July 2018.

    MSC Class: 68T45

  20. arXiv:1806.07823  [pdf, other

    cs.CV

    Unsupervised Learning of Object Landmarks through Conditional Image Generation

    Authors: Tomas Jakab, Ankush Gupta, Hakan Bilen, Andrea Vedaldi

    Abstract: We propose a method for learning landmark detectors for visual objects (such as the eyes and the nose in a face) without any manual supervision. We cast this as the problem of generating images that combine the appearance of the object as seen in a first example image with the geometry of the object as seen in a second example image, where the two examples differ by a viewpoint change and/or an ob… ▽ More

    Submitted 13 December, 2018; v1 submitted 20 June, 2018; originally announced June 2018.

    Comments: In NeurIPS 2018. Project page: http://www.robots.ox.ac.uk/~vgg/research/unsupervised_landmarks/

  21. arXiv:1806.05502  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    Scrutinizing and De-Biasing Intuitive Physics with Neural Stethoscopes

    Authors: Fabian B. Fuchs, Oliver Groth, Adam R. Kosiorek, Alex Bewley, Markus Wulfmeier, Andrea Vedaldi, Ingmar Posner

    Abstract: Visually predicting the stability of block towers is a popular task in the domain of intuitive physics. While previous work focusses on prediction accuracy, a one-dimensional performance measure, we provide a broader analysis of the learned physical understanding of the final model and how the learning process can be guided. To this end, we introduce neural stethoscopes as a general purpose framew… ▽ More

    Submitted 6 September, 2019; v1 submitted 14 June, 2018; originally announced June 2018.

  22. arXiv:1805.08136  [pdf, other

    cs.CV cs.LG stat.ML

    Meta-learning with differentiable closed-form solvers

    Authors: Luca Bertinetto, João F. Henriques, Philip H. S. Torr, Andrea Vedaldi

    Abstract: Adapting deep networks to new concepts from a few examples is challenging, due to the high computational requirements of standard fine-tuning procedures. Most work on few-shot learning has thus focused on simple learning techniques for adaptation, such as nearest neighbours or gradient descent. Nonetheless, the machine learning literature contains a wealth of methods that learn non-deep models ver… ▽ More

    Submitted 24 July, 2019; v1 submitted 21 May, 2018; originally announced May 2018.

    Comments: Published at ICLR'19. Code and data available at http://www.robots.ox.ac.uk/~luca/r2d2.html

  23. arXiv:1805.08095  [pdf, other

    cs.LG cs.CV math.NA stat.ML

    Small steps and giant leaps: Minimal Newton solvers for Deep Learning

    Authors: João F. Henriques, Sebastien Ehrhardt, Samuel Albanie, Andrea Vedaldi

    Abstract: We propose a fast second-order method that can be used as a drop-in replacement for current deep learning solvers. Compared to stochastic gradient descent (SGD), it only requires two additional forward-mode automatic differentiation operations per iteration, which has a computational cost comparable to two standard forward passes and is easy to implement. Our method addresses long-standing issues… ▽ More

    Submitted 21 May, 2018; originally announced May 2018.

  24. arXiv:1805.05086  [pdf, other

    cs.CV cs.AI

    Unsupervised Intuitive Physics from Visual Observations

    Authors: Sebastien Ehrhardt, Aron Monszpart, Niloy Mitra, Andrea Vedaldi

    Abstract: While learning models of intuitive physics is an increasingly active area of research, current approaches still fall short of natural intelligences in one important regard: they require external supervision, such as explicit access to physical states, at training and sometimes even at test times. Some authors have relaxed such requirements by supplementing the model with an handcrafted physical si… ▽ More

    Submitted 29 March, 2019; v1 submitted 14 May, 2018; originally announced May 2018.

  25. arXiv:1804.08018  [pdf, other

    cs.CV

    ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking

    Authors: Oliver Groth, Fabian B. Fuchs, Ingmar Posner, Andrea Vedaldi

    Abstract: Physical intuition is pivotal for intelligent agents to perform complex tasks. In this paper we investigate the passive acquisition of an intuitive understanding of physical principles as well as the active utilisation of this intuition in the context of generalised object stacking. To this end, we provide: a simulation-based dataset featuring 20,000 stack configurations composed of a variety of e… ▽ More

    Submitted 6 July, 2018; v1 submitted 21 April, 2018; originally announced April 2018.

    Comments: revised version to appear at ECCV 2018

  26. arXiv:1804.01552  [pdf, other

    cs.CV

    Self-supervised Learning of Geometrically Stable Features Through Probabilistic Introspection

    Authors: David Novotny, Samuel Albanie, Diane Larlus, Andrea Vedaldi

    Abstract: Self-supervision can dramatically cut back the amount of manually-labelled data required to train deep neural networks. While self-supervision has usually been considered for tasks such as image classification, in this paper we aim at extending it to geometry-oriented tasks such as semantic matching and part detection. We do so by building on several recent ideas in unsupervised landmark detection… ▽ More

    Submitted 4 April, 2018; originally announced April 2018.

    Comments: In 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2018)

  27. arXiv:1803.10082  [pdf, other

    cs.CV stat.ML

    Efficient parametrization of multi-domain deep neural networks

    Authors: Sylvestre-Alvise Rebuffi, Hakan Bilen, Andrea Vedaldi

    Abstract: A practical limitation of deep neural networks is their high degree of specialization to a single task and visual domain. Recently, inspired by the successes of transfer learning, several authors have proposed to learn instead universal, fixed feature extractors that, used as the first stage of any deep network, work well for several tasks and domains simultaneously. Nevertheless, such universal f… ▽ More

    Submitted 27 March, 2018; originally announced March 2018.

    Comments: CVPR 2018

  28. arXiv:1803.09502  [pdf, other

    cs.CV

    Long-term Tracking in the Wild: A Benchmark

    Authors: Jack Valmadre, Luca Bertinetto, João F. Henriques, Ran Tao, Andrea Vedaldi, Arnold Smeulders, Philip Torr, Efstratios Gavves

    Abstract: We introduce the OxUvA dataset and benchmark for evaluating single-object tracking algorithms. Benchmarks have enabled great strides in the field of object tracking by defining standardized evaluations on large sets of diverse videos. However, these works have focused exclusively on sequences that are just tens of seconds in length and in which the target is always visible. Consequently, most rese… ▽ More

    Submitted 10 August, 2018; v1 submitted 26 March, 2018; originally announced March 2018.

    Comments: To appear at ECCV 2018

  29. arXiv:1801.03454  [pdf, other

    cs.CV cs.AI stat.ML

    Net2Vec: Quantifying and Explaining how Concepts are Encoded by Filters in Deep Neural Networks

    Authors: Ruth Fong, Andrea Vedaldi

    Abstract: In an effort to understand the meaning of the intermediate representations captured by deep networks, recent papers have tried to associate specific semantic concepts to individual neural network filter responses, where interesting correlations are often found, largely by focusing on extremal filter responses. In this paper, we show that this approach can favor easy-to-interpret cases that are not… ▽ More

    Submitted 29 March, 2018; v1 submitted 10 January, 2018; originally announced January 2018.

    Comments: Camera-Ready for CVPR18; supplementary materials: http://ruthcfong.github.io/files/net2vec_supps.pdf

  30. arXiv:1712.09448  [pdf, other

    cs.CV

    Taking Visual Motion Prediction To New Heightfields

    Authors: Sebastien Ehrhardt, Aron Monszpart, Niloy Mitra, Andrea Vedaldi

    Abstract: While the basic laws of Newtonian mechanics are well understood, explaining a physical scenario still requires manually modeling the problem with suitable equations and estimating the associated parameters. In order to be able to leverage the approximation capabilities of artificial intelligence techniques in such physics related contexts, researchers have handcrafted the relevant states, and then… ▽ More

    Submitted 10 December, 2021; v1 submitted 22 December, 2017; originally announced December 2017.

    Comments: arXiv admin note: text overlap with arXiv:1706.02179

  31. Deep Image Prior

    Authors: Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky

    Abstract: Deep convolutional networks have become a popular tool for image generation and restoration. Generally, their excellent performance is imputed to their ability to learn realistic image priors from a large number of example images. In this paper, we show that, on the contrary, the structure of a generator network is sufficient to capture a great deal of low-level image statistics prior to any learn… ▽ More

    Submitted 17 May, 2020; v1 submitted 29 November, 2017; originally announced November 2017.

  32. arXiv:1711.09313  [pdf, other

    cs.CV

    DeepRadiologyNet: Radiologist Level Pathology Detection in CT Head Images

    Authors: Jameson Merkow, Robert Lufkin, Kim Nguyen, Stefano Soatto, Zhuowen Tu, Andrea Vedaldi

    Abstract: We describe a system to automatically filter clinically significant findings from computerized tomography (CT) head scans, operating at performance levels exceeding that of practicing radiologists. Our system, named DeepRadiologyNet, builds on top of deep convolutional neural networks (CNNs) trained using approximately 3.5 million CT head images gathered from over 24,000 studies taken from January… ▽ More

    Submitted 2 December, 2017; v1 submitted 25 November, 2017; originally announced November 2017.

    Comments: 22 pages with references, 6 figures, 2 tables

  33. arXiv:1706.02932  [pdf, other

    cs.CV stat.ML

    Unsupervised learning of object frames by dense equivariant image labelling

    Authors: James Thewlis, Hakan Bilen, Andrea Vedaldi

    Abstract: One of the key challenges of visual perception is to extract abstract models of 3D objects and object categories from visual measurements, which are affected by complex nuisance factors such as viewpoint, occlusion, motion, and deformations. Starting from the recent idea of viewpoint factorization, we propose a new approach that, given a large number of images of an object and no other supervision… ▽ More

    Submitted 17 November, 2017; v1 submitted 9 June, 2017; originally announced June 2017.

    Comments: NIPS 2017

  34. arXiv:1706.02179  [pdf, other

    cs.CV cs.AI

    Learning to Represent Mechanics via Long-term Extrapolation and Interpolation

    Authors: Sébastien Ehrhardt, Aron Monszpart, Andrea Vedaldi, Niloy Mitra

    Abstract: While the basic laws of Newtonian mechanics are well understood, explaining a physical scenario still requires manually modeling the problem with suitable equations and associated parameters. In order to adopt such models for artificial intelligence, researchers have handcrafted the relevant states, and then used neural networks to learn the state transitions using simulation runs as training data… ▽ More

    Submitted 8 June, 2017; v1 submitted 6 June, 2017; originally announced June 2017.

    Comments: arXiv admin note: text overlap with arXiv:1703.00247

  35. arXiv:1705.08045  [pdf, other

    cs.CV stat.ML

    Learning multiple visual domains with residual adapters

    Authors: Sylvestre-Alvise Rebuffi, Hakan Bilen, Andrea Vedaldi

    Abstract: There is a growing interest in learning data representations that work well for many different types of problems and data. In this paper, we look in particular at the task of learning a single visual representation that can be successfully utilized in the analysis of very different types of images, from dog breeds to stop signs and digits. Inspired by recent work on learning networks that predict… ▽ More

    Submitted 27 November, 2017; v1 submitted 22 May, 2017; originally announced May 2017.

  36. arXiv:1705.03951  [pdf, other

    cs.CV

    Learning 3D Object Categories by Looking Around Them

    Authors: David Novotny, Diane Larlus, Andrea Vedaldi

    Abstract: Traditional approaches for learning 3D object categories use either synthetic data or manual supervision. In this paper, we propose a method which does not require manual annotations and is instead cued by observing objects from a moving vantage point. Our system builds on two innovations: a Siamese viewpoint factorization network that robustly aligns different videos together without explicitly c… ▽ More

    Submitted 2 December, 2021; v1 submitted 10 May, 2017; originally announced May 2017.

    Comments: Proceedings of the International Conference on Computer Vision, 2017

  37. arXiv:1705.02193  [pdf, other

    cs.CV stat.ML

    Unsupervised learning of object landmarks by factorized spatial embeddings

    Authors: James Thewlis, Hakan Bilen, Andrea Vedaldi

    Abstract: Learning automatically the structure of object categories remains an important open problem in computer vision. In this paper, we propose a novel unsupervised approach that can discover and learn landmarks in object categories, thus characterizing their structure. Our approach is based on factorizing image deformations, as induced by a viewpoint change or an object deformation, by learning a deep… ▽ More

    Submitted 6 August, 2017; v1 submitted 5 May, 2017; originally announced May 2017.

    Comments: To be published in ICCV 2017

  38. arXiv:1704.06036  [pdf, other

    cs.CV cs.LG

    End-to-end representation learning for Correlation Filter based tracking

    Authors: Jack Valmadre, Luca Bertinetto, João F. Henriques, Andrea Vedaldi, Philip H. S. Torr

    Abstract: The Correlation Filter is an algorithm that trains a linear template to discriminate between images and their translations. It is well suited to object tracking because its formulation in the Fourier domain provides a fast solution, enabling the detector to be re-trained once per frame. Previous works that use the Correlation Filter, however, have adopted features that were either manually designe… ▽ More

    Submitted 20 April, 2017; originally announced April 2017.

    Comments: To appear at CVPR 2017

  39. arXiv:1704.05939  [pdf, other

    cs.CV

    HPatches: A benchmark and evaluation of handcrafted and learned local descriptors

    Authors: Vassileios Balntas, Karel Lenc, Andrea Vedaldi, Krystian Mikolajczyk

    Abstract: In this paper, we propose a novel benchmark for evaluating local image descriptors. We demonstrate that the existing datasets and evaluation protocols do not specify unambiguously all aspects of evaluation, leading to ambiguities and inconsistencies in results reported in the literature. Furthermore, these datasets are nearly saturated due to the recent improvements in local descriptors obtained b… ▽ More

    Submitted 19 April, 2017; originally announced April 2017.

  40. arXiv:1704.04749  [pdf, other

    cs.CV

    AnchorNet: A Weakly Supervised Network to Learn Geometry-sensitive Features For Semantic Matching

    Authors: David Novotny, Diane Larlus, Andrea Vedaldi

    Abstract: Despite significant progress of deep learning in recent years, state-of-the-art semantic matching methods still rely on legacy features such as SIFT or HoG. We argue that the strong invariance properties that are key to the success of recent deep architectures on the classification task make them unfit for dense correspondence tasks, unless a large amount of supervision is used. In this work, we p… ▽ More

    Submitted 16 April, 2017; originally announced April 2017.

    Comments: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017

  41. arXiv:1704.03296  [pdf, other

    cs.CV cs.AI cs.LG stat.ML

    Interpretable Explanations of Black Boxes by Meaningful Perturbation

    Authors: Ruth Fong, Andrea Vedaldi

    Abstract: As machine learning algorithms are increasingly applied to high impact yet high risk tasks, such as medical diagnosis or autonomous driving, it is critical that researchers can explain how such algorithms arrived at their predictions. In recent years, a number of image saliency methods have been developed to summarize where highly complex neural networks "look" in an image for evidence for their p… ▽ More

    Submitted 3 December, 2021; v1 submitted 11 April, 2017; originally announced April 2017.

    Comments: Final camera-ready paper published at ICCV 2017 (Supplementary materials: http://openaccess.thecvf.com/content_ICCV_2017/supplemental/Fong_Interpretable_Explanations_of_ICCV_2017_supplemental.pdf)

    Journal ref: Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV)

  42. arXiv:1704.02304  [pdf, other

    cs.CV cs.LG stat.ML

    It Takes (Only) Two: Adversarial Generator-Encoder Networks

    Authors: Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky

    Abstract: We present a new autoencoder-type architecture that is trainable in an unsupervised mode, sustains both generation and inference, and has the quality of conditional and unconditional samples boosted by adversarial learning. Unlike previous hybrids of autoencoders and adversarial networks, the adversarial game in our approach is set up directly between the encoder and the generator, and no external… ▽ More

    Submitted 6 November, 2017; v1 submitted 7 April, 2017; originally announced April 2017.

  43. arXiv:1703.00247  [pdf, other

    cs.AI cs.NE

    Learning A Physical Long-term Predictor

    Authors: Sebastien Ehrhardt, Aron Monszpart, Niloy J. Mitra, Andrea Vedaldi

    Abstract: Evolution has resulted in highly developed abilities in many natural intelligences to quickly and accurately predict mechanical phenomena. Humans have successfully developed laws of physics to abstract and model such mechanical phenomena. In the context of artificial intelligence, a recent line of work has focused on estimating physical parameters based on sensory data and use them in physical sim… ▽ More

    Submitted 1 March, 2017; originally announced March 2017.

  44. arXiv:1701.07275  [pdf, other

    cs.CV stat.ML

    Universal representations:The missing link between faces, text, planktons, and cat breeds

    Authors: Hakan Bilen, Andrea Vedaldi

    Abstract: With the advent of large labelled datasets and high-capacity models, the performance of machine vision systems has been improving rapidly. However, the technology has still major limitations, starting from the fact that different vision problems are still solved by different models, trained from scratch or fine-tuned on the target data. The human visual system, in stark contrast, learns a universa… ▽ More

    Submitted 25 January, 2017; originally announced January 2017.

    Comments: 10 pages, 4 figures, 5 tables

  45. arXiv:1701.02096  [pdf, other

    cs.CV

    Improved Texture Networks: Maximizing Quality and Diversity in Feed-forward Stylization and Texture Synthesis

    Authors: Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky

    Abstract: The recent work of Gatys et al., who characterized the style of an image by the statistics of convolutional neural network filters, ignited a renewed interest in the texture generation and image stylization problems. While their image generation technique uses a slow optimization process, recently several authors have proposed to learn generator neural networks that can produce similar outputs in… ▽ More

    Submitted 6 November, 2017; v1 submitted 9 January, 2017; originally announced January 2017.

  46. arXiv:1612.00738  [pdf, other

    cs.CV

    Action Recognition with Dynamic Image Networks

    Authors: Hakan Bilen, Basura Fernando, Efstratios Gavves, Andrea Vedaldi

    Abstract: We introduce the concept of "dynamic image", a novel compact representation of videos useful for video analysis, particularly in combination with convolutional neural networks (CNNs). A dynamic image encodes temporal data such as RGB or optical flow videos by using the concept of `rank pooling'. The idea is to learn a ranking machine that captures the temporal evolution of the data and to use the… ▽ More

    Submitted 19 August, 2017; v1 submitted 2 December, 2016; originally announced December 2016.

    Comments: 14 pages, 9 figures, 9 tables

  47. arXiv:1610.02431  [pdf, other

    cs.CV

    ResearchDoom and CocoDoom: Learning Computer Vision with Games

    Authors: A. Mahendran, H. Bilen, J. F. Henriques, A. Vedaldi

    Abstract: In this short note we introduce ResearchDoom, an implementation of the Doom first-person shooter that can extract detailed metadata from the game. We also introduce the CocoDoom dataset, a collection of pre-recorded data extracted from Doom gaming sessions along with annotations in the MS Coco format. ResearchDoom and CocoDoom can be used to train and evaluate a variety of computer vision methods… ▽ More

    Submitted 7 October, 2016; originally announced October 2016.

  48. arXiv:1610.02255  [pdf, other

    cs.CV

    Learning Grimaces by Watching TV

    Authors: Samuel Albanie, Andrea Vedaldi

    Abstract: Differently from computer vision systems which require explicit supervision, humans can learn facial expressions by observing people in their environment. In this paper, we look at how similar capabilities could be developed in machine vision. As a starting point, we consider the problem of relating facial expressions to objectively measurable events occurring in videos. In particular, we consider… ▽ More

    Submitted 7 October, 2016; originally announced October 2016.

    Comments: British Machine Vision Conference (BMVC) 2016

  49. arXiv:1609.04382  [pdf, other

    cs.CV

    Warped Convolutions: Efficient Invariance to Spatial Transformations

    Authors: João F. Henriques, Andrea Vedaldi

    Abstract: Convolutional Neural Networks (CNNs) are extremely efficient, since they exploit the inherent translation-invariance of natural images. However, translation is just one of a myriad of useful spatial transformations. Can the same efficiency be attained when considering other spatial invariances? Such generalized convolutions have been considered in the past, but at a high computational cost. We pre… ▽ More

    Submitted 30 November, 2021; v1 submitted 14 September, 2016; originally announced September 2016.

    Comments: Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, PMLR 70, 2017

  50. arXiv:1609.03532  [pdf, other

    cs.CV

    Fully-Trainable Deep Matching

    Authors: James Thewlis, Shuai Zheng, Philip H. S. Torr, Andrea Vedaldi

    Abstract: Deep Matching (DM) is a popular high-quality method for quasi-dense image matching. Despite its name, however, the original DM formulation does not yield a deep neural network that can be trained end-to-end via backpropagation. In this paper, we remove this limitation by rewriting the complete DM algorithm as a convolutional neural network. This results in a novel deep architecture for image match… ▽ More

    Submitted 12 September, 2016; originally announced September 2016.

    Comments: British Machine Vision Conference (BMVC) 2016