Zum Hauptinhalt springen

Showing 151–200 of 278 results for author: Schmid, C

.
  1. arXiv:1901.04437  [pdf, ps, other

    physics.atom-ph physics.chem-ph

    Quantum-State-Specific Reaction Rate Measurements for the Photo-induced Reaction Ca$^+$ + O$_2$ $\rightarrow$ CaO$^+$ + O

    Authors: Philipp C. Schmid, Mikhail I. Miller, James Greenberg, Thanh L. Nguyen, John F. Stanton, H. J. Lewandowski

    Abstract: Atoms and molecules often react at different rates depending on their internal quantum states. Thus, controlling which internal states are populated can be used to manipulate the reactivity and can lead to a more detailed understanding of reaction mechanisms. We demonstrate this control of reactions by studying the excited state reaction reaction Ca$^+$ + O$_2$ $\rightarrow$ CaO$^+$ + O. This reac… ▽ More

    Submitted 14 January, 2019; originally announced January 2019.

    Comments: 12 pages, 5 figures

  2. arXiv:1901.01342  [pdf, other

    cs.CV cs.MM cs.SD eess.AS

    AVA-ActiveSpeaker: An Audio-Visual Dataset for Active Speaker Detection

    Authors: Joseph Roth, Sourish Chaudhuri, Ondrej Klejch, Radhika Marvin, Andrew Gallagher, Liat Kaver, Sharadh Ramaswamy, Arkadiusz Stopczynski, Cordelia Schmid, Zhonghua Xi, Caroline Pantofaru

    Abstract: Active speaker detection is an important component in video analysis algorithms for applications such as speaker diarization, video re-targeting for meetings, speech enhancement, and human-robot interaction. The absence of a large, carefully labeled audio-visual dataset for this task has constrained algorithm evaluations with respect to data diversity, environments, and accuracy. This has made com… ▽ More

    Submitted 24 May, 2019; v1 submitted 4 January, 2019; originally announced January 2019.

  3. arXiv:1901.01091  [pdf, other

    cs.CV cs.LG

    Adaptive Density Estimation for Generative Models

    Authors: Thomas Lucas, Konstantin Shmelkov, Karteek Alahari, Cordelia Schmid, Jakob Verbeek

    Abstract: Unsupervised learning of generative models has seen tremendous progress over recent years, in particular due to generative adversarial networks (GANs), variational autoencoders, and flow-based models. GANs have dramatically improved sample quality, but suffer from two drawbacks: (i) they mode-drop, i.e., do not cover the full support of the train data, and (ii) they do not allow for likelihood eva… ▽ More

    Submitted 3 January, 2020; v1 submitted 4 January, 2019; originally announced January 2019.

  4. arXiv:1812.07673  [pdf, other

    stat.ME stat.CO

    Active learning for efficiently training emulators of computationally expensive mathematical models

    Authors: Alexandra G. Ellis, Rowan Iskandar, Christopher H. Schmid, John B. Wong, Thomas A. Trikalinos

    Abstract: An emulator is a fast-to-evaluate statistical approximation of a detailed mathematical model (simulator). When used in lieu of simulators, emulators can expedite tasks that require many repeated evaluations, such as sensitivity analyses, policy optimization, model calibration, and value-of-information analyses. Emulators are developed using the output of simulators at specific input values (design… ▽ More

    Submitted 3 January, 2020; v1 submitted 18 December, 2018; originally announced December 2018.

    Comments: Counting appendix materials: 31 pages, 7 Figures, 3 Tables

  5. arXiv:1812.05736  [pdf, other

    cs.CV

    Detecting unseen visual relations using analogies

    Authors: Julia Peyre, Ivan Laptev, Cordelia Schmid, Josef Sivic

    Abstract: We seek to detect visual relations in images of the form of triplets t = (subject, predicate, object), such as "person riding dog", where training examples of the individual entities are available but their combinations are unseen at training. This is an important set-up due to the combinatorial nature of visual relations : collecting sufficient training data for all possible triplets would be ver… ▽ More

    Submitted 22 September, 2019; v1 submitted 13 December, 2018; originally announced December 2018.

  6. arXiv:1812.03544  [pdf, other

    cs.CV

    A Structured Model For Action Detection

    Authors: Yubo Zhang, Pavel Tokmakov, Martial Hebert, Cordelia Schmid

    Abstract: A dominant paradigm for learning-based approaches in computer vision is training generic models, such as ResNet for image recognition, or I3D for video understanding, on large datasets and allowing them to discover the optimal representation for the problem at hand. While this is an obviously attractive approach, it is not applicable in all scenarios. We claim that action detection is one such cha… ▽ More

    Submitted 5 June, 2019; v1 submitted 9 December, 2018; originally announced December 2018.

  7. arXiv:1812.00025  [pdf, other

    cs.LG cs.AI

    Modulated Policy Hierarchies

    Authors: Alexander Pashevich, Danijar Hafner, James Davidson, Rahul Sukthankar, Cordelia Schmid

    Abstract: Solving tasks with sparse rewards is a main challenge in reinforcement learning. While hierarchical controllers are an intuitive approach to this problem, current methods often require manual reward shaping, alternating training phases, or manually defined sub tasks. We introduce modulated policy hierarchies (MPH), that can learn end-to-end to solve tasks from sparse rewards. To achieve this, we s… ▽ More

    Submitted 30 November, 2018; originally announced December 2018.

    Comments: 8 pages, 5 figures

  8. arXiv:1809.06396  [pdf, other

    cs.CV

    Déjà Vu: an empirical evaluation of the memorization properties of ConvNets

    Authors: Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou

    Abstract: Convolutional neural networks memorize part of their training data, which is why strategies such as data augmentation and drop-out are employed to mitigate overfitting. This paper considers the related question of "membership inference", where the goal is to determine if an image was used during training. We consider it under three complementary angles. We show how to detect which dataset was used… ▽ More

    Submitted 17 September, 2018; originally announced September 2018.

  9. arXiv:1809.02492  [pdf, other

    cs.CV

    On the Importance of Visual Context for Data Augmentation in Scene Understanding

    Authors: Nikita Dvornik, Julien Mairal, Cordelia Schmid

    Abstract: Performing data augmentation for learning deep neural networks is known to be important for training visual recognition systems. By artificially increasing the number of training examples, it helps reducing overfitting and improves generalization. While simple image transformations can already improve predictive performance in most vision tasks, larger gains can be obtained by leveraging task-spec… ▽ More

    Submitted 19 September, 2019; v1 submitted 6 September, 2018; originally announced September 2018.

    Comments: Updated the experimental section. arXiv admin note: substantial text overlap with arXiv:1807.07428

  10. arXiv:1807.10982  [pdf, other

    cs.CV

    Actor-Centric Relation Network

    Authors: Chen Sun, Abhinav Shrivastava, Carl Vondrick, Kevin Murphy, Rahul Sukthankar, Cordelia Schmid

    Abstract: Current state-of-the-art approaches for spatio-temporal action localization rely on detections at the frame level and model temporal context with 3D ConvNets. Here, we go one step further and model spatio-temporal relations to capture the interactions between human actors, relevant objects and scene elements essential to differentiate similar human actions. Our approach is weakly supervised and mi… ▽ More

    Submitted 28 July, 2018; originally announced July 2018.

    Comments: ECCV 2018 camera ready

  11. arXiv:1807.09536  [pdf, other

    cs.CV

    End-to-End Incremental Learning

    Authors: Francisco M. Castro, Manuel J. Marín-Jiménez, Nicolás Guil, Cordelia Schmid, Karteek Alahari

    Abstract: Although deep learning approaches have stood out in recent years due to their state-of-the-art results, they continue to suffer from catastrophic forgetting, a dramatic decrease in overall performance when training with new classes added incrementally. This is due to current neural network architectures requiring the entire dataset, consisting of all the samples from the old as well as the new cla… ▽ More

    Submitted 3 September, 2018; v1 submitted 25 July, 2018; originally announced July 2018.

    Comments: To appear in ECCV 2018

  12. arXiv:1807.09499  [pdf, other

    cs.CV cs.LG

    How good is my GAN?

    Authors: Konstantin Shmelkov, Cordelia Schmid, Karteek Alahari

    Abstract: Generative adversarial networks (GANs) are one of the most popular methods for generating images today. While impressive results have been validated by visual inspection, a number of quantitative criteria have emerged only recently. We argue here that the existing ones are insufficient and need to be in adequation with the task at hand. In this paper we introduce two measures based on image classi… ▽ More

    Submitted 25 July, 2018; originally announced July 2018.

    Comments: Accepted to ECCV2018

  13. arXiv:1807.07428  [pdf, other

    cs.CV

    Modeling Visual Context is Key to Augmenting Object Detection Datasets

    Authors: Nikita Dvornik, Julien Mairal, Cordelia Schmid

    Abstract: Performing data augmentation for learning deep neural networks is well known to be important for training visual recognition systems. By artificially increasing the number of training examples, it helps reducing overfitting and improves generalization. For object detection, classical approaches for data augmentation consist of generating images obtained by basic geometrical transformations and col… ▽ More

    Submitted 19 July, 2018; originally announced July 2018.

    Journal ref: ECCV2018, Sep 2018, Munich, Germany. 2018

  14. arXiv:1807.02616  [pdf, other

    stat.AP

    Effects of Predictive Real-Time Traffic Signal Information

    Authors: Vadim Sokolov, David W. Etherington, Christian Schmid, Dominik Karbowski, Aymeric Rousseau, Muhammad Imran

    Abstract: This paper analyzes the impact of providing car drivers with predictive information on traffic signal timing in real-time, including time-to-green and green-wave speed recommendations. Over a period of six months, the behavior of these 121 drivers in everyday urban driving was analyzed with and without access to live traffic signal information. In a first period, drivers had the information provid… ▽ More

    Submitted 9 November, 2018; v1 submitted 7 July, 2018; originally announced July 2018.

  15. arXiv:1806.11328  [pdf, other

    cs.CV

    A flexible model for training action localization with varying levels of supervision

    Authors: Guilhem Chéron, Jean-Baptiste Alayrac, Ivan Laptev, Cordelia Schmid

    Abstract: Spatio-temporal action detection in videos is typically addressed in a fully-supervised setup with manual annotation of training videos required at every frame. Since such annotation is extremely tedious and prohibits scalability, there is a clear need to minimize the amount of manual supervision. In this work we propose a unifying framework that can handle and combine varying types of less-demand… ▽ More

    Submitted 27 November, 2018; v1 submitted 29 June, 2018; originally announced June 2018.

  16. arXiv:1806.11008  [pdf, other

    cs.CV

    Modeling Spatio-Temporal Human Track Structure for Action Localization

    Authors: Guilhem Chéron, Anton Osokin, Ivan Laptev, Cordelia Schmid

    Abstract: This paper addresses spatio-temporal localization of human actions in video. In order to localize actions in time, we propose a recurrent localization network (RecLNet) designed to model the temporal structure of actions on the level of person tracks. Our model is trained to simultaneously recognize and localize action classes in time and is based on two layer gated recurrent units (GRU) applied s… ▽ More

    Submitted 28 June, 2018; originally announced June 2018.

  17. Synthetic simulations of the extragalactic sky seen by eROSITA. I. Pre-launch selection functions from Monte-Carlo simulations

    Authors: N. Clerc, M. E. Ramos-Ceja, J. Ridl, G. Lamer, H. Brunner, F. Hofmann, J. Comparat, F. Pacaud, F. Käfer, T. H. Reiprich, A. Merloni, C. Schmid, T. Brand, J. Wilms, P. Friedrich, A. Finoguenov, T. Dauser, I. Kreykenbohm

    Abstract: Studies of galaxy clusters provide stringent constraints on models of structure formation. Provided that selection effects are under control, large X-ray surveys are well suited to derive cosmological parameters, in particular those governing the dark energy equation of state. We forecast the capabilities of the all-sky eROSITA (the extended ROentgen Survey with an Imaging Telescope Array) survey… ▽ More

    Submitted 22 June, 2018; originally announced June 2018.

    Comments: Accepted in A&A. Image quality degraded for arXiv submission

    Journal ref: A&A 617, A92 (2018)

  18. arXiv:1806.03198  [pdf, other

    stat.ML cs.LG

    Spreading vectors for similarity search

    Authors: Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou

    Abstract: Discretizing multi-dimensional data distributions is a fundamental step of modern indexing methods. State-of-the-art techniques learn parameters of quantizers on training data for optimal performance, thus adapting quantizers to the data. In this work, we propose to reverse this paradigm and adapt the data to the quantizer: we train a neural net which last layer forms a fixed parameter-free quanti… ▽ More

    Submitted 30 August, 2019; v1 submitted 8 June, 2018; originally announced June 2018.

    Comments: Published at ICLR 2019

  19. arXiv:1805.11155  [pdf, other

    stat.ML cs.CV cs.LG

    Unsupervised Learning of Artistic Styles with Archetypal Style Analysis

    Authors: Daan Wynen, Cordelia Schmid, Julien Mairal

    Abstract: In this paper, we introduce an unsupervised learning approach to automatically discover, summarize, and manipulate artistic styles from large collections of paintings. Our method is based on archetypal analysis, which is an unsupervised learning technique akin to sparse coding with a geometric interpretation. When applied to deep image representations from a collection of artworks, it learns a dic… ▽ More

    Submitted 2 October, 2018; v1 submitted 28 May, 2018; originally announced May 2018.

    Comments: Accepted at NIPS 2018, Montréal, Canada

  20. arXiv:1804.09627  [pdf, other

    cs.CV

    Actor and Observer: Joint Modeling of First and Third-Person Videos

    Authors: Gunnar A. Sigurdsson, Abhinav Gupta, Cordelia Schmid, Ali Farhadi, Karteek Alahari

    Abstract: Several theories in cognitive neuroscience suggest that when people interact with the world, or simulate interactions, they do so from a first-person egocentric perspective, and seamlessly transfer knowledge between third-person (observer) and first-person (actor). Despite this, learning such models for human action recognition has not been achievable due to the lack of data. This paper takes a st… ▽ More

    Submitted 25 April, 2018; originally announced April 2018.

    Comments: CVPR 2018 spotlight presentation

    Journal ref: CVPR 2018

  21. arXiv:1804.09626  [pdf, other

    cs.CV

    Charades-Ego: A Large-Scale Dataset of Paired Third and First Person Videos

    Authors: Gunnar A. Sigurdsson, Abhinav Gupta, Cordelia Schmid, Ali Farhadi, Karteek Alahari

    Abstract: In Actor and Observer we introduced a dataset linking the first and third-person video understanding domains, the Charades-Ego Dataset. In this paper we describe the egocentric aspect of the dataset and present annotations for Charades-Ego with 68,536 activity instances in 68.8 hours of first and third-person video, making it one of the largest and most diverse egocentric datasets available. Chara… ▽ More

    Submitted 30 April, 2018; v1 submitted 25 April, 2018; originally announced April 2018.

  22. arXiv:1804.04875  [pdf, other

    cs.CV

    BodyNet: Volumetric Inference of 3D Human Body Shapes

    Authors: Gül Varol, Duygu Ceylan, Bryan Russell, Jimei Yang, Ersin Yumer, Ivan Laptev, Cordelia Schmid

    Abstract: Human shape estimation is an important task for video editing, animation and fashion industry. Predicting 3D human body shape from natural images, however, is highly challenging due to factors such as variation in human bodies, clothing and viewpoint. Prior methods addressing this problem typically attempt to fit parametric body models with certain priors on pose and shape. In this work we argue f… ▽ More

    Submitted 18 August, 2018; v1 submitted 13 April, 2018; originally announced April 2018.

    Comments: Appears in: European Conference on Computer Vision 2018 (ECCV 2018). 27 pages

  23. LCR-Net++: Multi-person 2D and 3D Pose Detection in Natural Images

    Authors: Gregory Rogez, Philippe Weinzaepfel, Cordelia Schmid

    Abstract: We propose an end-to-end architecture for joint 2D and 3D human pose estimation in natural images. Key to our approach is the generation and scoring of a number of pose proposals per image, which allows us to predict 2D and 3D poses of multiple people simultaneously. Hence, our approach does not require an approximate localization of the humans for initialization. Our Localization-Classification-R… ▽ More

    Submitted 13 January, 2019; v1 submitted 1 March, 2018; originally announced March 2018.

    Comments: journal version of the CVPR 2017 paper, accepted to appear in IEEE Trans. PAMI

  24. arXiv:1802.04216  [pdf, other

    cs.CV

    Image-based Synthesis for Deep 3D Human Pose Estimation

    Authors: Grégory Rogez, Cordelia Schmid

    Abstract: This paper addresses the problem of 3D human pose estimation in the wild. A significant challenge is the lack of training data, i.e., 2D images of humans annotated with 3D poses. Such data is necessary to train state-of-the-art CNN architectures. Here, we propose a solution to generate a large set of photorealistic synthetic images of humans with 3D pose annotations. We introduce an image-based sy… ▽ More

    Submitted 12 February, 2018; originally announced February 2018.

    Comments: accepted to appear in IJCV (with minor revisions). Follow-up to NIPS 2016 arXiv:1607.02046

  25. arXiv:1712.01127  [pdf, other

    cs.CV

    Learning to Segment Moving Objects

    Authors: Pavel Tokmakov, Cordelia Schmid, Karteek Alahari

    Abstract: We study the problem of segmenting moving objects in unconstrained videos. Given a video, the task is to segment all the objects that exhibit independent motion in at least one frame. We formulate this as a learning problem and design our framework with three cues: (i) independent object motion between a pair of frames, which complements object recognition, (ii) object appearance, which helps to c… ▽ More

    Submitted 1 December, 2017; originally announced December 2017.

    Comments: arXiv admin note: text overlap with arXiv:1704.05737, arXiv:1612.07217

  26. arXiv:1708.06977  [pdf, other

    cs.CV

    Incremental Learning of Object Detectors without Catastrophic Forgetting

    Authors: Konstantin Shmelkov, Cordelia Schmid, Karteek Alahari

    Abstract: Despite their success for object detection, convolutional neural networks are ill-equipped for incremental learning, i.e., adapting the original model trained on a set of classes to additionally detect objects of new classes, in the absence of the initial training data. They suffer from "catastrophic forgetting" - an abrupt degradation of performance on the original set of classes, when the traini… ▽ More

    Submitted 23 August, 2017; originally announced August 2017.

    Comments: To appear in ICCV 2017

  27. arXiv:1708.02813  [pdf, other

    cs.CV

    BlitzNet: A Real-Time Deep Network for Scene Understanding

    Authors: Nikita Dvornik, Konstantin Shmelkov, Julien Mairal, Cordelia Schmid

    Abstract: Real-time scene understanding has become crucial in many applications such as autonomous driving. In this paper, we propose a deep architecture, called BlitzNet, that jointly performs object detection and semantic segmentation in one forward pass, allowing real-time computations. Besides the computational gain of having a single network to perform several tasks, we show that object detection and s… ▽ More

    Submitted 9 August, 2017; originally announced August 2017.

  28. arXiv:1708.02598  [pdf, other

    stat.CO

    Exponential Random Graph Models with Big Networks: Maximum Pseudolikelihood Estimation and the Parametric Bootstrap

    Authors: Christian S. Schmid, Bruce A. Desmarais

    Abstract: With the growth of interest in network data across fields, the Exponential Random Graph Model (ERGM) has emerged as the leading approach to the statistical analysis of network data. ERGM parameter estimation requires the approximation of an intractable normalizing constant. Simulation methods represent the state-of-the-art approach to approximating the normalizing constant, leading to estimation b… ▽ More

    Submitted 8 August, 2017; originally announced August 2017.

  29. arXiv:1707.09472  [pdf, other

    cs.CV

    Weakly-supervised learning of visual relations

    Authors: Julia Peyre, Ivan Laptev, Cordelia Schmid, Josef Sivic

    Abstract: This paper introduces a novel approach for modeling visual relations between pairs of objects. We call relation a triplet of the form (subject, predicate, object) where the predicate is typically a preposition (eg. 'under', 'in front of') or a verb ('hold', 'ride') that links a pair of objects (subject, object). Learning such relations is challenging as the objects have different spatial configura… ▽ More

    Submitted 29 July, 2017; originally announced July 2017.

  30. arXiv:1707.07036  [pdf, ps, other

    physics.ins-det physics.atom-ph physics.chem-ph

    High resolution ion trap time-of-flight mass spectrometer for cold trapped ion experiments

    Authors: Philipp C. Schmid, James Greenberg, Mikhail I. Miller, Kevin Loeffler, Heather J. Lewandowski

    Abstract: Trapping molecular ions that have been sympathetically cooled with laser-cooled atomic ions is a useful platform for exploring cold ion chemistry. We designed and characterized a new experimental apparatus for probing chemical reaction dynamics between molecular cations and neutral radicals at temperatures below 1 K. The ions are trapped in a linear quadrupole radio-frequency trap and sympathetica… ▽ More

    Submitted 21 July, 2017; originally announced July 2017.

    Comments: 9 pages, 9 figures

    Journal ref: Rev. Sci. Instrum. 88 (2017) 123107

  31. arXiv:1707.06005  [pdf, other

    cs.CV

    Detecting Parts for Action Localization

    Authors: Nicolas Chesneau, Grégory Rogez, Karteek Alahari, Cordelia Schmid

    Abstract: In this paper, we propose a new framework for action localization that tracks people in videos and extracts full-body human tubes, i.e., spatio-temporal regions localizing actions, even in the case of occlusions or truncations. This is achieved by training a novel human part detector that scores visible parts while regressing full-body bounding boxes. The core of our method is a convolutional neur… ▽ More

    Submitted 21 July, 2017; v1 submitted 19 July, 2017; originally announced July 2017.

    Comments: BMVC 2017

  32. arXiv:1707.03993  [pdf

    cs.CV

    Developing the Path Signature Methodology and its Application to Landmark-based Human Action Recognition

    Authors: Weixin Yang, Terry Lyons, Hao Ni, Cordelia Schmid, Lianwen Jin

    Abstract: Landmark-based human action recognition in videos is a challenging task in computer vision. One key step is to design a generic approach that generates discriminative features for the spatial structure and temporal dynamics. To this end, we regard the evolving landmark data as a high-dimensional path and apply non-linear path signature techniques to provide an expressive, robust, non-linear, and i… ▽ More

    Submitted 12 December, 2019; v1 submitted 13 July, 2017; originally announced July 2017.

    Comments: 14 pages, 11 figures

  33. arXiv:1705.08421  [pdf, other

    cs.CV

    AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

    Authors: Chunhui Gu, Chen Sun, David A. Ross, Carl Vondrick, Caroline Pantofaru, Yeqing Li, Sudheendra Vijayanarasimhan, George Toderici, Susanna Ricco, Rahul Sukthankar, Cordelia Schmid, Jitendra Malik

    Abstract: This paper introduces a video dataset of spatio-temporally localized Atomic Visual Actions (AVA). The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1.58M action labels with multiple labels per person occurring frequently. The key characteristics of our dataset are: (1) the definition of atomic visual… ▽ More

    Submitted 30 April, 2018; v1 submitted 23 May, 2017; originally announced May 2017.

    Comments: To appear in CVPR 2018. Check dataset page https://research.google.com/ava/ for details

  34. arXiv:1705.04043  [pdf, other

    cs.CV

    SCNet: Learning Semantic Correspondence

    Authors: Kai Han, Rafael S. Rezende, Bumsub Ham, Kwan-Yee K. Wong, Minsu Cho, Cordelia Schmid, Jean Ponce

    Abstract: This paper addresses the problem of establishing semantic correspondences between images depicting different instances of the same object or scene category. Previous approaches focus on either combining a spatial regularizer with hand-crafted features, or learning a correspondence model for appearance only. We propose instead a convolutional neural network architecture, called SCNet, for learning… ▽ More

    Submitted 17 August, 2017; v1 submitted 11 May, 2017; originally announced May 2017.

    Comments: ICCV 2017

  35. arXiv:1705.01861  [pdf, other

    cs.CV

    Action Tubelet Detector for Spatio-Temporal Action Localization

    Authors: Vicky Kalogeiton, Philippe Weinzaepfel, Vittorio Ferrari, Cordelia Schmid

    Abstract: Current state-of-the-art approaches for spatio-temporal action localization rely on detections at the frame level that are then linked or tracked across time. In this paper, we leverage the temporal continuity of videos instead of operating at the frame level. We propose the ACtion Tubelet detector (ACT-detector) that takes as input a sequence of frames and outputs tubelets, i.e., sequences of bou… ▽ More

    Submitted 21 August, 2017; v1 submitted 4 May, 2017; originally announced May 2017.

    Comments: 9 pages

  36. arXiv:1704.07804  [pdf, other

    cs.CV

    SfM-Net: Learning of Structure and Motion from Video

    Authors: Sudheendra Vijayanarasimhan, Susanna Ricco, Cordelia Schmid, Rahul Sukthankar, Katerina Fragkiadaki

    Abstract: We propose SfM-Net, a geometry-aware neural network for motion estimation in videos that decomposes frame-to-frame pixel motion in terms of scene and object depth, camera motion and 3D object rotations and translations. Given a sequence of frames, SfM-Net predicts depth, segmentation, camera and rigid object motions, converts those into a dense frame-to-frame motion field (optical flow), different… ▽ More

    Submitted 25 April, 2017; originally announced April 2017.

  37. arXiv:1704.05737  [pdf, other

    cs.CV

    Learning Video Object Segmentation with Visual Memory

    Authors: Pavel Tokmakov, Karteek Alahari, Cordelia Schmid

    Abstract: This paper addresses the task of segmenting moving objects in unconstrained videos. We introduce a novel two-stream neural network with an explicit memory module to achieve this. The two streams of the network encode spatial and temporal features in a video sequence respectively, while the memory module captures the evolution of objects over time. The module to build a "visual memory" in video, i.… ▽ More

    Submitted 12 July, 2017; v1 submitted 19 April, 2017; originally announced April 2017.

  38. arXiv:1703.07144  [pdf, other

    cs.CV

    Proposal Flow: Semantic Correspondences from Object Proposals

    Authors: Bumsub Ham, Minsu Cho, Cordelia Schmid, Jean Ponce

    Abstract: Finding image correspondences remains a challenging problem in the presence of intra-class variations and large changes in scene layout. Semantic flow methods are designed to handle images depicting different instances of the same object or scene category. We introduce a novel approach to semantic flow, dubbed proposal flow, that establishes reliable correspondences using object proposals. Unlike… ▽ More

    Submitted 21 March, 2017; originally announced March 2017.

    Comments: arXiv admin note: text overlap with arXiv:1511.05065

  39. Learning from Synthetic Humans

    Authors: Gül Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael J. Black, Ivan Laptev, Cordelia Schmid

    Abstract: Estimating human pose, shape, and motion from images and videos are fundamental challenges with many applications. Recent advances in 2D human pose estimation use large amounts of manually-labeled training data for learning convolutional neural networks (CNNs). Such data is time consuming to acquire and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion is impractical. In… ▽ More

    Submitted 19 January, 2018; v1 submitted 5 January, 2017; originally announced January 2017.

    Comments: Appears in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017). 9 pages

  40. arXiv:1612.07217  [pdf, other

    cs.CV

    Learning Motion Patterns in Videos

    Authors: Pavel Tokmakov, Karteek Alahari, Cordelia Schmid

    Abstract: The problem of determining whether an object is in motion, irrespective of camera motion, is far from being solved. We address this challenging task by learning motion patterns in videos. The core of our approach is a fully convolutional network, which is learned entirely from synthetic video sequences, and their ground-truth optical flow and motion segmentation. This encoder-decoder style archite… ▽ More

    Submitted 10 April, 2017; v1 submitted 21 December, 2016; originally announced December 2016.

  41. arXiv:1612.02008  [pdf, other

    hep-th math-ph

    Little String Defects and Bala-Carter Theory

    Authors: Nathan Haouzi, Christian Schmid

    Abstract: We give a physical realization of the Bala-Carter labels that classify nilpotent orbits of semi-simple Lie algebras, for the case $\mathfrak{g}=A,D,E$. We start from type IIB string theory compactified on an $ADE$ singularity and study the six-dimensional (2,0) $\mathfrak{g}$-type little string on a Riemann surface with punctures. The defects are introduced as D-branes wrapping the 2-cycles of the… ▽ More

    Submitted 16 December, 2016; v1 submitted 6 December, 2016; originally announced December 2016.

    Comments: 51 pages, 9 figures, 3 longtables. v2: Added proof of dimension formula, minor changes

  42. arXiv:1612.01033  [pdf, other

    cs.CV

    Areas of Attention for Image Captioning

    Authors: Marco Pedersoli, Thomas Lucas, Cordelia Schmid, Jakob Verbeek

    Abstract: We propose "Areas of Attention", a novel attention-based model for automatic image captioning. Our approach models the dependencies between image regions, caption words, and the state of an RNN language model, using three pairwise interactions. In contrast to previous attention-based approaches that associate image regions only to the RNN state, our method allows a direct association between capti… ▽ More

    Submitted 25 August, 2017; v1 submitted 3 December, 2016; originally announced December 2016.

    Comments: Accepted in ICCV 2017

  43. arXiv:1609.02195  [pdf, ps, other

    gr-qc

    Einstein's $R^{\hat{0} \hat{0}}$ equation for non-relativistic sources derived from Einstein's inertial motion and the Newtonian law for relative acceleration

    Authors: Christoph Schmid

    Abstract: With Einstein's inertial motion (free-falling and non-rotating relative to gyroscopes), geodesics for non-relativistic particles can intersect repeatedly, allowing one to compute the space-time curvature $R^{\hat{0} \hat{0}}$ exactly. Einstein's $R^{\hat{0} \hat{0}}$ for strong gravitational fields and for relativistic source-matter is identical with the Newtonian expression for the relative radia… ▽ More

    Submitted 7 September, 2016; originally announced September 2016.

    Comments: 4 pages. arXiv admin note: text overlap with arXiv:1607.08661

  44. Little String Origin of Surface Defects

    Authors: Nathan Haouzi, Christian Schmid

    Abstract: We derive the codimension-two defects of 4d $\mathcal{N} = 4$ Super Yang-Mills (SYM) theory from the (2, 0) little string. The origin of the little string is type IIB theory compactified on an ADE singularity. The defects are D-branes wrapping the 2-cycles of the singularity. We use this construction to make contact with the description of SYM defects due to Gukov and Witten [arXiv:hep-th/0612073]… ▽ More

    Submitted 18 December, 2016; v1 submitted 25 August, 2016; originally announced August 2016.

    Comments: 64 pages, 18 figures. v2: Minor fixes and clarifications

  45. arXiv:1607.08661  [pdf, ps, other

    gr-qc

    Einstein's equations from Einstein's inertial motion and Newton's law for relative acceleration

    Authors: Christoph Schmid

    Abstract: We show that Einstein's $R^{\hat{0} \hat{0}}$ equation for nonrelativistic matter and strong gravitational fields is identical with Newton's equation for relative radial acceleration of neighbouring freefalling particles, spherically averaged. These laws are explicitely identical with primary observer's (1) space-time slicing by radial 4-geodesics, (2) radially parallel Local Ortho-Normal Bases, L… ▽ More

    Submitted 28 July, 2016; originally announced July 2016.

    Comments: 17 pages

  46. arXiv:1607.02046  [pdf, other

    cs.CV

    MoCap-guided Data Augmentation for 3D Pose Estimation in the Wild

    Authors: Grégory Rogez, Cordelia Schmid

    Abstract: This paper addresses the problem of 3D human pose estimation in the wild. A significant challenge is the lack of training data, i.e., 2D images of humans annotated with 3D poses. Such data is necessary to train state-of-the-art CNN architectures. Here, we propose a solution to generate a large set of photorealistic synthetic images of humans with 3D pose annotations. We introduce an image-based sy… ▽ More

    Submitted 28 October, 2016; v1 submitted 7 July, 2016; originally announced July 2016.

    Comments: 9 pages, accepted to appear in NIPS 2016

  47. arXiv:1606.00043  [pdf

    astro-ph.IM astro-ph.GA

    The Detailed Science Case for the Maunakea Spectroscopic Explorer: the Composition and Dynamics of the Faint Universe

    Authors: Alan McConnachie, Carine Babusiaux, Michael Balogh, Simon Driver, Pat Côté, Helene Courtois, Luke Davies, Laura Ferrarese, Sarah Gallagher, Rodrigo Ibata, Nicolas Martin, Aaron Robotham, Kim Venn, Eva Villaver, Jo Bovy, Alessandro Boselli, Matthew Colless, Johan Comparat, Kelly Denny, Pierre-Alain Duc, Sara Ellison, Richard de Grijs, Mirian Fernandez-Lorenzo, Ken Freeman, Raja Guhathakurta , et al. (152 additional authors not shown)

    Abstract: MSE is an 11.25m aperture observatory with a 1.5 square degree field of view that will be fully dedicated to multi-object spectroscopy. More than 3200 fibres will feed spectrographs operating at low (R ~ 2000 - 3500) and moderate (R ~ 6000) spectral resolution, and approximately 1000 fibers will feed spectrographs operating at high (R ~ 40000) resolution. MSE is designed to enable transformational… ▽ More

    Submitted 31 May, 2016; originally announced June 2016.

    Comments: 210 pages, 91 figures. Exposure draft. Appendices to the Detailed Science Case can be found at http://mse.cfht.hawaii.edu/docs/

  48. arXiv:1605.05197  [pdf, other

    cs.CV

    Human Action Localization with Sparse Spatial Supervision

    Authors: Philippe Weinzaepfel, Xavier Martin, Cordelia Schmid

    Abstract: We introduce an approach for spatio-temporal human action localization using sparse spatial supervision. Our method leverages the large amount of annotated humans available today and extracts human tubes by combining a state-of-the-art human detector with a tracking-by-detection approach. Given these high-quality human tubes and temporal supervision, we select positive and negative tubes with very… ▽ More

    Submitted 23 May, 2017; v1 submitted 17 May, 2016; originally announced May 2016.

  49. arXiv:1604.04494  [pdf, other

    cs.CV

    Long-term Temporal Convolutions for Action Recognition

    Authors: Gül Varol, Ivan Laptev, Cordelia Schmid

    Abstract: Typical human actions last several seconds and exhibit characteristic spatio-temporal structure. Recent methods attempt to capture this structure and learn action representations with convolutional neural networks. Such representations, however, are typically learned at the level of a few video frames failing to model actions at their full temporal extent. In this work we learn video representatio… ▽ More

    Submitted 2 June, 2017; v1 submitted 15 April, 2016; originally announced April 2016.

  50. arXiv:1603.08841  [pdf, other

    physics.ins-det hep-ex

    Optimisation of the Read-out Electronics of Muon Drift-Tube Chambers for Very High Background Rates at HL-LHC and Future Colliders

    Authors: Sebastian Nowak, Sergey Abovyan, Philipp Gadow, Katharina Ecker, David Fink, Markus Fras, Oliver Kortner, Hubert Kroha, Felix Mueller, Robert Richter, Clemens Schmid, Korbinian Schmidt-Sommerfeld, Yazhou Zhao

    Abstract: In the ATLAS Muon Spectrometer, Monitored Drift Tube (MDT) chambers and sMDT chambers with half of the tube diameter of the MDTs are used for precision muon track reconstruction. The sMDT chambers are designed for operation at high counting rates due to neutron and gamma background irradiation expected for the HL-LHC and future hadron colliders. The existing MDT read-out electronics uses bipolar s… ▽ More

    Submitted 29 March, 2016; originally announced March 2016.

    Report number: MPP-2015-284