Zum Hauptinhalt springen

Showing 1–19 of 19 results for author: Osokin, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2210.07201  [pdf, other

    cs.CL

    Searching for Better Database Queries in the Outputs of Semantic Parsers

    Authors: Anton Osokin, Irina Saparina, Ramil Yarullin

    Abstract: The task of generating a database query from a question in natural language suffers from ambiguity and insufficiently precise description of the goal. The problem is amplified when the system needs to generalize to databases unseen at training. In this paper, we consider the case when, at the test time, the system has access to an external criterion that evaluates the generated queries. The criter… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

  2. arXiv:2109.06162  [pdf, other

    cs.CL

    SPARQLing Database Queries from Intermediate Question Decompositions

    Authors: Irina Saparina, Anton Osokin

    Abstract: To translate natural language questions into executable database queries, most approaches rely on a fully annotated training set. Annotating a large dataset with queries is difficult as it requires query-language expertise. We reduce this burden using grounded in databases intermediate question representations. These representations are simpler to collect and were originally crowdsourced within th… ▽ More

    Submitted 31 May, 2022; v1 submitted 13 September, 2021; originally announced September 2021.

  3. arXiv:2003.06800  [pdf, other

    cs.CV

    OS2D: One-Stage One-Shot Object Detection by Matching Anchor Features

    Authors: Anton Osokin, Denis Sumin, Vasily Lomakin

    Abstract: In this paper, we consider the task of one-shot object detection, which consists in detecting objects defined by a single demonstration. Differently from the standard object detection, the classes of objects used for training and testing do not overlap. We build the one-stage system that performs localization and recognition jointly. We use dense correlation matching of learned local features to f… ▽ More

    Submitted 19 August, 2020; v1 submitted 15 March, 2020; originally announced March 2020.

    Comments: Published at ECCV 2020

  4. arXiv:1912.03771  [pdf, other

    cs.LG stat.ML

    Cost-Sensitive Training for Autoregressive Models

    Authors: Irina Saparina, Anton Osokin

    Abstract: Training autoregressive models to better predict under the test metric, instead of maximizing the likelihood, has been reported to be beneficial in several use cases but brings additional complications, which prevent wider adoption. In this paper, we follow the learning-to-search approach (Daumé III et al., 2009; Leblond et al., 2018) and investigate its several components. First, we propose a way… ▽ More

    Submitted 8 December, 2019; originally announced December 2019.

  5. arXiv:1902.11088  [pdf, other

    cs.LG stat.ML

    Scaling Matters in Deep Structured-Prediction Models

    Authors: Aleksandr Shevchenko, Anton Osokin

    Abstract: Deep structured-prediction energy-based models combine the expressive power of learned representations and the ability of embedding knowledge about the task at hand into the system. A common way to learn parameters of such models consists in a multistage procedure where different combinations of components are trained at different stages. The joint end-to-end training of the whole system is then d… ▽ More

    Submitted 28 February, 2019; originally announced February 2019.

    Comments: 13 pages

  6. arXiv:1812.02619  [pdf, other

    cs.CV

    Tube-CNN: Modeling temporal evolution of appearance for object detection in video

    Authors: Tuan-Hung Vu, Anton Osokin, Ivan Laptev

    Abstract: Object detection in video is crucial for many applications. Compared to images, video provides additional cues which can help to disambiguate the detection problem. Our goal in this paper is to learn discriminative models for the temporal evolution of object appearance and to use such models for object detection. To model temporal evolution, we introduce space-time tubes corresponding to temporal… ▽ More

    Submitted 6 December, 2018; originally announced December 2018.

    Comments: 13 pages, 8 figures, technical report

  7. arXiv:1811.08725  [pdf, other

    stat.ML cs.LG

    Marginal Weighted Maximum Log-likelihood for Efficient Learning of Perturb-and-Map models

    Authors: Tatiana Shpakova, Francis Bach, Anton Osokin

    Abstract: We consider the structured-output prediction problem through probabilistic approaches and generalize the "perturb-and-MAP" framework to more challenging weighted Hamming losses, which are crucial in applications. While in principle our approach is a straightforward marginalization, it requires solving many related MAP inference problems. We show that for log-supermodular pairwise models these oper… ▽ More

    Submitted 21 November, 2018; originally announced November 2018.

    Comments: Published in Proceedings of the Conference of Uncertainty in Artificial Intelligence (UAI), 2018

  8. arXiv:1810.11544  [pdf, other

    cs.LG cs.AI stat.ML

    Quantifying Learning Guarantees for Convex but Inconsistent Surrogates

    Authors: Kirill Struminsky, Simon Lacoste-Julien, Anton Osokin

    Abstract: We study consistency properties of machine learning methods based on minimizing convex surrogates. We extend the recent framework of Osokin et al. (2017) for the quantitative analysis of consistency properties to the case of inconsistent surrogates. Our key technical contribution consists in a new lower bound on the calibration function for the quadratic surrogate, which is non-trivial (not always… ▽ More

    Submitted 9 January, 2019; v1 submitted 26 October, 2018; originally announced October 2018.

    Comments: Appears in: Advances in Neural Information Processing Systems 31 (NeurIPS 2018). 18 pages

  9. arXiv:1806.11008  [pdf, other

    cs.CV

    Modeling Spatio-Temporal Human Track Structure for Action Localization

    Authors: Guilhem Chéron, Anton Osokin, Ivan Laptev, Cordelia Schmid

    Abstract: This paper addresses spatio-temporal localization of human actions in video. In order to localize actions in time, we propose a recurrent localization network (RecLNet) designed to model the temporal structure of actions on the level of person tracks. Our model is trained to simultaneously recognize and localize action classes in time and is based on two layer gated recurrent units (GRU) applied s… ▽ More

    Submitted 28 June, 2018; originally announced June 2018.

  10. arXiv:1708.04692  [pdf, other

    cs.CV cs.LG stat.ML

    GANs for Biological Image Synthesis

    Authors: Anton Osokin, Anatole Chessel, Rafael E. Carazo Salas, Federico Vaggi

    Abstract: In this paper, we propose a novel application of Generative Adversarial Networks (GAN) to the synthesis of cells imaged by fluorescence microscopy. Compared to natural images, cells tend to have a simpler and more geometric global structure that facilitates image generation. However, the correlation between the spatial pattern of different fluorescent proteins reflects important biological functio… ▽ More

    Submitted 12 September, 2017; v1 submitted 15 August, 2017; originally announced August 2017.

    Comments: The paper appearing at the International Conference on Computer Vision (ICCV) 2017 + its supplementary materials

  11. arXiv:1706.04499  [pdf, other

    cs.LG stat.ML

    SEARNN: Training RNNs with Global-Local Losses

    Authors: Rémi Leblond, Jean-Baptiste Alayrac, Anton Osokin, Simon Lacoste-Julien

    Abstract: We propose SEARNN, a novel training algorithm for recurrent neural networks (RNNs) inspired by the "learning to search" (L2S) approach to structured prediction. RNNs have been widely successful in structured prediction applications such as machine translation or parsing, and are commonly trained using maximum likelihood estimation (MLE). Unfortunately, this training loss is not always an appropria… ▽ More

    Submitted 4 March, 2018; v1 submitted 14 June, 2017; originally announced June 2017.

    Comments: Published as a conference paper at ICLR 2018, 16 pages

  12. arXiv:1703.02403  [pdf, other

    cs.LG stat.ML

    On Structured Prediction Theory with Calibrated Convex Surrogate Losses

    Authors: Anton Osokin, Francis Bach, Simon Lacoste-Julien

    Abstract: We provide novel theoretical insights on structured prediction in the context of efficient convex surrogate loss minimization with consistency guarantees. For any task loss, we construct a convex surrogate that can be optimized via stochastic gradient descent and we prove tight bounds on the so-called "calibration function" relating the excess surrogate risk to the actual risk. In contrast to prio… ▽ More

    Submitted 29 January, 2018; v1 submitted 7 March, 2017; originally announced March 2017.

    Comments: Appears in: Advances in Neural Information Processing Systems 30 (NIPS 2017). 30 pages

  13. arXiv:1605.09346  [pdf, other

    cs.LG math.OC stat.ML

    Minding the Gaps for Block Frank-Wolfe Optimization of Structured SVMs

    Authors: Anton Osokin, Jean-Baptiste Alayrac, Isabella Lukasewitz, Puneet K. Dokania, Simon Lacoste-Julien

    Abstract: In this paper, we propose several improvements on the block-coordinate Frank-Wolfe (BCFW) algorithm from Lacoste-Julien et al. (2013) recently used to optimize the structured support vector machine (SSVM) objective in the context of structured prediction, though it has wider applications. The key intuition behind our improvements is that the estimates of block gaps maintained by BCFW reveal the bl… ▽ More

    Submitted 30 May, 2016; originally announced May 2016.

    Comments: Appears in Proceedings of the 33rd International Conference on Machine Learning (ICML 2016). 31 pages

    MSC Class: 90C52; 90C90; 90C06; 68T05 ACM Class: G.1.6; I.2.6

  14. arXiv:1511.07917  [pdf, other

    cs.CV cs.LG

    Context-aware CNNs for person head detection

    Authors: Tuan-Hung Vu, Anton Osokin, Ivan Laptev

    Abstract: Person detection is a key problem for many computer vision tasks. While face detection has reached maturity, detecting people under a full variation of camera view-points, human poses, lighting conditions and occlusions is still a difficult challenge. In this work we focus on detecting human heads in natural scenes. Starting from the recent local R-CNN object detector, we extend it with two types… ▽ More

    Submitted 24 November, 2015; originally announced November 2015.

    Comments: To appear in International Conference on Computer Vision (ICCV), 2015

  15. arXiv:1509.06569  [pdf, ps, other

    cs.LG cs.NE

    Tensorizing Neural Networks

    Authors: Alexander Novikov, Dmitry Podoprikhin, Anton Osokin, Dmitry Vetrov

    Abstract: Deep neural networks currently demonstrate state-of-the-art performance in several domains. At the same time, models of this class are very demanding in terms of computational resources. In particular, a large amount of memory is required by commonly used fully-connected layers, making it hard to use the models on low-end devices and stopping the further increase of the model size. In this paper w… ▽ More

    Submitted 20 December, 2015; v1 submitted 22 September, 2015; originally announced September 2015.

  16. arXiv:1502.07257  [pdf, other

    cs.CL

    Breaking Sticks and Ambiguities with Adaptive Skip-gram

    Authors: Sergey Bartunov, Dmitry Kondrashkin, Anton Osokin, Dmitry Vetrov

    Abstract: Recently proposed Skip-gram model is a powerful method for learning high-dimensional word representations that capture rich semantic relationships between words. However, Skip-gram as well as most prior work on learning word representations does not take into account word ambiguity and maintain only single representation per word. Although a number of Skip-gram modifications were proposed to overc… ▽ More

    Submitted 15 November, 2015; v1 submitted 25 February, 2015; originally announced February 2015.

  17. arXiv:1501.03771  [pdf, ps, other

    cs.CV math.OC stat.ML

    Submodular relaxation for inference in Markov random fields

    Authors: Anton Osokin, Dmitry Vetrov

    Abstract: In this paper we address the problem of finding the most probable state of a discrete Markov random field (MRF), also known as the MRF energy minimization problem. The task is known to be NP-hard in general and its practical importance motivates numerous approximate algorithms. We propose a submodular relaxation approach (SMR) based on a Lagrangian relaxation of the initial problem. Unlike the dua… ▽ More

    Submitted 15 January, 2015; originally announced January 2015.

    Comments: This paper is accepted for publication in IEEE Transactions on Pattern Analysis and Machine Intelligence

  18. arXiv:1406.5910  [pdf, other

    cs.CV cs.LG

    Multi-utility Learning: Structured-output Learning with Multiple Annotation-specific Loss Functions

    Authors: Roman Shapovalov, Dmitry Vetrov, Anton Osokin, Pushmeet Kohli

    Abstract: Structured-output learning is a challenging problem; particularly so because of the difficulty in obtaining large datasets of fully labelled instances for training. In this paper we try to overcome this difficulty by presenting a multi-utility learning framework for structured prediction that can learn from training instances with different forms of supervision. We propose a unified technique for… ▽ More

    Submitted 23 June, 2014; originally announced June 2014.

  19. arXiv:1103.1077  [pdf, ps, other

    cs.CV cs.DM math.OC

    Submodular Decomposition Framework for Inference in Associative Markov Networks with Global Constraints

    Authors: Anton Osokin, Dmitry Vetrov, Vladimir Kolmogorov

    Abstract: In the paper we address the problem of finding the most probable state of discrete Markov random field (MRF) with associative pairwise terms. Although of practical importance, this problem is known to be NP-hard in general. We propose a new type of MRF decomposition, submodular decomposition (SMD). Unlike existing decomposition approaches SMD decomposes the initial problem into subproblems corresp… ▽ More

    Submitted 5 March, 2011; originally announced March 2011.

    Comments: 17 pages. Shorter version to appear in CVPR 2011