Zum Hauptinhalt springen

Showing 1–15 of 15 results for author: Behl, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.14219  [pdf, other

    cs.CL cs.AI

    Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

    Authors: Marah Abdin, Sam Ade Jacobs, Ammar Ahmad Awan, Jyoti Aneja, Ahmed Awadallah, Hany Awadalla, Nguyen Bach, Amit Bahree, Arash Bakhtiari, Jianmin Bao, Harkirat Behl, Alon Benhaim, Misha Bilenko, Johan Bjorck, Sébastien Bubeck, Qin Cai, Martin Cai, Caio César Teodoro Mendes, Weizhu Chen, Vishrav Chaudhary, Dong Chen, Dongdong Chen, Yen-Chun Chen, Yi-Ling Chen, Parul Chopra , et al. (90 additional authors not shown)

    Abstract: We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whose overall performance, as measured by both academic benchmarks and internal testing, rivals that of models such as Mixtral 8x7B and GPT-3.5 (e.g., phi-3-mini achieves 69% on MMLU and 8.38 on MT-bench), despite being small enough to be deployed on a phone. The innovation lies entirely in our dataset… ▽ More

    Submitted 23 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

    Comments: 19 pages

  2. arXiv:2312.07509  [pdf, other

    cs.CV cs.LG

    PEEKABOO: Interactive Video Generation via Masked-Diffusion

    Authors: Yash Jain, Anshul Nasery, Vibhav Vineet, Harkirat Behl

    Abstract: Modern video generation models like Sora have achieved remarkable success in producing high-quality videos. However, a significant limitation is their inability to offer interactive control to users, a feature that promises to open up unprecedented applications and creativity. In this work, we introduce the first solution to equip diffusion-based video generation models with spatio-temporal contro… ▽ More

    Submitted 19 April, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Project webpage - https://jinga-lala.github.io/projects/Peekaboo/

  3. arXiv:2311.17937  [pdf, other

    cs.CV

    Unlocking Spatial Comprehension in Text-to-Image Diffusion Models

    Authors: Mohammad Mahdi Derakhshani, Menglin Xia, Harkirat Behl, Cees G. M. Snoek, Victor Rühle

    Abstract: We propose CompFuser, an image generation pipeline that enhances spatial comprehension and attribute assignment in text-to-image generative models. Our pipeline enables the interpretation of instructions defining spatial relationships between objects in a scene, such as `An image of a gray cat on the left of an orange dog', and generate corresponding images. This is especially important in order t… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  4. arXiv:2311.04894  [pdf, other

    cs.CV cs.AI cs.LG

    DAMEX: Dataset-aware Mixture-of-Experts for visual understanding of mixture-of-datasets

    Authors: Yash Jain, Harkirat Behl, Zsolt Kira, Vibhav Vineet

    Abstract: Construction of a universal detector poses a crucial question: How can we most effectively train a model on a large mixture of datasets? The answer lies in learning dataset-specific features and ensembling their knowledge but do all this in a single model. Previous methods achieve this by having separate detection heads on a common backbone but that results in a significant increase in parameters.… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

    Comments: https://github.com/jinga-lala/DAMEX

  5. arXiv:2309.07499  [pdf, other

    cs.CV

    Efficiently Robustify Pre-trained Models

    Authors: Nishant Jain, Harkirat Behl, Yogesh Singh Rawat, Vibhav Vineet

    Abstract: A recent trend in deep learning algorithms has been towards training large scale models, having high parameter count and trained on big dataset. However, robustness of such large scale models towards real-world settings is still a less-explored topic. In this work, we first benchmark the performance of these models under different perturbations and datasets thereby representing real-world shifts,… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  6. arXiv:2306.11644  [pdf, other

    cs.CL cs.AI cs.LG

    Textbooks Are All You Need

    Authors: Suriya Gunasekar, Yi Zhang, Jyoti Aneja, Caio César Teodoro Mendes, Allie Del Giorno, Sivakanth Gopi, Mojan Javaheripi, Piero Kauffmann, Gustavo de Rosa, Olli Saarikivi, Adil Salim, Shital Shah, Harkirat Singh Behl, Xin Wang, Sébastien Bubeck, Ronen Eldan, Adam Tauman Kalai, Yin Tat Lee, Yuanzhi Li

    Abstract: We introduce phi-1, a new large language model for code, with significantly smaller size than competing models: phi-1 is a Transformer-based model with 1.3B parameters, trained for 4 days on 8 A100s, using a selection of ``textbook quality" data from the web (6B tokens) and synthetically generated textbooks and exercises with GPT-3.5 (1B tokens). Despite this small scale, phi-1 attains pass@1 accu… ▽ More

    Submitted 2 October, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: 26 pages; changed color scheme of plot. fixed minor typos and added couple clarifications

  7. arXiv:2212.11270  [pdf, other

    cs.CV cs.CL

    Generalized Decoding for Pixel, Image, and Language

    Authors: Xueyan Zou, Zi-Yi Dou, Jianwei Yang, Zhe Gan, Linjie Li, Chunyuan Li, Xiyang Dai, Harkirat Behl, Jianfeng Wang, Lu Yuan, Nanyun Peng, Lijuan Wang, Yong Jae Lee, Jianfeng Gao

    Abstract: We present X-Decoder, a generalized decoding model that can predict pixel-level segmentation and language tokens seamlessly. X-Decodert takes as input two types of queries: (i) generic non-semantic queries and (ii) semantic queries induced from text inputs, to decode different pixel-level and token-level outputs in the same semantic space. With such a novel design, X-Decoder is the first work that… ▽ More

    Submitted 21 December, 2022; originally announced December 2022.

    Comments: https://x-decoder-vl.github.io

  8. arXiv:2207.11368  [pdf, other

    cs.CV

    Neural-Sim: Learning to Generate Training Data with NeRF

    Authors: Yunhao Ge, Harkirat Behl, Jiashu Xu, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, Vibhav Vineet

    Abstract: Training computer vision models usually requires collecting and labeling vast amounts of imagery under a diverse set of scene configurations and properties. This process is incredibly time-consuming, and it is challenging to ensure that the captured data distribution maps well to the target domain of an application scenario. Recently, synthetic data has emerged as a way to address both of these is… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: ECCV 2022

  9. arXiv:2101.05844  [pdf, other

    cs.LG

    Scaling the Convex Barrier with Sparse Dual Algorithms

    Authors: Alessandro De Palma, Harkirat Singh Behl, Rudy Bunel, Philip H. S. Torr, M. Pawan Kumar

    Abstract: Tight and efficient neural network bounding is crucial to the scaling of neural network verification systems. Many efficient bounding algorithms have been presented recently, but they are often too loose to verify more challenging properties. This is due to the weakness of the employed relaxation, which is usually a linear program of size linear in the number of neurons. While a tighter linear rel… ▽ More

    Submitted 26 February, 2024; v1 submitted 14 January, 2021; originally announced January 2021.

    Comments: Journal of Machine Learning Research, 2024 (extension of ICLR 2021 paper in [v1])

  10. arXiv:2008.08424  [pdf, other

    cs.CV cs.GR cs.LG stat.ML

    AutoSimulate: (Quickly) Learning Synthetic Data Generation

    Authors: Harkirat Singh Behl, Atılım Güneş Baydin, Ran Gal, Philip H. S. Torr, Vibhav Vineet

    Abstract: Simulation is increasingly being used for generating large labelled datasets in many machine learning problems. Recent methods have focused on adjusting simulator parameters with the goal of maximising accuracy on a validation task, usually relying on REINFORCE-like gradient estimators. However these approaches are very expensive as they treat the entire data generation, model training, and valida… ▽ More

    Submitted 16 August, 2020; originally announced August 2020.

    Comments: ECCV 2020

    Journal ref: European Conference on Computer Vision (ECCV) 2020

  11. arXiv:2006.10711  [pdf, other

    cs.LG stat.ML

    STEER: Simple Temporal Regularization For Neural ODEs

    Authors: Arnab Ghosh, Harkirat Singh Behl, Emilien Dupont, Philip H. S. Torr, Vinay Namboodiri

    Abstract: Training Neural Ordinary Differential Equations (ODEs) is often computationally expensive. Indeed, computing the forward pass of such models involves solving an ODE which can become arbitrarily complex during training. Recent works have shown that regularizing the dynamics of the ODE can partially alleviate this. In this paper we propose a new regularization technique: randomly sampling the end ti… ▽ More

    Submitted 2 November, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

    Comments: Neurips 2020

  12. arXiv:2006.09081  [pdf, other

    cs.CV cs.LG

    Progressive Skeletonization: Trimming more fat from a network at initialization

    Authors: Pau de Jorge, Amartya Sanyal, Harkirat S. Behl, Philip H. S. Torr, Gregory Rogez, Puneet K. Dokania

    Abstract: Recent studies have shown that skeletonization (pruning parameters) of networks \textit{at initialization} provides all the practical benefits of sparsity both at inference and training time, while only marginally degrading their performance. However, we observe that beyond a certain level of sparsity (approx $95\%$), these approaches fail to preserve the network performance, and to our surprise,… ▽ More

    Submitted 19 March, 2021; v1 submitted 16 June, 2020; originally announced June 2020.

  13. arXiv:1905.07435  [pdf, other

    cs.LG cs.AI stat.ML

    Alpha MAML: Adaptive Model-Agnostic Meta-Learning

    Authors: Harkirat Singh Behl, Atılım Güneş Baydin, Philip H. S. Torr

    Abstract: Model-agnostic meta-learning (MAML) is a meta-learning technique to train a model on a multitude of learning tasks in a way that primes the model for few-shot learning of new tasks. The MAML algorithm performs well on few-shot learning problems in classification, regression, and fine-tuning of policy gradients in reinforcement learning, but comes with the need for costly hyperparameter tuning for… ▽ More

    Submitted 17 May, 2019; originally announced May 2019.

    Comments: 6th ICML Workshop on Automated Machine Learning (2019)

    Journal ref: ICML Workshop on Automated Machine Learning (2019)

  14. arXiv:1812.01397  [pdf, other

    cs.CV

    Meta Learning Deep Visual Words for Fast Video Object Segmentation

    Authors: Harkirat Singh Behl, Mohammad Najafi, Anurag Arnab, Philip H. S. Torr

    Abstract: Personal robots and driverless cars need to be able to operate in novel environments and thus quickly and efficiently learn to recognise new object classes. We address this problem by considering the task of video object segmentation. Previous accurate methods for this task finetune a model using the first annotated frame, and/or use additional inputs such as optical flow and complex post-processi… ▽ More

    Submitted 16 August, 2020; v1 submitted 4 December, 2018; originally announced December 2018.

    Journal ref: In Proceedings of International Conference on Intelligent Robots and Systems (IROS) 2020

  15. arXiv:1704.01358  [pdf, other

    cs.CV

    Incremental Tube Construction for Human Action Detection

    Authors: Harkirat Singh Behl, Michael Sapienza, Gurkirt Singh, Suman Saha, Fabio Cuzzolin, Philip H. S. Torr

    Abstract: Current state-of-the-art action detection systems are tailored for offline batch-processing applications. However, for online applications like human-robot interaction, current systems fall short, either because they only detect one action per video, or because they assume that the entire video is available ahead of time. In this work, we introduce a real-time and online joint-labelling and associ… ▽ More

    Submitted 23 July, 2018; v1 submitted 5 April, 2017; originally announced April 2017.

    Comments: British Machine Vision Conference (BMVC) 2018