-
Are we making progress in unlearning? Findings from the first NeurIPS unlearning competition
Authors:
Eleni Triantafillou,
Peter Kairouz,
Fabian Pedregosa,
Jamie Hayes,
Meghdad Kurmanji,
Kairan Zhao,
Vincent Dumoulin,
Julio Jacques Junior,
Ioannis Mitliagkas,
Jun Wan,
Lisheng Sun Hosoya,
Sergio Escalera,
Gintare Karolina Dziugaite,
Peter Triantafillou,
Isabelle Guyon
Abstract:
We present the findings of the first NeurIPS competition on unlearning, which sought to stimulate the development of novel algorithms and initiate discussions on formal and robust evaluation methodologies. The competition was highly successful: nearly 1,200 teams from across the world participated, and a wealth of novel, imaginative solutions with different characteristics were contributed. In thi…
▽ More
We present the findings of the first NeurIPS competition on unlearning, which sought to stimulate the development of novel algorithms and initiate discussions on formal and robust evaluation methodologies. The competition was highly successful: nearly 1,200 teams from across the world participated, and a wealth of novel, imaginative solutions with different characteristics were contributed. In this paper, we analyze top solutions and delve into discussions on benchmarking unlearning, which itself is a research problem. The evaluation methodology we developed for the competition measures forgetting quality according to a formal notion of unlearning, while incorporating model utility for a holistic evaluation. We analyze the effectiveness of different instantiations of this evaluation framework vis-a-vis the associated compute cost, and discuss implications for standardizing evaluation. We find that the ranking of leading methods remains stable under several variations of this framework, pointing to avenues for reducing the cost of evaluation. Overall, our findings indicate progress in unlearning, with top-performing competition entries surpassing existing algorithms under our evaluation framework. We analyze trade-offs made by different algorithms and strengths or weaknesses in terms of generalizability to new datasets, paving the way for advancing both benchmarking and algorithm development in this important area.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
AI Competitions and Benchmarks: Dataset Development
Authors:
Romain Egele,
Julio C. S. Jacques Junior,
Jan N. van Rijn,
Isabelle Guyon,
Xavier Baró,
Albert Clapés,
Prasanna Balaprakash,
Sergio Escalera,
Thomas Moeslund,
Jun Wan
Abstract:
Machine learning is now used in many applications thanks to its ability to predict, generate, or discover patterns from large quantities of data. However, the process of collecting and transforming data for practical use is intricate. Even in today's digital era, where substantial data is generated daily, it is uncommon for it to be readily usable; most often, it necessitates meticulous manual dat…
▽ More
Machine learning is now used in many applications thanks to its ability to predict, generate, or discover patterns from large quantities of data. However, the process of collecting and transforming data for practical use is intricate. Even in today's digital era, where substantial data is generated daily, it is uncommon for it to be readily usable; most often, it necessitates meticulous manual data preparation. The haste in developing new models can frequently result in various shortcomings, potentially posing risks when deployed in real-world scenarios (eg social discrimination, critical failures), leading to the failure or substantial escalation of costs in AI-based projects. This chapter provides a comprehensive overview of established methodological tools, enriched by our practical experience, in the development of datasets for machine learning. Initially, we develop the tasks involved in dataset development and offer insights into their effective management (including requirements, design, implementation, evaluation, distribution, and maintenance). Then, we provide more details about the implementation process which includes data collection, transformation, and quality evaluation. Finally, we address practical considerations regarding dataset distribution and maintenance.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
Unified Physical-Digital Attack Detection Challenge
Authors:
Haocheng Yuan,
Ajian Liu,
Junze Zheng,
Jun Wan,
Jiankang Deng,
Sergio Escalera,
Hugo Jair Escalante,
Isabelle Guyon,
Zhen Lei
Abstract:
Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attac…
▽ More
Face Anti-Spoofing (FAS) is crucial to safeguard Face Recognition (FR) Systems. In real-world scenarios, FRs are confronted with both physical and digital attacks. However, existing algorithms often address only one type of attack at a time, which poses significant limitations in real-world scenarios where FR systems face hybrid physical-digital threats. To facilitate the research of Unified Attack Detection (UAD) algorithms, a large-scale UniAttackData dataset has been collected. UniAttackData is the largest public dataset for Unified Attack Detection, with a total of 28,706 videos, where each unique identity encompasses all advanced attack types. Based on this dataset, we organized a Unified Physical-Digital Face Attack Detection Challenge to boost the research in Unified Attack Detections. It attracted 136 teams for the development phase, with 13 qualifying for the final round. The results re-verified by the organizing team were used for the final ranking. This paper comprehensively reviews the challenge, detailing the dataset introduction, protocol definition, evaluation criteria, and a summary of published results. Finally, we focus on the detailed analysis of the highest-performing algorithms and offer potential directions for unified physical-digital attack detection inspired by this competition. Challenge Website: https://sites.google.com/view/face-anti-spoofing-challenge/welcome/challengecvpr2024.
△ Less
Submitted 18 April, 2024; v1 submitted 9 April, 2024;
originally announced April 2024.
-
Challenge design roadmap
Authors:
Hugo Jair Escalante Balderas,
Isabelle Guyon,
Addison Howard,
Walter Reade,
Sebastien Treguer
Abstract:
Challenges can be seen as a type of game that motivates participants to solve serious tasks. As a result, competition organizers must develop effective game rules. However, these rules have multiple objectives beyond making the game enjoyable for participants. These objectives may include solving real-world problems, advancing scientific or technical areas, making scientific discoveries, and educa…
▽ More
Challenges can be seen as a type of game that motivates participants to solve serious tasks. As a result, competition organizers must develop effective game rules. However, these rules have multiple objectives beyond making the game enjoyable for participants. These objectives may include solving real-world problems, advancing scientific or technical areas, making scientific discoveries, and educating the public. In many ways, creating a challenge is similar to launching a product. It requires the same level of excitement and rigorous testing, and the goal is to attract ''customers'' in the form of participants. The process begins with a solid plan, such as a competition proposal that will eventually be submitted to an international conference and subjected to peer review. Although peer review does not guarantee quality, it does force organizers to consider the impact of their challenge, identify potential oversights, and generally improve its quality. This chapter provides guidelines for creating a strong plan for a challenge. The material draws on the preparation guidelines from organizations such as Kaggle 1 , ChaLearn 2 and Tailor 3 , as well as the NeurIPS proposal template, which some of the authors contributed to.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
DMLR: Data-centric Machine Learning Research -- Past, Present and Future
Authors:
Luis Oala,
Manil Maskey,
Lilith Bat-Leah,
Alicia Parrish,
Nezihe Merve Gürel,
Tzu-Sheng Kuo,
Yang Liu,
Rotem Dror,
Danilo Brajovic,
Xiaozhe Yao,
Max Bartolo,
William A Gaviria Rojas,
Ryan Hileman,
Rainier Aliment,
Michael W. Mahoney,
Meg Risdal,
Matthew Lease,
Wojciech Samek,
Debojyoti Dutta,
Curtis G Northcutt,
Cody Coleman,
Braden Hancock,
Bernard Koch,
Girmaw Abebe Tadesse,
Bojan Karlaš
, et al. (13 additional authors not shown)
Abstract:
Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods tow…
▽ More
Drawing from discussions at the inaugural DMLR workshop at ICML 2023 and meetings prior, in this report we outline the relevance of community engagement and infrastructure development for the creation of next-generation public datasets that will advance machine learning science. We chart a path forward as a collective effort to sustain the creation and maintenance of these datasets and methods towards positive scientific, societal and business impact.
△ Less
Submitted 1 June, 2024; v1 submitted 21 November, 2023;
originally announced November 2023.
-
RRR-Net: Reusing, Reducing, and Recycling a Deep Backbone Network
Authors:
Haozhe Sun,
Isabelle Guyon,
Felix Mohr,
Hedi Tabia
Abstract:
It has become mainstream in computer vision and other machine learning domains to reuse backbone networks pre-trained on large datasets as preprocessors. Typically, the last layer is replaced by a shallow learning machine of sorts; the newly-added classification head and (optionally) deeper layers are fine-tuned on a new task. Due to its strong performance and simplicity, a common pre-trained back…
▽ More
It has become mainstream in computer vision and other machine learning domains to reuse backbone networks pre-trained on large datasets as preprocessors. Typically, the last layer is replaced by a shallow learning machine of sorts; the newly-added classification head and (optionally) deeper layers are fine-tuned on a new task. Due to its strong performance and simplicity, a common pre-trained backbone network is ResNet152.However, ResNet152 is relatively large and induces inference latency. In many cases, a compact and efficient backbone with similar performance would be preferable over a larger, slower one. This paper investigates techniques to reuse a pre-trained backbone with the objective of creating a smaller and faster model. Starting from a large ResNet152 backbone pre-trained on ImageNet, we first reduce it from 51 blocks to 5 blocks, reducing its number of parameters and FLOPs by more than 6 times, without significant performance degradation. Then, we split the model after 3 blocks into several branches, while preserving the same number of parameters and FLOPs, to create an ensemble of sub-networks to improve performance. Our experiments on a large benchmark of $40$ image classification datasets from various domains suggest that our techniques match the performance (if not better) of ``classical backbone fine-tuning'' while achieving a smaller model size and faster inference speed.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Modularity in Deep Learning: A Survey
Authors:
Haozhe Sun,
Isabelle Guyon
Abstract:
Modularity is a general principle present in many fields. It offers attractive advantages, including, among others, ease of conceptualization, interpretability, scalability, module combinability, and module reusability. The deep learning community has long sought to take inspiration from the modularity principle, either implicitly or explicitly. This interest has been increasing over recent years.…
▽ More
Modularity is a general principle present in many fields. It offers attractive advantages, including, among others, ease of conceptualization, interpretability, scalability, module combinability, and module reusability. The deep learning community has long sought to take inspiration from the modularity principle, either implicitly or explicitly. This interest has been increasing over recent years. We review the notion of modularity in deep learning around three axes: data, task, and model, which characterize the life cycle of deep learning. Data modularity refers to the observation or creation of data groups for various purposes. Task modularity refers to the decomposition of tasks into sub-tasks. Model modularity means that the architecture of a neural network system can be decomposed into identifiable modules. We describe different instantiations of the modularity principle, and we contextualize their advantages in different deep learning sub-fields. Finally, we conclude the paper with a discussion of the definition of modularity and directions for future research.
△ Less
Submitted 2 October, 2023;
originally announced October 2023.
-
Is One Epoch All You Need For Multi-Fidelity Hyperparameter Optimization?
Authors:
Romain Egele,
Isabelle Guyon,
Yixuan Sun,
Prasanna Balaprakash
Abstract:
Hyperparameter optimization (HPO) is crucial for fine-tuning machine learning models but can be computationally expensive. To reduce costs, Multi-fidelity HPO (MF-HPO) leverages intermediate accuracy levels in the learning process and discards low-performing models early on. We compared various representative MF-HPO methods against a simple baseline on classical benchmark data. The baseline involv…
▽ More
Hyperparameter optimization (HPO) is crucial for fine-tuning machine learning models but can be computationally expensive. To reduce costs, Multi-fidelity HPO (MF-HPO) leverages intermediate accuracy levels in the learning process and discards low-performing models early on. We compared various representative MF-HPO methods against a simple baseline on classical benchmark data. The baseline involved discarding all models except the Top-K after training for only one epoch, followed by further training to select the best model. Surprisingly, this baseline achieved similar results to its counterparts, while requiring an order of magnitude less computation. Upon analyzing the learning curves of the benchmark data, we observed a few dominant learning curves, which explained the success of our baseline. This suggests that researchers should (1) always use the suggested baseline in benchmarks and (2) broaden the diversity of MF-HPO benchmarks to include more complex cases.
△ Less
Submitted 26 September, 2023; v1 submitted 28 July, 2023;
originally announced July 2023.
-
Meta-Album: Multi-domain Meta-Dataset for Few-Shot Image Classification
Authors:
Ihsan Ullah,
Dustin Carrión-Ojeda,
Sergio Escalera,
Isabelle Guyon,
Mike Huisman,
Felix Mohr,
Jan N van Rijn,
Haozhe Sun,
Joaquin Vanschoren,
Phan Anh Vu
Abstract:
We introduce Meta-Album, an image classification meta-dataset designed to facilitate few-shot learning, transfer learning, meta-learning, among other tasks. It includes 40 open datasets, each having at least 20 classes with 40 examples per class, with verified licences. They stem from diverse domains, such as ecology (fauna and flora), manufacturing (textures, vehicles), human actions, and optical…
▽ More
We introduce Meta-Album, an image classification meta-dataset designed to facilitate few-shot learning, transfer learning, meta-learning, among other tasks. It includes 40 open datasets, each having at least 20 classes with 40 examples per class, with verified licences. They stem from diverse domains, such as ecology (fauna and flora), manufacturing (textures, vehicles), human actions, and optical character recognition, featuring various image scales (microscopic, human scales, remote sensing). All datasets are preprocessed, annotated, and formatted uniformly, and come in 3 versions (Micro $\subset$ Mini $\subset$ Extended) to match users' computational resources. We showcase the utility of the first 30 datasets on few-shot learning problems. The other 10 will be released shortly after. Meta-Album is already more diverse and larger (in number of datasets) than similar efforts, and we are committed to keep enlarging it via a series of competitions. As competitions terminate, their test data are released, thus creating a rolling benchmark, available through OpenML.org. Our website https://meta-album.github.io/ contains the source code of challenge winning methods, baseline methods, data loaders, and instructions for contributing either new datasets or algorithms to our expandable meta-dataset.
△ Less
Submitted 16 February, 2023;
originally announced February 2023.
-
NeurIPS'22 Cross-Domain MetaDL competition: Design and baseline results
Authors:
Dustin Carrión-Ojeda,
Hong Chen,
Adrian El Baz,
Sergio Escalera,
Chaoyu Guan,
Isabelle Guyon,
Ihsan Ullah,
Xin Wang,
Wenwu Zhu
Abstract:
We present the design and baseline results for a new challenge in the ChaLearn meta-learning series, accepted at NeurIPS'22, focusing on "cross-domain" meta-learning. Meta-learning aims to leverage experience gained from previous tasks to solve new tasks efficiently (i.e., with better performance, little training data, and/or modest computational resources). While previous challenges in the series…
▽ More
We present the design and baseline results for a new challenge in the ChaLearn meta-learning series, accepted at NeurIPS'22, focusing on "cross-domain" meta-learning. Meta-learning aims to leverage experience gained from previous tasks to solve new tasks efficiently (i.e., with better performance, little training data, and/or modest computational resources). While previous challenges in the series focused on within-domain few-shot learning problems, with the aim of learning efficiently N-way k-shot tasks (i.e., N class classification problems with k training examples), this competition challenges the participants to solve "any-way" and "any-shot" problems drawn from various domains (healthcare, ecology, biology, manufacturing, and others), chosen for their humanitarian and societal impact. To that end, we created Meta-Album, a meta-dataset of 40 image classification datasets from 10 domains, from which we carve out tasks with any number of "ways" (within the range 2-20) and any number of "shots" (within the range 1-20). The competition is with code submission, fully blind-tested on the CodaLab challenge platform. The code of the winners will be open-sourced, enabling the deployment of automated machine learning solutions for few-shot image classification across several domains.
△ Less
Submitted 31 August, 2022;
originally announced August 2022.
-
Meta-learning from Learning Curves Challenge: Lessons learned from the First Round and Design of the Second Round
Authors:
Manh Hung Nguyen,
Lisheng Sun,
Nathan Grinsztajn,
Isabelle Guyon
Abstract:
Meta-learning from learning curves is an important yet often neglected research area in the Machine Learning community. We introduce a series of Reinforcement Learning-based meta-learning challenges, in which an agent searches for the best suited algorithm for a given dataset, based on feedback of learning curves from the environment. The first round attracted participants both from academia and i…
▽ More
Meta-learning from learning curves is an important yet often neglected research area in the Machine Learning community. We introduce a series of Reinforcement Learning-based meta-learning challenges, in which an agent searches for the best suited algorithm for a given dataset, based on feedback of learning curves from the environment. The first round attracted participants both from academia and industry. This paper analyzes the results of the first round (accepted to the competition program of WCCI 2022), to draw insights into what makes a meta-learner successful at learning from learning curves. With the lessons learned from the first round and the feedback from the participants, we have designed the second round of our challenge with a new protocol and a new meta-dataset. The second round of our challenge is accepted at the AutoML-Conf 2022 and currently ongoing .
△ Less
Submitted 4 August, 2022;
originally announced August 2022.
-
Reinforcement learning for Energies of the future and carbon neutrality: a Challenge Design
Authors:
Gaëtan Serré,
Eva Boguslawski,
Benjamin Donnot,
Adrien Pavão,
Isabelle Guyon,
Antoine Marot
Abstract:
Current rapid changes in climate increase the urgency to change energy production and consumption management, to reduce carbon and other green-house gas production. In this context, the French electricity network management company RTE (R{é}seau de Transport d'{É}lectricit{é}) has recently published the results of an extensive study outlining various scenarios for tomorrow's French power managemen…
▽ More
Current rapid changes in climate increase the urgency to change energy production and consumption management, to reduce carbon and other green-house gas production. In this context, the French electricity network management company RTE (R{é}seau de Transport d'{É}lectricit{é}) has recently published the results of an extensive study outlining various scenarios for tomorrow's French power management. We propose a challenge that will test the viability of such a scenario. The goal is to control electricity transportation in power networks, while pursuing multiple objectives: balancing production and consumption, minimizing energetic losses, and keeping people and equipment safe and particularly avoiding catastrophic failures. While the importance of the application provides a goal in itself, this challenge also aims to push the state-of-the-art in a branch of Artificial Intelligence (AI) called Reinforcement Learning (RL), which offers new possibilities to tackle control problems. In particular, various aspects of the combination of Deep Learning and RL called Deep Reinforcement Learning remain to be harnessed in this application domain. This challenge belongs to a series started in 2019 under the name "Learning to run a power network" (L2RPN). In this new edition, we introduce new more realistic scenarios proposed by RTE to reach carbon neutrality by 2050, retiring fossil fuel electricity production, increasing proportions of renewable and nuclear energy and introducing batteries. Furthermore, we provide a baseline using state-of-the-art reinforcement learning algorithm to stimulate the future participants.
△ Less
Submitted 21 July, 2022;
originally announced July 2022.
-
Asynchronous Decentralized Bayesian Optimization for Large Scale Hyperparameter Optimization
Authors:
Romain Egele,
Isabelle Guyon,
Venkatram Vishwanath,
Prasanna Balaprakash
Abstract:
Bayesian optimization (BO) is a promising approach for hyperparameter optimization of deep neural networks (DNNs), where each model training can take minutes to hours. In BO, a computationally cheap surrogate model is employed to learn the relationship between parameter configurations and their performance such as accuracy. Parallel BO methods often adopt single manager/multiple workers strategies…
▽ More
Bayesian optimization (BO) is a promising approach for hyperparameter optimization of deep neural networks (DNNs), where each model training can take minutes to hours. In BO, a computationally cheap surrogate model is employed to learn the relationship between parameter configurations and their performance such as accuracy. Parallel BO methods often adopt single manager/multiple workers strategies to evaluate multiple hyperparameter configurations simultaneously. Despite significant hyperparameter evaluation time, the overhead in such centralized schemes prevents these methods to scale on a large number of workers. We present an asynchronous-decentralized BO, wherein each worker runs a sequential BO and asynchronously communicates its results through shared storage. We scale our method without loss of computational efficiency with above 95% of worker's utilization to 1,920 parallel workers (full production queue of the Polaris supercomputer) and demonstrate improvement in model accuracy as well as faster convergence on the CANDLE benchmark from the Exascale computing project.
△ Less
Submitted 26 September, 2023; v1 submitted 1 July, 2022;
originally announced July 2022.
-
Lessons learned from the NeurIPS 2021 MetaDL challenge: Backbone fine-tuning without episodic meta-learning dominates for few-shot learning image classification
Authors:
Adrian El Baz,
Ihsan Ullah,
Edesio Alcobaça,
André C. P. L. F. Carvalho,
Hong Chen,
Fabio Ferreira,
Henry Gouk,
Chaoyu Guan,
Isabelle Guyon,
Timothy Hospedales,
Shell Hu,
Mike Huisman,
Frank Hutter,
Zhengying Liu,
Felix Mohr,
Ekrem Öztürk,
Jan N. van Rijn,
Haozhe Sun,
Xin Wang,
Wenwu Zhu
Abstract:
Although deep neural networks are capable of achieving performance superior to humans on various tasks, they are notorious for requiring large amounts of data and computing resources, restricting their success to domains where such resources are available. Metalearning methods can address this problem by transferring knowledge from related tasks, thus reducing the amount of data and computing reso…
▽ More
Although deep neural networks are capable of achieving performance superior to humans on various tasks, they are notorious for requiring large amounts of data and computing resources, restricting their success to domains where such resources are available. Metalearning methods can address this problem by transferring knowledge from related tasks, thus reducing the amount of data and computing resources needed to learn new tasks. We organize the MetaDL competition series, which provide opportunities for research groups all over the world to create and experimentally assess new meta-(deep)learning solutions for real problems. In this paper, authored collaboratively between the competition organizers and the top-ranked participants, we describe the design of the competition, the datasets, the best experimental results, as well as the top-ranked methods in the NeurIPS 2021 challenge, which attracted 15 active teams who made it to the final phase (by outperforming the baseline), making over 100 code submissions during the feedback phase. The solutions of the top participants have been open-sourced. The lessons learned include that learning good representations is essential for effective transfer learning.
△ Less
Submitted 11 July, 2022; v1 submitted 15 June, 2022;
originally announced June 2022.
-
Bridging the Gap of AutoGraph between Academia and Industry: Analysing AutoGraph Challenge at KDD Cup 2020
Authors:
Zhen Xu,
Lanning Wei,
Huan Zhao,
Rex Ying,
Quanming Yao,
Wei-Wei Tu,
Isabelle Guyon
Abstract:
Graph structured data is ubiquitous in daily life and scientific areas and has attracted increasing attention. Graph Neural Networks (GNNs) have been proved to be effective in modeling graph structured data and many variants of GNN architectures have been proposed. However, much human effort is often needed to tune the architecture depending on different datasets. Researchers naturally adopt Autom…
▽ More
Graph structured data is ubiquitous in daily life and scientific areas and has attracted increasing attention. Graph Neural Networks (GNNs) have been proved to be effective in modeling graph structured data and many variants of GNN architectures have been proposed. However, much human effort is often needed to tune the architecture depending on different datasets. Researchers naturally adopt Automated Machine Learning on Graph Learning, aiming to reduce the human effort and achieve generally top-performing GNNs, but their methods focus more on the architecture search. To understand GNN practitioners' automated solutions, we organized AutoGraph Challenge at KDD Cup 2020, emphasizing on automated graph neural networks for node classification. We received top solutions especially from industrial tech companies like Meituan, Alibaba and Twitter, which are already open sourced on Github. After detailed comparisons with solutions from academia, we quantify the gaps between academia and industry on modeling scope, effectiveness and efficiency, and show that (1) academia AutoML for Graph solutions focus on GNN architecture search while industrial solutions, especially the winning ones in the KDD Cup, tend to obtain an overall solution (2) by neural architecture search only, academia solutions achieve on average 97.3% accuracy of industrial solutions (3) academia solutions are cheap to obtain with several GPU hours while industrial solutions take a few months' labors. Academic solutions also contain much fewer parameters.
△ Less
Submitted 6 April, 2022;
originally announced April 2022.
-
Comparison of Spatio-Temporal Models for Human Motion and Pose Forecasting in Face-to-Face Interaction Scenarios
Authors:
German Barquero,
Johnny Núñez,
Zhen Xu,
Sergio Escalera,
Wei-Wei Tu,
Isabelle Guyon,
Cristina Palmero
Abstract:
Human behavior forecasting during human-human interactions is of utmost importance to provide robotic or virtual agents with social intelligence. This problem is especially challenging for scenarios that are highly driven by interpersonal dynamics. In this work, we present the first systematic comparison of state-of-the-art approaches for behavior forecasting. To do so, we leverage whole-body anno…
▽ More
Human behavior forecasting during human-human interactions is of utmost importance to provide robotic or virtual agents with social intelligence. This problem is especially challenging for scenarios that are highly driven by interpersonal dynamics. In this work, we present the first systematic comparison of state-of-the-art approaches for behavior forecasting. To do so, we leverage whole-body annotations (face, body, and hands) from the very recently released UDIVA v0.5, which features face-to-face dyadic interactions. Our best attention-based approaches achieve state-of-the-art performance in UDIVA v0.5. We show that by autoregressively predicting the future with methods trained for the short-term future (<400ms), we outperform the baselines even for a considerably longer-term future (up to 2s). We also show that this finding holds when highly noisy annotations are used, which opens new horizons towards the use of weakly-supervised learning. Combined with large-scale datasets, this may help boost the advances in this field.
△ Less
Submitted 7 March, 2022;
originally announced March 2022.
-
Didn't see that coming: a survey on non-verbal social human behavior forecasting
Authors:
German Barquero,
Johnny Núñez,
Sergio Escalera,
Zhen Xu,
Wei-Wei Tu,
Isabelle Guyon,
Cristina Palmero
Abstract:
Non-verbal social human behavior forecasting has increasingly attracted the interest of the research community in recent years. Its direct applications to human-robot interaction and socially-aware human motion generation make it a very attractive field. In this survey, we define the behavior forecasting problem for multiple interactive agents in a generic way that aims at unifying the fields of s…
▽ More
Non-verbal social human behavior forecasting has increasingly attracted the interest of the research community in recent years. Its direct applications to human-robot interaction and socially-aware human motion generation make it a very attractive field. In this survey, we define the behavior forecasting problem for multiple interactive agents in a generic way that aims at unifying the fields of social signals prediction and human motion forecasting, traditionally separated. We hold that both problem formulations refer to the same conceptual problem, and identify many shared fundamental challenges: future stochasticity, context awareness, history exploitation, etc. We also propose a taxonomy that comprises methods published in the last 5 years in a very informative way and describes the current main concerns of the community with regard to this problem. In order to promote further research on this field, we also provide a summarised and friendly overview of audiovisual datasets featuring non-acted social interactions. Finally, we describe the most common metrics used in this task and their particular issues.
△ Less
Submitted 4 March, 2022;
originally announced March 2022.
-
Learning by Doing: Controlling a Dynamical System using Causality, Control, and Reinforcement Learning
Authors:
Sebastian Weichwald,
Søren Wengel Mogensen,
Tabitha Edith Lee,
Dominik Baumann,
Oliver Kroemer,
Isabelle Guyon,
Sebastian Trimpe,
Jonas Peters,
Niklas Pfister
Abstract:
Questions in causality, control, and reinforcement learning go beyond the classical machine learning task of prediction under i.i.d. observations. Instead, these fields consider the problem of learning how to actively perturb a system to achieve a certain effect on a response variable. Arguably, they have complementary views on the problem: In control, one usually aims to first identify the system…
▽ More
Questions in causality, control, and reinforcement learning go beyond the classical machine learning task of prediction under i.i.d. observations. Instead, these fields consider the problem of learning how to actively perturb a system to achieve a certain effect on a response variable. Arguably, they have complementary views on the problem: In control, one usually aims to first identify the system by excitation strategies to then apply model-based design techniques to control the system. In (non-model-based) reinforcement learning, one directly optimizes a reward. In causality, one focus is on identifiability of causal structure. We believe that combining the different views might create synergies and this competition is meant as a first step toward such synergies. The participants had access to observational and (offline) interventional data generated by dynamical systems. Track CHEM considers an open-loop problem in which a single impulse at the beginning of the dynamics can be set, while Track ROBO considers a closed-loop problem in which control variables can be set at each time step. The goal in both tracks is to infer controls that drive the system to a desired state. Code is open-sourced ( https://github.com/LearningByDoingCompetition/learningbydoing-comp ) to reproduce the winning solutions of the competition and to facilitate trying out new methods on the competition tasks.
△ Less
Submitted 12 February, 2022;
originally announced February 2022.
-
LTU Attacker for Membership Inference
Authors:
Joseph Pedersen,
Rafael Muñoz-Gómez,
Jiangnan Huang,
Haozhe Sun,
Wei-Wei Tu,
Isabelle Guyon
Abstract:
We address the problem of defending predictive models, such as machine learning classifiers (Defender models), against membership inference attacks, in both the black-box and white-box setting, when the trainer and the trained model are publicly released. The Defender aims at optimizing a dual objective: utility and privacy. Both utility and privacy are evaluated with an external apparatus includi…
▽ More
We address the problem of defending predictive models, such as machine learning classifiers (Defender models), against membership inference attacks, in both the black-box and white-box setting, when the trainer and the trained model are publicly released. The Defender aims at optimizing a dual objective: utility and privacy. Both utility and privacy are evaluated with an external apparatus including an Attacker and an Evaluator. On one hand, Reserved data, distributed similarly to the Defender training data, is used to evaluate Utility; on the other hand, Reserved data, mixed with Defender training data, is used to evaluate membership inference attack robustness. In both cases classification accuracy or error rate are used as the metric: Utility is evaluated with the classification accuracy of the Defender model; Privacy is evaluated with the membership prediction error of a so-called "Leave-Two-Unlabeled" LTU Attacker, having access to all of the Defender and Reserved data, except for the membership label of one sample from each. We prove that, under certain conditions, even a "naïve" LTU Attacker can achieve lower bounds on privacy loss with simple attack strategies, leading to concrete necessary conditions to protect privacy, including: preventing over-fitting and adding some amount of randomness. However, we also show that such a naïve LTU Attacker can fail to attack the privacy of models known to be vulnerable in the literature, demonstrating that knowledge must be complemented with strong attack strategies to turn the LTU Attacker into a powerful means of evaluating privacy. Our experiments on the QMNIST and CIFAR-10 datasets validate our theoretical results and confirm the roles of over-fitting prevention and randomness in the algorithms to protect against privacy attacks.
△ Less
Submitted 4 February, 2022;
originally announced February 2022.
-
Advances in MetaDL: AAAI 2021 challenge and workshop
Authors:
Adrian El Baz,
Isabelle Guyon,
Zhengying Liu,
Jan van Rijn,
Sebastien Treguer,
Joaquin Vanschoren
Abstract:
To stimulate advances in metalearning using deep learning techniques (MetaDL), we organized in 2021 a challenge and an associated workshop. This paper presents the design of the challenge and its results, and summarizes presentations made at the workshop. The challenge focused on few-shot learning classification tasks of small images. Participants' code submissions were run in a uniform manner, un…
▽ More
To stimulate advances in metalearning using deep learning techniques (MetaDL), we organized in 2021 a challenge and an associated workshop. This paper presents the design of the challenge and its results, and summarizes presentations made at the workshop. The challenge focused on few-shot learning classification tasks of small images. Participants' code submissions were run in a uniform manner, under tight computational constraints. This put pressure on solution designs to use existing architecture backbones and/or pre-trained networks. Winning methods featured various classifiers trained on top of the second last layer of popular CNN backbones, fined-tuned on the meta-training data (not necessarily in an episodic manner), then trained on the labeled support and tested on the unlabeled query sets of the meta-test data.
△ Less
Submitted 1 February, 2022;
originally announced February 2022.
-
OmniPrint: A Configurable Printed Character Synthesizer
Authors:
Haozhe Sun,
Wei-Wei Tu,
Isabelle Guyon
Abstract:
We introduce OmniPrint, a synthetic data generator of isolated printed characters, geared toward machine learning research. It draws inspiration from famous datasets such as MNIST, SVHN and Omniglot, but offers the capability of generating a wide variety of printed characters from various languages, fonts and styles, with customized distortions. We include 935 fonts from 27 scripts and many types…
▽ More
We introduce OmniPrint, a synthetic data generator of isolated printed characters, geared toward machine learning research. It draws inspiration from famous datasets such as MNIST, SVHN and Omniglot, but offers the capability of generating a wide variety of printed characters from various languages, fonts and styles, with customized distortions. We include 935 fonts from 27 scripts and many types of distortions. As a proof of concept, we show various use cases, including an example of meta-learning dataset designed for the upcoming MetaDL NeurIPS 2021 competition. OmniPrint is available at https://github.com/SunHaozhe/OmniPrint.
△ Less
Submitted 17 January, 2022;
originally announced January 2022.
-
Winning solutions and post-challenge analyses of the ChaLearn AutoDL challenge 2019
Authors:
Zhengying Liu,
Adrien Pavao,
Zhen Xu,
Sergio Escalera,
Fabio Ferreira,
Isabelle Guyon,
Sirui Hong,
Frank Hutter,
Rongrong Ji,
Julio C. S. Jacques Junior,
Ge Li,
Marius Lindauer,
Zhipeng Luo,
Meysam Madadi,
Thomas Nierhoff,
Kangning Niu,
Chunguang Pan,
Danny Stoll,
Sebastien Treguer,
Jin Wang,
Peng Wang,
Chenglin Wu,
Youcheng Xiong,
Arbe r Zela,
Yang Zhang
Abstract:
This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification…
▽ More
This paper reports the results and post-challenge analyses of ChaLearn's AutoDL challenge series, which helped sorting out a profusion of AutoML solutions for Deep Learning (DL) that had been introduced in a variety of settings, but lacked fair comparisons. All input data modalities (time series, images, videos, text, tabular) were formatted as tensors and all tasks were multi-label classification problems. Code submissions were executed on hidden tasks, with limited time and computational resources, pushing solutions that get results quickly. In this setting, DL methods dominated, though popular Neural Architecture Search (NAS) was impractical. Solutions relied on fine-tuned pre-trained networks, with architectures matching data modality. Post-challenge tests did not reveal improvements beyond the imposed time limit. While no component is particularly original or novel, a high level modular organization emerged featuring a "meta-learner", "data ingestor", "model selector", "model/learner", and "evaluator". This modularity enabled ablation studies, which revealed the importance of (off-platform) meta-learning, ensembling, and efficient data management. Experiments on heterogeneous module combinations further confirm the (local) optimality of the winning solutions. Our challenge legacy includes an ever-lasting benchmark (http://autodl.chalearn.org), the open-sourced code of the winners, and a free "AutoDL self-service".
△ Less
Submitted 11 January, 2022;
originally announced January 2022.
-
AutoDEUQ: Automated Deep Ensemble with Uncertainty Quantification
Authors:
Romain Egele,
Romit Maulik,
Krishnan Raghavan,
Bethany Lusch,
Isabelle Guyon,
Prasanna Balaprakash
Abstract:
Deep neural networks are powerful predictors for a variety of tasks. However, they do not capture uncertainty directly. Using neural network ensembles to quantify uncertainty is competitive with approaches based on Bayesian neural networks while benefiting from better computational scalability. However, building ensembles of neural networks is a challenging task because, in addition to choosing th…
▽ More
Deep neural networks are powerful predictors for a variety of tasks. However, they do not capture uncertainty directly. Using neural network ensembles to quantify uncertainty is competitive with approaches based on Bayesian neural networks while benefiting from better computational scalability. However, building ensembles of neural networks is a challenging task because, in addition to choosing the right neural architecture or hyperparameters for each member of the ensemble, there is an added cost of training each model. We propose AutoDEUQ, an automated approach for generating an ensemble of deep neural networks. Our approach leverages joint neural architecture and hyperparameter search to generate ensembles. We use the law of total variance to decompose the predictive variance of deep ensembles into aleatoric (data) and epistemic (model) uncertainties. We show that AutoDEUQ outperforms probabilistic backpropagation, Monte Carlo dropout, deep ensemble, distribution-free ensembles, and hyper ensemble methods on a number of regression benchmarks.
△ Less
Submitted 4 July, 2022; v1 submitted 26 October, 2021;
originally announced October 2021.
-
Codabench: Flexible, Easy-to-Use and Reproducible Benchmarking Platform
Authors:
Zhen Xu,
Sergio Escalera,
Isabelle Guyon,
Adrien Pavão,
Magali Richard,
Wei-Wei Tu,
Quanming Yao,
Huan Zhao
Abstract:
Obtaining standardized crowdsourced benchmark of computational methods is a major issue in data science communities. Dedicated frameworks enabling fair benchmarking in a unified environment are yet to be developed. Here we introduce Codabench, an open-source, community-driven platform for benchmarking algorithms or software agents versus datasets or tasks. A public instance of Codabench (https://w…
▽ More
Obtaining standardized crowdsourced benchmark of computational methods is a major issue in data science communities. Dedicated frameworks enabling fair benchmarking in a unified environment are yet to be developed. Here we introduce Codabench, an open-source, community-driven platform for benchmarking algorithms or software agents versus datasets or tasks. A public instance of Codabench (https://www.codabench.org/) is open to everyone, free of charge, and allows benchmark organizers to compare fairly submissions, under the same setting (software, hardware, data, algorithms), with custom protocols and data formats. Codabench has unique features facilitating the organization of benchmarks flexibly, easily and reproducibly, such as the possibility of re-using templates of benchmarks, and supplying compute resources on-demand. Codabench has been used internally and externally on various applications, receiving more than 130 users and 2500 submissions. As illustrative use cases, we introduce 4 diverse benchmarks covering Graph Machine Learning, Cancer Heterogeneity, Clinical Diagnosis and Reinforcement Learning.
△ Less
Submitted 25 February, 2022; v1 submitted 12 October, 2021;
originally announced October 2021.
-
AutoML Meets Time Series Regression Design and Analysis of the AutoSeries Challenge
Authors:
Zhen Xu,
Wei-Wei Tu,
Isabelle Guyon
Abstract:
Analyzing better time series with limited human effort is of interest to academia and industry. Driven by business scenarios, we organized the first Automated Time Series Regression challenge (AutoSeries) for the WSDM Cup 2020. We present its design, analysis, and post-hoc experiments. The code submission requirement precluded participants from any manual intervention, testing automated machine le…
▽ More
Analyzing better time series with limited human effort is of interest to academia and industry. Driven by business scenarios, we organized the first Automated Time Series Regression challenge (AutoSeries) for the WSDM Cup 2020. We present its design, analysis, and post-hoc experiments. The code submission requirement precluded participants from any manual intervention, testing automated machine learning capabilities of solutions, across many datasets, under hardware and time limitations. We prepared 10 datasets from diverse application domains (sales, power consumption, air quality, traffic, and parking), featuring missing data, mixed continuous and categorical variables, and various sampling rates. Each dataset was split into a training and a test sequence (which was streamed, allowing models to continuously adapt). The setting of time series regression, differs from classical forecasting in that covariates at the present time are known. Great strides were made by participants to tackle this AutoSeries problem, as demonstrated by the jump in performance from the sample submission, and post-hoc comparisons with AutoGluon. Simple yet effective methods were used, based on feature engineering, LightGBM, and random search hyper-parameter tuning, addressing all aspects of the challenge. Our post-hoc analyses revealed that providing additional time did not yield significant improvements. The winners' code was open-sourced https://github.com/NehzUx/AutoSeries.
△ Less
Submitted 27 December, 2021; v1 submitted 28 July, 2021;
originally announced July 2021.
-
ChaLearn Looking at People: Inpainting and Denoising challenges
Authors:
Sergio Escalera,
Marti Soler,
Stephane Ayache,
Umut Guclu,
Jun Wan,
Meysam Madadi,
Xavier Baro,
Hugo Jair Escalante,
Isabelle Guyon
Abstract:
Dealing with incomplete information is a well studied problem in the context of machine learning and computational intelligence. However, in the context of computer vision, the problem has only been studied in specific scenarios (e.g., certain types of occlusions in specific types of images), although it is common to have incomplete information in visual data. This chapter describes the design of…
▽ More
Dealing with incomplete information is a well studied problem in the context of machine learning and computational intelligence. However, in the context of computer vision, the problem has only been studied in specific scenarios (e.g., certain types of occlusions in specific types of images), although it is common to have incomplete information in visual data. This chapter describes the design of an academic competition focusing on inpainting of images and video sequences that was part of the competition program of WCCI2018 and had a satellite event collocated with ECCV2018. The ChaLearn Looking at People Inpainting Challenge aimed at advancing the state of the art on visual inpainting by promoting the development of methods for recovering missing and occluded information from images and video. Three tracks were proposed in which visual inpainting might be helpful but still challenging: human body pose estimation, text overlays removal and fingerprint denoising. This chapter describes the design of the challenge, which includes the release of three novel datasets, and the description of evaluation metrics, baselines and evaluation protocol. The results of the challenge are analyzed and discussed in detail and conclusions derived from this event are outlined.
△ Less
Submitted 24 June, 2021;
originally announced June 2021.
-
The Tracking Machine Learning challenge : Throughput phase
Authors:
Sabrina Amrouche,
Laurent Basara,
Paolo Calafiura,
Dmitry Emeliyanov,
Victor Estrade,
Steven Farrell,
Cécile Germain,
Vladimir Vava Gligorov,
Tobias Golling,
Sergey Gorbunov,
Heather Gray,
Isabelle Guyon,
Mikhail Hushchyn,
Vincenzo Innocente,
Moritz Kiehn,
Marcel Kunze,
Edward Moyse,
David Rousseau,
Andreas Salzburger,
Andrey Ustyuzhanin,
Jean-Roch Vlimant
Abstract:
This paper reports on the second "Throughput" phase of the Tracking Machine Learning (TrackML) challenge on the Codalab platform. As in the first "Accuracy" phase, the participants had to solve a difficult experimental problem linked to tracking accurately the trajectory of particles as e.g. created at the Large Hadron Collider (LHC): given O($10^5$) points, the participants had to connect them in…
▽ More
This paper reports on the second "Throughput" phase of the Tracking Machine Learning (TrackML) challenge on the Codalab platform. As in the first "Accuracy" phase, the participants had to solve a difficult experimental problem linked to tracking accurately the trajectory of particles as e.g. created at the Large Hadron Collider (LHC): given O($10^5$) points, the participants had to connect them into O($10^4$) individual groups that represent the particle trajectories which are approximated helical. While in the first phase only the accuracy mattered, the goal of this second phase was a compromise between the accuracy and the speed of inference. Both were measured on the Codalab platform where the participants had to upload their software. The best three participants had solutions with good accuracy and speed an order of magnitude faster than the state of the art when the challenge was designed. Although the core algorithms were less diverse than in the first phase, a diversity of techniques have been used and are described in this paper. The performance of the algorithms are analysed in depth and lessons derived.
△ Less
Submitted 14 May, 2021; v1 submitted 3 May, 2021;
originally announced May 2021.
-
Bayesian Optimization is Superior to Random Search for Machine Learning Hyperparameter Tuning: Analysis of the Black-Box Optimization Challenge 2020
Authors:
Ryan Turner,
David Eriksson,
Michael McCourt,
Juha Kiili,
Eero Laaksonen,
Zhen Xu,
Isabelle Guyon
Abstract:
This paper presents the results and insights from the black-box optimization (BBO) challenge at NeurIPS 2020 which ran from July-October, 2020. The challenge emphasized the importance of evaluating derivative-free optimizers for tuning the hyperparameters of machine learning models. This was the first black-box optimization challenge with a machine learning emphasis. It was based on tuning (valida…
▽ More
This paper presents the results and insights from the black-box optimization (BBO) challenge at NeurIPS 2020 which ran from July-October, 2020. The challenge emphasized the importance of evaluating derivative-free optimizers for tuning the hyperparameters of machine learning models. This was the first black-box optimization challenge with a machine learning emphasis. It was based on tuning (validation set) performance of standard machine learning models on real datasets. This competition has widespread impact as black-box optimization (e.g., Bayesian optimization) is relevant for hyperparameter tuning in almost every machine learning project as well as many applications outside of machine learning. The final leaderboard was determined using the optimization performance on held-out (hidden) objective functions, where the optimizers ran without human intervention. Baselines were set using the default settings of several open-source black-box optimization packages as well as random search.
△ Less
Submitted 31 August, 2021; v1 submitted 20 April, 2021;
originally announced April 2021.
-
Learning to run a Power Network Challenge: a Retrospective Analysis
Authors:
Antoine Marot,
Benjamin Donnot,
Gabriel Dulac-Arnold,
Adrian Kelly,
Aïdan O'Sullivan,
Jan Viebahn,
Mariette Awad,
Isabelle Guyon,
Patrick Panciatici,
Camilo Romero
Abstract:
Power networks, responsible for transporting electricity across large geographical regions, are complex infrastructures on which modern life critically depend. Variations in demand and production profiles, with increasing renewable energy integration, as well as the high voltage network technology, constitute a real challenge for human operators when optimizing electricity transportation while avo…
▽ More
Power networks, responsible for transporting electricity across large geographical regions, are complex infrastructures on which modern life critically depend. Variations in demand and production profiles, with increasing renewable energy integration, as well as the high voltage network technology, constitute a real challenge for human operators when optimizing electricity transportation while avoiding blackouts. Motivated to investigate the potential of AI methods in enabling adaptability in power network operation, we have designed a L2RPN challenge to encourage the development of reinforcement learning solutions to key problems present in the next-generation power networks. The NeurIPS 2020 competition was well received by the international community attracting over 300 participants worldwide.
The main contribution of this challenge is our proposed comprehensive 'Grid2Op' framework, and associated benchmark, which plays realistic sequential network operations scenarios. The Grid2Op framework, which is open-source and easily re-usable, allows users to define new environments with its companion GridAlive ecosystem. Grid2Op relies on existing non-linear physical power network simulators and let users create a series of perturbations and challenges that are representative of two important problems: a) the uncertainty resulting from the increased use of unpredictable renewable energy sources, and b) the robustness required with contingent line disconnections. In this paper, we give the competition highlights. We present the benchmark suite and analyse the winning solutions, including one super-human performance demonstration. We propose our organizational insights for a successful competition and conclude on open research avenues. Given the challenge success, we expect our work will foster research to create more sustainable solutions for power network operations.
△ Less
Submitted 21 October, 2021; v1 submitted 2 March, 2021;
originally announced March 2021.
-
NeurIPS 2020 Competition: Predicting Generalization in Deep Learning
Authors:
Yiding Jiang,
Pierre Foret,
Scott Yak,
Daniel M. Roy,
Hossein Mobahi,
Gintare Karolina Dziugaite,
Samy Bengio,
Suriya Gunasekar,
Isabelle Guyon,
Behnam Neyshabur
Abstract:
Understanding generalization in deep learning is arguably one of the most important questions in deep learning. Deep learning has been successfully adopted to a large number of problems ranging from pattern recognition to complex decision making, but many recent researchers have raised many concerns about deep learning, among which the most important is generalization. Despite numerous attempts, c…
▽ More
Understanding generalization in deep learning is arguably one of the most important questions in deep learning. Deep learning has been successfully adopted to a large number of problems ranging from pattern recognition to complex decision making, but many recent researchers have raised many concerns about deep learning, among which the most important is generalization. Despite numerous attempts, conventional statistical learning approaches have yet been able to provide a satisfactory explanation on why deep learning works. A recent line of works aims to address the problem by trying to predict the generalization performance through complexity measures. In this competition, we invite the community to propose complexity measures that can accurately predict generalization of models. A robust and general complexity measure would potentially lead to a better understanding of deep learning's underlying mechanism and behavior of deep models on unseen data, or shed light on better generalization bounds. All these outcomes will be important for making deep learning more robust and reliable.
△ Less
Submitted 14 December, 2020;
originally announced December 2020.
-
AgEBO-Tabular: Joint Neural Architecture and Hyperparameter Search with Autotuned Data-Parallel Training for Tabular Data
Authors:
Romain Egele,
Prasanna Balaprakash,
Venkatram Vishwanath,
Isabelle Guyon,
Zhengying Liu
Abstract:
Developing high-performing predictive models for large tabular data sets is a challenging task. The state-of-the-art methods are based on expert-developed model ensembles from different supervised learning methods. Recently, automated machine learning (AutoML) is emerging as a promising approach to automate predictive model development. Neural architecture search (NAS) is an AutoML approach that g…
▽ More
Developing high-performing predictive models for large tabular data sets is a challenging task. The state-of-the-art methods are based on expert-developed model ensembles from different supervised learning methods. Recently, automated machine learning (AutoML) is emerging as a promising approach to automate predictive model development. Neural architecture search (NAS) is an AutoML approach that generates and evaluates multiple neural network architectures concurrently and improves the accuracy of the generated models iteratively. A key issue in NAS, particularly for large data sets, is the large computation time required to evaluate each generated architecture. While data-parallel training is a promising approach that can address this issue, its use within NAS is difficult. For different data sets, the data-parallel training settings such as the number of parallel processes, learning rate, and batch size need to be adapted to achieve high accuracy and reduction in training time. To that end, we have developed AgEBO-Tabular, an approach to combine aging evolution (AgE), a parallel NAS method that searches over neural architecture space, and an asynchronous Bayesian optimization method for tuning the hyperparameters of the data-parallel training simultaneously. We demonstrate the efficacy of the proposed method to generate high-performing neural network models for large tabular benchmark data sets. Furthermore, we demonstrate that the automatically discovered neural network models using our method outperform the state-of-the-art AutoML ensemble models in inference speed by two orders of magnitude while reaching similar accuracy values.
△ Less
Submitted 26 October, 2021; v1 submitted 30 October, 2020;
originally announced October 2020.
-
Learning to run a power network challenge for training topology controllers
Authors:
Antoine Marot,
Benjamin Donnot,
Camilo Romero,
Luca Veyrin-Forrer,
Marvin Lerousseau,
Balthazar Donon,
Isabelle Guyon
Abstract:
For power grid operations, a large body of research focuses on using generation redispatching, load shedding or demand side management flexibilities. However, a less costly and potentially more flexible option would be grid topology reconfiguration, as already partially exploited by Coreso (European RSC) and RTE (French TSO) operations. Beyond previous work on branch switching, bus reconfiguration…
▽ More
For power grid operations, a large body of research focuses on using generation redispatching, load shedding or demand side management flexibilities. However, a less costly and potentially more flexible option would be grid topology reconfiguration, as already partially exploited by Coreso (European RSC) and RTE (French TSO) operations. Beyond previous work on branch switching, bus reconfigurations are a broader class of action and could provide some substantial benefits to route electricity and optimize the grid capacity to keep it within safety margins. Because of its non-linear and combinatorial nature, no existing optimal power flow solver can yet tackle this problem. We here propose a new framework to learn topology controllers through imitation and reinforcement learning. We present the design and the results of the first "Learning to Run a Power Network" challenge released with this framework. We finally develop a method providing performance upper-bounds (oracle), which highlights remaining unsolved challenges and suggests future directions of improvement.
△ Less
Submitted 5 December, 2019;
originally announced December 2019.
-
Synthetic Event Time Series Health Data Generation
Authors:
Saloni Dash,
Ritik Dutta,
Isabelle Guyon,
Adrien Pavao,
Andrew Yale,
Kristin P. Bennett
Abstract:
Synthetic medical data which preserves privacy while maintaining utility can be used as an alternative to real medical data, which has privacy costs and resource constraints associated with it. At present, most models focus on generating cross-sectional health data which is not necessarily representative of real data. In reality, medical data is longitudinal in nature, with a single patient having…
▽ More
Synthetic medical data which preserves privacy while maintaining utility can be used as an alternative to real medical data, which has privacy costs and resource constraints associated with it. At present, most models focus on generating cross-sectional health data which is not necessarily representative of real data. In reality, medical data is longitudinal in nature, with a single patient having multiple health events, non-uniformly distributed throughout their lifetime. These events are influenced by patient covariates such as comorbidities, age group, gender etc. as well as external temporal effects (e.g. flu season). While there exist seminal methods to model time series data, it becomes increasingly challenging to extend these methods to medical event time series data. Due to the complexity of the real data, in which each patient visit is an event, we transform the data by using summary statistics to characterize the events for a fixed set of time intervals, to facilitate analysis and interpretability. We then train a generative adversarial network to generate synthetic data. We demonstrate this approach by generating human sleep patterns, from a publicly available dataset. We empirically evaluate the generated data and show close univariate resemblance between synthetic and real data. However, we also demonstrate how stratification by covariates is required to gain a deeper understanding of synthetic data quality.
△ Less
Submitted 27 November, 2019; v1 submitted 14 November, 2019;
originally announced November 2019.
-
LEAP nets for power grid perturbations
Authors:
Benjamin Donnot,
Balthazar Donon,
Isabelle Guyon,
Zhengying Liu,
Antoine Marot,
Patrick Panciatici,
Marc Schoenauer
Abstract:
We propose a novel neural network embedding approach to model power transmission grids, in which high voltage lines are disconnected and reconnected with one-another from time to time, either accidentally or willfully. We call our architeture LEAP net, for Latent Encoding of Atypical Perturbation. Our method implements a form of transfer learning, permitting to train on a few source domains, then…
▽ More
We propose a novel neural network embedding approach to model power transmission grids, in which high voltage lines are disconnected and reconnected with one-another from time to time, either accidentally or willfully. We call our architeture LEAP net, for Latent Encoding of Atypical Perturbation. Our method implements a form of transfer learning, permitting to train on a few source domains, then generalize to new target domains, without learning on any example of that domain. We evaluate the viability of this technique to rapidly assess cu-rative actions that human operators take in emergency situations, using real historical data, from the French high voltage power grid.
△ Less
Submitted 22 August, 2019;
originally announced August 2019.
-
ChaLearn Looking at People: IsoGD and ConGD Large-scale RGB-D Gesture Recognition
Authors:
Jun Wan,
Chi Lin,
Longyin Wen,
Yunan Li,
Qiguang Miao,
Sergio Escalera,
Gholamreza Anbarjafari,
Isabelle Guyon,
Guodong Guo,
Stan Z. Li
Abstract:
The ChaLearn large-scale gesture recognition challenge has been run twice in two workshops in conjunction with the International Conference on Pattern Recognition (ICPR) 2016 and International Conference on Computer Vision (ICCV) 2017, attracting more than $200$ teams round the world. This challenge has two tracks, focusing on isolated and continuous gesture recognition, respectively. This paper d…
▽ More
The ChaLearn large-scale gesture recognition challenge has been run twice in two workshops in conjunction with the International Conference on Pattern Recognition (ICPR) 2016 and International Conference on Computer Vision (ICCV) 2017, attracting more than $200$ teams round the world. This challenge has two tracks, focusing on isolated and continuous gesture recognition, respectively. This paper describes the creation of both benchmark datasets and analyzes the advances in large-scale gesture recognition based on these two datasets. We discuss the challenges of collecting large-scale ground-truth annotations of gesture recognition, and provide a detailed analysis of the current state-of-the-art methods for large-scale isolated and continuous gesture recognition based on RGB-D video sequences. In addition to recognition rate and mean jaccard index (MJI) as evaluation metrics used in our previous challenges, we also introduce the corrected segmentation rate (CSR) metric to evaluate the performance of temporal segmentation for continuous gesture recognition. Furthermore, we propose a bidirectional long short-term memory (Bi-LSTM) baseline method, determining the video division points based on the skeleton points extracted by convolutional pose machine (CPM). Experiments demonstrate that the proposed Bi-LSTM outperforms the state-of-the-art methods with an absolute improvement of $8.1\%$ (from $0.8917$ to $0.9639$) of CSR.
△ Less
Submitted 28 July, 2019;
originally announced July 2019.
-
Towards AutoML in the presence of Drift: first results
Authors:
Jorge G. Madrid,
Hugo Jair Escalante,
Eduardo F. Morales,
Wei-Wei Tu,
Yang Yu,
Lisheng Sun-Hosoya,
Isabelle Guyon,
Michele Sebag
Abstract:
Research progress in AutoML has lead to state of the art solutions that can cope quite wellwith supervised learning task, e.g., classification with AutoSklearn. However, so far thesesystems do not take into account the changing nature of evolving data over time (i.e., theystill assume i.i.d. data); even when this sort of domains are increasingly available in realapplications (e.g., spam filtering,…
▽ More
Research progress in AutoML has lead to state of the art solutions that can cope quite wellwith supervised learning task, e.g., classification with AutoSklearn. However, so far thesesystems do not take into account the changing nature of evolving data over time (i.e., theystill assume i.i.d. data); even when this sort of domains are increasingly available in realapplications (e.g., spam filtering, user preferences, etc.). We describe a first attempt to de-velop an AutoML solution for scenarios in which data distribution changes relatively slowlyover time and in which the problem is approached in a lifelong learning setting. We extendAuto-Sklearn with sound and intuitive mechanisms that allow it to cope with this sort ofproblems. The extended Auto-Sklearn is combined with concept drift detection techniquesthat allow it to automatically determine when the initial models have to be adapted. Wereport experimental results in benchmark data from AutoML competitions that adhere tothis scenario. Results demonstrate the effectiveness of the proposed methodology.
△ Less
Submitted 24 July, 2019;
originally announced July 2019.
-
AutoML @ NeurIPS 2018 challenge: Design and Results
Authors:
Hugo Jair Escalante,
Wei-Wei Tu,
Isabelle Guyon,
Daniel L. Silver,
Evelyne Viegas,
Yuqiang Chen,
Wenyuan Dai,
Qiang Yang
Abstract:
We organized a competition on Autonomous Lifelong Machine Learning with Drift that was part of the competition program of NeurIPS 2018. This data driven competition asked participants to develop computer programs capable of solving supervised learning problems where the i.i.d. assumption did not hold. Large data sets were arranged in a lifelong learning and evaluation scenario and CodaLab was used…
▽ More
We organized a competition on Autonomous Lifelong Machine Learning with Drift that was part of the competition program of NeurIPS 2018. This data driven competition asked participants to develop computer programs capable of solving supervised learning problems where the i.i.d. assumption did not hold. Large data sets were arranged in a lifelong learning and evaluation scenario and CodaLab was used as the challenge platform. The challenge attracted more than 300 participants in its two month duration. This chapter describes the design of the challenge and summarizes its main results.
△ Less
Submitted 13 March, 2019; v1 submitted 12 March, 2019;
originally announced March 2019.
-
Optimization of computational budget for power system risk assessment
Authors:
Benjamin Donnot,
Isabelle Guyon,
Antoine Marot,
Marc Schoenauer,
Patrick Panciatici
Abstract:
We address the problem of maintaining high voltage power transmission networks in security at all time, namely anticipating exceeding of thermal limit for eventual single line disconnection (whatever its cause may be) by running slow, but accurate, physical grid simulators. New conceptual frameworks are calling for a probabilistic risk-based security criterion. However, these approaches suffer fro…
▽ More
We address the problem of maintaining high voltage power transmission networks in security at all time, namely anticipating exceeding of thermal limit for eventual single line disconnection (whatever its cause may be) by running slow, but accurate, physical grid simulators. New conceptual frameworks are calling for a probabilistic risk-based security criterion. However, these approaches suffer from high requirements in terms of tractability. Here, we propose a new method to assess the risk. This method uses both machine learning techniques (artificial neural networks) and more standard simulators based on physical laws. More specifically we train neural networks to estimate the overall dangerousness of a grid state. A classical benchmark problem (manpower 118 buses test case) is used to show the strengths of the proposed method.
△ Less
Submitted 3 May, 2018;
originally announced May 2018.
-
First Impressions: A Survey on Vision-Based Apparent Personality Trait Analysis
Authors:
Julio C. S. Jacques Junior,
Yağmur Güçlütürk,
Marc Pérez,
Umut Güçlü,
Carlos Andujar,
Xavier Baró,
Hugo Jair Escalante,
Isabelle Guyon,
Marcel A. J. van Gerven,
Rob van Lier,
Sergio Escalera
Abstract:
Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing inter…
▽ More
Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.
△ Less
Submitted 17 July, 2019; v1 submitted 21 April, 2018;
originally announced April 2018.
-
Explaining First Impressions: Modeling, Recognizing, and Explaining Apparent Personality from Videos
Authors:
Hugo Jair Escalante,
Heysem Kaya,
Albert Ali Salah,
Sergio Escalera,
Yagmur Gucluturk,
Umut Guclu,
Xavier Baro,
Isabelle Guyon,
Julio Jacques Junior,
Meysam Madadi,
Stephane Ayache,
Evelyne Viegas,
Furkan Gurpinar,
Achmadnoer Sukma Wicaksana,
Cynthia C. S. Liem,
Marcel A. J. van Gerven,
Rob van Lier
Abstract:
Explainability and interpretability are two critical aspects of decision support systems. Within computer vision, they are critical in certain tasks related to human behavior analysis such as in health care applications. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in…
▽ More
Explainability and interpretability are two critical aspects of decision support systems. Within computer vision, they are critical in certain tasks related to human behavior analysis such as in health care applications. Despite their importance, it is only recently that researchers are starting to explore these aspects. This paper provides an introduction to explainability and interpretability in the context of computer vision with an emphasis on looking at people tasks. Specifically, we review and study those mechanisms in the context of first impressions analysis. To the best of our knowledge, this is the first effort in this direction. Additionally, we describe a challenge we organized on explainability in first impressions analysis from video. We analyze in detail the newly introduced data set, the evaluation protocol, and summarize the results of the challenge. Finally, derived from our study, we outline research opportunities that we foresee will be decisive in the near future for the development of the explainable computer vision field.
△ Less
Submitted 28 September, 2019; v1 submitted 2 February, 2018;
originally announced February 2018.
-
Fast Power system security analysis with Guided Dropout
Authors:
Benjamin Donnot,
Isabelle Guyon,
Marc Schoenauer,
Antoine Marot,
Patrick Panciatici
Abstract:
We propose a new method to efficiently compute load-flows (the steady-state of the power-grid for given productions, consumptions and grid topology), substituting conventional simulators based on differential equation solvers. We use a deep feed-forward neural network trained with load-flows precomputed by simulation. Our architecture permits to train a network on so-called "n-1" problems, in whic…
▽ More
We propose a new method to efficiently compute load-flows (the steady-state of the power-grid for given productions, consumptions and grid topology), substituting conventional simulators based on differential equation solvers. We use a deep feed-forward neural network trained with load-flows precomputed by simulation. Our architecture permits to train a network on so-called "n-1" problems, in which load flows are evaluated for every possible line disconnection, then generalize to "n-2" problems without retraining (a clear advantage because of the combinatorial nature of the problem). To that end, we developed a technique bearing similarity with "dropout", which we named "guided dropout".
△ Less
Submitted 30 January, 2018;
originally announced January 2018.
-
Introducing machine learning for power system operation support
Authors:
Benjamin Donnot,
Isabelle Guyon,
Marc Schoenauer,
Patrick Panciatici,
Antoine Marot
Abstract:
We address the problem of assisting human dispatchers in operating power grids in today's changing context using machine learning, with theaim of increasing security and reducing costs. Power networks are highly regulated systems, which at all times must meet varying demands of electricity with a complex production system, including conventional power plants, less predictable re…
▽ More
We address the problem of assisting human dispatchers in operating power grids in today's changing context using machine learning, with theaim of increasing security and reducing costs. Power networks are highly regulated systems, which at all times must meet varying demands of electricity with a complex production system, including conventional power plants, less predictable renewable energies (such as wind or solar power), and the possibility of buying/selling electricity on the international market with more and more actors involved at a Europeanscale. This problem is becoming ever more challenging in an aging network infrastructure. One of the primary goals of dispatchers is to protect equipment (e.g. avoid that transmission lines overheat) with few degrees of freedom: we are considering in this paper solely modifications in network topology, i.e. re-configuring the way in which lines, transformers, productions and loads are connected in sub-stations. Using years of historical data collected by the French Transmission Service Operator (TSO) "Réseau de Transport d'Electricité" (RTE), we develop novel machine learning techniques (drawing on "deep learning") to mimic human decisions to devise "remedial actions" to prevent any line to violate power flow limits (so-called "thermal limits"). The proposed technique is hybrid. It does not rely purely on machine learning: every action will be tested with actual simulators before being proposed to the dispatchers or implemented on the grid.
△ Less
Submitted 27 September, 2017;
originally announced September 2017.
-
Design and Analysis of the NIPS 2016 Review Process
Authors:
Nihar B. Shah,
Behzad Tabibian,
Krikamol Muandet,
Isabelle Guyon,
Ulrike von Luxburg
Abstract:
Neural Information Processing Systems (NIPS) is a top-tier annual conference in machine learning. The 2016 edition of the conference comprised more than 2,400 paper submissions, 3,000 reviewers, and 8,000 attendees. This represents a growth of nearly 40% in terms of submissions, 96% in terms of reviewers, and over 100% in terms of attendees as compared to the previous year. The massive scale as we…
▽ More
Neural Information Processing Systems (NIPS) is a top-tier annual conference in machine learning. The 2016 edition of the conference comprised more than 2,400 paper submissions, 3,000 reviewers, and 8,000 attendees. This represents a growth of nearly 40% in terms of submissions, 96% in terms of reviewers, and over 100% in terms of attendees as compared to the previous year. The massive scale as well as rapid growth of the conference calls for a thorough quality assessment of the peer-review process and novel means of improvement. In this paper, we analyze several aspects of the data collected during the review process, including an experiment investigating the efficacy of collecting ordinal rankings from reviewers. Our goal is to check the soundness of the review process, and provide insights that may be useful in the design of the review process of subsequent conferences.
△ Less
Submitted 23 April, 2018; v1 submitted 31 August, 2017;
originally announced August 2017.
-
ChaLearn Looking at People: A Review of Events and Resources
Authors:
Sergio Escalera,
Xavier Baró,
Hugo Jair Escalante,
Isabelle Guyon
Abstract:
This paper reviews the historic of ChaLearn Looking at People (LAP) events. We started in 2011 (with the release of the first Kinect device) to run challenges related to human action/activity and gesture recognition. Since then we have regularly organized events in a series of competitions covering all aspects of visual analysis of humans. So far we have organized more than 10 international challe…
▽ More
This paper reviews the historic of ChaLearn Looking at People (LAP) events. We started in 2011 (with the release of the first Kinect device) to run challenges related to human action/activity and gesture recognition. Since then we have regularly organized events in a series of competitions covering all aspects of visual analysis of humans. So far we have organized more than 10 international challenges and events in this field. This paper reviews associated events, and introduces the ChaLearn LAP platform where public resources (including code, data and preprints of papers) related to the organized events are available. We also provide a discussion on perspectives of ChaLearn LAP activities.
△ Less
Submitted 15 February, 2017; v1 submitted 10 January, 2017;
originally announced January 2017.
-
Principal motion components for gesture recognition using a single-example
Authors:
Hugo Jair Escalante,
Isabelle Guyon,
Vassilis Athitsos,
Pat Jangyodsuk,
Jun Wan
Abstract:
This paper introduces principal motion components (PMC), a new method for one-shot gesture recognition. In the considered scenario a single training-video is available for each gesture to be recognized, which limits the application of traditional techniques (e.g., HMMs). In PMC, a 2D map of motion energy is obtained per each pair of consecutive frames in a video. Motion maps associated to a video…
▽ More
This paper introduces principal motion components (PMC), a new method for one-shot gesture recognition. In the considered scenario a single training-video is available for each gesture to be recognized, which limits the application of traditional techniques (e.g., HMMs). In PMC, a 2D map of motion energy is obtained per each pair of consecutive frames in a video. Motion maps associated to a video are processed to obtain a PCA model, which is used for recognition under a reconstruction-error approach. The main benefits of the proposed approach are its simplicity, easiness of implementation, competitive performance and efficiency. We report experimental results in one-shot gesture recognition using the ChaLearn Gesture Dataset; a benchmark comprising more than 50,000 gestures, recorded as both RGB and depth video with a Kinect camera. Results obtained with PMC are competitive with alternative methods proposed for the same data set.
△ Less
Submitted 31 January, 2014; v1 submitted 17 October, 2013;
originally announced October 2013.