-
Risk-based Calibration for Probabilistic Classifiers
Authors:
Aritz Pérez,
Carlos Echegoyen,
Guzmán Santafé
Abstract:
We introduce a general iterative procedure called risk-based calibration (RC) designed to minimize the empirical risk under the 0-1 loss (empirical error) for probabilistic classifiers. These classifiers are based on modeling probability distributions, including those constructed from the joint distribution (generative) and those based on the class conditional distribution (conditional). RC can be…
▽ More
We introduce a general iterative procedure called risk-based calibration (RC) designed to minimize the empirical risk under the 0-1 loss (empirical error) for probabilistic classifiers. These classifiers are based on modeling probability distributions, including those constructed from the joint distribution (generative) and those based on the class conditional distribution (conditional). RC can be particularized to any probabilistic classifier provided a specific learning algorithm that computes the classifier's parameters in closed form using data statistics. RC reinforces the statistics aligned with the true class while penalizing those associated with other classes, guided by the 0-1 loss. The proposed method has been empirically tested on 30 datasets using naïve Bayes, quadratic discriminant analysis, and logistic regression classifiers. RC improves the empirical error of the original closed-form learning algorithms and, more notably, consistently outperforms the gradient descent approach with the three classifiers.
△ Less
Submitted 5 September, 2024;
originally announced September 2024.
-
PitVis-2023 Challenge: Workflow Recognition in videos of Endoscopic Pituitary Surgery
Authors:
Adrito Das,
Danyal Z. Khan,
Dimitrios Psychogyios,
Yitong Zhang,
John G. Hanrahan,
Francisco Vasconcelos,
You Pang,
Zhen Chen,
Jinlin Wu,
Xiaoyang Zou,
Guoyan Zheng,
Abdul Qayyum,
Moona Mazher,
Imran Razzak,
Tianbin Li,
Jin Ye,
Junjun He,
Szymon Płotka,
Joanna Kaleta,
Amine Yamlahi,
Antoine Jund,
Patrick Godau,
Satoshi Kondo,
Satoshi Kasai,
Kousuke Hirasawa
, et al. (7 additional authors not shown)
Abstract:
The field of computer vision applied to videos of minimally invasive surgery is ever-growing. Workflow recognition pertains to the automated recognition of various aspects of a surgery: including which surgical steps are performed; and which surgical instruments are used. This information can later be used to assist clinicians when learning the surgery; during live surgery; and when writing operat…
▽ More
The field of computer vision applied to videos of minimally invasive surgery is ever-growing. Workflow recognition pertains to the automated recognition of various aspects of a surgery: including which surgical steps are performed; and which surgical instruments are used. This information can later be used to assist clinicians when learning the surgery; during live surgery; and when writing operation notes. The Pituitary Vision (PitVis) 2023 Challenge tasks the community to step and instrument recognition in videos of endoscopic pituitary surgery. This is a unique task when compared to other minimally invasive surgeries due to the smaller working space, which limits and distorts vision; and higher frequency of instrument and step switching, which requires more precise model predictions. Participants were provided with 25-videos, with results presented at the MICCAI-2023 conference as part of the Endoscopic Vision 2023 Challenge in Vancouver, Canada, on 08-Oct-2023. There were 18-submissions from 9-teams across 6-countries, using a variety of deep learning models. A commonality between the top performing models was incorporating spatio-temporal and multi-task methods, with greater than 50% and 10% macro-F1-score improvement over purely spacial single-task models in step and instrument recognition respectively. The PitVis-2023 Challenge therefore demonstrates state-of-the-art computer vision models in minimally invasive surgery are transferable to a new dataset, with surgery specific techniques used to enhance performance, progressing the field further. Benchmark results are provided in the paper, and the dataset is publicly available at: https://doi.org/10.5522/04/26531686.
△ Less
Submitted 2 September, 2024;
originally announced September 2024.
-
Integrating Quantum Computing Resources into Scientific HPC Ecosystems
Authors:
Thomas Beck,
Alessandro Baroni,
Ryan Bennink,
Gilles Buchs,
Eduardo Antonio Coello Perez,
Markus Eisenbach,
Rafael Ferreira da Silva,
Muralikrishnan Gopalakrishnan Meena,
Kalyan Gottiparthi,
Peter Groszkowski,
Travis S. Humble,
Ryan Landfield,
Ketan Maheshwari,
Sarp Oral,
Michael A. Sandoval,
Amir Shehata,
In-Saeng Suh,
Christopher Zimmer
Abstract:
Quantum Computing (QC) offers significant potential to enhance scientific discovery in fields such as quantum chemistry, optimization, and artificial intelligence. Yet QC faces challenges due to the noisy intermediate-scale quantum era's inherent external noise issues. This paper discusses the integration of QC as a computational accelerator within classical scientific high-performance computing (…
▽ More
Quantum Computing (QC) offers significant potential to enhance scientific discovery in fields such as quantum chemistry, optimization, and artificial intelligence. Yet QC faces challenges due to the noisy intermediate-scale quantum era's inherent external noise issues. This paper discusses the integration of QC as a computational accelerator within classical scientific high-performance computing (HPC) systems. By leveraging a broad spectrum of simulators and hardware technologies, we propose a hardware-agnostic framework for augmenting classical HPC with QC capabilities. Drawing on the HPC expertise of the Oak Ridge National Laboratory (ORNL) and the HPC lifecycle management of the Department of Energy (DOE), our approach focuses on the strategic incorporation of QC capabilities and acceleration into existing scientific HPC workflows. This includes detailed analyses, benchmarks, and code optimization driven by the needs of the DOE and ORNL missions. Our comprehensive framework integrates hardware, software, workflows, and user interfaces to foster a synergistic environment for quantum and classical computing research. This paper outlines plans to unlock new computational possibilities, driving forward scientific inquiry and innovation in a wide array of research domains.
△ Less
Submitted 28 August, 2024;
originally announced August 2024.
-
Algorithms for Markov Binomial Chains
Authors:
Alejandro Alarcón Gonzalez,
Niel Hens,
Tim Leys,
Guillermo A. Pérez
Abstract:
We study algorithms to analyze a particular class of Markov population processes that is often used in epidemiology. More specifically, Markov binomial chains are the model that arises from stochastic time-discretizations of classical compartmental models. In this work we formalize this class of Markov population processes and focus on the problem of computing the expected time to termination in a…
▽ More
We study algorithms to analyze a particular class of Markov population processes that is often used in epidemiology. More specifically, Markov binomial chains are the model that arises from stochastic time-discretizations of classical compartmental models. In this work we formalize this class of Markov population processes and focus on the problem of computing the expected time to termination in a given such model. Our theoretical contributions include proving that Markov binomial chains whose flow of individuals through compartments is acyclic almost surely terminate. We give a PSPACE algorithm for the problem of approximating the time to termination and a direct algorithm for the exact problem in the Blum-Shub-Smale model of computation. Finally, we provide a natural encoding of Markov binomial chains into a common input language for probabilistic model checkers. We implemented the latter encoding and present some initial empirical results showcasing what formal methods can do for practicing epidemilogists.
△ Less
Submitted 9 August, 2024;
originally announced August 2024.
-
In-Situ Techniques on GPU-Accelerated Data-Intensive Applications
Authors:
Yi Ju,
Mingshuai Li,
Adalberto Perez,
Laura Bellentani,
Niclas Jansson,
Stefano Markidis,
Philipp Schlatter,
Erwin Laure
Abstract:
The computational power of High-Performance Computing (HPC) systems is constantly increasing, however, their input/output (IO) performance grows relatively slowly, and their storage capacity is also limited. This unbalance presents significant challenges for applications such as Molecular Dynamics (MD) and Computational Fluid Dynamics (CFD), which generate massive amounts of data for further visua…
▽ More
The computational power of High-Performance Computing (HPC) systems is constantly increasing, however, their input/output (IO) performance grows relatively slowly, and their storage capacity is also limited. This unbalance presents significant challenges for applications such as Molecular Dynamics (MD) and Computational Fluid Dynamics (CFD), which generate massive amounts of data for further visualization or analysis. At the same time, checkpointing is crucial for long runs on HPC clusters, due to limited walltimes and/or failures of system components, and typically requires the storage of large amount of data. Thus, restricted IO performance and storage capacity can lead to bottlenecks for the performance of full application workflows (as compared to computational kernels without IO). In-situ techniques, where data is further processed while still in memory rather to write it out over the I/O subsystem, can help to tackle these problems. In contrast to traditional post-processing methods, in-situ techniques can reduce or avoid the need to write or read data via the IO subsystem. They offer a promising approach for applications aiming to leverage the full power of large scale HPC systems. In-situ techniques can also be applied to hybrid computational nodes on HPC systems consisting of graphics processing units (GPUs) and central processing units (CPUs). On one node, the GPUs would have significant performance advantages over the CPUs. Therefore, current approaches for GPU-accelerated applications often focus on maximizing GPU usage, leaving CPUs underutilized. In-situ tasks using CPUs to perform data analysis or preprocess data concurrently to the running simulation, offer a possibility to improve this underutilization.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
Understanding the Impact of Synchronous, Asynchronous, and Hybrid In-Situ Techniques in Computational Fluid Dynamics Applications
Authors:
Yi Ju,
Adalberto Perez,
Stefano Markidis,
Philipp Schlatter,
Erwin Laure
Abstract:
High-Performance Computing (HPC) systems provide input/output (IO) performance growing relatively slowly compared to peak computational performance and have limited storage capacity. Computational Fluid Dynamics (CFD) applications aiming to leverage the full power of Exascale HPC systems, such as the solver Nek5000, will generate massive data for further processing. These data need to be efficient…
▽ More
High-Performance Computing (HPC) systems provide input/output (IO) performance growing relatively slowly compared to peak computational performance and have limited storage capacity. Computational Fluid Dynamics (CFD) applications aiming to leverage the full power of Exascale HPC systems, such as the solver Nek5000, will generate massive data for further processing. These data need to be efficiently stored via the IO subsystem. However, limited IO performance and storage capacity may result in performance, and thus scientific discovery, bottlenecks. In comparison to traditional post-processing methods, in-situ techniques can reduce or avoid writing and reading the data through the IO subsystem, promising to be a solution to these problems. In this paper, we study the performance and resource usage of three in-situ use cases: data compression, image generation, and uncertainty quantification. We furthermore analyze three approaches when these in-situ tasks and the simulation are executed synchronously, asynchronously, or in a hybrid manner. In-situ compression can be used to reduce the IO time and storage requirements while maintaining data accuracy. Furthermore, in-situ visualization and analysis can save Terabytes of data from being routed through the IO subsystem to storage. However, the overall efficiency is crucially dependent on the characteristics of both, the in-situ task and the simulation. In some cases, the overhead introduced by the in-situ tasks can be substantial. Therefore, it is essential to choose the proper in-situ approach, synchronous, asynchronous, or hybrid, to minimize overhead and maximize the benefits of concurrent execution.
△ Less
Submitted 30 July, 2024;
originally announced July 2024.
-
MuST: Multi-Scale Transformers for Surgical Phase Recognition
Authors:
Alejandra Pérez,
Santiago Rodríguez,
Nicolás Ayobi,
Nicolás Aparicio,
Eugénie Dessevres,
Pablo Arbeláez
Abstract:
Phase recognition in surgical videos is crucial for enhancing computer-aided surgical systems as it enables automated understanding of sequential procedural stages. Existing methods often rely on fixed temporal windows for video analysis to identify dynamic surgical phases. Thus, they struggle to simultaneously capture short-, mid-, and long-term information necessary to fully understand complex s…
▽ More
Phase recognition in surgical videos is crucial for enhancing computer-aided surgical systems as it enables automated understanding of sequential procedural stages. Existing methods often rely on fixed temporal windows for video analysis to identify dynamic surgical phases. Thus, they struggle to simultaneously capture short-, mid-, and long-term information necessary to fully understand complex surgical procedures. To address these issues, we propose Multi-Scale Transformers for Surgical Phase Recognition (MuST), a novel Transformer-based approach that combines a Multi-Term Frame encoder with a Temporal Consistency Module to capture information across multiple temporal scales of a surgical video. Our Multi-Term Frame Encoder computes interdependencies across a hierarchy of temporal scales by sampling sequences at increasing strides around the frame of interest. Furthermore, we employ a long-term Transformer encoder over the frame embeddings to further enhance long-term reasoning. MuST achieves higher performance than previous state-of-the-art methods on three different public benchmarks.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
DIDUP: Dynamic Iterative Development for UI Prototyping
Authors:
Jenny Ma,
Karthik Sreedhar,
Vivian Liu,
Sitong Wang,
Pedro Alejandro Perez,
Lydia B. Chilton
Abstract:
Large language models (LLMs) are remarkably good at writing code. A particularly valuable case of human-LLM collaboration is code-based UI prototyping, a method for creating interactive prototypes that allows users to view and fully engage with a user interface. We conduct a formative study of GPT Pilot, a leading LLM-generated code-prototyping system, and find that its inflexibility towards chang…
▽ More
Large language models (LLMs) are remarkably good at writing code. A particularly valuable case of human-LLM collaboration is code-based UI prototyping, a method for creating interactive prototypes that allows users to view and fully engage with a user interface. We conduct a formative study of GPT Pilot, a leading LLM-generated code-prototyping system, and find that its inflexibility towards change once development has started leads to weaknesses in failure prevention and dynamic planning; it closely resembles the linear workflow of the waterfall model. We introduce DIDUP, a system for code-based UI prototyping that follows an iterative spiral model, which takes changes and iterations that come up during the development process into account. We propose three novel mechanisms for LLM-generated code-prototyping systems: (1) adaptive planning, where plans should be dynamic and reflect changes during implementation, (2) code injection, where the system should write a minimal amount of code and inject it instead of rewriting code so users have a better mental model of the code evolution, and (3) lightweight state management, a simplified version of source control so users can quickly revert to different working states. Together, this enables users to rapidly develop and iterate on prototypes.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Around Classical and Intuitionistic Linear Processes
Authors:
Juan C. Jaramillo,
Dan Frumin,
Jorge A. Pérez
Abstract:
Curry-Howard correspondences between Linear Logic (LL) and session types provide a firm foundation for concurrent processes. As the correspondences hold for intuitionistic and classic versions of LL (ILL and CLL), we obtain two different families of type systems for concurrency. An open question remains: how do these two families exactly relate to each other? Based upon a translation from CLL to I…
▽ More
Curry-Howard correspondences between Linear Logic (LL) and session types provide a firm foundation for concurrent processes. As the correspondences hold for intuitionistic and classic versions of LL (ILL and CLL), we obtain two different families of type systems for concurrency. An open question remains: how do these two families exactly relate to each other? Based upon a translation from CLL to ILL due to Laurent (2018), we provide two complementary answers, in the form of full abstraction results based on a typed observational equivalence due to Atkey (2017). Our results elucidate hitherto missing formal links between seemingly related yet different type systems for concurrency.
△ Less
Submitted 22 July, 2024; v1 submitted 8 July, 2024;
originally announced July 2024.
-
On the Challenges of Creating Datasets for Analyzing Commercial Sex Advertisements to Assess Human Trafficking Risk and Organized Activity
Authors:
Pablo Rivas,
Tomas Cerny,
Alejandro Rodriguez Perez,
Javier Turek,
Laurie Giddens,
Gisela Bichler,
Stacie Petter
Abstract:
Our study addresses the challenges of building datasets to understand the risks associated with organized activities and human trafficking through commercial sex advertisements. These challenges include data scarcity, rapid obsolescence, and privacy concerns. Traditional approaches, which are not automated and are difficult to reproduce, fall short in addressing these issues. We have developed a r…
▽ More
Our study addresses the challenges of building datasets to understand the risks associated with organized activities and human trafficking through commercial sex advertisements. These challenges include data scarcity, rapid obsolescence, and privacy concerns. Traditional approaches, which are not automated and are difficult to reproduce, fall short in addressing these issues. We have developed a reproducible and automated methodology to analyze five million advertisements. In the process, we identified further challenges in dataset creation within this sensitive domain. This paper presents a streamlined methodology to assist researchers in constructing effective datasets for combating organized crime, allowing them to focus on advancing detection technologies.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Scrolly2Reel: Retargeting Graphics for Social Media Using Narrative Beats
Authors:
Duy K. Nguyen,
Jenny Ma,
Pedro Alejandro Perez,
Lydia B. Chilton
Abstract:
Content retargeting is crucial for social media creators. Once great content is created, it is important to reach as broad an audience as possible. This is particularly important in journalism where younger audiences are shifting away from print and towards short-video platforms. Many newspapers already create rich graphics for the web that they want to be able to reuse for social media. One examp…
▽ More
Content retargeting is crucial for social media creators. Once great content is created, it is important to reach as broad an audience as possible. This is particularly important in journalism where younger audiences are shifting away from print and towards short-video platforms. Many newspapers already create rich graphics for the web that they want to be able to reuse for social media. One example is scrollytelling sequences or "scrollies" -- immersive articles with graphics like animation, charts, and 3D visualizations that appear as a user scrolls. We present a system that helps transform scrollies into social media videos. By using the scriptwriting concept of narrative beats to extract fundamental storytelling units, we can create videos that are more aligned with narration, and allow for better pacing and stylistic changes. Narrative beats are thus an important primitive to retargeting content that matches the style of a new medium while maintaining the cohesiveness of the original content.
△ Less
Submitted 19 June, 2024; v1 submitted 26 March, 2024;
originally announced March 2024.
-
Active Learning of Mealy Machines with Timers
Authors:
Véronique Bruyère,
Bharat Garhewal,
Guillermo A. Pérez,
Gaëtan Staquet,
Frits W. Vaandrager
Abstract:
We present the first algorithm for query learning of a general class of Mealy machines with timers (MMTs) in a black-box context. Our algorithm is an extension of the L# algorithm of Vaandrager et al. to a timed setting. Like the algorithm for learning timed automata proposed by Waga, our algorithm is inspired by ideas of Maler & Pnueli. Based on the elementary languages of, both Waga's and our al…
▽ More
We present the first algorithm for query learning of a general class of Mealy machines with timers (MMTs) in a black-box context. Our algorithm is an extension of the L# algorithm of Vaandrager et al. to a timed setting. Like the algorithm for learning timed automata proposed by Waga, our algorithm is inspired by ideas of Maler & Pnueli. Based on the elementary languages of, both Waga's and our algorithm use symbolic queries, which are then implemented using finitely many concrete queries. However, whereas Waga needs exponentially many concrete queries to implement a single symbolic query, we only need a polynomial number. This is because in order to learn a timed automaton, a learner needs to determine the exact guard and reset for each transition (out of exponentially many possibilities), whereas for learning an MMT a learner only needs to figure out which of the preceding transitions caused a timeout. As shown in our previous work, this can be done efficiently for a subclass of MMTs that are race-avoiding: if a timeout is caused by a preceding input then a slight change in the timing of this input will induce a corresponding change in the timing of the timeout ("wiggling"). Experiments with a prototype implementation, written in Rust, show that our algorithm is able to efficiently learn realistic benchmarks.
△ Less
Submitted 4 March, 2024;
originally announced March 2024.
-
Synthesis of Hierarchical Controllers Based on Deep Reinforcement Learning Policies
Authors:
Florent Delgrange,
Guy Avni,
Anna Lukina,
Christian Schilling,
Ann Nowé,
Guillermo A. Pérez
Abstract:
We propose a novel approach to the problem of controller design for environments modeled as Markov decision processes (MDPs). Specifically, we consider a hierarchical MDP a graph with each vertex populated by an MDP called a "room". We first apply deep reinforcement learning (DRL) to obtain low-level policies for each room, scaling to large rooms of unknown structure. We then apply reactive synthe…
▽ More
We propose a novel approach to the problem of controller design for environments modeled as Markov decision processes (MDPs). Specifically, we consider a hierarchical MDP a graph with each vertex populated by an MDP called a "room". We first apply deep reinforcement learning (DRL) to obtain low-level policies for each room, scaling to large rooms of unknown structure. We then apply reactive synthesis to obtain a high-level planner that chooses which low-level policy to execute in each room. The central challenge in synthesizing the planner is the need for modeling rooms. We address this challenge by developing a DRL procedure to train concise "latent" policies together with PAC guarantees on their performance. Unlike previous approaches, ours circumvents a model distillation step. Our approach combats sparse rewards in DRL and enables reusability of low-level policies. We demonstrate feasibility in a case study involving agent navigation amid moving obstacles.
△ Less
Submitted 21 February, 2024;
originally announced February 2024.
-
Continuous Pushdown VASS in One Dimension are Easy
Authors:
Guillermo A. Perez,
Shrisha Rao
Abstract:
A pushdown vector addition system with states (PVASS) extends the model of vector addition systems with a pushdown stack. The algorithmic analysis of PVASS has applications such as static analysis of recursive programs manipulating integer variables. Unfortunately, reachability analysis, even for one-dimensional PVASS is not known to be decidable. We relax the model of one-dimensional PVASS to mak…
▽ More
A pushdown vector addition system with states (PVASS) extends the model of vector addition systems with a pushdown stack. The algorithmic analysis of PVASS has applications such as static analysis of recursive programs manipulating integer variables. Unfortunately, reachability analysis, even for one-dimensional PVASS is not known to be decidable. We relax the model of one-dimensional PVASS to make the counter updates continuous and show that in this case reachability, coverability, and boundedness are decidable in polynomial time. In addition, for the extension of the model with lower-bound guards on the states, we show that coverability and reachability are in NP, and boundedness is in coNP.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Analyzing Operator States and the Impact of AI-Enhanced Decision Support in Control Rooms: A Human-in-the-Loop Specialized Reinforcement Learning Framework for Intervention Strategies
Authors:
Ammar N. Abbas,
Chidera W. Amazu,
Joseph Mietkiewicz,
Houda Briwa,
Andres Alonzo Perez,
Gabriele Baldissone,
Micaela Demichela,
Georgios G. Chasparis,
John D. Kelleher,
Maria Chiara Leva
Abstract:
In complex industrial and chemical process control rooms, effective decision-making is crucial for safety and efficiency. The experiments in this paper evaluate the impact and applications of an AI-based decision support system integrated into an improved human-machine interface, using dynamic influence diagrams, a hidden Markov model, and deep reinforcement learning. The enhanced support system a…
▽ More
In complex industrial and chemical process control rooms, effective decision-making is crucial for safety and efficiency. The experiments in this paper evaluate the impact and applications of an AI-based decision support system integrated into an improved human-machine interface, using dynamic influence diagrams, a hidden Markov model, and deep reinforcement learning. The enhanced support system aims to reduce operator workload, improve situational awareness, and provide different intervention strategies to the operator adapted to the current state of both the system and human performance. Such a system can be particularly useful in cases of information overload when many alarms and inputs are presented all within the same time window, or for junior operators during training. A comprehensive cross-data analysis was conducted, involving 47 participants and a diverse range of data sources such as smartwatch metrics, eye-tracking data, process logs, and responses from questionnaires. The results indicate interesting insights regarding the effectiveness of the approach in aiding decision-making, decreasing perceived workload, and increasing situational awareness for the scenarios considered. Additionally, the results provide valuable insights to compare differences between styles of information gathering when using the system by individual participants. These findings are particularly relevant when predicting the overall performance of the individual participant and their capacity to successfully handle a plant upset and the alarms connected to it using process and human-machine interaction logs in real-time. These predictions enable the development of more effective intervention strategies.
△ Less
Submitted 20 February, 2024;
originally announced February 2024.
-
Real-World Planning with PDDL+ and Beyond
Authors:
Wiktor Piotrowski,
Alexandre Perez
Abstract:
Real-world applications of AI Planning often require a highly expressive modeling language to accurately capture important intricacies of target systems. Hybrid systems are ubiquitous in the real-world, and PDDL+ is the standardized modeling language for capturing such systems as planning domains. PDDL+ enables accurate encoding of mixed discrete-continuous system dynamics, exogenous activity, and…
▽ More
Real-world applications of AI Planning often require a highly expressive modeling language to accurately capture important intricacies of target systems. Hybrid systems are ubiquitous in the real-world, and PDDL+ is the standardized modeling language for capturing such systems as planning domains. PDDL+ enables accurate encoding of mixed discrete-continuous system dynamics, exogenous activity, and many other interesting features exhibited in realistic scenarios. However, the uptake in usage of PDDL+ has been slow and apprehensive, largely due to a general shortage of PDDL+ planning software, and rigid limitations of the few existing planners. To overcome this chasm, we present Nyx, a novel PDDL+ planner built to emphasize lightness, simplicity, and, most importantly, adaptability. The planner is designed to be effortlessly customizable to expand its capabilities well beyond the scope of PDDL+. As a result, Nyx can be tailored to virtually any potential real-world application requiring some form of AI Planning, paving the way for wider adoption of planning methods for solving real-world problems.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Inform: From Compartmental Models to Stochastic Bounded Counter Machines
Authors:
Tim Leys,
Guillermo A. Perez
Abstract:
Compartmental models are used in epidemiology to capture the evolution of infectious diseases such as COVID-19 in a population by assigning members of it to compartments with labels such as susceptible, infected, and recovered. In a stochastic compartmental model the flow of individuals between compartments is determined probabilistically. We establish that certain stochastic compartment models ca…
▽ More
Compartmental models are used in epidemiology to capture the evolution of infectious diseases such as COVID-19 in a population by assigning members of it to compartments with labels such as susceptible, infected, and recovered. In a stochastic compartmental model the flow of individuals between compartments is determined probabilistically. We establish that certain stochastic compartment models can be encoded as probabilistic counter machines where the configurations are bounded. Based on the latter, we obtain simple descriptions of the models in the PRISM language. This enables the analysis of such compartmental models via probabilistic model checkers. Finally, we report on experimental results where we analyze results from a Belgian COVID-19 model using a probabilistic model checkers.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Detecting $K_{2,3}$ as an induced minor
Authors:
Clément Dallard,
Maël Dumas,
Claire Hilaire,
Martin Milanič,
Anthony Perez,
Nicolas Trotignon
Abstract:
We consider a natural generalization of chordal graphs, in which every minimal separator induces a subgraph with independence number at most $2$. Such graphs can be equivalently defined as graphs that do not contain the complete bipartite graph $K_{2,3}$ as an induced minor, that is, graphs from which $K_{2,3}$ cannot be obtained by a sequence of edge contractions and vertex deletions.
We develo…
▽ More
We consider a natural generalization of chordal graphs, in which every minimal separator induces a subgraph with independence number at most $2$. Such graphs can be equivalently defined as graphs that do not contain the complete bipartite graph $K_{2,3}$ as an induced minor, that is, graphs from which $K_{2,3}$ cannot be obtained by a sequence of edge contractions and vertex deletions.
We develop a polynomial-time algorithm for recognizing these graphs. Our algorithm relies on a characterization of $K_{2,3}$-induced minor-free graphs in terms of excluding particular induced subgraphs, called Truemper configurations.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
LtU-ILI: An All-in-One Framework for Implicit Inference in Astrophysics and Cosmology
Authors:
Matthew Ho,
Deaglan J. Bartlett,
Nicolas Chartier,
Carolina Cuesta-Lazaro,
Simon Ding,
Axel Lapel,
Pablo Lemos,
Christopher C. Lovell,
T. Lucas Makinen,
Chirag Modi,
Viraj Pandya,
Shivam Pandey,
Lucia A. Perez,
Benjamin Wandelt,
Greg L. Bryan
Abstract:
This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It i…
▽ More
This paper presents the Learning the Universe Implicit Likelihood Inference (LtU-ILI) pipeline, a codebase for rapid, user-friendly, and cutting-edge machine learning (ML) inference in astrophysics and cosmology. The pipeline includes software for implementing various neural architectures, training schemata, priors, and density estimators in a manner easily adaptable to any research workflow. It includes comprehensive validation metrics to assess posterior estimate coverage, enhancing the reliability of inferred results. Additionally, the pipeline is easily parallelizable and is designed for efficient exploration of modeling hyperparameters. To demonstrate its capabilities, we present real applications across a range of astrophysics and cosmology problems, such as: estimating galaxy cluster masses from X-ray photometry; inferring cosmology from matter power spectra and halo point clouds; characterizing progenitors in gravitational wave signals; capturing physical dust parameters from galaxy colors and luminosities; and establishing properties of semi-analytic models of galaxy formation. We also include exhaustive benchmarking and comparisons of all implemented methods as well as discussions about the challenges and pitfalls of ML inference in astronomical sciences. All code and examples are made publicly available at https://github.com/maho3/ltu-ili.
△ Less
Submitted 2 July, 2024; v1 submitted 6 February, 2024;
originally announced February 2024.
-
Comparing Session Type Systems derived from Linear Logic
Authors:
Bas van den Heuvel,
Jorge A. Pérez
Abstract:
Session types are a typed approach to message-passing concurrency, where types describe sequences of intended exchanges over channels. Session type systems have been given strong logical foundations via Curry-Howard correspondences with linear logic, a resource-aware logic that naturally captures structured interactions. These logical foundations provide an elegant framework to specify and (static…
▽ More
Session types are a typed approach to message-passing concurrency, where types describe sequences of intended exchanges over channels. Session type systems have been given strong logical foundations via Curry-Howard correspondences with linear logic, a resource-aware logic that naturally captures structured interactions. These logical foundations provide an elegant framework to specify and (statically) verify message-passing processes.
In this paper, we rigorously compare different type systems for concurrency derived from the Curry-Howard correspondence between linear logic and session types. We address the main divide between these type systems: the classical and intuitionistic presentations of linear logic. Over the years, these presentations have given rise to separate research strands on logical foundations for concurrency; the differences between their derived type systems have only been addressed informally.
To formally assess these differences, we develop $π\mathsf{ULL}$, a session type system that encompasses type systems derived from classical and intuitionistic interpretations of linear logic. Based on a fragment of Girard's Logic of Unity, $π\mathsf{ULL}$ provides a basic reference framework: we compare existing session type systems by characterizing fragments of $π\mathsf{ULL}$ that coincide with classical and intuitionistic formulations. We analyze the significance of our characterizations by considering the locality principle (enforced by intuitionistic interpretations but not by classical ones) and forms of process composition induced by the interpretations.
△ Less
Submitted 22 August, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Adiabatic Quantum Support Vector Machines
Authors:
Prasanna Date,
Dong Jun Woun,
Kathleen Hamilton,
Eduardo A. Coello Perez,
Mayanka Chandra Shekhar,
Francisco Rios,
John Gounley,
In-Saeng Suh,
Travis Humble,
Georgia Tourassi
Abstract:
Adiabatic quantum computers can solve difficult optimization problems (e.g., the quadratic unconstrained binary optimization problem), and they seem well suited to train machine learning models. In this paper, we describe an adiabatic quantum approach for training support vector machines. We show that the time complexity of our quantum approach is an order of magnitude better than the classical ap…
▽ More
Adiabatic quantum computers can solve difficult optimization problems (e.g., the quadratic unconstrained binary optimization problem), and they seem well suited to train machine learning models. In this paper, we describe an adiabatic quantum approach for training support vector machines. We show that the time complexity of our quantum approach is an order of magnitude better than the classical approach. Next, we compare the test accuracy of our quantum approach against a classical approach that uses the Scikit-learn library in Python across five benchmark datasets (Iris, Wisconsin Breast Cancer (WBC), Wine, Digits, and Lambeq). We show that our quantum approach obtains accuracies on par with the classical approach. Finally, we perform a scalability study in which we compute the total training times of the quantum approach and the classical approach with increasing number of features and number of data points in the training dataset. Our scalability results show that the quantum approach obtains a 3.5--4.5 times speedup over the classical approach on datasets with many (millions of) features.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Pixel-Wise Recognition for Holistic Surgical Scene Understanding
Authors:
Nicolás Ayobi,
Santiago Rodríguez,
Alejandra Pérez,
Isabela Hernández,
Nicolás Aparicio,
Eugénie Dessevres,
Sebastián Peña,
Jessica Santander,
Juan Ignacio Caicedo,
Nicolás Fernández,
Pablo Arbeláez
Abstract:
This paper presents the Holistic and Multi-Granular Surgical Scene Understanding of Prostatectomies (GraSP) dataset, a curated benchmark that models surgical scene understanding as a hierarchy of complementary tasks with varying levels of granularity. Our approach enables a multi-level comprehension of surgical activities, encompassing long-term tasks such as surgical phases and steps recognition…
▽ More
This paper presents the Holistic and Multi-Granular Surgical Scene Understanding of Prostatectomies (GraSP) dataset, a curated benchmark that models surgical scene understanding as a hierarchy of complementary tasks with varying levels of granularity. Our approach enables a multi-level comprehension of surgical activities, encompassing long-term tasks such as surgical phases and steps recognition and short-term tasks including surgical instrument segmentation and atomic visual actions detection. To exploit our proposed benchmark, we introduce the Transformers for Actions, Phases, Steps, and Instrument Segmentation (TAPIS) model, a general architecture that combines a global video feature extractor with localized region proposals from an instrument segmentation model to tackle the multi-granularity of our benchmark. Through extensive experimentation, we demonstrate the impact of including segmentation annotations in short-term recognition tasks, highlight the varying granularity requirements of each task, and establish TAPIS's superiority over previously proposed baselines and conventional CNN-based models. Additionally, we validate the robustness of our method across multiple public benchmarks, confirming the reliability and applicability of our dataset. This work represents a significant step forward in Endoscopic Vision, offering a novel and comprehensive framework for future research towards a holistic understanding of surgical procedures.
△ Less
Submitted 25 January, 2024; v1 submitted 20 January, 2024;
originally announced January 2024.
-
PAC-Bayes-Chernoff bounds for unbounded losses
Authors:
Ioar Casado,
Luis A. Ortega,
Andrés R. Masegosa,
Aritz Pérez
Abstract:
We introduce a new PAC-Bayes oracle bound for unbounded losses. This result can be understood as a PAC-Bayesian version of the Cramér-Chernoff bound. The proof technique relies on controlling the tails of certain random variables involving the Cramér transform of the loss. We highlight several applications of the main theorem. First, we show that our result naturally allows exact optimization of t…
▽ More
We introduce a new PAC-Bayes oracle bound for unbounded losses. This result can be understood as a PAC-Bayesian version of the Cramér-Chernoff bound. The proof technique relies on controlling the tails of certain random variables involving the Cramér transform of the loss. We highlight several applications of the main theorem. First, we show that our result naturally allows exact optimization of the free parameter on many PAC-Bayes bounds. Second, we recover and generalize previous results. Finally, we show that our approach allows working with richer assumptions that result in more informative and potentially tighter bounds. In this direction, we provide a general bound under a new ``model-dependent bounded CGF" assumption from which we obtain bounds based on parameter norms and log-Sobolev inequalities. All these bounds can be minimized to obtain novel posteriors.
△ Less
Submitted 6 February, 2024; v1 submitted 2 January, 2024;
originally announced January 2024.
-
SAR-RARP50: Segmentation of surgical instrumentation and Action Recognition on Robot-Assisted Radical Prostatectomy Challenge
Authors:
Dimitrios Psychogyios,
Emanuele Colleoni,
Beatrice Van Amsterdam,
Chih-Yang Li,
Shu-Yu Huang,
Yuchong Li,
Fucang Jia,
Baosheng Zou,
Guotai Wang,
Yang Liu,
Maxence Boels,
Jiayu Huo,
Rachel Sparks,
Prokar Dasgupta,
Alejandro Granados,
Sebastien Ourselin,
Mengya Xu,
An Wang,
Yanan Wu,
Long Bai,
Hongliang Ren,
Atsushi Yamada,
Yuriko Harai,
Yuto Ishikawa,
Kazuyuki Hayashi
, et al. (25 additional authors not shown)
Abstract:
Surgical tool segmentation and action recognition are fundamental building blocks in many computer-assisted intervention applications, ranging from surgical skills assessment to decision support systems. Nowadays, learning-based action recognition and segmentation approaches outperform classical methods, relying, however, on large, annotated datasets. Furthermore, action recognition and tool segme…
▽ More
Surgical tool segmentation and action recognition are fundamental building blocks in many computer-assisted intervention applications, ranging from surgical skills assessment to decision support systems. Nowadays, learning-based action recognition and segmentation approaches outperform classical methods, relying, however, on large, annotated datasets. Furthermore, action recognition and tool segmentation algorithms are often trained and make predictions in isolation from each other, without exploiting potential cross-task relationships. With the EndoVis 2022 SAR-RARP50 challenge, we release the first multimodal, publicly available, in-vivo, dataset for surgical action recognition and semantic instrumentation segmentation, containing 50 suturing video segments of Robotic Assisted Radical Prostatectomy (RARP). The aim of the challenge is twofold. First, to enable researchers to leverage the scale of the provided dataset and develop robust and highly accurate single-task action recognition and tool segmentation approaches in the surgical domain. Second, to further explore the potential of multitask-based learning approaches and determine their comparative advantage against their single-task counterparts. A total of 12 teams participated in the challenge, contributing 7 action recognition methods, 9 instrument segmentation techniques, and 4 multitask approaches that integrated both action recognition and instrument segmentation. The complete SAR-RARP50 dataset is available at: https://rdr.ucl.ac.uk/projects/SARRARP50_Segmentation_of_surgical_instrumentation_and_Action_Recognition_on_Robot-Assisted_Radical_Prostatectomy_Challenge/191091
△ Less
Submitted 23 January, 2024; v1 submitted 31 December, 2023;
originally announced January 2024.
-
Uncertainty-aware Language Modeling for Selective Question Answering
Authors:
Qi Yang,
Shreya Ravikumar,
Fynn Schmitt-Ulms,
Satvik Lolla,
Ege Demir,
Iaroslav Elistratov,
Alex Lavaee,
Sadhana Lolla,
Elaheh Ahmadi,
Daniela Rus,
Alexander Amini,
Alejandro Perez
Abstract:
We present an automatic large language model (LLM) conversion approach that produces uncertainty-aware LLMs capable of estimating uncertainty with every prediction. Our approach is model- and data-agnostic, is computationally-efficient, and does not rely on external models or systems. We evaluate converted models on the selective question answering setting -- to answer as many questions as possibl…
▽ More
We present an automatic large language model (LLM) conversion approach that produces uncertainty-aware LLMs capable of estimating uncertainty with every prediction. Our approach is model- and data-agnostic, is computationally-efficient, and does not rely on external models or systems. We evaluate converted models on the selective question answering setting -- to answer as many questions as possible while maintaining a given accuracy, forgoing providing predictions when necessary. As part of our results, we test BERT and Llama 2 model variants on the SQuAD extractive QA task and the TruthfulQA generative QA task. We show that using the uncertainty estimates provided by our approach to selectively answer questions leads to significantly higher accuracy over directly using model probabilities.
△ Less
Submitted 26 November, 2023;
originally announced November 2023.
-
Combatting Human Trafficking in the Cyberspace: A Natural Language Processing-Based Methodology to Analyze the Language in Online Advertisements
Authors:
Alejandro Rodriguez Perez,
Pablo Rivas
Abstract:
This project tackles the pressing issue of human trafficking in online C2C marketplaces through advanced Natural Language Processing (NLP) techniques. We introduce a novel methodology for generating pseudo-labeled datasets with minimal supervision, serving as a rich resource for training state-of-the-art NLP models. Focusing on tasks like Human Trafficking Risk Prediction (HTRP) and Organized Acti…
▽ More
This project tackles the pressing issue of human trafficking in online C2C marketplaces through advanced Natural Language Processing (NLP) techniques. We introduce a novel methodology for generating pseudo-labeled datasets with minimal supervision, serving as a rich resource for training state-of-the-art NLP models. Focusing on tasks like Human Trafficking Risk Prediction (HTRP) and Organized Activity Detection (OAD), we employ cutting-edge Transformer models for analysis. A key contribution is the implementation of an interpretability framework using Integrated Gradients, providing explainable insights crucial for law enforcement. This work not only fills a critical gap in the literature but also offers a scalable, machine learning-driven approach to combat human exploitation online. It serves as a foundation for future research and practical applications, emphasizing the role of machine learning in addressing complex social issues.
△ Less
Submitted 21 November, 2023;
originally announced November 2023.
-
Time-dependent Probabilistic Generative Models for Disease Progression
Authors:
Onintze Zaballa,
Aritz Pérez,
Elisa Gómez-Inhiesto,
Teresa Acaiturri-Ayesta,
Jose A. Lozano
Abstract:
Electronic health records contain valuable information for monitoring patients' health trajectories over time. Disease progression models have been developed to understand the underlying patterns and dynamics of diseases using these data as sequences. However, analyzing temporal data from EHRs is challenging due to the variability and irregularities present in medical records. We propose a Markovi…
▽ More
Electronic health records contain valuable information for monitoring patients' health trajectories over time. Disease progression models have been developed to understand the underlying patterns and dynamics of diseases using these data as sequences. However, analyzing temporal data from EHRs is challenging due to the variability and irregularities present in medical records. We propose a Markovian generative model of treatments developed to (i) model the irregular time intervals between medical events; (ii) classify treatments into subtypes based on the patient sequence of medical events and the time intervals between them; and (iii) segment treatments into subsequences of disease progression patterns. We assume that sequences have an associated structure of latent variables: a latent class representing the different subtypes of treatments; and a set of latent stages indicating the phase of progression of the treatments. We use the Expectation-Maximization algorithm to learn the model, which is efficiently solved with a dynamic programming-based method. Various parametric models have been employed to model the time intervals between medical events during the learning process, including the geometric, exponential, and Weibull distributions. The results demonstrate the effectiveness of our model in recovering the underlying model from data and accurately modeling the irregular time intervals between medical actions.
△ Less
Submitted 15 November, 2023;
originally announced November 2023.
-
Liquid phase fast electron tomography unravels the true 3D structure of colloidal assemblies
Authors:
Daniel Arenas Esteban,
Da Wang,
Ajinkya Kadu,
Noa Olluyn,
Ana Sánchez Iglesias,
Alejandro Gomez Perez,
Jesus Gonzalez Casablanca,
Stavros Nicolopoulos,
Luis M. Liz-Marzán,
Sara Bals
Abstract:
Electron tomography has become a commonly used tool to investigate the three-dimensional (3D) structure of nanomaterials, including colloidal nanoparticle assemblies. However, electron microscopy is typically carried out under high vacuum conditions. Therefore, pre-treatment sample preparation is needed for assemblies obtained by (wet) colloid chemistry methods, including solvent evaporation and d…
▽ More
Electron tomography has become a commonly used tool to investigate the three-dimensional (3D) structure of nanomaterials, including colloidal nanoparticle assemblies. However, electron microscopy is typically carried out under high vacuum conditions. Therefore, pre-treatment sample preparation is needed for assemblies obtained by (wet) colloid chemistry methods, including solvent evaporation and deposition on a solid TEM support. As a result of this procedure, changes are consistently imposed on the actual nanoparticle organization. Therefore, we propose herein the application of electron tomography of nanoparticle assemblies while in their original colloidal liquid environment. To address the challenges related to electron tomography in liquid, we devised a method that combines fast data acquisition in a commercial liquid-TEM cell, with a dedicated alignment and reconstruction workflow. We present the application of this method to two different systems, which exemplify the difference between conventional and liquid tomography, depending on the nature of the protecting ligands. 3D reconstructions of assemblies comprising polystyrene-capped Au nanoparticles encapsulated in polymeric shells revealed less compact and more distorted configurations for experiments performed in a liquid medium compared to their dried counterparts. On the other hand, quantitative analysis of the surface-to-surface distance of self-assembled Au nanorods in water agrees with previously reported dimensions of the ligand layers surrounding the nanorods, which are in much closer contact when in similar but dried assemblies. This study, therefore, emphasizes the importance of developing high-resolution characterization tools that preserve the native environment of colloidal nanostructures.
△ Less
Submitted 23 November, 2023; v1 submitted 9 November, 2023;
originally announced November 2023.
-
Conversations in Galician: a Large Language Model for an Underrepresented Language
Authors:
Eliseo Bao,
Anxo Pérez,
Javier Parapar
Abstract:
The recent proliferation of Large Conversation Language Models has highlighted the economic significance of widespread access to this type of AI technologies in the current information age. Nevertheless, prevailing models have primarily been trained on corpora consisting of documents written in popular languages. The dearth of such cutting-edge tools for low-resource languages further exacerbates…
▽ More
The recent proliferation of Large Conversation Language Models has highlighted the economic significance of widespread access to this type of AI technologies in the current information age. Nevertheless, prevailing models have primarily been trained on corpora consisting of documents written in popular languages. The dearth of such cutting-edge tools for low-resource languages further exacerbates their underrepresentation in the current economic landscape, thereby impacting their native speakers. This paper introduces two novel resources designed to enhance Natural Language Processing (NLP) for the Galician language. We present a Galician adaptation of the Alpaca dataset, comprising 52,000 instructions and demonstrations. This dataset proves invaluable for enhancing language models by fine-tuning them to more accurately adhere to provided instructions. Additionally, as a demonstration of the dataset utility, we fine-tuned LLaMA-7B to comprehend and respond in Galician, a language not originally supported by the model, by following the Alpaca format. This work contributes to the research on multilingual models tailored for low-resource settings, a crucial endeavor in ensuring the inclusion of all linguistic communities in the development of Large Language Models. Another noteworthy aspect of this research is the exploration of how knowledge of a closely related language, in this case, Portuguese, can assist in generating coherent text when training resources are scarce. Both the Galician Alpaca dataset and Cabuxa-7B are publicly accessible on our Huggingface Hub, and we have made the source code available to facilitate replication of this experiment and encourage further advancements for underrepresented languages.
△ Less
Submitted 7 November, 2023;
originally announced November 2023.
-
Synthesizing Efficiently Monitorable Formulas in Metric Temporal Logic
Authors:
Ritam Raha,
Rajarshi Roy,
Nathanael Fijalkow,
Daniel Neider,
Guillermo A. Perez
Abstract:
In runtime verification, manually formalizing a specification for monitoring system executions is a tedious and error-prone process. To address this issue, we consider the problem of automatically synthesizing formal specifications from system executions. To demonstrate our approach, we consider the popular specification language Metric Temporal Logic (MTL), which is particularly tailored towards…
▽ More
In runtime verification, manually formalizing a specification for monitoring system executions is a tedious and error-prone process. To address this issue, we consider the problem of automatically synthesizing formal specifications from system executions. To demonstrate our approach, we consider the popular specification language Metric Temporal Logic (MTL), which is particularly tailored towards specifying temporal properties for cyber-physical systems (CPS). Most of the classical approaches for synthesizing temporal logic formulas aim at minimizing the size of the formula. However, for efficiency in monitoring, along with the size, the amount of "lookahead" required for the specification becomes relevant, especially for safety-critical applications. We formalize this notion and devise a learning algorithm that synthesizes concise formulas having bounded lookahead. To do so, our algorithm reduces the synthesis task to a series of satisfiability problems in Linear Real Arithmetic (LRA) and generates MTL formulas from their satisfying assignments. The reduction uses a novel encoding of a popular MTL monitoring procedure using LRA. Finally, we implement our algorithm in a tool called TEAL and demonstrate its ability to synthesize efficiently monitorable MTL formulas in a CPS application.
△ Less
Submitted 26 October, 2023;
originally announced October 2023.
-
Field-level simulation-based inference with galaxy catalogs: the impact of systematic effects
Authors:
Natalí S. M. de Santi,
Francisco Villaescusa-Navarro,
L. Raul Abramo,
Helen Shao,
Lucia A. Perez,
Tiago Castro,
Yueying Ni,
Christopher C. Lovell,
Elena Hernandez-Martinez,
Federico Marinacci,
David N. Spergel,
Klaus Dolag,
Lars Hernquist,
Mark Vogelsberger
Abstract:
It has been recently shown that a powerful way to constrain cosmological parameters from galaxy redshift surveys is to train graph neural networks to perform field-level likelihood-free inference without imposing cuts on scale. In particular, de Santi et al. (2023) developed models that could accurately infer the value of $Ω_{\rm m}$ from catalogs that only contain the positions and radial velocit…
▽ More
It has been recently shown that a powerful way to constrain cosmological parameters from galaxy redshift surveys is to train graph neural networks to perform field-level likelihood-free inference without imposing cuts on scale. In particular, de Santi et al. (2023) developed models that could accurately infer the value of $Ω_{\rm m}$ from catalogs that only contain the positions and radial velocities of galaxies that are robust to uncertainties in astrophysics and subgrid models. However, observations are affected by many effects, including 1) masking, 2) uncertainties in peculiar velocities and radial distances, and 3) different galaxy selections. Moreover, observations only allow us to measure redshift, intertwining galaxies' radial positions and velocities. In this paper we train and test our models on galaxy catalogs, created from thousands of state-of-the-art hydrodynamic simulations run with different codes from the CAMELS project, that incorporate these observational effects. We find that, although the presence of these effects degrades the precision and accuracy of the models, and increases the fraction of catalogs where the model breaks down, the fraction of galaxy catalogs where the model performs well is over 90 %, demonstrating the potential of these models to constrain cosmological parameters even when applied to real data.
△ Less
Submitted 9 May, 2024; v1 submitted 23 October, 2023;
originally announced October 2023.
-
Explainable Depression Symptom Detection in Social Media
Authors:
Eliseo Bao,
Anxo Pérez,
Javier Parapar
Abstract:
Users of social platforms often perceive these sites as supportive spaces to post about their mental health issues. Those conversations contain important traces about individuals' health risks. Recently, researchers have exploited this online information to construct mental health detection models, which aim to identify users at risk on platforms like Twitter, Reddit or Facebook. Most of these mod…
▽ More
Users of social platforms often perceive these sites as supportive spaces to post about their mental health issues. Those conversations contain important traces about individuals' health risks. Recently, researchers have exploited this online information to construct mental health detection models, which aim to identify users at risk on platforms like Twitter, Reddit or Facebook. Most of these models are centred on achieving good classification results, ignoring the explainability and interpretability of the decisions. Recent research has pointed out the importance of using clinical markers, such as the use of symptoms, to improve trust in the computational models by health professionals. In this paper, we propose using transformer-based architectures to detect and explain the appearance of depressive symptom markers in the users' writings. We present two approaches: i) train a model to classify, and another one to explain the classifier's decision separately and ii) unify the two tasks simultaneously using a single model. Additionally, for this latter manner, we also investigated the performance of recent conversational LLMs when using in-context learning. Our natural language explanations enable clinicians to interpret the models' decisions based on validated symptoms, enhancing trust in the automated process. We evaluate our approach using recent symptom-based datasets, employing both offline and expert-in-the-loop metrics to assess the quality of the explanations generated by our models. The experimental results show that it is possible to achieve good classification results while generating interpretable symptom-based explanations.
△ Less
Submitted 20 August, 2024; v1 submitted 20 October, 2023;
originally announced October 2023.
-
Speeding-up Evolutionary Algorithms to solve Black-Box Optimization Problems
Authors:
Judith Echevarrieta,
Etor Arza,
Aritz Pérez
Abstract:
Population-based evolutionary algorithms are often considered when approaching computationally expensive black-box optimization problems. They employ a selection mechanism to choose the best solutions from a given population after comparing their objective values, which are then used to generate the next population. This iterative process explores the solution space efficiently, leading to improve…
▽ More
Population-based evolutionary algorithms are often considered when approaching computationally expensive black-box optimization problems. They employ a selection mechanism to choose the best solutions from a given population after comparing their objective values, which are then used to generate the next population. This iterative process explores the solution space efficiently, leading to improved solutions over time. However, these algorithms require a large number of evaluations to provide a quality solution, which might be computationally expensive when the evaluation cost is high. In some cases, it is possible to replace the original objective function with a less accurate approximation of lower cost. This introduces a trade-off between the evaluation cost and its accuracy.
In this paper, we propose a technique capable of choosing an appropriate approximate function cost during the execution of the optimization algorithm. The proposal finds the minimum evaluation cost at which the solutions are still properly ranked, and consequently, more evaluations can be computed in the same amount of time with minimal accuracy loss. An experimental section on four very different problems reveals that the proposed approach can reach the same objective value in less than half of the time in certain cases.
△ Less
Submitted 29 January, 2024; v1 submitted 23 September, 2023;
originally announced September 2023.
-
Superpixels algorithms through network community detection
Authors:
Anthony Perez
Abstract:
Community detection is a powerful tool from complex networks analysis that finds applications in various research areas. Several image segmentation methods rely for instance on community detection algorithms as a black box in order to compute undersegmentations, i.e. a small number of regions that represent areas of interest of the image. However, to the best of our knowledge, the efficiency of su…
▽ More
Community detection is a powerful tool from complex networks analysis that finds applications in various research areas. Several image segmentation methods rely for instance on community detection algorithms as a black box in order to compute undersegmentations, i.e. a small number of regions that represent areas of interest of the image. However, to the best of our knowledge, the efficiency of such an approach w.r.t. superpixels, that aim at representing the image at a smaller level while preserving as much as possible original information, has been neglected so far. The only related work seems to be the one by Liu et. al. (IET Image Processing, 2022) that developed a superpixels algorithm using a so-called modularity maximization approach, leading to relevant results. We follow this line of research by studying the efficiency of superpixels computed by state-of-the-art community detection algorithms on a 4-connected pixel graph, so-called pixel-grid. We first detect communities on such a graph and then apply a simple merging procedure that allows to obtain the desired number of superpixels. As we shall see, such methods result in the computation of relevant superpixels as emphasized by both qualitative and quantitative experiments, according to different widely-used metrics based on ground-truth comparison or on superpixels only. We observe that the choice of the community detection algorithm has a great impact on the number of communities and hence on the merging procedure. Similarly, small variations on the pixel-grid may provide different results from both qualitative and quantitative viewpoints. For the sake of completeness, we compare our results with those of several state-of-the-art superpixels algorithms as computed by Stutz et al. (Computer Vision and Image Understanding, 2018).
△ Less
Submitted 27 August, 2023;
originally announced August 2023.
-
Integer Programming with GCD Constraints
Authors:
Rémy Defossez,
Christoph Haase,
Alessio Mansutti,
Guillermo A. Perez
Abstract:
We study the non-linear extension of integer programming with greatest common divisor constraints of the form $\gcd(f,g) \sim d$, where $f$ and $g$ are linear polynomials, $d$ is a positive integer, and $\sim$ is a relation among $\leq, =, \neq$ and $\geq$. We show that the feasibility problem for these systems is in NP, and that an optimal solution minimizing a linear objective function, if it ex…
▽ More
We study the non-linear extension of integer programming with greatest common divisor constraints of the form $\gcd(f,g) \sim d$, where $f$ and $g$ are linear polynomials, $d$ is a positive integer, and $\sim$ is a relation among $\leq, =, \neq$ and $\geq$. We show that the feasibility problem for these systems is in NP, and that an optimal solution minimizing a linear objective function, if it exists, has polynomial bit length. To show these results, we identify an expressive fragment of the existential theory of the integers with addition and divisibility that admits solutions of polynomial bit length. It was shown by Lipshitz [Trans. Am. Math. Soc., 235, pp. 271-283, 1978] that this theory adheres to a local-to-global principle in the following sense: a formula $Φ$ is equi-satisfiable with a formula $Ψ$ in this theory such that $Ψ$ has a solution if and only if $Ψ$ has a solution modulo every prime $p$. We show that in our fragment, only a polynomial number of primes of polynomial bit length need to be considered, and that the solutions modulo prime numbers can be combined to yield a solution to $Φ$ of polynomial bit length. As a technical by-product, we establish a Chinese-remainder-type theorem for systems of congruences and non-congruences showing that solution sizes do not depend on the magnitude of the moduli of non-congruences.
△ Less
Submitted 25 August, 2023;
originally announced August 2023.
-
DepreSym: A Depression Symptom Annotated Corpus and the Role of LLMs as Assessors of Psychological Markers
Authors:
Anxo Pérez,
Marcos Fernández-Pichel,
Javier Parapar,
David E. Losada
Abstract:
Computational methods for depression detection aim to mine traces of depression from online publications posted by Internet users. However, solutions trained on existing collections exhibit limited generalisation and interpretability. To tackle these issues, recent studies have shown that identifying depressive symptoms can lead to more robust models. The eRisk initiative fosters research on this…
▽ More
Computational methods for depression detection aim to mine traces of depression from online publications posted by Internet users. However, solutions trained on existing collections exhibit limited generalisation and interpretability. To tackle these issues, recent studies have shown that identifying depressive symptoms can lead to more robust models. The eRisk initiative fosters research on this area and has recently proposed a new ranking task focused on developing search methods to find sentences related to depressive symptoms. This search challenge relies on the symptoms specified by the Beck Depression Inventory-II (BDI-II), a questionnaire widely used in clinical practice. Based on the participant systems' results, we present the DepreSym dataset, consisting of 21580 sentences annotated according to their relevance to the 21 BDI-II symptoms. The labelled sentences come from a pool of diverse ranking methods, and the final dataset serves as a valuable resource for advancing the development of models that incorporate depressive markers such as clinical symptoms. Due to the complex nature of this relevance annotation, we designed a robust assessment methodology carried out by three expert assessors (including an expert psychologist). Additionally, we explore here the feasibility of employing recent Large Language Models (ChatGPT and GPT4) as potential assessors in this complex task. We undertake a comprehensive examination of their performance, determine their main limitations and analyze their role as a complement or replacement for human annotators.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
Formally-Sharp DAgger for MCTS: Lower-Latency Monte Carlo Tree Search using Data Aggregation with Formal Methods
Authors:
Debraj Chakraborty,
Damien Busatto-Gaston,
Jean-François Raskin,
Guillermo A. Pérez
Abstract:
We study how to efficiently combine formal methods, Monte Carlo Tree Search (MCTS), and deep learning in order to produce high-quality receding horizon policies in large Markov Decision processes (MDPs). In particular, we use model-checking techniques to guide the MCTS algorithm in order to generate offline samples of high-quality decisions on a representative set of states of the MDP. Those sampl…
▽ More
We study how to efficiently combine formal methods, Monte Carlo Tree Search (MCTS), and deep learning in order to produce high-quality receding horizon policies in large Markov Decision processes (MDPs). In particular, we use model-checking techniques to guide the MCTS algorithm in order to generate offline samples of high-quality decisions on a representative set of states of the MDP. Those samples can then be used to train a neural network that imitates the policy used to generate them. This neural network can either be used as a guide on a lower-latency MCTS online search, or alternatively be used as a full-fledged policy when minimal latency is required. We use statistical model checking to detect when additional samples are needed and to focus those additional samples on configurations where the learnt neural network policy differs from the (computationally-expensive) offline policy. We illustrate the use of our method on MDPs that model the Frozen Lake and Pac-Man environments -- two popular benchmarks to evaluate reinforcement-learning algorithms.
△ Less
Submitted 15 August, 2023;
originally announced August 2023.
-
Termination in Concurrency, Revisited
Authors:
Joseph W. N. Paulus,
Jorge A. Pérez,
Daniele Nantes-Sobrinho
Abstract:
Termination is a central property in sequential programming models: a term is terminating if all its reduction sequences are finite. Termination is also important in concurrency in general, and for message-passing programs in particular. A variety of type systems that enforce termination by typing have been developed. In this paper, we rigorously compare several type systems for $π$-calculus proce…
▽ More
Termination is a central property in sequential programming models: a term is terminating if all its reduction sequences are finite. Termination is also important in concurrency in general, and for message-passing programs in particular. A variety of type systems that enforce termination by typing have been developed. In this paper, we rigorously compare several type systems for $π$-calculus processes from the unifying perspective of termination. Adopting session types as reference framework, we consider two different type systems: one follows Deng and Sangiorgi's weight-based approach; the other is Caires and Pfenning's Curry-Howard correspondence between linear logic and session types. Our technical results precisely connect these very different type systems, and shed light on the classes of client/server interactions they admit as correct.
△ Less
Submitted 2 August, 2023;
originally announced August 2023.
-
CONSTRUCT: A Program Synthesis Approach for Reconstructing Control Algorithms from Embedded System Binaries in Cyber-Physical Systems
Authors:
Ali Shokri,
Alexandre Perez,
Souma Chowdhury,
Chen Zeng,
Gerald Kaloor,
Ion Matei,
Peter-Patel Schneider,
Akshith Gunasekaran,
Shantanu Rane
Abstract:
We introduce a novel approach to automatically synthesize a mathematical representation of the control algorithms implemented in industrial cyber-physical systems (CPS), given the embedded system binary. The output model can be used by subject matter experts to assess the system's compliance with the expected behavior and for a variety of forensic applications. Our approach first performs static a…
▽ More
We introduce a novel approach to automatically synthesize a mathematical representation of the control algorithms implemented in industrial cyber-physical systems (CPS), given the embedded system binary. The output model can be used by subject matter experts to assess the system's compliance with the expected behavior and for a variety of forensic applications. Our approach first performs static analysis on decompiled binary files of the controller to create a sketch of the mathematical representation. Then, we perform an evolutionary-based search to find the correct semantic for the created representation, i.e., the control law. We demonstrate the effectiveness of the introduced approach in practice via three case studies conducted on two real-life industrial CPS.
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
Capsa: A Unified Framework for Quantifying Risk in Deep Neural Networks
Authors:
Sadhana Lolla,
Iaroslav Elistratov,
Alejandro Perez,
Elaheh Ahmadi,
Daniela Rus,
Alexander Amini
Abstract:
The modern pervasiveness of large-scale deep neural networks (NNs) is driven by their extraordinary performance on complex problems but is also plagued by their sudden, unexpected, and often catastrophic failures, particularly on challenging scenarios. Existing algorithms that provide risk-awareness to NNs are complex and ad-hoc. Specifically, these methods require significant engineering changes,…
▽ More
The modern pervasiveness of large-scale deep neural networks (NNs) is driven by their extraordinary performance on complex problems but is also plagued by their sudden, unexpected, and often catastrophic failures, particularly on challenging scenarios. Existing algorithms that provide risk-awareness to NNs are complex and ad-hoc. Specifically, these methods require significant engineering changes, are often developed only for particular settings, and are not easily composable. Here we present capsa, a framework for extending models with risk-awareness. Capsa provides a methodology for quantifying multiple forms of risk and composing different algorithms together to quantify different risk metrics in parallel. We validate capsa by implementing state-of-the-art uncertainty estimation algorithms within the capsa framework and benchmarking them on complex perception datasets. We demonstrate capsa's ability to easily compose aleatoric uncertainty, epistemic uncertainty, and bias estimation together in a single procedure, and show how this approach provides a comprehensive awareness of NN risk.
△ Less
Submitted 31 July, 2023;
originally announced August 2023.
-
An improved kernelization algorithm for Trivially Perfect Editing
Authors:
Maël Dumas,
Anthony Perez
Abstract:
In the Trivially Perfect Editing problem one is given an undirected graph $G = (V,E)$ and an integer $k$ and seeks to add or delete at most $k$ edges in $G$ to obtain a trivially perfect graph. In a recent work, Dumas, Perez and Todinca [Algorithmica 2023] proved that this problem admits a kernel with $O(k^3)$ vertices. This result heavily relies on the fact that the size of trivially perfect modu…
▽ More
In the Trivially Perfect Editing problem one is given an undirected graph $G = (V,E)$ and an integer $k$ and seeks to add or delete at most $k$ edges in $G$ to obtain a trivially perfect graph. In a recent work, Dumas, Perez and Todinca [Algorithmica 2023] proved that this problem admits a kernel with $O(k^3)$ vertices. This result heavily relies on the fact that the size of trivially perfect modules can be bounded by $O(k^2)$ as shown by Drange and Pilipczuk [Algorithmica 2018]. To obtain their cubic vertex-kernel, Dumas, Perez and Todinca [Algorithmica 2023] then showed that a more intricate structure, so-called \emph{comb}, can be reduced to $O(k^2)$ vertices. In this work we show that the bound can be improved to $O(k)$ for both aforementioned structures and thus obtain a kernel with $O(k^2)$ vertices. Our approach relies on the straightforward yet powerful observation that any large enough structure contains unaffected vertices whose neighborhood remains unchanged by an editing of size $k$, implying strong structural properties.
△ Less
Submitted 26 October, 2023; v1 submitted 29 June, 2023;
originally announced June 2023.
-
Challenges and Opportunities for RISC-V Architectures towards Genomics-based Workloads
Authors:
Gonzalo Gomez-Sanchez,
Aaron Call,
Xavier Teruel,
Lorena Alonso,
Ignasi Moran,
Miguel Angel Perez,
David Torrents,
Josep Ll. Berral
Abstract:
The use of large-scale supercomputing architectures is a hard requirement for scientific computing Big-Data applications. An example is genomics analytics, where millions of data transformations and tests per patient need to be done to find relevant clinical indicators. Therefore, to ensure open and broad access to high-performance technologies, governments, and academia are pushing toward the int…
▽ More
The use of large-scale supercomputing architectures is a hard requirement for scientific computing Big-Data applications. An example is genomics analytics, where millions of data transformations and tests per patient need to be done to find relevant clinical indicators. Therefore, to ensure open and broad access to high-performance technologies, governments, and academia are pushing toward the introduction of novel computing architectures in large-scale scientific environments. This is the case of RISC-V, an open-source and royalty-free instruction-set architecture. To evaluate such technologies, here we present the Variant-Interaction Analytics use case benchmarking suite and datasets. Through this use case, we search for possible genetic interactions using computational and statistical methods, providing a representative case for heavy ETL (Extract, Transform, Load) data processing. Current implementations are implemented in x86-based supercomputers (e.g. MareNostrum-IV at the Barcelona Supercomputing Center (BSC)), and future steps propose RISC-V as part of the next MareNostrum generations. Here we describe the Variant Interaction Use Case, highlighting the characteristics leveraging high-performance computing, indicating the caveats and challenges towards the next RISC-V developments and designs to come from a first comparison between x86 and RISC-V architectures on real Variant Interaction executions over real hardware implementations.
△ Less
Submitted 27 June, 2023;
originally announced June 2023.
-
Structural Restricted Boltzmann Machine for image denoising and classification
Authors:
Arkaitz Bidaurrazaga,
Aritz Pérez,
Roberto Santana
Abstract:
Restricted Boltzmann Machines are generative models that consist of a layer of hidden variables connected to another layer of visible units, and they are used to model the distribution over visible variables. In order to gain a higher representability power, many hidden units are commonly used, which, in combination with a large number of visible units, leads to a high number of trainable paramete…
▽ More
Restricted Boltzmann Machines are generative models that consist of a layer of hidden variables connected to another layer of visible units, and they are used to model the distribution over visible variables. In order to gain a higher representability power, many hidden units are commonly used, which, in combination with a large number of visible units, leads to a high number of trainable parameters. In this work we introduce the Structural Restricted Boltzmann Machine model, which taking advantage of the structure of the data in hand, constrains connections of hidden units to subsets of visible units in order to reduce significantly the number of trainable parameters, without compromising performance. As a possible area of application, we focus on image modelling. Based on the nature of the images, the structure of the connections is given in terms of spatial neighbourhoods over the pixels of the image that constitute the visible variables of the model. We conduct extensive experiments on various image domains. Image denoising is evaluated with corrupted images from the MNIST dataset. The generative power of our models is compared to vanilla RBMs, as well as their classification performance, which is assessed with five different image domains. Results show that our proposed model has a faster and more stable training, while also obtaining better results compared to an RBM with no constrained connections between its visible and hidden units.
△ Less
Submitted 16 June, 2023;
originally announced June 2023.
-
Efficient Learning of Minimax Risk Classifiers in High Dimensions
Authors:
Kartheek Bondugula,
Santiago Mazuelas,
Aritz Pérez
Abstract:
High-dimensional data is common in multiple areas, such as health care and genomics, where the number of features can be tens of thousands. In such scenarios, the large number of features often leads to inefficient learning. Constraint generation methods have recently enabled efficient learning of L1-regularized support vector machines (SVMs). In this paper, we leverage such methods to obtain an e…
▽ More
High-dimensional data is common in multiple areas, such as health care and genomics, where the number of features can be tens of thousands. In such scenarios, the large number of features often leads to inefficient learning. Constraint generation methods have recently enabled efficient learning of L1-regularized support vector machines (SVMs). In this paper, we leverage such methods to obtain an efficient learning algorithm for the recently proposed minimax risk classifiers (MRCs). The proposed iterative algorithm also provides a sequence of worst-case error probabilities and performs feature selection. Experiments on multiple high-dimensional datasets show that the proposed algorithm is efficient in high-dimensional scenarios. In addition, the worst-case error probability provides useful information about the classifier performance, and the features selected by the algorithm are competitive with the state-of-the-art.
△ Less
Submitted 11 June, 2023;
originally announced June 2023.
-
Monitoring Blackbox Implementations of Multiparty Session Protocols
Authors:
Bas van den Heuvel,
Jorge A. Pérez,
Rares A. Dobre
Abstract:
We present a framework for the distributed monitoring of networks of components that coordinate by message-passing, following multiparty session protocols specified as global types. We improve over prior works by (i) supporting components whose exact specification is unknown ("blackboxes") and (ii) covering protocols that cannot be analyzed by existing techniques. We first give a procedure for syn…
▽ More
We present a framework for the distributed monitoring of networks of components that coordinate by message-passing, following multiparty session protocols specified as global types. We improve over prior works by (i) supporting components whose exact specification is unknown ("blackboxes") and (ii) covering protocols that cannot be analyzed by existing techniques. We first give a procedure for synthesizing monitors for blackboxes from global types, and precisely define when a blackbox correctly satisfies its global type. Then, we prove that monitored blackboxes are sound (they correctly follow the protocol) and transparent (blackboxes with and without monitors are behaviorally equivalent).
△ Less
Submitted 3 October, 2023; v1 submitted 7 June, 2023;
originally announced June 2023.
-
Bi-Objective Lexicographic Optimization in Markov Decision Processes with Related Objectives
Authors:
Damien Busatto-Gaston,
Debraj Chakraborty,
Anirban Majumdar,
Sayan Mukherjee,
Guillermo A. Pérez,
Jean-François Raskin
Abstract:
We consider lexicographic bi-objective problems on Markov Decision Processes (MDPs), where we optimize one objective while guaranteeing optimality of another. We propose a two-stage technique for solving such problems when the objectives are related (in a way that we formalize). We instantiate our technique for two natural pairs of objectives: minimizing the (conditional) expected number of steps…
▽ More
We consider lexicographic bi-objective problems on Markov Decision Processes (MDPs), where we optimize one objective while guaranteeing optimality of another. We propose a two-stage technique for solving such problems when the objectives are related (in a way that we formalize). We instantiate our technique for two natural pairs of objectives: minimizing the (conditional) expected number of steps to a target while guaranteeing the optimal probability of reaching it; and maximizing the (conditional) expected average reward while guaranteeing an optimal probability of staying safe (w.r.t. some safe set of states). For the first combination of objectives, which covers the classical frozen lake environment from reinforcement learning, we also report on experiments performed using a prototype implementation of our algorithm and compare it with what can be obtained from state-of-the-art probabilistic model checkers solving optimal reachability.
△ Less
Submitted 15 August, 2023; v1 submitted 16 May, 2023;
originally announced May 2023.
-
Automata with Timers
Authors:
Véronique Bruyère,
Guillermo A. Pérez,
Gaëtan Staquet,
Frits W. Vaandrager
Abstract:
In this work, we study properties of deterministic finite-state automata with timers, a subclass of timed automata proposed by Vaandrager et al. as a candidate for an efficiently learnable timed model. We first study the complexity of the configuration reachability problem for such automata and establish that it is PSPACE-complete. Then, as simultaneous timeouts (we call these, races) can occur in…
▽ More
In this work, we study properties of deterministic finite-state automata with timers, a subclass of timed automata proposed by Vaandrager et al. as a candidate for an efficiently learnable timed model. We first study the complexity of the configuration reachability problem for such automata and establish that it is PSPACE-complete. Then, as simultaneous timeouts (we call these, races) can occur in timed runs of such automata, we study the problem of determining whether it is possible to modify the delays between the actions in a run, in a way to avoid such races. The absence of races is important for modelling purposes and to streamline learning of automata with timers. We provide an effective characterization of when an automaton is race-avoiding and establish that the related decision problem is in 3EXP and PSPACE-hard.
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
On the Fair Comparison of Optimization Algorithms in Different Machines
Authors:
Etor Arza,
Josu Ceberio,
Ekhiñe Irurozki,
Aritz Pérez
Abstract:
An experimental comparison of two or more optimization algorithms requires the same computational resources to be assigned to each algorithm. When a maximum runtime is set as the stopping criterion, all algorithms need to be executed in the same machine if they are to use the same resources. Unfortunately, the implementation code of the algorithms is not always available, which means that running…
▽ More
An experimental comparison of two or more optimization algorithms requires the same computational resources to be assigned to each algorithm. When a maximum runtime is set as the stopping criterion, all algorithms need to be executed in the same machine if they are to use the same resources. Unfortunately, the implementation code of the algorithms is not always available, which means that running the algorithms to be compared in the same machine is not always possible. And even if they are available, some optimization algorithms might be costly to run, such as training large neural-networks in the cloud.
In this paper, we consider the following problem: how do we compare the performance of a new optimization algorithm B with a known algorithm A in the literature if we only have the results (the objective values) and the runtime in each instance of algorithm A? Particularly, we present a methodology that enables a statistical analysis of the performance of algorithms executed in different machines. The proposed methodology has two parts. First, we propose a model that, given the runtime of an algorithm in a machine, estimates the runtime of the same algorithm in another machine. This model can be adjusted so that the probability of estimating a runtime longer than what it should be is arbitrarily low. Second, we introduce an adaptation of the one-sided sign test that uses a modified p-value and takes into account that probability. Such adaptation avoids increasing the probability of type I error associated with executing algorithms A and B in different machines.
△ Less
Submitted 7 August, 2023; v1 submitted 12 May, 2023;
originally announced May 2023.
-
Graph-Based Reductions for Parametric and Weighted MDPs
Authors:
Kasper Engelen,
Guillermo A. Pérez,
Shrisha Rao
Abstract:
We study the complexity of reductions for weighted reachability in parametric Markov decision processes. That is, we say a state p is never worse than q if for all valuations of the polynomial indeterminates it is the case that the maximal expected weight that can be reached from p is greater than the same value from q. In terms of computational complexity, we establish that determining whether p…
▽ More
We study the complexity of reductions for weighted reachability in parametric Markov decision processes. That is, we say a state p is never worse than q if for all valuations of the polynomial indeterminates it is the case that the maximal expected weight that can be reached from p is greater than the same value from q. In terms of computational complexity, we establish that determining whether p is never worse than q is coETR-complete. On the positive side, we give a polynomial-time algorithm to compute the equivalence classes of the order we study for Markov chains. Additionally, we describe and implement two inference rules to under-approximate the never-worse relation and empirically show that it can be used as an efficient preprocessing step for the analysis of large Markov decision processes.
△ Less
Submitted 9 May, 2023;
originally announced May 2023.
-
Wasserstein Auto-encoded MDPs: Formal Verification of Efficiently Distilled RL Policies with Many-sided Guarantees
Authors:
Florent Delgrange,
Ann Nowé,
Guillermo A. Pérez
Abstract:
Although deep reinforcement learning (DRL) has many success stories, the large-scale deployment of policies learned through these advanced techniques in safety-critical scenarios is hindered by their lack of formal guarantees. Variational Markov Decision Processes (VAE-MDPs) are discrete latent space models that provide a reliable framework for distilling formally verifiable controllers from any R…
▽ More
Although deep reinforcement learning (DRL) has many success stories, the large-scale deployment of policies learned through these advanced techniques in safety-critical scenarios is hindered by their lack of formal guarantees. Variational Markov Decision Processes (VAE-MDPs) are discrete latent space models that provide a reliable framework for distilling formally verifiable controllers from any RL policy. While the related guarantees address relevant practical aspects such as the satisfaction of performance and safety properties, the VAE approach suffers from several learning flaws (posterior collapse, slow learning speed, poor dynamics estimates), primarily due to the absence of abstraction and representation guarantees to support latent optimization. We introduce the Wasserstein auto-encoded MDP (WAE-MDP), a latent space model that fixes those issues by minimizing a penalized form of the optimal transport between the behaviors of the agent executing the original policy and the distilled policy, for which the formal guarantees apply. Our approach yields bisimulation guarantees while learning the distilled policy, allowing concrete optimization of the abstraction and representation model quality. Our experiments show that, besides distilling policies up to 10 times faster, the latent model quality is indeed better in general. Moreover, we present experiments from a simple time-to-failure verification algorithm on the latent space. The fact that our approach enables such simple verification techniques highlights its applicability.
△ Less
Submitted 21 April, 2023; v1 submitted 22 March, 2023;
originally announced March 2023.