-
Who Let the Guards Out: Visual Support for Patrolling Games
Authors:
Matěj Lang,
Adam Štěpánek,
Róbert Zvara,
Vojtěch Řehák,
Barbora Kozlíková
Abstract:
Effective security patrol management is critical for ensuring safety in diverse environments such as art galleries, airports, and factories. The behavior of patrols in these situations can be modeled by patrolling games. They simulate the behavior of the patrol and adversary in the building, which is modeled as a graph of interconnected nodes representing rooms. The designers of algorithms solving…
▽ More
Effective security patrol management is critical for ensuring safety in diverse environments such as art galleries, airports, and factories. The behavior of patrols in these situations can be modeled by patrolling games. They simulate the behavior of the patrol and adversary in the building, which is modeled as a graph of interconnected nodes representing rooms. The designers of algorithms solving the game face the problem of analyzing complex graph layouts with temporal dependencies. Therefore, appropriate visual support is crucial for them to work effectively. In this paper, we present a novel tool that helps the designers of patrolling games explore the outcomes of the proposed algorithms and approaches, evaluate their success rate, and propose modifications that can improve their solutions. Our tool offers an intuitive and interactive interface, featuring a detailed exploration of patrol routes and probabilities of taking them, simulation of patrols, and other requested features. In close collaboration with experts in designing patrolling games, we conducted three case studies demonstrating the usage and usefulness of our tool. The prototype of the tool, along with exemplary datasets, is available at https://gitlab.fi.muni.cz/formela/strategy-vizualizer.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
Graph Neural Networks: A suitable Alternative to MLPs in Latent 3D Medical Image Classification?
Authors:
Johannes Kiechle,
Daniel M. Lang,
Stefan M. Fischer,
Lina Felsner,
Jan C. Peeken,
Julia A. Schnabel
Abstract:
Recent studies have underscored the capabilities of natural imaging foundation models to serve as powerful feature extractors, even in a zero-shot setting for medical imaging data. Most commonly, a shallow multi-layer perceptron (MLP) is appended to the feature extractor to facilitate end-to-end learning and downstream prediction tasks such as classification, thus representing the de facto standar…
▽ More
Recent studies have underscored the capabilities of natural imaging foundation models to serve as powerful feature extractors, even in a zero-shot setting for medical imaging data. Most commonly, a shallow multi-layer perceptron (MLP) is appended to the feature extractor to facilitate end-to-end learning and downstream prediction tasks such as classification, thus representing the de facto standard. However, as graph neural networks (GNNs) have become a practicable choice for various tasks in medical research in the recent past, we direct attention to the question of how effective GNNs are compared to MLP prediction heads for the task of 3D medical image classification, proposing them as a potential alternative. In our experiments, we devise a subject-level graph for each volumetric dataset instance. Therein latent representations of all slices in the volume, encoded through a DINOv2 pretrained vision transformer (ViT), constitute the nodes and their respective node features. We use public datasets to compare the classification heads numerically and evaluate various graph construction and graph convolution methods in our experiments. Our findings show enhancements of the GNN in classification performance and substantial improvements in runtime compared to an MLP prediction head. Additional robustness evaluations further validate the promising performance of the GNN, promoting them as a suitable alternative to traditional MLP classification heads. Our code is publicly available at: https://github.com/compai-lab/2024-miccai-grail-kiechle
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
MedEdit: Counterfactual Diffusion-based Image Editing on Brain MRI
Authors:
Malek Ben Alaya,
Daniel M. Lang,
Benedikt Wiestler,
Julia A. Schnabel,
Cosmin I. Bercea
Abstract:
Denoising diffusion probabilistic models enable high-fidelity image synthesis and editing. In biomedicine, these models facilitate counterfactual image editing, producing pairs of images where one is edited to simulate hypothetical conditions. For example, they can model the progression of specific diseases, such as stroke lesions. However, current image editing techniques often fail to generate r…
▽ More
Denoising diffusion probabilistic models enable high-fidelity image synthesis and editing. In biomedicine, these models facilitate counterfactual image editing, producing pairs of images where one is edited to simulate hypothetical conditions. For example, they can model the progression of specific diseases, such as stroke lesions. However, current image editing techniques often fail to generate realistic biomedical counterfactuals, either by inadequately modeling indirect pathological effects like brain atrophy or by excessively altering the scan, which disrupts correspondence to the original images. Here, we propose MedEdit, a conditional diffusion model for medical image editing. MedEdit induces pathology in specific areas while balancing the modeling of disease effects and preserving the integrity of the original scan. We evaluated MedEdit on the Atlas v2.0 stroke dataset using Frechet Inception Distance and Dice scores, outperforming state-of-the-art diffusion-based methods such as Palette (by 45%) and SDEdit (by 61%). Additionally, clinical evaluations by a board-certified neuroradiologist confirmed that MedEdit generated realistic stroke scans indistinguishable from real ones. We believe this work will enable counterfactual image editing research to further advance the development of realistic and clinically useful imaging tools.
△ Less
Submitted 21 July, 2024;
originally announced July 2024.
-
Enhancing the Utility of Privacy-Preserving Cancer Classification using Synthetic Data
Authors:
Richard Osuala,
Daniel M. Lang,
Anneliese Riess,
Georgios Kaissis,
Zuzanna Szafranowska,
Grzegorz Skorupko,
Oliver Diaz,
Julia A. Schnabel,
Karim Lekadir
Abstract:
Deep learning holds immense promise for aiding radiologists in breast cancer detection. However, achieving optimal model performance is hampered by limitations in availability and sharing of data commonly associated to patient privacy concerns. Such concerns are further exacerbated, as traditional deep learning models can inadvertently leak sensitive training information. This work addresses these…
▽ More
Deep learning holds immense promise for aiding radiologists in breast cancer detection. However, achieving optimal model performance is hampered by limitations in availability and sharing of data commonly associated to patient privacy concerns. Such concerns are further exacerbated, as traditional deep learning models can inadvertently leak sensitive training information. This work addresses these challenges exploring and quantifying the utility of privacy-preserving deep learning techniques, concretely, (i) differentially private stochastic gradient descent (DP-SGD) and (ii) fully synthetic training data generated by our proposed malignancy-conditioned generative adversarial network. We assess these methods via downstream malignancy classification of mammography masses using a transformer model. Our experimental results depict that synthetic data augmentation can improve privacy-utility tradeoffs in differentially private model training. Further, model pretraining on synthetic data achieves remarkable performance, which can be further increased with DP-SGD fine-tuning across all privacy guarantees. With this first in-depth exploration of privacy-preserving deep learning in breast imaging, we address current and emerging clinical privacy requirements and pave the way towards the adoption of private high-utility deep diagnostic models. Our reproducible codebase is publicly available at https://github.com/RichardObi/mammo_dp.
△ Less
Submitted 17 July, 2024;
originally announced July 2024.
-
Progressive Growing of Patch Size: Resource-Efficient Curriculum Learning for Dense Prediction Tasks
Authors:
Stefan M. Fischer,
Lina Felsner,
Richard Osuala,
Johannes Kiechle,
Daniel M. Lang,
Jan C. Peeken,
Julia A. Schnabel
Abstract:
In this work, we introduce Progressive Growing of Patch Size, a resource-efficient implicit curriculum learning approach for dense prediction tasks. Our curriculum approach is defined by growing the patch size during model training, which gradually increases the task's difficulty. We integrated our curriculum into the nnU-Net framework and evaluated the methodology on all 10 tasks of the Medical S…
▽ More
In this work, we introduce Progressive Growing of Patch Size, a resource-efficient implicit curriculum learning approach for dense prediction tasks. Our curriculum approach is defined by growing the patch size during model training, which gradually increases the task's difficulty. We integrated our curriculum into the nnU-Net framework and evaluated the methodology on all 10 tasks of the Medical Segmentation Decathlon. With our approach, we are able to substantially reduce runtime, computational costs, and CO2 emissions of network training compared to classical constant patch size training. In our experiments, the curriculum approach resulted in improved convergence. We are able to outperform standard nnU-Net training, which is trained with constant patch size, in terms of Dice Score on 7 out of 10 MSD tasks while only spending roughly 50% of the original training runtime. To the best of our knowledge, our Progressive Growing of Patch Size is the first successful employment of a sample-length curriculum in the form of patch size in the field of computer vision. Our code is publicly available at https://github.com/compai-lab/2024-miccai-fischer.
△ Less
Submitted 11 July, 2024; v1 submitted 10 July, 2024;
originally announced July 2024.
-
Mask the Unknown: Assessing Different Strategies to Handle Weak Annotations in the MICCAI2023 Mediastinal Lymph Node Quantification Challenge
Authors:
Stefan M. Fischer,
Johannes Kiechle,
Daniel M. Lang,
Jan C. Peeken,
Julia A. Schnabel
Abstract:
Pathological lymph node delineation is crucial in cancer diagnosis, progression assessment, and treatment planning. The MICCAI 2023 Lymph Node Quantification Challenge published the first public dataset for pathological lymph node segmentation in the mediastinum. As lymph node annotations are expensive, the challenge was formed as a weakly supervised learning task, where only a subset of all lymph…
▽ More
Pathological lymph node delineation is crucial in cancer diagnosis, progression assessment, and treatment planning. The MICCAI 2023 Lymph Node Quantification Challenge published the first public dataset for pathological lymph node segmentation in the mediastinum. As lymph node annotations are expensive, the challenge was formed as a weakly supervised learning task, where only a subset of all lymph nodes in the training set have been annotated. For the challenge submission, multiple methods for training on these weakly supervised data were explored, including noisy label training, loss masking of unlabeled data, and an approach that integrated the TotalSegmentator toolbox as a form of pseudo labeling in order to reduce the number of unknown voxels. Furthermore, multiple public TCIA datasets were incorporated into the training to improve the performance of the deep learning model. Our submitted model achieved a Dice score of 0.628 and an average symmetric surface distance of 5.8~mm on the challenge test set. With our submitted model, we accomplished third rank in the MICCAI2023 LNQ challenge. A finding of our analysis was that the integration of all visible, including non-pathological, lymph nodes improved the overall segmentation performance on pathological lymph nodes of the test set. Furthermore, segmentation models trained only on clinically enlarged lymph nodes, as given in the challenge scenario, could not generalize to smaller pathological lymph nodes. The code and model for the challenge submission are available at \url{https://gitlab.lrz.de/compai/MediastinalLymphNodeSegmentation}.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Multi-modal Transfer Learning between Biological Foundation Models
Authors:
Juan Jose Garau-Luis,
Patrick Bordes,
Liam Gonzalez,
Masa Roller,
Bernardo P. de Almeida,
Lorenz Hexemer,
Christopher Blum,
Stefan Laurent,
Jan Grzegorzewski,
Maren Lang,
Thomas Pierrot,
Guillaume Richard
Abstract:
Biological sequences encode fundamental instructions for the building blocks of life, in the form of DNA, RNA, and proteins. Modeling these sequences is key to understand disease mechanisms and is an active research area in computational biology. Recently, Large Language Models have shown great promise in solving certain biological tasks but current approaches are limited to a single sequence moda…
▽ More
Biological sequences encode fundamental instructions for the building blocks of life, in the form of DNA, RNA, and proteins. Modeling these sequences is key to understand disease mechanisms and is an active research area in computational biology. Recently, Large Language Models have shown great promise in solving certain biological tasks but current approaches are limited to a single sequence modality (DNA, RNA, or protein). Key problems in genomics intrinsically involve multiple modalities, but it remains unclear how to adapt general-purpose sequence models to those cases. In this work we propose a multi-modal model that connects DNA, RNA, and proteins by leveraging information from different pre-trained modality-specific encoders. We demonstrate its capabilities by applying it to the largely unsolved problem of predicting how multiple RNA transcript isoforms originate from the same gene (i.e. same DNA sequence) and map to different transcription expression levels across various human tissues. We show that our model, dubbed IsoFormer, is able to accurately predict differential transcript expression, outperforming existing methods and leveraging the use of multiple modalities. Our framework also achieves efficient transfer knowledge from the encoders pre-training as well as in between modalities. We open-source our model, paving the way for new multi-modal gene expression approaches.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
TREE: Tree Regularization for Efficient Execution
Authors:
Lena Schmid,
Daniel Biebert,
Christian Hakert,
Kuan-Hsun Chen,
Michel Lang,
Markus Pauly,
Jian-Jia Chen
Abstract:
The rise of machine learning methods on heavily resource constrained devices requires not only the choice of a suitable model architecture for the target platform, but also the optimization of the chosen model with regard to execution time consumption for inference in order to optimally utilize the available resources. Random forests and decision trees are shown to be a suitable model for such a s…
▽ More
The rise of machine learning methods on heavily resource constrained devices requires not only the choice of a suitable model architecture for the target platform, but also the optimization of the chosen model with regard to execution time consumption for inference in order to optimally utilize the available resources. Random forests and decision trees are shown to be a suitable model for such a scenario, since they are not only heavily tunable towards the total model size, but also offer a high potential for optimizing their executions according to the underlying memory architecture.
In addition to the straightforward strategy of enforcing shorter paths through decision trees and hence reducing the execution time for inference, hardware-aware implementations can optimize the execution time in an orthogonal manner. One particular hardware-aware optimization is to layout the memory of decision trees in such a way, that higher probably paths are less likely to be evicted from system caches. This works particularly well when splits within tree nodes are uneven and have a high probability to visit one of the child nodes.
In this paper, we present a method to reduce path lengths by rewarding uneven probability distributions during the training of decision trees at the cost of a minimal accuracy degradation. Specifically, we regularize the impurity computation of the CART algorithm in order to favor not only low impurity, but also highly asymmetric distributions for the evaluation of split criteria and hence offer a high optimization potential for a memory architecture-aware implementation.
We show that especially for binary classification data sets and data sets with many samples, this form of regularization can lead to an reduction of up to approximately four times in the execution time with a minimal accuracy degradation.
△ Less
Submitted 18 June, 2024;
originally announced June 2024.
-
Towards Learning Contrast Kinetics with Multi-Condition Latent Diffusion Models
Authors:
Richard Osuala,
Daniel M. Lang,
Preeti Verma,
Smriti Joshi,
Apostolia Tsirikoglou,
Grzegorz Skorupko,
Kaisar Kushibar,
Lidia Garrucho,
Walter H. L. Pinaya,
Oliver Diaz,
Julia A. Schnabel,
Karim Lekadir
Abstract:
Contrast agents in dynamic contrast enhanced magnetic resonance imaging allow to localize tumors and observe their contrast kinetics, which is essential for cancer characterization and respective treatment decision-making. However, contrast agent administration is not only associated with adverse health risks, but also restricted for patients during pregnancy, and for those with kidney malfunction…
▽ More
Contrast agents in dynamic contrast enhanced magnetic resonance imaging allow to localize tumors and observe their contrast kinetics, which is essential for cancer characterization and respective treatment decision-making. However, contrast agent administration is not only associated with adverse health risks, but also restricted for patients during pregnancy, and for those with kidney malfunction, or other adverse reactions. With contrast uptake as key biomarker for lesion malignancy, cancer recurrence risk, and treatment response, it becomes pivotal to reduce the dependency on intravenous contrast agent administration. To this end, we propose a multi-conditional latent diffusion model capable of acquisition time-conditioned image synthesis of DCE-MRI temporal sequences. To evaluate medical image synthesis, we additionally propose and validate the Fréchet radiomics distance as an image quality measure based on biomarker variability between synthetic and real imaging data. Our results demonstrate our method's ability to generate realistic multi-sequence fat-saturated breast DCE-MRI and uncover the emerging potential of deep learning based contrast kinetics simulation. We publicly share our accessible codebase at https://github.com/RichardObi/ccnet and provide a user-friendly library for Fréchet radiomics distance calculation at https://pypi.org/project/frd-score.
△ Less
Submitted 17 July, 2024; v1 submitted 20 March, 2024;
originally announced March 2024.
-
Towards Top-Down Reasoning: An Explainable Multi-Agent Approach for Visual Question Answering
Authors:
Zeqing Wang,
Wentao Wan,
Qiqing Lao,
Runmeng Chen,
Minjie Lang,
Keze Wang,
Liang Lin
Abstract:
Recently, several methods have been proposed to augment large Vision Language Models (VLMs) for Visual Question Answering (VQA) simplicity by incorporating external knowledge from knowledge bases or visual clues derived from question decomposition. Although having achieved promising results, these methods still suffer from the challenge that VLMs cannot inherently understand the incorporated knowl…
▽ More
Recently, several methods have been proposed to augment large Vision Language Models (VLMs) for Visual Question Answering (VQA) simplicity by incorporating external knowledge from knowledge bases or visual clues derived from question decomposition. Although having achieved promising results, these methods still suffer from the challenge that VLMs cannot inherently understand the incorporated knowledge and might fail to generate the optimal answers. Contrarily, human cognition engages visual questions through a top-down reasoning process, systematically exploring relevant issues to derive a comprehensive answer. This not only facilitates an accurate answer but also provides a transparent rationale for the decision-making pathway. Motivated by this cognitive mechanism, we introduce a novel, explainable multi-agent collaboration framework designed to imitate human-like top-down reasoning by leveraging the expansive knowledge of Large Language Models (LLMs). Our framework comprises three agents, i.e., Responder, Seeker, and Integrator, each contributing uniquely to the top-down reasoning process. The VLM-based Responder generates the answer candidates for the question and gives responses to other issues. The Seeker, primarily based on LLM, identifies relevant issues related to the question to inform the Responder and constructs a Multi-View Knowledge Base (MVKB) for the given visual scene by leveraging the understanding capabilities of LLM. The Integrator agent combines information from the Seeker and the Responder to produce the final VQA answer. Through this collaboration mechanism, our framework explicitly constructs an MVKB for a specific visual scene and reasons answers in a top-down reasoning process. Extensive and comprehensive evaluations on diverse VQA datasets and VLMs demonstrate the superior applicability and interpretability of our framework over the existing compared methods.
△ Less
Submitted 14 May, 2024; v1 submitted 28 November, 2023;
originally announced November 2023.
-
Less Power for More Learning: Restricting OCaml Features for Effective Teaching
Authors:
Max Lang,
Nico Petzendorfer
Abstract:
We present a framework for sandboxing and restricting features of the OCaml programming language to effectively automate the grading of programming exercises, scaling to hundreds of submissions. We describe how to disable language and library features that should not be used to solve a given exercise. We present an overview of an implementation of a mock IO system to allow testing of IO-related ex…
▽ More
We present a framework for sandboxing and restricting features of the OCaml programming language to effectively automate the grading of programming exercises, scaling to hundreds of submissions. We describe how to disable language and library features that should not be used to solve a given exercise. We present an overview of an implementation of a mock IO system to allow testing of IO-related exercises in a controlled environment. Finally, we detail a number of security considerations to ensure submitted code remains sandboxed, allowing automatic grading to be trusted without manual verification. The source code of our implementation is publicly available.
△ Less
Submitted 8 September, 2023;
originally announced September 2023.
-
Tailoring Stateless Model Checking for Event-Driven Multi-Threaded Programs
Authors:
Parosh Aziz Abdulla,
Mohamed Faouzi Atig,
Frederik Meyer Bønneland,
Sarbojit Das,
Bengt Jonsson,
Magnus Lång,
Konstantinos Sagonas
Abstract:
Event-driven multi-threaded programming is an important idiom for structuring concurrent computations. Stateless Model Checking (SMC) is an effective verification technique for multi-threaded programs, especially when coupled with Dynamic Partial Order Reduction (DPOR). Existing SMC techniques are often ineffective in handling event-driven programs, since they will typically explore all possible o…
▽ More
Event-driven multi-threaded programming is an important idiom for structuring concurrent computations. Stateless Model Checking (SMC) is an effective verification technique for multi-threaded programs, especially when coupled with Dynamic Partial Order Reduction (DPOR). Existing SMC techniques are often ineffective in handling event-driven programs, since they will typically explore all possible orderings of event processing, even when events do not conflict. We present Event-DPOR , a DPOR algorithm tailored to event-driven multi-threaded programs. It is based on Optimal-DPOR, an optimal DPOR algorithm for multi-threaded programs; we show how it can be extended for event-driven programs. We prove correctness of Event-DPOR for all programs, and optimality for a large subclass. One complication is that an operation in Event-DPOR, which checks for redundancy of new executions, is NP-hard, as we show in this paper; we address this by a sequence of inexpensive (but incomplete) tests which check for redundancy efficiently. Our implementation and experimental evaluation show that, in comparison with other tools in which handler threads are simulated using locks, Event-DPOR can be exponentially faster than other state-of-the-art DPOR algorithms on a variety of programs and manages to completely avoid unnecessary exploration of executions.
△ Less
Submitted 29 July, 2023;
originally announced July 2023.
-
Awaiting for Godot: Stateless Model Checking that Avoids Executions where Nothing Happens
Authors:
Bengt Jonsson,
Magnus Lång,
Konstantinos Sagonas
Abstract:
Stateless Model Checking (SMC) is a verification technique for concurrent programs that checks for safety violations by exploring all possible thread schedulings. It is highly effective when coupled with Dynamic Partial Order Reduction (DPOR), which introduces an equivalence on schedulings and need explore only one in each equivalence class. Even with DPOR, SMC often spends unnecessary effort in e…
▽ More
Stateless Model Checking (SMC) is a verification technique for concurrent programs that checks for safety violations by exploring all possible thread schedulings. It is highly effective when coupled with Dynamic Partial Order Reduction (DPOR), which introduces an equivalence on schedulings and need explore only one in each equivalence class. Even with DPOR, SMC often spends unnecessary effort in exploring loop iterations that are pure, i.e., have no effect on the program state. We present techniques for making SMC with DPOR more effective on programs with pure loop iterations. The first is a static program analysis to detect loop purity and an associated program transformation, called Partial Loop Purity Elimination, that inserts assume statements to block pure loop iterations. Subsequently, some of these assumes are turned into await statements that completely remove many assume-blocked executions. Finally, we present an extension of the standard DPOR equivalence, obtained by weakening the conflict relation between events. All these techniques are incorporated into a new DPOR algorithm, Optimal-DPOR-Await, which can handle both awaits and the weaker conflict relation, is optimal in the sense that it explores exactly one execution in each equivalence class, and can also diagnose livelocks. Our implementation in Nidhugg shows that these techniques can significantly speed up the analysis of concurrent programs that are currently challenging for SMC tools, both for exploring their complete set of interleavings, but even for detecting concurrency errors in them.
△ Less
Submitted 19 August, 2022;
originally announced August 2022.
-
Bluetooth Low Energy mesh network for power-limited, robust and reliable IoT services
Authors:
Davide Villa,
Chih-Kuang Lin,
Adam Kuenzi,
Michael Lang
Abstract:
Bluetooth Low Energy (BLE) is an emerging wireless technology created for short-range control and monitoring applications that is becoming increasingly widespread among the Internet of Things (IoT) services because of its low-cost and low-energy consumption. In this paper, we propose a novel neighbor discovery scheme and failure recovery techniques for multi-path and single-path low-power and reli…
▽ More
Bluetooth Low Energy (BLE) is an emerging wireless technology created for short-range control and monitoring applications that is becoming increasingly widespread among the Internet of Things (IoT) services because of its low-cost and low-energy consumption. In this paper, we propose a novel neighbor discovery scheme and failure recovery techniques for multi-path and single-path low-power and reliable BLE networks. By exploiting energy-efficient access control and fast and robust routing ideas with adaptive failure recovery, the proposed methods outperform the well-known flooding approach used by the BLE Mesh standard. We show varying improvements in packet latency and power consumption in event-driven simulations as network topology and traffic changes. The failure recovery approaches proposed are optimized and demonstrated during the simulations, showing the varying of the overall failure recovery latency and node power consumption in different use cases.
△ Less
Submitted 8 August, 2022;
originally announced August 2022.
-
Multi-Objective Hyperparameter Optimization in Machine Learning -- An Overview
Authors:
Florian Karl,
Tobias Pielok,
Julia Moosbauer,
Florian Pfisterer,
Stefan Coors,
Martin Binder,
Lennart Schneider,
Janek Thomas,
Jakob Richter,
Michel Lang,
Eduardo C. Garrido-Merchán,
Juergen Branke,
Bernd Bischl
Abstract:
Hyperparameter optimization constitutes a large part of typical modern machine learning workflows. This arises from the fact that machine learning methods and corresponding preprocessing steps often only yield optimal performance when hyperparameters are properly tuned. But in many applications, we are not only interested in optimizing ML pipelines solely for predictive accuracy; additional metric…
▽ More
Hyperparameter optimization constitutes a large part of typical modern machine learning workflows. This arises from the fact that machine learning methods and corresponding preprocessing steps often only yield optimal performance when hyperparameters are properly tuned. But in many applications, we are not only interested in optimizing ML pipelines solely for predictive accuracy; additional metrics or constraints must be considered when determining an optimal configuration, resulting in a multi-objective optimization problem. This is often neglected in practice, due to a lack of knowledge and readily available software implementations for multi-objective hyperparameter optimization. In this work, we introduce the reader to the basics of multi-objective hyperparameter optimization and motivate its usefulness in applied ML. Furthermore, we provide an extensive survey of existing optimization strategies, both from the domain of evolutionary algorithms and Bayesian optimization. We illustrate the utility of MOO in several specific ML applications, considering objectives such as operating conditions, prediction time, sparseness, fairness, interpretability and robustness.
△ Less
Submitted 6 June, 2024; v1 submitted 15 June, 2022;
originally announced June 2022.
-
Automated Benchmark-Driven Design and Explanation of Hyperparameter Optimizers
Authors:
Julia Moosbauer,
Martin Binder,
Lennart Schneider,
Florian Pfisterer,
Marc Becker,
Michel Lang,
Lars Kotthoff,
Bernd Bischl
Abstract:
Automated hyperparameter optimization (HPO) has gained great popularity and is an important ingredient of most automated machine learning frameworks. The process of designing HPO algorithms, however, is still an unsystematic and manual process: Limitations of prior work are identified and the improvements proposed are -- even though guided by expert knowledge -- still somewhat arbitrary. This rare…
▽ More
Automated hyperparameter optimization (HPO) has gained great popularity and is an important ingredient of most automated machine learning frameworks. The process of designing HPO algorithms, however, is still an unsystematic and manual process: Limitations of prior work are identified and the improvements proposed are -- even though guided by expert knowledge -- still somewhat arbitrary. This rarely allows for gaining a holistic understanding of which algorithmic components are driving performance, and carries the risk of overlooking good algorithmic design choices. We present a principled approach to automated benchmark-driven algorithm design applied to multifidelity HPO (MF-HPO): First, we formalize a rich space of MF-HPO candidates that includes, but is not limited to common HPO algorithms, and then present a configurable framework covering this space. To find the best candidate automatically and systematically, we follow a programming-by-optimization approach and search over the space of algorithm candidates via Bayesian optimization. We challenge whether the found design choices are necessary or could be replaced by more naive and simpler ones by performing an ablation analysis. We observe that using a relatively simple configuration, in some ways simpler than established methods, performs very well as long as some critical configuration parameters have the right value.
△ Less
Submitted 29 November, 2021;
originally announced November 2021.
-
mlr3spatiotempcv: Spatiotemporal resampling methods for machine learning in R
Authors:
Patrick Schratz,
Marc Becker,
Michel Lang,
Alexander Brenning
Abstract:
Spatial and spatiotemporal machine-learning models require a suitable framework for their model assessment, model selection, and hyperparameter tuning, in order to avoid error estimation bias and over-fitting. This contribution reviews the state-of-the-art in spatial and spatiotemporal cross-validation, and introduces the {R} package {mlr3spatiotempcv} as an extension package of the machine-learni…
▽ More
Spatial and spatiotemporal machine-learning models require a suitable framework for their model assessment, model selection, and hyperparameter tuning, in order to avoid error estimation bias and over-fitting. This contribution reviews the state-of-the-art in spatial and spatiotemporal cross-validation, and introduces the {R} package {mlr3spatiotempcv} as an extension package of the machine-learning framework {mlr3}. Currently various {R} packages implementing different spatiotemporal partitioning strategies exist: {blockCV}, {CAST}, {skmeans} and {sperrorest}. The goal of {mlr3spatiotempcv} is to gather the available spatiotemporal resampling methods in {R} and make them available to users through a simple and common interface. This is made possible by integrating the package directly into the {mlr3} machine-learning framework, which already has support for generic non-spatiotemporal resampling methods such as random partitioning. One advantage is the use of a consistent nomenclature in an overarching machine-learning toolkit instead of a varying package-specific syntax, making it easier for users to choose from a variety of spatiotemporal resampling methods. This package avoids giving recommendations which method to use in practice as this decision depends on the predictive task at hand, the autocorrelation within the data, and the spatial structure of the sampling design or geographic objects being studied.
△ Less
Submitted 2 November, 2022; v1 submitted 25 October, 2021;
originally announced October 2021.
-
Hyperparameter Optimization: Foundations, Algorithms, Best Practices and Open Challenges
Authors:
Bernd Bischl,
Martin Binder,
Michel Lang,
Tobias Pielok,
Jakob Richter,
Stefan Coors,
Janek Thomas,
Theresa Ullmann,
Marc Becker,
Anne-Laure Boulesteix,
Difan Deng,
Marius Lindauer
Abstract:
Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for superv…
▽ More
Most machine learning algorithms are configured by one or several hyperparameters that must be carefully chosen and often considerably impact performance. To avoid a time consuming and unreproducible manual trial-and-error process to find well-performing hyperparameter configurations, various automatic hyperparameter optimization (HPO) methods, e.g., based on resampling error estimation for supervised machine learning, can be employed. After introducing HPO from a general perspective, this paper reviews important HPO methods such as grid or random search, evolutionary algorithms, Bayesian optimization, Hyperband and racing. It gives practical recommendations regarding important choices to be made when conducting HPO, including the HPO algorithms themselves, performance evaluation, how to combine HPO with ML pipelines, runtime improvements, and parallelization. This work is accompanied by an appendix that contains information on specific software packages in R and Python, as well as information and recommended hyperparameter search spaces for specific learning algorithms. We also provide notebooks that demonstrate concepts from this work as supplementary files.
△ Less
Submitted 24 November, 2021; v1 submitted 13 July, 2021;
originally announced July 2021.
-
Employing an Adjusted Stability Measure for Multi-Criteria Model Fitting on Data Sets with Similar Features
Authors:
Andrea Bommert,
Jörg Rahnenführer,
Michel Lang
Abstract:
Fitting models with high predictive accuracy that include all relevant but no irrelevant or redundant features is a challenging task on data sets with similar (e.g. highly correlated) features. We propose the approach of tuning the hyperparameters of a predictive model in a multi-criteria fashion with respect to predictive accuracy and feature selection stability. We evaluate this approach based o…
▽ More
Fitting models with high predictive accuracy that include all relevant but no irrelevant or redundant features is a challenging task on data sets with similar (e.g. highly correlated) features. We propose the approach of tuning the hyperparameters of a predictive model in a multi-criteria fashion with respect to predictive accuracy and feature selection stability. We evaluate this approach based on both simulated and real data sets and we compare it to the standard approach of single-criteria tuning of the hyperparameters as well as to the state-of-the-art technique "stability selection". We conclude that our approach achieves the same or better predictive performance compared to the two established approaches. Considering the stability during tuning does not decrease the predictive accuracy of the resulting models. Our approach succeeds at selecting the relevant features while avoiding irrelevant or redundant features. The single-criteria approach fails at avoiding irrelevant or redundant features and the stability selection approach fails at selecting enough relevant features for achieving acceptable predictive accuracy. For our approach, for data sets with many similar features, the feature selection stability must be evaluated with an adjusted stability measure, that is, a measure that considers similarities between features. For data sets with only few similar features, an unadjusted stability measure suffices and is faster to compute.
△ Less
Submitted 15 June, 2021;
originally announced June 2021.
-
Deep Learning Based HPV Status Prediction for Oropharyngeal Cancer Patients
Authors:
Daniel M. Lang,
Jan C. Peeken,
Stephanie E. Combs,
Jan J. Wilkens,
Stefan Bartzsch
Abstract:
We investigated the ability of deep learning models for imaging based HPV status detection. To overcome the problem of small medical datasets we used a transfer learning approach. A 3D convolutional network pre-trained on sports video clips was fine tuned such that full 3D information in the CT images could be exploited. The video pre-trained model was able to differentiate HPV-positive from HPV-n…
▽ More
We investigated the ability of deep learning models for imaging based HPV status detection. To overcome the problem of small medical datasets we used a transfer learning approach. A 3D convolutional network pre-trained on sports video clips was fine tuned such that full 3D information in the CT images could be exploited. The video pre-trained model was able to differentiate HPV-positive from HPV-negative cases with an area under the receiver operating characteristic curve (AUC) of 0.81 for an external test set. In comparison to a 3D convolutional neural network (CNN) trained from scratch and a 2D architecture pre-trained on ImageNet the video pre-trained model performed best.
△ Less
Submitted 17 November, 2020;
originally announced November 2020.
-
An Open-Source Integration of Process Mining Features into the Camunda Workflow Engine: Data Extraction and Challenges
Authors:
Alessandro Berti,
Wil van der Aalst,
David Zang,
Magdalena Lang
Abstract:
Process mining provides techniques to improve the performance and compliance of operational processes. Although sometimes the term "workflow mining" is used, the application in the context of Workflow Management (WFM) and Business Process Management (BPM) systems is limited. The main reason is that WFM/BPM systems control the process, leaving less room for flexibility and the corresponding deviati…
▽ More
Process mining provides techniques to improve the performance and compliance of operational processes. Although sometimes the term "workflow mining" is used, the application in the context of Workflow Management (WFM) and Business Process Management (BPM) systems is limited. The main reason is that WFM/BPM systems control the process, leaving less room for flexibility and the corresponding deviations. However, as this paper shows, it is easy to extract event data from systems like Camunda, one of the leading open-source WFM/BPM systems. Moreover, although the respective process engines control the process flow, process mining is still able to provide valuable insights, such as the analysis of the performance of the paths and the mining of the decision rules. This demo paper presents a process mining connector to Camunda that extracts event logs and process models, allowing for the application of existing process mining tools. We also analyzed the added value of different process mining techniques in the context of Camunda. We discuss a subset of process mining techniques that nicely complements the process intelligence capabilities of Camunda. Through this demo paper, we hope to boost the use of process mining among Camunda users.
△ Less
Submitted 14 September, 2020;
originally announced September 2020.
-
mlr3proba: An R Package for Machine Learning in Survival Analysis
Authors:
Raphael Sonabend,
Franz J. Király,
Andreas Bender,
Bernd Bischl,
Michel Lang
Abstract:
As machine learning has become increasingly popular over the last few decades, so too has the number of machine learning interfaces for implementing these models. Whilst many R libraries exist for machine learning, very few offer extended support for survival analysis. This is problematic considering its importance in fields like medicine, bioinformatics, economics, engineering, and more. mlr3prob…
▽ More
As machine learning has become increasingly popular over the last few decades, so too has the number of machine learning interfaces for implementing these models. Whilst many R libraries exist for machine learning, very few offer extended support for survival analysis. This is problematic considering its importance in fields like medicine, bioinformatics, economics, engineering, and more. mlr3proba provides a comprehensive machine learning interface for survival analysis and connects with mlr3's general model tuning and benchmarking facilities to provide a systematic infrastructure for survival modeling and evaluation.
△ Less
Submitted 14 December, 2020; v1 submitted 18 August, 2020;
originally announced August 2020.
-
Feature Selection Methods for Cost-Constrained Classification in Random Forests
Authors:
Rudolf Jagdhuber,
Michel Lang,
Jörg Rahnenführer
Abstract:
Cost-sensitive feature selection describes a feature selection problem, where features raise individual costs for inclusion in a model. These costs allow to incorporate disfavored aspects of features, e.g. failure rates of as measuring device, or patient harm, in the model selection process. Random Forests define a particularly challenging problem for feature selection, as features are generally e…
▽ More
Cost-sensitive feature selection describes a feature selection problem, where features raise individual costs for inclusion in a model. These costs allow to incorporate disfavored aspects of features, e.g. failure rates of as measuring device, or patient harm, in the model selection process. Random Forests define a particularly challenging problem for feature selection, as features are generally entangled in an ensemble of multiple trees, which makes a post hoc removal of features infeasible. Feature selection methods therefore often either focus on simple pre-filtering methods, or require many Random Forest evaluations along their optimization path, which drastically increases the computational complexity. To solve both issues, we propose Shallow Tree Selection, a novel fast and multivariate feature selection method that selects features from small tree structures. Additionally, we also adapt three standard feature selection algorithms for cost-sensitive learning by introducing a hyperparameter-controlled benefit-cost ratio criterion (BCR) for each method. In an extensive simulation study, we assess this criterion, and compare the proposed methods to multiple performance-based baseline alternatives on four artificial data settings and seven real-world data settings. We show that all methods using a hyperparameterized BCR criterion outperform the baseline alternatives. In a direct comparison between the proposed methods, each method indicates strengths in certain settings, but no one-fits-all solution exists. On a global average, we could identify preferable choices among our BCR based methods. Nevertheless, we conclude that a practical analysis should never rely on a single method only, but always compare different approaches to obtain the best results.
△ Less
Submitted 17 August, 2020; v1 submitted 14 August, 2020;
originally announced August 2020.
-
Intrinsic-Extrinsic Convolution and Pooling for Learning on 3D Protein Structures
Authors:
Pedro Hermosilla,
Marco Schäfer,
Matěj Lang,
Gloria Fackelmann,
Pere Pau Vázquez,
Barbora Kozlíková,
Michael Krone,
Tobias Ritschel,
Timo Ropinski
Abstract:
Proteins perform a large variety of functions in living organisms, thus playing a key role in biology. As of now, available learning algorithms to process protein data do not consider several particularities of such data and/or do not scale well for large protein conformations. To fill this gap, we propose two new learning operations enabling deep 3D analysis of large-scale protein data. First, we…
▽ More
Proteins perform a large variety of functions in living organisms, thus playing a key role in biology. As of now, available learning algorithms to process protein data do not consider several particularities of such data and/or do not scale well for large protein conformations. To fill this gap, we propose two new learning operations enabling deep 3D analysis of large-scale protein data. First, we introduce a novel convolution operator which considers both, the intrinsic (invariant under protein folding) as well as extrinsic (invariant under bonding) structure, by using $n$-D convolutions defined on both the Euclidean distance, as well as multiple geodesic distances between atoms in a multi-graph. Second, we enable a multi-scale protein analysis by introducing hierarchical pooling operators, exploiting the fact that proteins are a recombination of a finite set of amino acids, which can be pooled using shared pooling matrices. Lastly, we evaluate the accuracy of our algorithms on several large-scale data sets for common protein analysis tasks, where we outperform state-of-the-art methods.
△ Less
Submitted 19 April, 2021; v1 submitted 13 July, 2020;
originally announced July 2020.
-
Learning to Link
Authors:
Maria-Florina Balcan,
Travis Dick,
Manuel Lang
Abstract:
Clustering is an important part of many modern data analysis pipelines, including network analysis and data retrieval. There are many different clustering algorithms developed by various communities, and it is often not clear which algorithm will give the best performance on a specific clustering task. Similarly, we often have multiple ways to measure distances between data points, and the best cl…
▽ More
Clustering is an important part of many modern data analysis pipelines, including network analysis and data retrieval. There are many different clustering algorithms developed by various communities, and it is often not clear which algorithm will give the best performance on a specific clustering task. Similarly, we often have multiple ways to measure distances between data points, and the best clustering performance might require a non-trivial combination of those metrics. In this work, we study data-driven algorithm selection and metric learning for clustering problems, where the goal is to simultaneously learn the best algorithm and metric for a specific application. The family of clustering algorithms we consider is parameterized linkage based procedures that includes single and complete linkage. The family of distance functions we learn over are convex combinations of base distance functions. We design efficient learning algorithms which receive samples from an application-specific distribution over clustering instances and simultaneously learn both a near-optimal distance and clustering algorithm from these classes. We also carry out a comprehensive empirical evaluation of our techniques showing that they can lead to significantly improved clustering performance.
△ Less
Submitted 2 October, 2019; v1 submitted 1 July, 2019;
originally announced July 2019.
-
High Dimensional Restrictive Federated Model Selection with multi-objective Bayesian Optimization over shifted distributions
Authors:
Xudong Sun,
Andrea Bommert,
Florian Pfisterer,
Jörg Rahnenführer,
Michel Lang,
Bernd Bischl
Abstract:
A novel machine learning optimization process coined Restrictive Federated Model Selection (RFMS) is proposed under the scenario, for example, when data from healthcare units can not leave the site it is situated on and it is forbidden to carry out training algorithms on remote data sites due to either technical or privacy and trust concerns. To carry out a clinical research under this scenario, a…
▽ More
A novel machine learning optimization process coined Restrictive Federated Model Selection (RFMS) is proposed under the scenario, for example, when data from healthcare units can not leave the site it is situated on and it is forbidden to carry out training algorithms on remote data sites due to either technical or privacy and trust concerns. To carry out a clinical research under this scenario, an analyst could train a machine learning model only on local data site, but it is still possible to execute a statistical query at a certain cost in the form of sending a machine learning model to some of the remote data sites and get the performance measures as feedback, maybe due to prediction being usually much cheaper. Compared to federated learning, which is optimizing the model parameters directly by carrying out training across all data sites, RFMS trains model parameters only on one local data site but optimizes hyper-parameters across other data sites jointly since hyper-parameters play an important role in machine learning performance. The aim is to get a Pareto optimal model with respective to both local and remote unseen prediction losses, which could generalize well across data sites. In this work, we specifically consider high dimensional data with shifted distributions over data sites. As an initial investigation, Bayesian Optimization especially multi-objective Bayesian Optimization is used to guide an adaptive hyper-parameter optimization process to select models under the RFMS scenario. Empirical results show that solely using the local data site to tune hyper-parameters generalizes poorly across data sites, compared to methods that utilize the local and remote performances. Furthermore, in terms of dominated hypervolumes, multi-objective Bayesian Optimization algorithms show increased performance across multiple data sites among other candidates.
△ Less
Submitted 8 August, 2019; v1 submitted 24 February, 2019;
originally announced February 2019.
-
A Multi-layer Gaussian Process for Motor Symptom Estimation in People with Parkinson's Disease
Authors:
Muriel Lang,
Franz M. J. Pfister,
Jakob Fröhner,
Kian Abedinpour,
Daniel Pichler,
Urban Fietzek,
Terry T. Um,
Dana Kulić,
Satoshi Endo,
Sandra Hirche
Abstract:
The assessment of Parkinson's disease (PD) poses a significant challenge as it is influenced by various factors which lead to a complex and fluctuating symptom manifestation. Thus, a frequent and objective PD assessment is highly valuable for effective health management of people with Parkinson's disease (PwP). Here, we propose a method for monitoring PwP by stochastically modeling the relationshi…
▽ More
The assessment of Parkinson's disease (PD) poses a significant challenge as it is influenced by various factors which lead to a complex and fluctuating symptom manifestation. Thus, a frequent and objective PD assessment is highly valuable for effective health management of people with Parkinson's disease (PwP). Here, we propose a method for monitoring PwP by stochastically modeling the relationships between their wrist movements during unscripted daily activities and corresponding annotations about clinical displays of movement abnormalities. We approach the estimation of PD motor signs by independently modeling and hierarchically stacking Gaussian process models for three classes of commonly observed movement abnormalities in PwP including tremor, (non-tremulous) bradykinesia, and (non-tremulous) dyskinesia. We use clinically adopted severity measures as annotations for training the models, thus allowing our multi-layer Gaussian process prediction models to estimate not only their presence but also their severities. The experimental validation of our approach demonstrates strong agreement of the model predictions with these PD annotations. Our results show the proposed method produces promising results in objective monitoring of movement abnormalities of PD in the presence of arbitrary and unknown voluntary motions, and makes an important step towards continuous monitoring of PD in the home environment.
△ Less
Submitted 27 September, 2018; v1 submitted 31 August, 2018;
originally announced August 2018.
-
Parkinson's Disease Assessment from a Wrist-Worn Wearable Sensor in Free-Living Conditions: Deep Ensemble Learning and Visualization
Authors:
Terry Taewoong Um,
Franz Michael Josef Pfister,
Daniel Christian Pichler,
Satoshi Endo,
Muriel Lang,
Sandra Hirche,
Urban Fietzek,
Dana Kulić
Abstract:
Parkinson's Disease (PD) is characterized by disorders in motor function such as freezing of gait, rest tremor, rigidity, and slowed and hyposcaled movements. Medication with dopaminergic medication may alleviate those motor symptoms, however, side-effects may include uncontrolled movements, known as dyskinesia. In this paper, an automatic PD motor-state assessment in free-living conditions is pro…
▽ More
Parkinson's Disease (PD) is characterized by disorders in motor function such as freezing of gait, rest tremor, rigidity, and slowed and hyposcaled movements. Medication with dopaminergic medication may alleviate those motor symptoms, however, side-effects may include uncontrolled movements, known as dyskinesia. In this paper, an automatic PD motor-state assessment in free-living conditions is proposed using an accelerometer in a wrist-worn wearable sensor. In particular, an ensemble of convolutional neural networks (CNNs) is applied to capture the large variability of daily-living activities and overcome the dissimilarity between training and test patients due to the inter-patient variability. In addition, class activation map (CAM), a visualization technique for CNNs, is applied for providing an interpretation of the results.
△ Less
Submitted 8 August, 2018;
originally announced August 2018.
-
OpenML Benchmarking Suites
Authors:
Bernd Bischl,
Giuseppe Casalicchio,
Matthias Feurer,
Pieter Gijsbers,
Frank Hutter,
Michel Lang,
Rafael G. Mantovani,
Jan N. van Rijn,
Joaquin Vanschoren
Abstract:
Machine learning research depends on objectively interpretable, comparable, and reproducible algorithm benchmarks. We advocate the use of curated, comprehensive suites of machine learning tasks to standardize the setup, execution, and reporting of benchmarks. We enable this through software tools that help to create and leverage these benchmarking suites. These are seamlessly integrated into the O…
▽ More
Machine learning research depends on objectively interpretable, comparable, and reproducible algorithm benchmarks. We advocate the use of curated, comprehensive suites of machine learning tasks to standardize the setup, execution, and reporting of benchmarks. We enable this through software tools that help to create and leverage these benchmarking suites. These are seamlessly integrated into the OpenML platform, and accessible through interfaces in Python, Java, and R. OpenML benchmarking suites (a) are easy to use through standardized data formats, APIs, and client libraries; (b) come with extensive meta-information on the included datasets; and (c) allow benchmarks to be shared and reused in future studies. We then present a first, carefully curated and practical benchmarking suite for classification: the OpenML Curated Classification benchmarking suite 2018 (OpenML-CC18). Finally, we discuss use cases and applications which demonstrate the usefulness of OpenML benchmarking suites and the OpenML-CC18 in particular.
△ Less
Submitted 22 November, 2021; v1 submitted 11 August, 2017;
originally announced August 2017.
-
Object Handover Prediction using Gaussian Processes clustered with Trajectory Classification
Authors:
Muriel Lang,
Satoshi Endo,
Oliver Dunkley,
Sandra Hirche
Abstract:
A robotic system which approximates the user intention and appropriate complimentary motion is critical for successful human-robot interaction. %While the existing wearable sensors can monitor human movements in real-time, prediction of human movement is a significant challenge due to its highly non-linear motions optimised through the redundancy in the degrees of freedom. Here, we demonstrate rob…
▽ More
A robotic system which approximates the user intention and appropriate complimentary motion is critical for successful human-robot interaction. %While the existing wearable sensors can monitor human movements in real-time, prediction of human movement is a significant challenge due to its highly non-linear motions optimised through the redundancy in the degrees of freedom. Here, we demonstrate robustness of the Gaussian Process (GP) clustered with a stochastic classification technique for trajectory prediction using an object handover scenario. By parametrising real 6D hand movements during human-human object handover using dual quaternions, variations of handover configurations were classified in real-time and then the remaining hand trajectory was predicted using the GP. The results highlights that our method can classify the handover configuration at an average of $43.4\%$ of the trajectory and the final hand configuration can be predicted within the normal variation of human movement. In conclusion, we demonstrate that GPs combined with a stochastic classification technique is a robust tool for proactively estimating human motions for human-robot interaction.
△ Less
Submitted 10 July, 2017;
originally announced July 2017.
-
MPG - A Framework for Reasoning on 6 DOF Pose Uncertainty
Authors:
Wendelin Feiten,
Muriel Lang
Abstract:
Reasoning about the pose, i.e. position and orientation of objects is one of the cornerstones of robotic manipulation under uncertainty. In a number of joint research projects our group is developing a robotic perception system that perceives and models an unprepared kitchen scenario with many objects. Since no single sensor or measurement provides sufficient information, a technique is needed to…
▽ More
Reasoning about the pose, i.e. position and orientation of objects is one of the cornerstones of robotic manipulation under uncertainty. In a number of joint research projects our group is developing a robotic perception system that perceives and models an unprepared kitchen scenario with many objects. Since no single sensor or measurement provides sufficient information, a technique is needed to fuse a number of uncertain estimates of the pose, i.e. estimates with a widely stretched probability density function ($pdf$). The most frequently used approaches to describe the $pdfs$ are sample based description and multivariate normal (Gaussian) distributions. Sample based descriptions in 6D can describe basically any type of $pdfs$, but they require a large number of samples and there are no analytic formulae to fuse several $pdfs$. For Gaussian distributions these formulae exist, but the Gaussian distributions are unimodal and don't model widely spread distributions well. In this paper we present a framework for probabilistic modeling of 6D poses that combines the expressive power of the sample based description with the conciseness and algorithmic power of the Gaussian models. As parameterization of the 6D poses we select the dual quaternions, i.e. any pose is represented by two quaternions. The orientation part of a pose is described by a unit quaternion. The translation part is described by a purely imaginary quaternion. A basic probability density function over the poses is constructed by selecting a tangent point on the 3D sphere representing unit quaternions and taking the Cartesian set product of the tangent space with the 3D space of translations. In this 6D Euclidean space a 6D Gaussian distribution is defined. Projecting this Gaussian back to the unit sphere and renormalizing induces a distribution over 6D poses, called a Projected Gaussian.
△ Less
Submitted 5 July, 2017;
originally announced July 2017.
-
Data Augmentation of Wearable Sensor Data for Parkinson's Disease Monitoring using Convolutional Neural Networks
Authors:
Terry Taewoong Um,
Franz Michael Josef Pfister,
Daniel Pichler,
Satoshi Endo,
Muriel Lang,
Sandra Hirche,
Urban Fietzek,
Dana Kulić
Abstract:
While convolutional neural networks (CNNs) have been successfully applied to many challenging classification applications, they typically require large datasets for training. When the availability of labeled data is limited, data augmentation is a critical preprocessing step for CNNs. However, data augmentation for wearable sensor data has not been deeply investigated yet.
In this paper, various…
▽ More
While convolutional neural networks (CNNs) have been successfully applied to many challenging classification applications, they typically require large datasets for training. When the availability of labeled data is limited, data augmentation is a critical preprocessing step for CNNs. However, data augmentation for wearable sensor data has not been deeply investigated yet.
In this paper, various data augmentation methods for wearable sensor data are proposed. The proposed methods and CNNs are applied to the classification of the motor state of Parkinson's Disease patients, which is challenging due to small dataset size, noisy labels, and large intra-class variability. Appropriate augmentation improves the classification performance from 77.54\% to 86.88\%.
△ Less
Submitted 8 November, 2017; v1 submitted 1 June, 2017;
originally announced June 2017.
-
OpenML: An R Package to Connect to the Machine Learning Platform OpenML
Authors:
Giuseppe Casalicchio,
Jakob Bossek,
Michel Lang,
Dominik Kirchhoff,
Pascal Kerschke,
Benjamin Hofner,
Heidi Seibold,
Joaquin Vanschoren,
Bernd Bischl
Abstract:
OpenML is an online machine learning platform where researchers can easily share data, machine learning tasks and experiments as well as organize them online to work and collaborate more efficiently. In this paper, we present an R package to interface with the OpenML platform and illustrate its usage in combination with the machine learning R package mlr. We show how the OpenML package allows R us…
▽ More
OpenML is an online machine learning platform where researchers can easily share data, machine learning tasks and experiments as well as organize them online to work and collaborate more efficiently. In this paper, we present an R package to interface with the OpenML platform and illustrate its usage in combination with the machine learning R package mlr. We show how the OpenML package allows R users to easily search, download and upload data sets and machine learning tasks. Furthermore, we also show how to upload results of experiments, share them with others and download results from other users. Beyond ensuring reproducibility of results, the OpenML platform automates much of the drudge work, speeds up research, facilitates collaboration and increases the users' visibility online.
△ Less
Submitted 4 May, 2017; v1 submitted 5 January, 2017;
originally announced January 2017.
-
mlr Tutorial
Authors:
Julia Schiffner,
Bernd Bischl,
Michel Lang,
Jakob Richter,
Zachary M. Jones,
Philipp Probst,
Florian Pfisterer,
Mason Gallo,
Dominik Kirchhoff,
Tobias Kühn,
Janek Thomas,
Lars Kotthoff
Abstract:
This document provides and in-depth introduction to the mlr framework for machine learning experiments in R.
This document provides and in-depth introduction to the mlr framework for machine learning experiments in R.
△ Less
Submitted 17 September, 2016;
originally announced September 2016.
-
Modeling and Verification of Infinite Systems with Resources
Authors:
Martin Lang,
Christof Löding
Abstract:
We consider formal verification of recursive programs with resource consumption. We introduce prefix replacement systems with non-negative integer counters which can be incremented and reset to zero as a formal model for such programs. In these systems, we investigate bounds on the resource consumption for reachability questions. Motivated by this question, we introduce relational structures with…
▽ More
We consider formal verification of recursive programs with resource consumption. We introduce prefix replacement systems with non-negative integer counters which can be incremented and reset to zero as a formal model for such programs. In these systems, we investigate bounds on the resource consumption for reachability questions. Motivated by this question, we introduce relational structures with resources and a quantitative first-order logic over these structures. We define resource automatic structures as a subclass of these structures and provide an effective method to compute the semantics of the logic on this subclass. Subsequently, we use this framework to solve the bounded reachability problem for resource prefix replacement systems. We achieve this result by extending the well-known saturation method to annotated prefix replacement systems. Finally, we provide a connection to the study of the logic cost-WMSO.
△ Less
Submitted 13 December, 2013; v1 submitted 5 November, 2013;
originally announced November 2013.