Search | arXiv e-print repository

arXiv:2408.03569 [pdf, other]

Maximum a Posteriori Estimation for Linear Structural Dynamics Models Using Bayesian Optimization with Rational Polynomial Chaos Expansions

Authors: Felix Schneider, Iason Papaioannou, Bruno Sudret, Gerhard Müller

Abstract: Bayesian analysis enables combining prior knowledge with measurement data to learn model parameters. Commonly, one resorts to computing the maximum a posteriori (MAP) estimate, when only a point estimate of the parameters is of interest. We apply MAP estimation in the context of structural dynamic models, where the system response can be described by the frequency response function. To alleviate h… ▽ More Bayesian analysis enables combining prior knowledge with measurement data to learn model parameters. Commonly, one resorts to computing the maximum a posteriori (MAP) estimate, when only a point estimate of the parameters is of interest. We apply MAP estimation in the context of structural dynamic models, where the system response can be described by the frequency response function. To alleviate high computational demands from repeated expensive model calls, we utilize a rational polynomial chaos expansion (RPCE) surrogate model that expresses the system frequency response as a rational of two polynomials with complex coefficients. We propose an extension to an existing sparse Bayesian learning approach for RPCE based on Laplace's approximation for the posterior distribution of the denominator coefficients. Furthermore, we introduce a Bayesian optimization approach, which allows to adaptively enrich the experimental design throughout the optimization process of MAP estimation. Thereby, we utilize the expected improvement acquisition function as a means to identify sample points in the input space that are possibly associated with large objective function values. The acquisition function is estimated through Monte Carlo sampling based on the posterior distribution of the expansion coefficients identified in the sparse Bayesian learning process. By combining the sparsity-inducing learning procedure with the sequential experimental design, we effectively reduce the number of model evaluations in the MAP estimation problem. We demonstrate the applicability of the presented methods on the parameter updating problem of an algebraic two-degree-of-freedom system and the finite element model of a cross-laminated timber plate. △ Less

Submitted 7 August, 2024; originally announced August 2024.

arXiv:2407.06447 [pdf, other]

Geospatial Trajectory Generation via Efficient Abduction: Deployment for Independent Testing

Authors: Divyagna Bavikadi, Dyuman Aditya, Devendra Parkar, Paulo Shakarian, Graham Mueller, Chad Parvis, Gerardo I. Simari

Abstract: The ability to generate artificial human movement patterns while meeting location and time constraints is an important problem in the security community, particularly as it enables the study of the analog problem of detecting such patterns while maintaining privacy. We frame this problem as an instance of abduction guided by a novel parsimony function represented as an aggregate truth value over a… ▽ More The ability to generate artificial human movement patterns while meeting location and time constraints is an important problem in the security community, particularly as it enables the study of the analog problem of detecting such patterns while maintaining privacy. We frame this problem as an instance of abduction guided by a novel parsimony function represented as an aggregate truth value over an annotated logic program. This approach has the added benefit of affording explainability to an analyst user. By showing that any subset of such a program can provide a lower bound on this parsimony requirement, we are able to abduce movement trajectories efficiently through an informed (i.e., A*) search. We describe how our implementation was enhanced with the application of multiple techniques in order to be scaled and integrated with a cloud-based software stack that included bottom-up rule learning, geolocated knowledge graph retrieval/management, and interfaces with government systems for independently conducted government-run tests for which we provide results. We also report on our own experiments showing that we not only provide exact results but also scale to very large scenarios and provide realistic agent trajectories that can go undetected by machine learning anomaly detectors. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: Accepted at ICLP 2024

arXiv:2406.14082 [pdf, other]

FLoCoRA: Federated learning compression with low-rank adaptation

Authors: Lucas Grativol Ribeiro, Mathieu Leonardon, Guillaume Muller, Virginie Fresse, Matthieu Arzel

Abstract: Low-Rank Adaptation (LoRA) methods have gained popularity in efficient parameter fine-tuning of models containing hundreds of billions of parameters. In this work, instead, we demonstrate the application of LoRA methods to train small-vision models in Federated Learning (FL) from scratch. We first propose an aggregation-agnostic method to integrate LoRA within FL, named FLoCoRA, showing that the m… ▽ More Low-Rank Adaptation (LoRA) methods have gained popularity in efficient parameter fine-tuning of models containing hundreds of billions of parameters. In this work, instead, we demonstrate the application of LoRA methods to train small-vision models in Federated Learning (FL) from scratch. We first propose an aggregation-agnostic method to integrate LoRA within FL, named FLoCoRA, showing that the method is capable of reducing communication costs by 4.8 times, while having less than 1% accuracy degradation, for a CIFAR-10 classification task with a ResNet-8. Next, we show that the same method can be extended with an affine quantization scheme, dividing the communication cost by 18.6 times, while comparing it with the standard method, with still less than 1% of accuracy loss, tested with on a ResNet-18 model. Our formulation represents a strong baseline for message size reduction, even when compared to conventional model compression works, while also reducing the training memory requirements due to the low-rank adaptation. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Journal ref: 32nd European Signal Processing Conference EUSIPCO, Aug 2024, Lyon, France

arXiv:2406.05313 [pdf, other]

Traversing Mars: Cooperative Informative Path Planning to Efficiently Navigate Unknown Scenes

Authors: Friedrich M. Rockenbauer, Jaeyoung Lim, Marcus G. Müller, Roland Siegwart, Lukas Schmid

Abstract: The ability to traverse an unknown environment is crucial for autonomous robot operations. However, due to the limited sensing capabilities and system constraints, approaching this problem with a single robot agent can be slow, costly, and unsafe. For example, in planetary exploration missions, the wear on the wheels of a rover from abrasive terrain should be minimized at all costs as reparations… ▽ More The ability to traverse an unknown environment is crucial for autonomous robot operations. However, due to the limited sensing capabilities and system constraints, approaching this problem with a single robot agent can be slow, costly, and unsafe. For example, in planetary exploration missions, the wear on the wheels of a rover from abrasive terrain should be minimized at all costs as reparations are infeasible. On the other hand, utilizing a scouting robot such as a micro aerial vehicle (MAV) has the potential to reduce wear and time costs and increasing safety of a follower robot. This work proposes a novel cooperative IPP framework that allows a scout (e.g., an MAV) to efficiently explore the minimum-cost-path for a follower (e.g., a rover) to reach the goal. We derive theoretic guarantees for our algorithm, and prove that the algorithm always terminates, always finds the optimal path if it exists, and terminates early when the found path is shown to be optimal or infeasible. We show in thorough experimental evaluation that the guarantees hold in practice, and that our algorithm is 22.5% quicker to find the optimal path and 15% quicker to terminate compared to existing methods. △ Less

Submitted 12 June, 2024; v1 submitted 7 June, 2024; originally announced June 2024.

Comments: 8 pages, 9 figures, code will be available at https://github.com/ethz-asl/scouting-ipp

arXiv:2406.03845 [pdf, other]

Open Problem: Active Representation Learning

Authors: Nikola Milosevic, Gesine Müller, Jan Huisken, Nico Scherf

Abstract: In this work, we introduce the concept of Active Representation Learning, a novel class of problems that intertwines exploration and representation learning within partially observable environments. We extend ideas from Active Simultaneous Localization and Mapping (active SLAM), and translate them to scientific discovery problems, exemplified by adaptive microscopy. We explore the need for a frame… ▽ More In this work, we introduce the concept of Active Representation Learning, a novel class of problems that intertwines exploration and representation learning within partially observable environments. We extend ideas from Active Simultaneous Localization and Mapping (active SLAM), and translate them to scientific discovery problems, exemplified by adaptive microscopy. We explore the need for a framework that derives exploration skills from representations that are in some sense actionable, aiming to enhance the efficiency and effectiveness of data collection and model building in the natural sciences. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.04443 [pdf, other]

POV Learning: Individual Alignment of Multimodal Models using Human Perception

Authors: Simon Werner, Katharina Christ, Laura Bernardy, Marion G. Müller, Achim Rettinger

Abstract: Aligning machine learning systems with human expectations is mostly attempted by training with manually vetted human behavioral samples, typically explicit feedback. This is done on a population level since the context that is capturing the subjective Point-Of-View (POV) of a concrete person in a specific situational context is not retained in the data. However, we argue that alignment on an indiv… ▽ More Aligning machine learning systems with human expectations is mostly attempted by training with manually vetted human behavioral samples, typically explicit feedback. This is done on a population level since the context that is capturing the subjective Point-Of-View (POV) of a concrete person in a specific situational context is not retained in the data. However, we argue that alignment on an individual level can boost the subjective predictive performance for the individual user interacting with the system considerably. Since perception differs for each person, the same situation is observed differently. Consequently, the basis for decision making and the subsequent reasoning processes and observable reactions differ. We hypothesize that individual perception patterns can be used for improving the alignment on an individual level. We test this, by integrating perception information into machine learning systems and measuring their predictive performance wrt.~individual subjective assessments. For our empirical study, we collect a novel data set of multimodal stimuli and corresponding eye tracking sequences for the novel task of Perception-Guided Crossmodal Entailment and tackle it with our Perception-Guided Multimodal Transformer. Our findings suggest that exploiting individual perception signals for the machine learning of subjective human assessments provides a valuable cue for individual alignment. It does not only improve the overall predictive performance from the point-of-view of the individual user but might also contribute to steering AI systems towards every person's individual expectations and values. △ Less

Submitted 7 May, 2024; originally announced May 2024.

arXiv:2404.19354 [pdf, other]

PEFSL: A deployment Pipeline for Embedded Few-Shot Learning on a FPGA SoC

Authors: Lucas Grativol Ribeiro, Lubin Gauthier, Mathieu Leonardon, Jérémy Morlier, Antoine Lavrard-Meyer, Guillaume Muller, Virginie Fresse, Matthieu Arzel

Abstract: This paper tackles the challenges of implementing few-shot learning on embedded systems, specifically FPGA SoCs, a vital approach for adapting to diverse classification tasks, especially when the costs of data acquisition or labeling prove to be prohibitively high. Our contributions encompass the development of an end-to-end open-source pipeline for a few-shot learning platform for object classifi… ▽ More This paper tackles the challenges of implementing few-shot learning on embedded systems, specifically FPGA SoCs, a vital approach for adapting to diverse classification tasks, especially when the costs of data acquisition or labeling prove to be prohibitively high. Our contributions encompass the development of an end-to-end open-source pipeline for a few-shot learning platform for object classification on a FPGA SoCs. The pipeline is built on top of the Tensil open-source framework, facilitating the design, training, evaluation, and deployment of DNN backbones tailored for few-shot learning. Additionally, we showcase our work's potential by building and deploying a low-power, low-latency demonstrator trained on the MiniImageNet dataset with a dataflow architecture. The proposed system has a latency of 30 ms while consuming 6.2 W on the PYNQ-Z1 board. △ Less

Submitted 30 April, 2024; originally announced April 2024.

Journal ref: ISCAS 2024 : IEEE International Symposium on Circuits and Systems, May 2024, Singapore, Singapore

arXiv:2403.05441 [pdf, other]

Bayesian Hierarchical Probabilistic Forecasting of Intraday Electricity Prices

Authors: Daniel Nickelsen, Gernot Müller

Abstract: We present a first study of Bayesian forecasting of electricity prices traded on the German continuous intraday market which fully incorporates parameter uncertainty. A particularly large set of endogenous and exogenous covariables is used, handled through feature selection with Orthogonal Matching Pursuit (OMP) and regularising priors. Our target variable is the IDFull price index, forecasts are… ▽ More We present a first study of Bayesian forecasting of electricity prices traded on the German continuous intraday market which fully incorporates parameter uncertainty. A particularly large set of endogenous and exogenous covariables is used, handled through feature selection with Orthogonal Matching Pursuit (OMP) and regularising priors. Our target variable is the IDFull price index, forecasts are given in terms of posterior predictive distributions. For validation we use the exceedingly volatile electricity prices of 2022, which have hardly been the subject of forecasting studies before. As a benchmark model, we use all available intraday transactions at the time of forecast creation to compute a current value for the IDFull. According to the weak-form efficiency hypothesis, it would not be possible to significantly improve this benchmark built from last price information. We do, however, observe statistically significant improvement in terms of both point measures and probability scores. Finally, we challenge the declared gold standard of using LASSO for feature selection in electricity price forecasting by presenting strong statistical evidence that OMP leads to better forecasting performance. △ Less

Submitted 30 July, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

Comments: 22 pages, 14 figures, 4 tables. Revised version with an added schematic figure. Under review for Applied Energy

ACM Class: G.3; I.2.6; I.6.3; I.6.5

arXiv:2310.14693 [pdf, other]

Federated learning compression designed for lightweight communications

Authors: Lucas Grativol Ribeiro, Mathieu Leonardon, Guillaume Muller, Virginie Fresse, Matthieu Arzel

Abstract: Federated Learning (FL) is a promising distributed method for edge-level machine learning, particularly for privacysensitive applications such as those in military and medical domains, where client data cannot be shared or transferred to a cloud computing server. In many use-cases, communication cost is a major challenge in FL due to its natural intensive network usage. Client devices, such as sma… ▽ More Federated Learning (FL) is a promising distributed method for edge-level machine learning, particularly for privacysensitive applications such as those in military and medical domains, where client data cannot be shared or transferred to a cloud computing server. In many use-cases, communication cost is a major challenge in FL due to its natural intensive network usage. Client devices, such as smartphones or Internet of Things (IoT) nodes, have limited resources in terms of energy, computation, and memory. To address these hardware constraints, lightweight models and compression techniques such as pruning and quantization are commonly adopted in centralised paradigms. In this paper, we investigate the impact of compression techniques on FL for a typical image classification task. Going further, we demonstrate that a straightforward method can compresses messages up to 50% while having less than 1% of accuracy loss, competing with state-of-the-art techniques. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Journal ref: IEEE 30th International Conference on Electronics, Circuits and Systems, Dec 2023, Istanbul, Turkey

arXiv:2310.02097 [pdf, other]

Leveraging Classic Deconvolution and Feature Extraction in Zero-Shot Image Restoration

Authors: Tomáš Chobola, Gesine Müller, Veit Dausmann, Anton Theileis, Jan Taucher, Jan Huisken, Tingying Peng

Abstract: Non-blind deconvolution aims to restore a sharp image from its blurred counterpart given an obtained kernel. Existing deep neural architectures are often built based on large datasets of sharp ground truth images and trained with supervision. Sharp, high quality ground truth images, however, are not always available, especially for biomedical applications. This severely hampers the applicability o… ▽ More Non-blind deconvolution aims to restore a sharp image from its blurred counterpart given an obtained kernel. Existing deep neural architectures are often built based on large datasets of sharp ground truth images and trained with supervision. Sharp, high quality ground truth images, however, are not always available, especially for biomedical applications. This severely hampers the applicability of current approaches in practice. In this paper, we propose a novel non-blind deconvolution method that leverages the power of deep learning and classic iterative deconvolution algorithms. Our approach combines a pre-trained network to extract deep features from the input image with iterative Richardson-Lucy deconvolution steps. Subsequently, a zero-shot optimisation process is employed to integrate the deconvolved features, resulting in a high-quality reconstructed image. By performing the preliminary reconstruction with the classic iterative deconvolution method, we can effectively utilise a smaller network to produce the final image, thus accelerating the reconstruction whilst reducing the demand for valuable computational resources. Our method demonstrates significant improvements in various real-world applications non-blind deconvolution tasks. △ Less

Submitted 3 October, 2023; originally announced October 2023.

arXiv:2309.01865 [pdf, other]

BigFUSE: Global Context-Aware Image Fusion in Dual-View Light-Sheet Fluorescence Microscopy with Image Formation Prior

Authors: Yu Liu, Gesine Muller, Nassir Navab, Carsten Marr, Jan Huisken, Tingying Peng

Abstract: Light-sheet fluorescence microscopy (LSFM), a planar illumination technique that enables high-resolution imaging of samples, experiences defocused image quality caused by light scattering when photons propagate through thick tissues. To circumvent this issue, dualview imaging is helpful. It allows various sections of the specimen to be scanned ideally by viewing the sample from opposing orientatio… ▽ More Light-sheet fluorescence microscopy (LSFM), a planar illumination technique that enables high-resolution imaging of samples, experiences defocused image quality caused by light scattering when photons propagate through thick tissues. To circumvent this issue, dualview imaging is helpful. It allows various sections of the specimen to be scanned ideally by viewing the sample from opposing orientations. Recent image fusion approaches can then be applied to determine in-focus pixels by comparing image qualities of two views locally and thus yield spatially inconsistent focus measures due to their limited field-of-view. Here, we propose BigFUSE, a global context-aware image fuser that stabilizes image fusion in LSFM by considering the global impact of photon propagation in the specimen while determining focus-defocus based on local image qualities. Inspired by the image formation prior in dual-view LSFM, image fusion is considered as estimating a focus-defocus boundary using Bayes Theorem, where (i) the effect of light scattering onto focus measures is included within Likelihood; and (ii) the spatial consistency regarding focus-defocus is imposed in Prior. The expectation-maximum algorithm is then adopted to estimate the focus-defocus boundary. Competitive experimental results show that BigFUSE is the first dual-view LSFM fuser that is able to exclude structured artifacts when fusing information, highlighting its abilities of automatic image fusion. △ Less

Submitted 3 November, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

Comments: paper in MICCAI 2023

arXiv:2307.07998 [pdf, other]

LUCYD: A Feature-Driven Richardson-Lucy Deconvolution Network

Authors: Tomáš Chobola, Gesine Müller, Veit Dausmann, Anton Theileis, Jan Taucher, Jan Huisken, Tingying Peng

Abstract: The process of acquiring microscopic images in life sciences often results in image degradation and corruption, characterised by the presence of noise and blur, which poses significant challenges in accurately analysing and interpreting the obtained data. This paper proposes LUCYD, a novel method for the restoration of volumetric microscopy images that combines the Richardson-Lucy deconvolution fo… ▽ More The process of acquiring microscopic images in life sciences often results in image degradation and corruption, characterised by the presence of noise and blur, which poses significant challenges in accurately analysing and interpreting the obtained data. This paper proposes LUCYD, a novel method for the restoration of volumetric microscopy images that combines the Richardson-Lucy deconvolution formula and the fusion of deep features obtained by a fully convolutional network. By integrating the image formation process into a feature-driven restoration model, the proposed approach aims to enhance the quality of the restored images whilst reducing computational costs and maintaining a high degree of interpretability. Our results demonstrate that LUCYD outperforms the state-of-the-art methods in both synthetic and real microscopy images, achieving superior performance in terms of image quality and generalisability. We show that the model can handle various microscopy modalities and different imaging conditions by evaluating it on two different microscopy datasets, including volumetric widefield and light-sheet microscopy. Our experiments indicate that LUCYD can significantly improve resolution, contrast, and overall quality of microscopy images. Therefore, it can be a valuable tool for microscopy image restoration and can facilitate further research in various microscopy applications. We made the source code for the model accessible under https://github.com/ctom2/lucyd-deconvolution. △ Less

Submitted 16 July, 2023; originally announced July 2023.

Comments: Accepted by 26th International Conference on Medical Image Computing and Computer Assisted Intervention

arXiv:2206.10670 [pdf, other]

SCIM: Simultaneous Clustering, Inference, and Mapping for Open-World Semantic Scene Understanding

Authors: Hermann Blum, Marcus G. Müller, Abel Gawel, Roland Siegwart, Cesar Cadena

Abstract: In order to operate in human environments, a robot's semantic perception has to overcome open-world challenges such as novel objects and domain gaps. Autonomous deployment to such environments therefore requires robots to update their knowledge and learn without supervision. We investigate how a robot can autonomously discover novel semantic classes and improve accuracy on known classes when explo… ▽ More In order to operate in human environments, a robot's semantic perception has to overcome open-world challenges such as novel objects and domain gaps. Autonomous deployment to such environments therefore requires robots to update their knowledge and learn without supervision. We investigate how a robot can autonomously discover novel semantic classes and improve accuracy on known classes when exploring an unknown environment. To this end, we develop a general framework for mapping and clustering that we then use to generate a self-supervised learning signal to update a semantic segmentation model. In particular, we show how clustering parameters can be optimized during deployment and that fusion of multiple observation modalities improves novel object discovery compared to prior work. Models, data, and implementations can be found at https://github.com/hermannsblum/scim △ Less

Submitted 20 September, 2022; v1 submitted 21 June, 2022; originally announced June 2022.

Comments: accepted at ISRR 2022

arXiv:2206.02523 [pdf, other]

Sparse Bayesian Learning for Complex-Valued Rational Approximations

Authors: Felix Schneider, Iason Papaioannou, Gerhard Müller

Abstract: Surrogate models are used to alleviate the computational burden in engineering tasks, which require the repeated evaluation of computationally demanding models of physical systems, such as the efficient propagation of uncertainties. For models that show a strongly non-linear dependence on their input parameters, standard surrogate techniques, such as polynomial chaos expansion, are not sufficient… ▽ More Surrogate models are used to alleviate the computational burden in engineering tasks, which require the repeated evaluation of computationally demanding models of physical systems, such as the efficient propagation of uncertainties. For models that show a strongly non-linear dependence on their input parameters, standard surrogate techniques, such as polynomial chaos expansion, are not sufficient to obtain an accurate representation of the original model response. Through applying a rational approximation instead, the approximation error can be efficiently reduced for models whose non-linearity is accurately described through a rational function. Specifically, our aim is to approximate complex-valued models. A common approach to obtain the coefficients in the surrogate is to minimize the sample-based error between model and surrogate in the least-square sense. In order to obtain an accurate representation of the original model and to avoid overfitting, the sample set has be two to three times the number of polynomial terms in the expansion. For models that require a high polynomial degree or are high-dimensional in terms of their input parameters, this number often exceeds the affordable computational cost. To overcome this issue, we apply a sparse Bayesian learning approach to the rational approximation. Through a specific prior distribution structure, sparsity is induced in the coefficients of the surrogate model. The denominator polynomial coefficients as well as the hyperparameters of the problem are determined through a type-II-maximum likelihood approach. We apply a quasi-Newton gradient-descent algorithm in order to find the optimal denominator coefficients and derive the required gradients through application of $\mathbb{CR}$-calculus. △ Less

Submitted 27 September, 2022; v1 submitted 6 June, 2022; originally announced June 2022.

Comments: 27 pages, 13 figures

arXiv:2205.06520 [pdf]

doi 10.1016/j.comgeo.2022.101941

Simplex Closing Probabilities in Directed Graphs

Authors: Florian Unger, Jonathan Krebs, Michael G. Müller

Abstract: Recent work in mathematical neuroscience has calculated the directed graph homology of the directed simplicial complex given by the brains sparse adjacency graph, the so called connectome. These biological connectomes show an abundance of both high-dimensional directed simplices and Betti-numbers in all viable dimensions - in contrast to Erdős-Rényi-graphs of comparable size and density. An anal… ▽ More Recent work in mathematical neuroscience has calculated the directed graph homology of the directed simplicial complex given by the brains sparse adjacency graph, the so called connectome. These biological connectomes show an abundance of both high-dimensional directed simplices and Betti-numbers in all viable dimensions - in contrast to Erdős-Rényi-graphs of comparable size and density. An analysis of synthetically trained connectomes reveals similar findings, raising questions about the graphs comparability and the nature of origin of the simplices. We present a new method capable of delivering insight into the emergence of simplices and thus simplicial abundance. Our approach allows to easily distinguish simplex-rich connectomes of different origin. The method relies on the novel concept of an almost-d-simplex, that is, a simplex missing exactly one edge, and consequently the almost-d-simplex closing probability by dimension. We also describe a fast algorithm to identify almost-d-simplices in a given graph. Applying this method to biological and artificial data allows us to identify a mechanism responsible for simplex emergence, and suggests this mechanism is responsible for the simplex signature of the excitatory subnetwork of a statistical reconstruction of the mouse primary visual cortex. Our highly optimised code for this new method is publicly available. △ Less

Submitted 13 May, 2022; originally announced May 2022.

MSC Class: 05E45 (Primary); 05C85 (Secondary); 05c20 (Secondary)

arXiv:2109.09690 [pdf, other]

Trust Your Robots! Predictive Uncertainty Estimation of Neural Networks with Sparse Gaussian Processes

Authors: Jongseok Lee, Jianxiang Feng, Matthias Humt, Marcus G. Müller, Rudolph Triebel

Abstract: This paper presents a probabilistic framework to obtain both reliable and fast uncertainty estimates for predictions with Deep Neural Networks (DNNs). Our main contribution is a practical and principled combination of DNNs with sparse Gaussian Processes (GPs). We prove theoretically that DNNs can be seen as a special case of sparse GPs, namely mixtures of GP experts (MoE-GP), and we devise a learn… ▽ More This paper presents a probabilistic framework to obtain both reliable and fast uncertainty estimates for predictions with Deep Neural Networks (DNNs). Our main contribution is a practical and principled combination of DNNs with sparse Gaussian Processes (GPs). We prove theoretically that DNNs can be seen as a special case of sparse GPs, namely mixtures of GP experts (MoE-GP), and we devise a learning algorithm that brings the derived theory into practice. In experiments from two different robotic tasks -- inverse dynamics of a manipulator and object detection on a micro-aerial vehicle (MAV) -- we show the effectiveness of our approach in terms of predictive uncertainty, improved scalability, and run-time efficiency on a Jetson TX2. We thus argue that our approach can pave the way towards reliable and fast robot learning systems with uncertainty awareness. △ Less

Submitted 21 September, 2021; v1 submitted 20 September, 2021; originally announced September 2021.

Comments: 12 pages, 6 figures and 1 table. Accepted at the 5th Conference on Robot Learning (CORL 2021), London, UK

arXiv:2109.05509 [pdf, other]

Towards Robust Monocular Visual Odometry for Flying Robots on Planetary Missions

Authors: Martin Wudenka, Marcus G. Müller, Nikolaus Demmel, Armin Wedler, Rudolph Triebel, Daniel Cremers, Wolfgang Stürzl

Abstract: In the future, extraterrestrial expeditions will not only be conducted by rovers but also by flying robots. The technical demonstration drone Ingenuity, that just landed on Mars, will mark the beginning of a new era of exploration unhindered by terrain traversability. Robust self-localization is crucial for that. Cameras that are lightweight, cheap and information-rich sensors are already used to… ▽ More In the future, extraterrestrial expeditions will not only be conducted by rovers but also by flying robots. The technical demonstration drone Ingenuity, that just landed on Mars, will mark the beginning of a new era of exploration unhindered by terrain traversability. Robust self-localization is crucial for that. Cameras that are lightweight, cheap and information-rich sensors are already used to estimate the ego-motion of vehicles. However, methods proven to work in man-made environments cannot simply be deployed on other planets. The highly repetitive textures present in the wastelands of Mars pose a huge challenge to descriptor matching based approaches. In this paper, we present an advanced robust monocular odometry algorithm that uses efficient optical flow tracking to obtain feature correspondences between images and a refined keyframe selection criterion. In contrast to most other approaches, our framework can also handle rotation-only motions that are particularly challenging for monocular odometry systems. Furthermore, we present a novel approach to estimate the current risk of scale drift based on a principal component analysis of the relative translation information matrix. This way we obtain an implicit measure of uncertainty. We evaluate the validity of our approach on all sequences of a challenging real-world dataset captured in a Mars-like environment and show that it outperforms state-of-the-art approaches. △ Less

Submitted 12 September, 2021; originally announced September 2021.

Comments: Accepted to IROS 2021. Updated version corresponding to IROS camera-ready. The source code is publicly available at: https://github.com/DLR-RM/granite

arXiv:2103.10158 [pdf, other]

TrivialAugment: Tuning-free Yet State-of-the-Art Data Augmentation

Authors: Samuel G. Müller, Frank Hutter

Abstract: Automatic augmentation methods have recently become a crucial pillar for strong model performance in vision tasks. While existing automatic augmentation methods need to trade off simplicity, cost and performance, we present a most simple baseline, TrivialAugment, that outperforms previous methods for almost free. TrivialAugment is parameter-free and only applies a single augmentation to each image… ▽ More Automatic augmentation methods have recently become a crucial pillar for strong model performance in vision tasks. While existing automatic augmentation methods need to trade off simplicity, cost and performance, we present a most simple baseline, TrivialAugment, that outperforms previous methods for almost free. TrivialAugment is parameter-free and only applies a single augmentation to each image. Thus, TrivialAugment's effectiveness is very unexpected to us and we performed very thorough experiments to study its performance. First, we compare TrivialAugment to previous state-of-the-art methods in a variety of image classification scenarios. Then, we perform multiple ablation studies with different augmentation spaces, augmentation methods and setups to understand the crucial requirements for its performance. Additionally, we provide a simple interface to facilitate the widespread adoption of automatic augmentation methods, as well as our full code base for reproducibility. Since our work reveals a stagnation in many parts of automatic augmentation research, we end with a short proposal of best practices for sustained future progress in automatic augmentation methods. △ Less

Submitted 17 August, 2021; v1 submitted 18 March, 2021; originally announced March 2021.

Comments: Accepted to ICCV 2021 as Oral

arXiv:2012.07259 [pdf, other]

AndroEvolve: Automated Update for Android Deprecated-API Usages

Authors: Stefanus Agus Haryono, Ferdian Thung, David Lo, Lingxiao Jiang, Julia Lawall, Hong Jin Kang, Lucas Serrano, Gilles Muller

Abstract: Android operating system (OS) is often updated, where each new version may involve API deprecation. Usages of deprecated APIs in Android apps need to be updated to ensure the apps' compatibility with the old and new versions of Android OS. In this work, we propose AndroEvolve, an automated tool to update usages of deprecated Android APIs, that addresses the limitations of the state-of-the-art tool… ▽ More Android operating system (OS) is often updated, where each new version may involve API deprecation. Usages of deprecated APIs in Android apps need to be updated to ensure the apps' compatibility with the old and new versions of Android OS. In this work, we propose AndroEvolve, an automated tool to update usages of deprecated Android APIs, that addresses the limitations of the state-of-the-art tool, CocciEvolve. AndroEvolve utilizes data flow analysis to solve the problem of out-of-method-boundary variables, and variable denormalization to remove the temporary variables introduced by CocciEvolve. We evaluated the accuracy of AndroEvolve using a dataset of 360 target files and 20 deprecated Android APIs, where AndroEvolve is able to produce 319 correct updates, compared to CocciEvolve which only produces 249 correct updates. We also evaluated the readability of AndroEvolve's update results using a manual and an automatic evaluation. Both evaluations demonstrated that the code produced by AndroEvolve has higher readability than CocciEvolve's. A video demonstration of AndroEvolve is available at https://youtu.be/siU0tuMITXI. △ Less

Submitted 11 February, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

arXiv:2012.01141 [pdf, other]

Algebraically-Informed Deep Networks (AIDN): A Deep Learning Approach to Represent Algebraic Structures

Authors: Mustafa Hajij, Ghada Zamzmi, Matthew Dawson, Greg Muller

Abstract: One of the central problems in the interface of deep learning and mathematics is that of building learning systems that can automatically uncover underlying mathematical laws from observed data. In this work, we make one step towards building a bridge between algebraic structures and deep learning, and introduce \textbf{AIDN}, \textit{Algebraically-Informed Deep Networks}. \textbf{AIDN} is a deep… ▽ More One of the central problems in the interface of deep learning and mathematics is that of building learning systems that can automatically uncover underlying mathematical laws from observed data. In this work, we make one step towards building a bridge between algebraic structures and deep learning, and introduce \textbf{AIDN}, \textit{Algebraically-Informed Deep Networks}. \textbf{AIDN} is a deep learning algorithm to represent any finitely-presented algebraic object with a set of deep neural networks. The deep networks obtained via \textbf{AIDN} are \textit{algebraically-informed} in the sense that they satisfy the algebraic relations of the presentation of the algebraic structure that serves as the input to the algorithm. Our proposed network can robustly compute linear and non-linear representations of most finitely-presented algebraic structures such as groups, associative algebras, and Lie algebras. We evaluate our proposed approach and demonstrate its applicability to algebraic and geometric objects that are significant in low-dimensional topology. In particular, we study solutions for the Yang-Baxter equations and their applications on braid groups. Further, we study the representations of the Temperley-Lieb algebra. Finally, we show, using the Reshetikhin-Turaev construction, how our proposed deep learning approach can be utilized to construct new link invariants. We believe the proposed approach would tread a path toward a promising future research in deep learning applied to algebraic and geometric structures. △ Less

Submitted 12 February, 2021; v1 submitted 2 December, 2020; originally announced December 2020.

arXiv:2011.05020 [pdf, other]

AndroEvolve: Automated Android API Update with Data Flow Analysis and Variable Denormalization

Authors: Stefanus A. Haryono, Ferdian Thung, David Lo, Lingxiao Jiang, Julia Lawall, Hong Jin Kang, Lucas Serrano, Gilles Muller

Abstract: The Android operating system is frequently updated, with each version bringing a new set of APIs. New versions may involve API deprecation; Android apps using deprecated APIs need to be updated to ensure the apps' compatibility withold and new versions of Android. Updating deprecated APIs is a time-consuming endeavor. Hence, automating the updates of Android APIs can be beneficial for developers.… ▽ More The Android operating system is frequently updated, with each version bringing a new set of APIs. New versions may involve API deprecation; Android apps using deprecated APIs need to be updated to ensure the apps' compatibility withold and new versions of Android. Updating deprecated APIs is a time-consuming endeavor. Hence, automating the updates of Android APIs can be beneficial for developers. CocciEvolve is the state-of-the-art approach for this automation. However, it has several limitations, including its inability to resolve out-of-method-boundary variables and the low code readability of its update due to the addition of temporary variables. In an attempt to further improve the performance of automated Android API update, we propose an approach named AndroEvolve, which addresses the limitations of CocciEvolve through the addition of data flow analysis and variable name denormalization. Data flow analysis enables AndroEvolve to resolve the value of any variable within the file scope. Variable name denormalization replaces temporary variables that may present in the CocciEvolve update with appropriate values in the target file. We have evaluated the performance of AndroEvolve and the readability of its updates on 360 target files. AndroEvolve produces 26.90% more instances of correct updates compared to CocciEvolve. Moreover, our manual and automated evaluation shows that AndroEvolve updates are more readable than CocciEvolve updates. △ Less

Submitted 10 November, 2020; originally announced November 2020.

arXiv:2006.11099 [pdf, other]

doi 10.1371/journal.pcbi.1009753

Cortical oscillations implement a backbone for sampling-based computation in spiking neural networks

Authors: Agnes Korcsak-Gorzo, Michael G. Müller, Andreas Baumbach, Luziwei Leng, Oliver Julien Breitwieser, Sacha J. van Albada, Walter Senn, Karlheinz Meier, Robert Legenstein, Mihai A. Petrovici

Abstract: Being permanently confronted with an uncertain world, brains have faced evolutionary pressure to represent this uncertainty in order to respond appropriately. Often, this requires visiting multiple interpretations of the available information or multiple solutions to an encountered problem. This gives rise to the so-called mixing problem: since all of these "valid" states represent powerful attrac… ▽ More Being permanently confronted with an uncertain world, brains have faced evolutionary pressure to represent this uncertainty in order to respond appropriately. Often, this requires visiting multiple interpretations of the available information or multiple solutions to an encountered problem. This gives rise to the so-called mixing problem: since all of these "valid" states represent powerful attractors, but between themselves can be very dissimilar, switching between such states can be difficult. We propose that cortical oscillations can be effectively used to overcome this challenge. By acting as an effective temperature, background spiking activity modulates exploration. Rhythmic changes induced by cortical oscillations can then be interpreted as a form of simulated tempering. We provide a rigorous mathematical discussion of this link and study some of its phenomenological implications in computer simulations. This identifies a new computational role of cortical oscillations and connects them to various phenomena in the brain, such as sampling-based probabilistic inference, memory replay, multisensory cue combination, and place cell flickering. △ Less

Submitted 4 April, 2022; v1 submitted 19 June, 2020; originally announced June 2020.

Comments: 34 pages, 9 figures

Journal ref: PLoS Comput Biol 18(3): e1009753 (2022)

arXiv:2006.00380 [pdf, ps, other]

Memory virtualization in virtualized systems: segmentation is better than paging

Authors: Boris Teabe, Peterson Yuhala, Alain Tchana, Fabien Hermenier, Daniel Hagimont, Gilles Muller

Abstract: The utilization of paging for virtual machine (VM) memory management is the root cause of memory virtualization overhead. This paper shows that paging is not necessary in the hypervisor. In fact, memory fragmentation, which explains paging utilization, is not an issue in virtualized datacenters thanks to VM memory demand patterns. Our solution Compromis, a novel Memory Management Unit, uses direct… ▽ More The utilization of paging for virtual machine (VM) memory management is the root cause of memory virtualization overhead. This paper shows that paging is not necessary in the hypervisor. In fact, memory fragmentation, which explains paging utilization, is not an issue in virtualized datacenters thanks to VM memory demand patterns. Our solution Compromis, a novel Memory Management Unit, uses direct segment for VM memory management combined with paging for VM's processes. The paper presents a systematic methodology for implementing Compromis in the hardware, the hypervisor and the datacenter scheduler. Evaluation results show that Compromis outperforms the two popular memory virtualization solutions: shadow paging and Extended Page Table by up to 30% and 370% respectively. △ Less

Submitted 30 May, 2020; originally announced June 2020.

arXiv:2005.13220 [pdf, other]

Automatic Android Deprecated-API Usage Update by Learning from Single Updated Example

Authors: Stefanus Agus Haryono, Ferdian Thung, Hong Jin Kang, Lucas Serrano, Gilles Muller, Julia Lawall, David Lo, Lingxiao Jiang

Abstract: Due to the deprecation of APIs in the Android operating system,developers have to update usages of the APIs to ensure that their applications work for both the past and current versions of Android.Such updates may be widespread, non-trivial, and time-consuming. Therefore, automation of such updates will be of great benefit to developers. AppEvolve, which is the state-of-the-art tool for automating… ▽ More Due to the deprecation of APIs in the Android operating system,developers have to update usages of the APIs to ensure that their applications work for both the past and current versions of Android.Such updates may be widespread, non-trivial, and time-consuming. Therefore, automation of such updates will be of great benefit to developers. AppEvolve, which is the state-of-the-art tool for automating such updates, relies on having before- and after-update examples to learn from. In this work, we propose an approach named CocciEvolve that performs such updates using only a single after-update example. CocciEvolve learns edits by extracting the relevant update to a block of code from an after-update example. From preliminary experiments, we find that CocciEvolve can successfully perform 96 out of 112 updates, with a success rate of 85%. △ Less

Submitted 27 May, 2020; originally announced May 2020.

Comments: 5 pages, 8 figures. Accepted in The International Conference on Program Comprehension (ICPC) 2020, ERA Track

ACM Class: I.2.2

arXiv:1902.10680 [pdf, other]

Analyzing the Perceived Severity of Cybersecurity Threats Reported on Social Media

Authors: Shi Zong, Alan Ritter, Graham Mueller, Evan Wright

Abstract: Breaking cybersecurity events are shared across a range of websites, including security blogs (FireEye, Kaspersky, etc.), in addition to social media platforms such as Facebook and Twitter. In this paper, we investigate methods to analyze the severity of cybersecurity threats based on the language that is used to describe them online. A corpus of 6,000 tweets describing software vulnerabilities is… ▽ More Breaking cybersecurity events are shared across a range of websites, including security blogs (FireEye, Kaspersky, etc.), in addition to social media platforms such as Facebook and Twitter. In this paper, we investigate methods to analyze the severity of cybersecurity threats based on the language that is used to describe them online. A corpus of 6,000 tweets describing software vulnerabilities is annotated with authors' opinions toward their severity. We show that our corpus supports the development of automatic classifiers with high precision for this task. Furthermore, we demonstrate the value of analyzing users' opinions about the severity of threats reported online as an early indicator of important software vulnerabilities. We present a simple, yet effective method for linking software vulnerabilities reported in tweets to Common Vulnerabilities and Exposures (CVEs) in the National Vulnerability Database (NVD). Using our predicted severity scores, we show that it is possible to achieve a Precision@50 of 0.86 when forecasting high severity vulnerabilities, significantly outperforming a baseline that is based on tweet volume. Finally we show how reports of severe vulnerabilities online are predictive of real-world exploits. △ Less

Submitted 3 May, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

Comments: Accepted at NAACL 2019

arXiv:1802.07094 [pdf, other]

Camera-based vehicle velocity estimation from monocular video

Authors: Moritz Kampelmühler, Michael G. Müller, Christoph Feichtenhofer

Abstract: This paper documents the winning entry at the CVPR2017 vehicle velocity estimation challenge. Velocity estimation is an emerging task in autonomous driving which has not yet been thoroughly explored. The goal is to estimate the relative velocity of a specific vehicle from a sequence of images. In this paper, we present a light-weight approach for directly regressing vehicle velocities from their t… ▽ More This paper documents the winning entry at the CVPR2017 vehicle velocity estimation challenge. Velocity estimation is an emerging task in autonomous driving which has not yet been thoroughly explored. The goal is to estimate the relative velocity of a specific vehicle from a sequence of images. In this paper, we present a light-weight approach for directly regressing vehicle velocities from their trajectories using a multilayer perceptron. Another contribution is an explorative study of features for monocular vehicle velocity estimation. We find that light-weight trajectory based features outperform depth and motion cues extracted from deep ConvNets, especially for far-distance predictions where current disparity and optical flow estimators are challenged significantly. Our light-weight approach is real-time capable on a single CPU and outperforms all competing entries in the velocity estimation challenge. On the test set, we report an average error of 1.12 m/s which is comparable to a (ground-truth) system that combines LiDAR and radar techniques to achieve an error of around 0.71 m/s. △ Less

Submitted 20 February, 2018; originally announced February 2018.

Comments: 8 pages, 5 figures, in CVWW2018

arXiv:1409.3367 [pdf, other]

HTML5 WebSocket protocol and its application to distributed computing

Authors: Gabriel L. Muller

Abstract: HTML5 WebSocket protocol brings real time communication in web browsers to a new level. Daily, new products are designed to stay permanently connected to the web. WebSocket is the technology enabling this revolution. WebSockets are supported by all current browsers, but it is still a new technology in constant evolution. WebSockets are slowly replacing older client-server communication technolog… ▽ More HTML5 WebSocket protocol brings real time communication in web browsers to a new level. Daily, new products are designed to stay permanently connected to the web. WebSocket is the technology enabling this revolution. WebSockets are supported by all current browsers, but it is still a new technology in constant evolution. WebSockets are slowly replacing older client-server communication technologies. As opposed to comet-like technologies WebSockets' remarkable performances is a result of the protocol's fully duplex nature and because it doesn't rely on HTTP communications. To begin with this paper studies the WebSocket protocol and different WebSocket servers implementations. This first theoretic part focuses more deeply on heterogeneous implementations and OpenCL. The second part is a benchmark of a new promising library. The real-time engine used for testing purposes is SocketCluster. SocketCluster provides a highly scalable WebSocket server that makes use of all available cpu cores on an instance. The scope of this work is reduced to vertical scaling of SocketCluster. △ Less

Submitted 11 September, 2014; originally announced September 2014.

arXiv:1407.4346 [pdf, ps, other]

doi 10.1145/2619090

Faults in Linux 2.6

Authors: Nicolas Palix, Gaël Thomas, Suman Saha, Christophe Calvès, Gilles Muller, Julia L. Lawall

Abstract: In August 2011, Linux entered its third decade. Ten years before, Chou et al. published a study of faults found by applying a static analyzer to Linux versions 1.0 through 2.4.1. A major result of their work was that the drivers directory contained up to 7 times more of certain kinds of faults than other directories. This result inspired numerous efforts on improving the reliability of driver code… ▽ More In August 2011, Linux entered its third decade. Ten years before, Chou et al. published a study of faults found by applying a static analyzer to Linux versions 1.0 through 2.4.1. A major result of their work was that the drivers directory contained up to 7 times more of certain kinds of faults than other directories. This result inspired numerous efforts on improving the reliability of driver code. Today, Linux is used in a wider range of environments, provides a wider range of services, and has adopted a new development and release model. What has been the impact of these changes on code quality? To answer this question, we have transported Chou et al.'s experiments to all versions of Linux 2.6; released between 2003 and 2011. We find that Linux has more than doubled in size during this period, but the number of faults per line of code has been decreasing. Moreover, the fault rate of drivers is now below that of other directories, such as arch. These results can guide further development and research efforts for the decade to come. To allow updating these results as Linux evolves, we define our experimental protocol and make our checkers available. △ Less

Submitted 16 July, 2014; originally announced July 2014.

Journal ref: ACM Transactions on Computer Systems 32, 2 (2014) 1--40

arXiv:1207.0405 [pdf, other]

MITRA: A Meta-Model for Information Flow in Trust and Reputation Architectures

Authors: Eugen Staab, Guillaume Muller

Abstract: We propose MITRA, a meta-model for the information flow in (computational) trust and reputation architectures. On an abstract level, MITRA describes the information flow as it is inherent in prominent trust and reputation models from the literature. We use MITRA to provide a structured comparison of these models. This makes it possible to get a clear overview of the complex research area. Furtherm… ▽ More We propose MITRA, a meta-model for the information flow in (computational) trust and reputation architectures. On an abstract level, MITRA describes the information flow as it is inherent in prominent trust and reputation models from the literature. We use MITRA to provide a structured comparison of these models. This makes it possible to get a clear overview of the complex research area. Furthermore, by doing so, we identify interesting new approaches for trust and reputation modeling that so far have not been investigated. △ Less

Submitted 2 July, 2012; originally announced July 2012.

Comments: 19 pages, 2 figures, 2 tables. Keywords: Computational Trust, Reputation Systems, Meta Model

MSC Class: 68T42 ACM Class: I.2.11

arXiv:0704.1373 [pdf, ps, other]

A Language-Based Approach for Improving the Robustness of Network Application Protocol Implementations

Authors: Burgy Laurent, Laurent Réveillère, Julia Lawall, Gilles Muller

Abstract: The secure and robust functioning of a network relies on the defect-free implementation of network applications. As network protocols have become increasingly complex, however, hand-writing network message processing code has become increasingly error-prone. In this paper, we present a domain-specific language, Zebu, for describing protocol message formats and related processing constraints. Fro… ▽ More The secure and robust functioning of a network relies on the defect-free implementation of network applications. As network protocols have become increasingly complex, however, hand-writing network message processing code has become increasingly error-prone. In this paper, we present a domain-specific language, Zebu, for describing protocol message formats and related processing constraints. From a Zebu specification, a compiler automatically generates stubs to be used by an application to parse network messages. Zebu is easy to use, as it builds on notations used in RFCs to describe protocol grammars. Zebu is also efficient, as the memory usage is tailored to application needs and message fragments can be specified to be processed on demand. Finally, Zebu-based applications are robust, as the Zebu compiler automatically checks specification consistency and generates parsing stubs that include validation of the message structure. Using a mutation analysis in the context of SIP and RTSP, we show that Zebu significantly improves application robustness. △ Less

Submitted 11 April, 2007; originally announced April 2007.

Showing 1–30 of 30 results for author: Müller, G