-
MyoGestic: EMG Interfacing Framework for Decoding Multiple Spared Degrees of Freedom of the Hand in Individuals with Neural Lesions
Authors:
Raul C. Sîmpetru,
Dominik I. Braun,
Arndt U. Simon,
Michael März,
Vlad Cnejevici,
Daniela Souza de Oliveira,
Nico Weber,
Jonas Walter,
Jörg Franke,
Daniel Höglinger,
Cosima Prahm,
Matthias Ponfick,
Alessandro Del Vecchio
Abstract:
Restoring limb motor function in individuals with spinal cord injury (SCI), stroke, or amputation remains a critical challenge, one which affects millions worldwide. Recent studies show through surface electromyography (EMG) that spared motor neurons can still be voluntarily controlled, even without visible limb movement . These signals can be decoded and used for motor intent estimation; however,…
▽ More
Restoring limb motor function in individuals with spinal cord injury (SCI), stroke, or amputation remains a critical challenge, one which affects millions worldwide. Recent studies show through surface electromyography (EMG) that spared motor neurons can still be voluntarily controlled, even without visible limb movement . These signals can be decoded and used for motor intent estimation; however, current wearable solutions lack the necessary hardware and software for intuitive interfacing of the spared degrees of freedom after neural injuries. To address these limitations, we developed a wireless, high-density EMG bracelet, coupled with a novel software framework, MyoGestic. Our system allows rapid and tailored adaptability of machine learning models to the needs of the users, facilitating real-time decoding of multiple spared distinctive degrees of freedom. In our study, we successfully decoded the motor intent from two participants with SCI, two with spinal stroke , and three amputees in real-time, achieving several controllable degrees of freedom within minutes after wearing the EMG bracelet. We provide a proof-of-concept that these decoded signals can be used to control a digitally rendered hand, a wearable orthosis, a prosthesis, or a 2D cursor. Our framework promotes a participant-centered approach, allowing immediate feedback integration, thus enhancing the iterative development of myocontrol algorithms. The proposed open-source software framework, MyoGestic, allows researchers and patients to focus on the augmentation and training of the spared degrees of freedom after neural lesions, thus potentially bridging the gap between research and clinical application and advancing the development of intuitive EMG interfaces for diverse neural lesions.
△ Less
Submitted 14 August, 2024;
originally announced August 2024.
-
Beware of Validation by Eye: Visual Validation of Linear Trends in Scatterplots
Authors:
Daniel Braun,
Remco Chang,
Michael Gleicher,
Tatiana von Landesberger
Abstract:
Visual validation of regression models in scatterplots is a common practice for assessing model quality, yet its efficacy remains unquantified. We conducted two empirical experiments to investigate individuals' ability to visually validate linear regression models (linear trends) and to examine the impact of common visualization designs on validation quality. The first experiment showed that the l…
▽ More
Visual validation of regression models in scatterplots is a common practice for assessing model quality, yet its efficacy remains unquantified. We conducted two empirical experiments to investigate individuals' ability to visually validate linear regression models (linear trends) and to examine the impact of common visualization designs on validation quality. The first experiment showed that the level of accuracy for visual estimation of slope (i.e., fitting a line to data) is higher than for visual validation of slope (i.e., accepting a shown line). Notably, we found bias toward slopes that are "too steep" in both cases. This lead to novel insights that participants naturally assessed regression with orthogonal distances between the points and the line (i.e., ODR regression) rather than the common vertical distances (OLS regression). In the second experiment, we investigated whether incorporating common designs for regression visualization (error lines, bounding boxes, and confidence intervals) would improve visual validation. Even though error lines reduced validation bias, results failed to show the desired improvements in accuracy for any design. Overall, our findings suggest caution in using visual model validation for linear trends in scatterplots.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
AGB-DE: A Corpus for the Automated Legal Assessment of Clauses in German Consumer Contracts
Authors:
Daniel Braun,
Florian Matthes
Abstract:
Legal tasks and datasets are often used as benchmarks for the capabilities of language models. However, openly available annotated datasets are rare. In this paper, we introduce AGB-DE, a corpus of 3,764 clauses from German consumer contracts that have been annotated and legally assessed by legal experts. Together with the data, we present a first baseline for the task of detecting potentially voi…
▽ More
Legal tasks and datasets are often used as benchmarks for the capabilities of language models. However, openly available annotated datasets are rare. In this paper, we introduce AGB-DE, a corpus of 3,764 clauses from German consumer contracts that have been annotated and legally assessed by legal experts. Together with the data, we present a first baseline for the task of detecting potentially void clauses, comparing the performance of an SVM baseline with three fine-tuned open language models and the performance of GPT-3.5. Our results show the challenging nature of the task, with no approach exceeding an F1-score of 0.54. While the fine-tuned models often performed better with regard to precision, GPT-3.5 outperformed the other approaches with regard to recall. An analysis of the errors indicates that one of the main challenges could be the correct interpretation of complex clauses, rather than the decision boundaries of what is permissible and what is not.
△ Less
Submitted 10 June, 2024;
originally announced June 2024.
-
Identifying Functionally Important Features with End-to-End Sparse Dictionary Learning
Authors:
Dan Braun,
Jordan Taylor,
Nicholas Goldowsky-Dill,
Lee Sharkey
Abstract:
Identifying the features learned by neural networks is a core challenge in mechanistic interpretability. Sparse autoencoders (SAEs), which learn a sparse, overcomplete dictionary that reconstructs a network's internal activations, have been used to identify these features. However, SAEs may learn more about the structure of the datatset than the computational structure of the network. There is the…
▽ More
Identifying the features learned by neural networks is a core challenge in mechanistic interpretability. Sparse autoencoders (SAEs), which learn a sparse, overcomplete dictionary that reconstructs a network's internal activations, have been used to identify these features. However, SAEs may learn more about the structure of the datatset than the computational structure of the network. There is therefore only indirect reason to believe that the directions found in these dictionaries are functionally important to the network. We propose end-to-end (e2e) sparse dictionary learning, a method for training SAEs that ensures the features learned are functionally important by minimizing the KL divergence between the output distributions of the original model and the model with SAE activations inserted. Compared to standard SAEs, e2e SAEs offer a Pareto improvement: They explain more network performance, require fewer total features, and require fewer simultaneously active features per datapoint, all with no cost to interpretability. We explore geometric and qualitative differences between e2e SAE features and standard SAE features. E2e dictionary learning brings us closer to methods that can explain network behavior concisely and accurately. We release our library for training e2e SAEs and reproducing our analysis at https://github.com/ApolloResearch/e2e_sae
△ Less
Submitted 24 May, 2024; v1 submitted 17 May, 2024;
originally announced May 2024.
-
DAC-JAX: A JAX Implementation of the Descript Audio Codec
Authors:
David Braun
Abstract:
We present an open-source implementation of the Descript Audio Codec (DAC) using Google's JAX ecosystem of Flax, Optax, Orbax, AUX, and CLU. Our codebase enables the reuse of model weights from the original PyTorch DAC, and we confirm that the two implementations produce equivalent token sequences and decoded audio if given the same input. We provide a training and fine-tuning script which support…
▽ More
We present an open-source implementation of the Descript Audio Codec (DAC) using Google's JAX ecosystem of Flax, Optax, Orbax, AUX, and CLU. Our codebase enables the reuse of model weights from the original PyTorch DAC, and we confirm that the two implementations produce equivalent token sequences and decoded audio if given the same input. We provide a training and fine-tuning script which supports device parallelism, although we have only verified it using brief training runs with a small dataset. Even with limited GPU memory, the original DAC can compress or decompress a long audio file by processing it as a sequence of overlapping "chunks." We implement this feature in JAX and benchmark the performance on two types of GPUs. On a consumer-grade GPU, DAC-JAX outperforms the original DAC for compression and decompression at all chunk sizes. However, on a high-performance, cluster-based GPU, DAC-JAX outperforms the original DAC for small chunk sizes but performs worse for large chunks.
△ Less
Submitted 19 May, 2024;
originally announced May 2024.
-
The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks
Authors:
Lucius Bushnaq,
Stefan Heimersheim,
Nicholas Goldowsky-Dill,
Dan Braun,
Jake Mendel,
Kaarel Hänni,
Avery Griffin,
Jörn Stöhler,
Magdalena Wache,
Marius Hobbhahn
Abstract:
Mechanistic interpretability aims to understand the behavior of neural networks by reverse-engineering their internal computations. However, current methods struggle to find clear interpretations of neural network activations because a decomposition of activations into computational features is missing. Individual neurons or model components do not cleanly correspond to distinct features or functi…
▽ More
Mechanistic interpretability aims to understand the behavior of neural networks by reverse-engineering their internal computations. However, current methods struggle to find clear interpretations of neural network activations because a decomposition of activations into computational features is missing. Individual neurons or model components do not cleanly correspond to distinct features or functions. We present a novel interpretability method that aims to overcome this limitation by transforming the activations of the network into a new basis - the Local Interaction Basis (LIB). LIB aims to identify computational features by removing irrelevant activations and interactions. Our method drops irrelevant activation directions and aligns the basis with the singular vectors of the Jacobian matrix between adjacent layers. It also scales features based on their importance for downstream computation, producing an interaction graph that shows all computationally-relevant features and interactions in a model. We evaluate the effectiveness of LIB on modular addition and CIFAR-10 models, finding that it identifies more computationally-relevant features that interact more sparsely, compared to principal component analysis. However, LIB does not yield substantial improvements in interpretability or interaction sparsity when applied to language models. We conclude that LIB is a promising theory-driven approach for analyzing neural networks, but in its current form is not applicable to large language models.
△ Less
Submitted 20 May, 2024; v1 submitted 17 May, 2024;
originally announced May 2024.
-
Using Degeneracy in the Loss Landscape for Mechanistic Interpretability
Authors:
Lucius Bushnaq,
Jake Mendel,
Stefan Heimersheim,
Dan Braun,
Nicholas Goldowsky-Dill,
Kaarel Hänni,
Cindy Wu,
Marius Hobbhahn
Abstract:
Mechanistic Interpretability aims to reverse engineer the algorithms implemented by neural networks by studying their weights and activations. An obstacle to reverse engineering neural networks is that many of the parameters inside a network are not involved in the computation being implemented by the network. These degenerate parameters may obfuscate internal structure. Singular learning theory t…
▽ More
Mechanistic Interpretability aims to reverse engineer the algorithms implemented by neural networks by studying their weights and activations. An obstacle to reverse engineering neural networks is that many of the parameters inside a network are not involved in the computation being implemented by the network. These degenerate parameters may obfuscate internal structure. Singular learning theory teaches us that neural network parameterizations are biased towards being more degenerate, and parameterizations with more degeneracy are likely to generalize further. We identify 3 ways that network parameters can be degenerate: linear dependence between activations in a layer; linear dependence between gradients passed back to a layer; ReLUs which fire on the same subset of datapoints. We also present a heuristic argument that modular networks are likely to be more degenerate, and we develop a metric for identifying modules in a network that is based on this argument. We propose that if we can represent a neural network in a way that is invariant to reparameterizations that exploit the degeneracies, then this representation is likely to be more interpretable, and we provide some evidence that such a representation is likely to have sparser interactions. We introduce the Interaction Basis, a tractable technique to obtain a representation that is invariant to degeneracies from linear dependence of activations or Jacobians.
△ Less
Submitted 20 May, 2024; v1 submitted 17 May, 2024;
originally announced May 2024.
-
Efficient Black-Box Adversarial Attacks on Neural Text Detectors
Authors:
Vitalii Fishchuk,
Daniel Braun
Abstract:
Neural text detectors are models trained to detect whether a given text was generated by a language model or written by a human. In this paper, we investigate three simple and resource-efficient strategies (parameter tweaking, prompt engineering, and character-level mutations) to alter texts generated by GPT-3.5 that are unsuspicious or unnoticeable for humans but cause misclassification by neural…
▽ More
Neural text detectors are models trained to detect whether a given text was generated by a language model or written by a human. In this paper, we investigate three simple and resource-efficient strategies (parameter tweaking, prompt engineering, and character-level mutations) to alter texts generated by GPT-3.5 that are unsuspicious or unnoticeable for humans but cause misclassification by neural text detectors. The results show that especially parameter tweaking and character-level mutations are effective strategies.
△ Less
Submitted 3 November, 2023;
originally announced November 2023.
-
Qualitative and quantitative evaluation of a methodology for the Digital Twin creation of brownfield production systems
Authors:
Dominik Braun,
Nasser Jazdi,
Wolfgang Schloegl,
Michael Weyrich
Abstract:
The Digital Twin is a well-known concept of industry 4.0 and is the cyber part of a cyber-physical production system providing several benefits such as virtual commissioning or predictive maintenance. The existing production systems are lacking a Digital Twin which has to be created manually in a time-consuming and error-prone process. Therefore, methods to create digital models of existing produc…
▽ More
The Digital Twin is a well-known concept of industry 4.0 and is the cyber part of a cyber-physical production system providing several benefits such as virtual commissioning or predictive maintenance. The existing production systems are lacking a Digital Twin which has to be created manually in a time-consuming and error-prone process. Therefore, methods to create digital models of existing production systems and their relations between them were developed. This paper presents the implementation of the methodology for the creation of multi-disciplinary relations and a quantitative and qualitative evaluation of the benefits of the methodology.
△ Less
Submitted 1 September, 2023;
originally announced October 2023.
-
CausalGPS: An R Package for Causal Inference With Continuous Exposures
Authors:
Naeem Khoshnevis,
Xiao Wu,
Danielle Braun
Abstract:
Quantifying the causal effects of continuous exposures on outcomes of interest is critical for social, economic, health, and medical research. However, most existing software packages focus on binary exposures. We develop the CausalGPS R package that implements a collection of algorithms to provide algorithmic solutions for causal inference with continuous exposures. CausalGPS implements a causal…
▽ More
Quantifying the causal effects of continuous exposures on outcomes of interest is critical for social, economic, health, and medical research. However, most existing software packages focus on binary exposures. We develop the CausalGPS R package that implements a collection of algorithms to provide algorithmic solutions for causal inference with continuous exposures. CausalGPS implements a causal inference workflow, with algorithms based on generalized propensity scores (GPS) as the core, extending propensity scores (the probability of a unit being exposed given pre-exposure covariates) from binary to continuous exposures. As the first step, the package implements efficient and flexible estimations of the GPS, allowing multiple user-specified modeling options. As the second step, the package provides two ways to adjust for confounding: weighting and matching, generating weighted and matched data sets, respectively. Lastly, the package provides built-in functions to fit flexible parametric, semi-parametric, or non-parametric regression models on the weighted or matched data to estimate the exposure-response function relating the outcome with the exposures. The computationally intensive tasks are implemented in C++, and efficient shared-memory parallelization is achieved by OpenMP API. This paper outlines the main components of the CausalGPS R package and demonstrates its application to assess the effect of long-term exposure to PM2.5 on educational attainment using zip code-level data from the contiguous United States from 2000-2016.
△ Less
Submitted 30 September, 2023;
originally announced October 2023.
-
Automatic Bat Call Classification using Transformer Networks
Authors:
Frank Fundel,
Daniel A. Braun,
Sebastian Gottwald
Abstract:
Automatically identifying bat species from their echolocation calls is a difficult but important task for monitoring bats and the ecosystem they live in. Major challenges in automatic bat call identification are high call variability, similarities between species, interfering calls and lack of annotated data. Many currently available models suffer from relatively poor performance on real-life data…
▽ More
Automatically identifying bat species from their echolocation calls is a difficult but important task for monitoring bats and the ecosystem they live in. Major challenges in automatic bat call identification are high call variability, similarities between species, interfering calls and lack of annotated data. Many currently available models suffer from relatively poor performance on real-life data due to being trained on single call datasets and, moreover, are often too slow for real-time classification. Here, we propose a Transformer architecture for multi-label classification with potential applications in real-time classification scenarios. We train our model on synthetically generated multi-species recordings by merging multiple bats calls into a single recording with multiple simultaneous calls. Our approach achieves a single species accuracy of 88.92% (F1-score of 84.23%) and a multi species macro F1-score of 74.40% on our test set. In comparison to three other tools on the independent and publicly available dataset ChiroVox, our model achieves at least 25.82% better accuracy for single species classification and at least 6.9% better macro F1-score for multi species classification.
△ Less
Submitted 20 September, 2023;
originally announced September 2023.
-
Reclaiming the Horizon: Novel Visualization Designs for Time-Series Data with Large Value Ranges
Authors:
Daniel Braun,
Rita Borgo,
Max Sondag,
Tatiana von Landesberger
Abstract:
We introduce two novel visualization designs to support practitioners in performing identification and discrimination tasks on large value ranges (i.e., several orders of magnitude) in time-series data: (1) The order of magnitude horizon graph, which extends the classic horizon graph; and (2) the order of magnitude line chart, which adapts the log-line chart. These new visualization designs visual…
▽ More
We introduce two novel visualization designs to support practitioners in performing identification and discrimination tasks on large value ranges (i.e., several orders of magnitude) in time-series data: (1) The order of magnitude horizon graph, which extends the classic horizon graph; and (2) the order of magnitude line chart, which adapts the log-line chart. These new visualization designs visualize large value ranges by explicitly splitting the mantissa m and exponent e of a value v = m * 10e . We evaluate our novel designs against the most relevant state-of-the-art visualizations in an empirical user study. It focuses on four main tasks commonly employed in the analysis of time-series and large value ranges visualization: identification, discrimination, estimation, and trend detection. For each task we analyse error, confidence, and response time. The new order of magnitude horizon graph performs better or equal to all other designs in identification, discrimination, and estimation tasks. Only for trend detection tasks, the more traditional horizon graphs reported better performance. Our results are domain-independent, only requiring time-series data with large value ranges.
△ Less
Submitted 31 October, 2023; v1 submitted 18 July, 2023;
originally announced July 2023.
-
Visual Validation versus Visual Estimation: A Study on the Average Value in Scatterplots
Authors:
Daniel Braun,
Ashley Suh,
Remco Chang,
Michael Gleicher,
Tatiana von Landesberger
Abstract:
We investigate the ability of individuals to visually validate statistical models in terms of their fit to the data. While visual model estimation has been studied extensively, visual model validation remains under-investigated. It is unknown how well people are able to visually validate models, and how their performance compares to visual and computational estimation. As a starting point, we cond…
▽ More
We investigate the ability of individuals to visually validate statistical models in terms of their fit to the data. While visual model estimation has been studied extensively, visual model validation remains under-investigated. It is unknown how well people are able to visually validate models, and how their performance compares to visual and computational estimation. As a starting point, we conducted a study across two populations (crowdsourced and volunteers). Participants had to both visually estimate (i.e, draw) and visually validate (i.e., accept or reject) the frequently studied model of averages. Across both populations, the level of accuracy of the models that were considered valid was lower than the accuracy of the estimated models. We find that participants' validation and estimation were unbiased. Moreover, their natural critical point between accepting and rejecting a given mean value is close to the boundary of its 95% confidence interval, indicating that the visually perceived confidence interval corresponds to a common statistical standard. Our work contributes to the understanding of visual model validation and opens new research opportunities.
△ Less
Submitted 2 January, 2024; v1 submitted 18 July, 2023;
originally announced July 2023.
-
Challenges in Domain-Specific Abstractive Summarization and How to Overcome them
Authors:
Anum Afzal,
Juraj Vladika,
Daniel Braun,
Florian Matthes
Abstract:
Large Language Models work quite well with general-purpose data and many tasks in Natural Language Processing. However, they show several limitations when used for a task such as domain-specific abstractive text summarization. This paper identifies three of those limitations as research problems in the context of abstractive text summarization: 1) Quadratic complexity of transformer-based models w…
▽ More
Large Language Models work quite well with general-purpose data and many tasks in Natural Language Processing. However, they show several limitations when used for a task such as domain-specific abstractive text summarization. This paper identifies three of those limitations as research problems in the context of abstractive text summarization: 1) Quadratic complexity of transformer-based models with respect to the input text length; 2) Model Hallucination, which is a model's ability to generate factually incorrect text; and 3) Domain Shift, which happens when the distribution of the model's training and test corpus is not the same. Along with a discussion of the open research questions, this paper also provides an assessment of existing state-of-the-art techniques relevant to domain-specific text summarization to address the research gaps.
△ Less
Submitted 3 July, 2023;
originally announced July 2023.
-
Proof-of-Turn: Blockchain consensus using a round-robin procedure as one possible solution for cutting costs in mobile games
Authors:
Dominik Braun
Abstract:
This master thesis deals with Blockchain Technology in mobile turn based peer to peer games. First, it investigates the capabilities of Blockchain Technology to be used for gaming applications. In this regard, among others, Proof-of-Mechanisms, Vote-based Consensus and several Performance Improvements are described. Second, several smart contracts are introduced to show the general feasibility of…
▽ More
This master thesis deals with Blockchain Technology in mobile turn based peer to peer games. First, it investigates the capabilities of Blockchain Technology to be used for gaming applications. In this regard, among others, Proof-of-Mechanisms, Vote-based Consensus and several Performance Improvements are described. Second, several smart contracts are introduced to show the general feasibility of turn based games hosted on Blockchain Technology. More specific, Hidden transactions, Randomization, Piles of Cards, Fog of War elements, Data allocation improvements and other smart contracts are specified. Third, a special Proof-of-Turn consensus mechanism, based on the Blockchain Technology, is defined to enable game publishers to cut costs in the means of their provided game servers. Herein, Byzantine Fault Tolerance, Peering, the CAP Theorem, Interoperability among other characteristics are covered. Last, these measures shall additionally raise the trust level among the players in mobile turn based games.
△ Less
Submitted 14 April, 2023;
originally announced April 2023.
-
Design of a Variable Stiffness Spring with Human-Selectable Stiffness
Authors:
Chase W. Mathews,
David J. Braun
Abstract:
Springs are commonly used in wearable robotic devices to provide assistive joint torque without the need for motors and batteries. However, different tasks (such as walking or running) and different users (such as athletes with strong legs or the elderly with weak legs) necessitate different assistive joint torques, and therefore, springs with different stiffness. Variable stiffness springs are a…
▽ More
Springs are commonly used in wearable robotic devices to provide assistive joint torque without the need for motors and batteries. However, different tasks (such as walking or running) and different users (such as athletes with strong legs or the elderly with weak legs) necessitate different assistive joint torques, and therefore, springs with different stiffness. Variable stiffness springs are a special class of springs which can exert more or less torque upon the same deflection, provided that the user is able to change the stiffness of the spring. In this paper, we present a novel variable stiffness spring design in which the user can select a preferred spring stiffness similar to switching gears on a bicycle. Using a leg-swing experiment, we demonstrate that the user can increment and decrement spring stiffness in a large range to effectively assist the hip joint during leg oscillations. Variable stiffness springs with human-selectable stiffness could be key components of wearable devices which augment locomotion tasks, such as walking, running, and swimming.
△ Less
Submitted 26 February, 2023;
originally announced February 2023.
-
Investigating Conversational Search Behavior For Domain Exploration
Authors:
Phillip Schneider,
Anum Afzal,
Juraj Vladika,
Daniel Braun,
Florian Matthes
Abstract:
Conversational search has evolved as a new information retrieval paradigm, marking a shift from traditional search systems towards interactive dialogues with intelligent search agents. This change especially affects exploratory information-seeking contexts, where conversational search systems can guide the discovery of unfamiliar domains. In these scenarios, users find it often difficult to expres…
▽ More
Conversational search has evolved as a new information retrieval paradigm, marking a shift from traditional search systems towards interactive dialogues with intelligent search agents. This change especially affects exploratory information-seeking contexts, where conversational search systems can guide the discovery of unfamiliar domains. In these scenarios, users find it often difficult to express their information goals due to insufficient background knowledge. Conversational interfaces can provide assistance by eliciting information needs and narrowing down the search space. However, due to the complexity of information-seeking behavior, the design of conversational interfaces for retrieving information remains a great challenge. Although prior work has employed user studies to empirically ground the system design, most existing studies are limited to well-defined search tasks or known domains, thus being less exploratory in nature. Therefore, we conducted a laboratory study to investigate open-ended search behavior for navigation through unknown information landscapes. The study comprised of 26 participants who were restricted in their search to a text chat interface. Based on the collected dialogue transcripts, we applied statistical analyses and process mining techniques to uncover general information-seeking patterns across five different domains. We not only identify core dialogue acts and their interrelations that enable users to discover domain knowledge, but also derive design suggestions for conversational search systems.
△ Less
Submitted 27 February, 2023; v1 submitted 10 January, 2023;
originally announced January 2023.
-
Controllable Mechanical-domain Energy Accumulators
Authors:
Sung Y. Kim,
David J. Braun
Abstract:
Springs are efficient in storing and returning elastic potential energy but are unable to hold the energy they store in the absence of an external load. Lockable springs use clutches to hold elastic potential energy in the absence of an external load, but have not yet been widely adopted in applications, partly because clutches introduce design complexity, reduce energy efficiency, and typically d…
▽ More
Springs are efficient in storing and returning elastic potential energy but are unable to hold the energy they store in the absence of an external load. Lockable springs use clutches to hold elastic potential energy in the absence of an external load, but have not yet been widely adopted in applications, partly because clutches introduce design complexity, reduce energy efficiency, and typically do not afford high fidelity control over the energy stored by the spring. Here, we present the design of a novel lockable compression spring that uses a small capstan clutch to passively lock a mechanical spring. The capstan clutch can lock over 1000 N force at any arbitrary deflection, unlock the spring in less than 10 ms with a control force less than 1 % of the maximal spring force, and provide an 80 % energy storage and return efficiency (comparable to a highly efficient electric motor operated at constant nominal speed). By retaining the form factor of a regular spring while providing high-fidelity locking capability even under large spring forces, the proposed design could facilitate the development of energy-efficient spring-based actuators and robots.
△ Less
Submitted 21 February, 2023; v1 submitted 29 December, 2022;
originally announced December 2022.
-
Novel Spring Mechanism Enables Iterative Energy Accumulation under Force and Deformation Constraints
Authors:
Cole A. Dempsey,
David J. Braun
Abstract:
Springs can provide force at zero net energy cost by recycling negative mechanical work to benefit motor-driven robots or spring-augmented humans. However, humans have limited force and range of motion, and motors have a limited ability to produce force. These limits constrain how much energy a conventional spring can store and, consequently, how much assistance a spring can provide. In this paper…
▽ More
Springs can provide force at zero net energy cost by recycling negative mechanical work to benefit motor-driven robots or spring-augmented humans. However, humans have limited force and range of motion, and motors have a limited ability to produce force. These limits constrain how much energy a conventional spring can store and, consequently, how much assistance a spring can provide. In this paper, we introduce an approach to accumulating negative work in assistive springs over several motion cycles. We show that, by utilizing a novel floating spring mechanism, the weight of a human or robot can be used to iteratively increase spring compression, irrespective of the potential energy stored by the spring. Decoupling the force required to compress a spring from the energy stored by a spring advances prior works, and could enable spring-driven robots and humans to perform physically demanding tasks without the use of large actuators.
△ Less
Submitted 29 December, 2022;
originally announced December 2022.
-
Design of a Parallel Elastic Actuator with a Continuously-Adjustable Equilibrium Position
Authors:
Evangelos Chatziandreou,
Chase W. Mathews,
David J. Braun
Abstract:
In this paper, we present an adjustable-equilibrium parallel elastic actuator (AE-PEA). The actuator consists of a motor, an equilibrium adjusting mechanism, and a spring arranged into a cylindrical geometry, similar to a motor-gearbox assembly. The novel component of the actuator is the equilibrium adjusting mechanism which (i) does not require external energy to maintain the equilibrium position…
▽ More
In this paper, we present an adjustable-equilibrium parallel elastic actuator (AE-PEA). The actuator consists of a motor, an equilibrium adjusting mechanism, and a spring arranged into a cylindrical geometry, similar to a motor-gearbox assembly. The novel component of the actuator is the equilibrium adjusting mechanism which (i) does not require external energy to maintain the equilibrium position of the actuator even if the spring is deformed and (ii) enables equilibrium position control with low energy cost by rotating the spring while keeping it undeformed. Adjustable equilibrium parallel elastic actuators resolve the main limitation of parallel elastic actuators (PEAs) by enabling energy-efficient operation at different equilibrium positions, instead of being limited to energy-efficient operation at a single equilibrium position. We foresee the use of AE-PEAs in industrial robots, mobile robots, exoskeletons, and prostheses, where efficient oscillatory motion and gravity compensation at different positions are required.
△ Less
Submitted 17 January, 2023; v1 submitted 14 December, 2022;
originally announced December 2022.
-
An Agent-based Realisation for a continuous Model Adaption Approach in intelligent Digital Twins
Authors:
Daniel Dittler,
Peter Lierhammer,
Dominik Braun,
Timo Müller,
Nasser Jazdi,
Michael Weyrich
Abstract:
The trend in industrial automation is towards networking, intelligence and autonomy. Digital Twins, which serve as virtual representations, are becoming increasingly important in this context. The Digital Twin of a modular production system contains many different models that are mostly created for specific applications and fulfil different requirements. Especially simulation models, which are cre…
▽ More
The trend in industrial automation is towards networking, intelligence and autonomy. Digital Twins, which serve as virtual representations, are becoming increasingly important in this context. The Digital Twin of a modular production system contains many different models that are mostly created for specific applications and fulfil different requirements. Especially simulation models, which are created in the development phase, can be used during the operational phase for applications such as prognosis or operation-parallel simulation. Due to the high heterogeneity of the model landscape in the context of a modular production system, the plant operator is faced with the challenge of adapting the models in order to ensure an application-oriented realism in the event of changes to the asset and its environment or the addition of applications. Therefore, this paper proposes a concept for the continuous model adaption in the Digital Twin of a modular production system during the operational phase. The benefits are then demonstrated by an application scenario and an agent-based realisation.
△ Less
Submitted 7 December, 2022;
originally announced December 2022.
-
Evaluating Unsupervised Text Classification: Zero-shot and Similarity-based Approaches
Authors:
Tim Schopf,
Daniel Braun,
Florian Matthes
Abstract:
Text classification of unseen classes is a challenging Natural Language Processing task and is mainly attempted using two different types of approaches. Similarity-based approaches attempt to classify instances based on similarities between text document representations and class description representations. Zero-shot text classification approaches aim to generalize knowledge gained from a trainin…
▽ More
Text classification of unseen classes is a challenging Natural Language Processing task and is mainly attempted using two different types of approaches. Similarity-based approaches attempt to classify instances based on similarities between text document representations and class description representations. Zero-shot text classification approaches aim to generalize knowledge gained from a training task by assigning appropriate labels of unknown classes to text documents. Although existing studies have already investigated individual approaches to these categories, the experiments in literature do not provide a consistent comparison. This paper addresses this gap by conducting a systematic evaluation of different similarity-based and zero-shot approaches for text classification of unseen classes. Different state-of-the-art approaches are benchmarked on four text classification datasets, including a new dataset from the medical domain. Additionally, novel SimCSE and SBERT-based baselines are proposed, as other baselines used in existing work yield weak classification results and are easily outperformed. Finally, the novel similarity-based Lbl2TransformerVec approach is presented, which outperforms previous state-of-the-art approaches in unsupervised text classification. Our experiments show that similarity-based approaches significantly outperform zero-shot approaches in most cases. Additionally, using SimCSE or SBERT embeddings instead of simpler text representations increases similarity-based classification results even further.
△ Less
Submitted 31 January, 2023; v1 submitted 29 November, 2022;
originally announced November 2022.
-
Interpreting Neural Networks through the Polytope Lens
Authors:
Sid Black,
Lee Sharkey,
Leo Grinsztajn,
Eric Winsor,
Dan Braun,
Jacob Merizian,
Kip Parker,
Carlos Ramón Guevara,
Beren Millidge,
Gabriel Alfour,
Connor Leahy
Abstract:
Mechanistic interpretability aims to explain what a neural network has learned at a nuts-and-bolts level. What are the fundamental primitives of neural network representations? Previous mechanistic descriptions have used individual neurons or their linear combinations to understand the representations a network has learned. But there are clues that neurons and their linear combinations are not the…
▽ More
Mechanistic interpretability aims to explain what a neural network has learned at a nuts-and-bolts level. What are the fundamental primitives of neural network representations? Previous mechanistic descriptions have used individual neurons or their linear combinations to understand the representations a network has learned. But there are clues that neurons and their linear combinations are not the correct fundamental units of description: directions cannot describe how neural networks use nonlinearities to structure their representations. Moreover, many instances of individual neurons and their combinations are polysemantic (i.e. they have multiple unrelated meanings). Polysemanticity makes interpreting the network in terms of neurons or directions challenging since we can no longer assign a specific feature to a neural unit. In order to find a basic unit of description that does not suffer from these problems, we zoom in beyond just directions to study the way that piecewise linear activation functions (such as ReLU) partition the activation space into numerous discrete polytopes. We call this perspective the polytope lens. The polytope lens makes concrete predictions about the behavior of neural networks, which we evaluate through experiments on both convolutional image classifiers and language models. Specifically, we show that polytopes can be used to identify monosemantic regions of activation space (while directions are not in general monosemantic) and that the density of polytope boundaries reflect semantic boundaries. We also outline a vision for what mechanistic interpretability might look like through the polytope lens.
△ Less
Submitted 22 November, 2022;
originally announced November 2022.
-
High-energy-density 3D-printed Composite Springs for Lightweight and Energy-efficient Compliant Robots
Authors:
Amanda Sutrisno,
David J. Braun
Abstract:
Springs store mechanical energy similar to batteries storing electrical energy. However, conventional springs are heavy and store limited amounts of mechanical energy relative to batteries, i.e they have low mass-energy-density. Next-generation 3D printing technology could potentially enable manufacturing low cost lightweight springs with high energy storage capacity. Here we present a novel desig…
▽ More
Springs store mechanical energy similar to batteries storing electrical energy. However, conventional springs are heavy and store limited amounts of mechanical energy relative to batteries, i.e they have low mass-energy-density. Next-generation 3D printing technology could potentially enable manufacturing low cost lightweight springs with high energy storage capacity. Here we present a novel design of a high-energy-density 3D printed torsional spiral spring using structural optimization. By optimizing the internal structure of the spring we obtained a 45% increase in the mass energy density, compared to a torsional spiral spring of uniform thickness. Our result suggests that optimally designed 3D printed springs could enable robots to recycle more mechanical energy per unit mass, potentially reducing the energy required to control robots.
△ Less
Submitted 16 November, 2022;
originally announced November 2022.
-
Hierarchically Structured Task-Agnostic Continual Learning
Authors:
Heinke Hihn,
Daniel A. Braun
Abstract:
One notable weakness of current machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learnin…
▽ More
One notable weakness of current machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle that facilitates a trade-off between learning and forgetting. We derive this principle from a Bayesian perspective and show its connections to previous approaches to continual learning. Based on this principle, we propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths through the network which is governed by a gating policy. Equipped with a diverse and specialized set of parameters, each path can be regarded as a distinct sub-network that learns to solve tasks. To improve expert allocation, we introduce diversity objectives, which we evaluate in additional ablation studies. Importantly, our approach can operate in a task-agnostic way, i.e., it does not require task-specific knowledge, as is the case with many existing continual learning algorithms. Due to the general formulation based on generic utility functions, we can apply this optimality principle to a large variety of learning problems, including supervised learning, reinforcement learning, and generative modeling. We demonstrate the competitive performance of our method on continual reinforcement learning and variants of the MNIST, CIFAR-10, and CIFAR-100 datasets.
△ Less
Submitted 14 November, 2022;
originally announced November 2022.
-
Lbl2Vec: An Embedding-Based Approach for Unsupervised Document Retrieval on Predefined Topics
Authors:
Tim Schopf,
Daniel Braun,
Florian Matthes
Abstract:
In this paper, we consider the task of retrieving documents with predefined topics from an unlabeled document dataset using an unsupervised approach. The proposed unsupervised approach requires only a small number of keywords describing the respective topics and no labeled document. Existing approaches either heavily relied on a large amount of additionally encoded world knowledge or on term-docum…
▽ More
In this paper, we consider the task of retrieving documents with predefined topics from an unlabeled document dataset using an unsupervised approach. The proposed unsupervised approach requires only a small number of keywords describing the respective topics and no labeled document. Existing approaches either heavily relied on a large amount of additionally encoded world knowledge or on term-document frequencies. Contrariwise, we introduce a method that learns jointly embedded document and word vectors solely from the unlabeled document dataset in order to find documents that are semantically similar to the topics described by the keywords. The proposed method requires almost no text preprocessing but is simultaneously effective at retrieving relevant documents with high probability. When successively retrieving documents on different predefined topics from publicly available and commonly used datasets, we achieved an average area under the receiver operating characteristic curve value of 0.95 on one dataset and 0.92 on another. Further, our method can be used for multiclass document classification, without the need to assign labels to the dataset in advance. Compared with an unsupervised classification baseline, we increased F1 scores from 76.6 to 82.7 and from 61.0 to 75.1 on the respective datasets. For easy replication of our approach, we make the developed Lbl2Vec code publicly available as a ready-to-use tool under the 3-Clause BSD license.
△ Less
Submitted 12 October, 2022;
originally announced October 2022.
-
A graph-based knowledge representation and pattern mining supporting the Digital Twin creation of existing manufacturing systems
Authors:
Dominik Braun,
Timo Müller,
Nada Sahlab,
Nasser Jazdi,
Wolfgang Schloegl,
Michael Weyrich
Abstract:
The creation of a Digital Twin for existing manufacturing systems, so-called brownfield systems, is a challenging task due to the needed expert knowledge about the structure of brownfield systems and the effort to realize the digital models. Several approaches and methods have already been proposed that at least partially digitalize the information about a brownfield manufacturing system. A Digita…
▽ More
The creation of a Digital Twin for existing manufacturing systems, so-called brownfield systems, is a challenging task due to the needed expert knowledge about the structure of brownfield systems and the effort to realize the digital models. Several approaches and methods have already been proposed that at least partially digitalize the information about a brownfield manufacturing system. A Digital Twin requires linked information from multiple sources. This paper presents a graph-based approach to merge information from heterogeneous sources. Furthermore, the approach provides a way to automatically identify templates using graph structure analysis to facilitate further work with the resulting Digital Twin and its further enhancement.
△ Less
Submitted 21 September, 2022;
originally announced September 2022.
-
Color Coding of Large Value Ranges Applied to Meteorological Data
Authors:
Daniel Braun,
Kerstin Ebell,
Vera Schemann,
Laura Pelchmann,
Susanne Crewell,
Rita Borgo,
Tatiana von Landesberger
Abstract:
This paper presents a novel color scheme designed to address the challenge of visualizing data series with large value ranges, where scale transformation provides limited support. We focus on meteorological data, where the presence of large value ranges is common. We apply our approach to meteorological scatterplots, as one of the most common plots used in this domain area. Our approach leverages…
▽ More
This paper presents a novel color scheme designed to address the challenge of visualizing data series with large value ranges, where scale transformation provides limited support. We focus on meteorological data, where the presence of large value ranges is common. We apply our approach to meteorological scatterplots, as one of the most common plots used in this domain area. Our approach leverages the numerical representation of mantissa and exponent of the values to guide the design of novel "nested" color schemes, able to emphasize differences between magnitudes. Our user study evaluates the new designs, the state of the art color scales and representative color schemes used in the analysis of meteorological data: ColorCrafter, Viridis, and Rainbow. We assess accuracy, time and confidence in the context of discrimination (comparison) and interpretation (reading) tasks. Our proposed color scheme significantly outperforms the others in interpretation tasks, while showing comparable performances in discrimination tasks.
△ Less
Submitted 24 October, 2022; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Countability constraints in order-theoretic approaches to computability
Authors:
Pedro Hack,
Daniel A. Braun,
Sebastian Gottwald
Abstract:
Computability on uncountable sets has no standard formalization, unlike that on countable sets, which is given by Turing machines. Some of the approaches to define computability in these sets rely on order-theoretic structures to translate such notions from Turing machines to uncountable spaces. Since these machines are used as a baseline for computability in these approaches, countability restric…
▽ More
Computability on uncountable sets has no standard formalization, unlike that on countable sets, which is given by Turing machines. Some of the approaches to define computability in these sets rely on order-theoretic structures to translate such notions from Turing machines to uncountable spaces. Since these machines are used as a baseline for computability in these approaches, countability restrictions on the ordered structures are fundamental. Here, we show several relations between the usual countability restrictions in order-theoretic theories of computability and some more common order-theoretic countability constraints, like order density properties and functional characterizations of the order structure in terms of multi-utilities. As a result, we show how computability can be introduced in some order structures via countability order density and multi-utility constraints.
△ Less
Submitted 28 May, 2024; v1 submitted 29 June, 2022;
originally announced June 2022.
-
Computation as uncertainty reduction: a simplified order-theoretic framework
Authors:
Pedro Hack,
Daniel A. Braun,
Sebastian Gottwald
Abstract:
Although there is a somewhat standard formalization of computability on countable sets given by Turing machines, the same cannot be said about uncountable sets. Among the approaches to define computability in these sets, order-theoretic structures have proven to be useful. Here, we discuss the mathematical structure needed to define computability using order-theoretic concepts. In particular, we i…
▽ More
Although there is a somewhat standard formalization of computability on countable sets given by Turing machines, the same cannot be said about uncountable sets. Among the approaches to define computability in these sets, order-theoretic structures have proven to be useful. Here, we discuss the mathematical structure needed to define computability using order-theoretic concepts. In particular, we introduce a more general framework and discuss its limitations compared to the previous one in domain theory. We expose four features in which the stronger requirements in the domain-theoretic structure allow to improve upon the more general framework: computable elements, computable functions, model dependence of computability and complexity theory. Crucially, we show computability of elements in uncountable spaces can be defined in this new setup, and argue why this is not the case for computable functions. Moreover, we show the stronger setup diminishes the dependence of computability on the chosen order-theoretic structure and that, although a suitable complexity theory can be defined in the stronger framework and the more general one posesses a notion of computable elements, there appears to be no proper notion of element complexity in the latter.
△ Less
Submitted 6 September, 2022; v1 submitted 28 June, 2022;
originally announced June 2022.
-
On a geometrical notion of dimension for partially ordered sets
Authors:
Pedro Hack,
Daniel A. Braun,
Sebastian Gottwald
Abstract:
The well-known notion of dimension for partial orders by Dushnik and Miller allows to quantify the degree of incomparability and, thus, is regarded as a measure of complexity for partial orders. However, despite its usefulness, its definition is somewhat disconnected from the geometrical idea of dimension, where, essentially, the number of dimensions indicates how many real lines are required to r…
▽ More
The well-known notion of dimension for partial orders by Dushnik and Miller allows to quantify the degree of incomparability and, thus, is regarded as a measure of complexity for partial orders. However, despite its usefulness, its definition is somewhat disconnected from the geometrical idea of dimension, where, essentially, the number of dimensions indicates how many real lines are required to represent the underlying partially ordered set.
Here, we introduce a variation of the Dushnik-Miller notion of dimension that is closer to geometry, the Debreu dimension, and show the following main results: (i) how to construct its building blocks under some countability restrictions, (ii) its relation to other notions of dimension in the literature, and (iii), as an application of the above, we improve on the classification of preordered spaces through real-valued monotones.
△ Less
Submitted 2 September, 2022; v1 submitted 30 March, 2022;
originally announced March 2022.
-
The classification of preordered spaces in terms of monotones: complexity and optimization
Authors:
Pedro Hack,
Daniel A. Braun,
Sebastian Gottwald
Abstract:
The study of complexity and optimization in decision theory involves both partial and complete characterizations of preferences over decision spaces in terms of real-valued monotones. With this motivation, and following the recent introduction of new classes of monotones, like injective monotones or strict monotone multi-utilities, we present the classification of preordered spaces in terms of bot…
▽ More
The study of complexity and optimization in decision theory involves both partial and complete characterizations of preferences over decision spaces in terms of real-valued monotones. With this motivation, and following the recent introduction of new classes of monotones, like injective monotones or strict monotone multi-utilities, we present the classification of preordered spaces in terms of both the existence and cardinality of real-valued monotones and the cardinality of the quotient space. In particular, we take advantage of a characterization of real-valued monotones in terms of separating families of increasing sets in order to obtain a more complete classification consisting of classes that are strictly different from each other. As a result, we gain new insight into both complexity and optimization, and clarify their interplay in preordered spaces.
△ Less
Submitted 14 August, 2022; v1 submitted 24 February, 2022;
originally announced February 2022.
-
N-QGN: Navigation Map from a Monocular Camera using Quadtree Generating Networks
Authors:
Daniel Braun,
Olivier Morel,
Pascal Vasseur,
Cédric Demonceaux
Abstract:
Monocular depth estimation has been a popular area of research for several years, especially since self-supervised networks have shown increasingly good results in bridging the gap with supervised and stereo methods. However, these approaches focus their interest on dense 3D reconstruction and sometimes on tiny details that are superfluous for autonomous navigation. In this paper, we propose to ad…
▽ More
Monocular depth estimation has been a popular area of research for several years, especially since self-supervised networks have shown increasingly good results in bridging the gap with supervised and stereo methods. However, these approaches focus their interest on dense 3D reconstruction and sometimes on tiny details that are superfluous for autonomous navigation. In this paper, we propose to address this issue by estimating the navigation map under a quadtree representation. The objective is to create an adaptive depth map prediction that only extract details that are essential for the obstacle avoidance. Other 3D space which leaves large room for navigation will be provided with approximate distance. Experiment on KITTI dataset shows that our method can significantly reduce the number of output information without major loss of accuracy.
△ Less
Submitted 24 February, 2022;
originally announced February 2022.
-
Mechanization of Incidence Projective Geometry in Higher Dimensions, a Combinatorial Approach
Authors:
Pascal Schreck,
Nicolas Magaud,
David Braun
Abstract:
Several tools have been developed to enhance automation of theorem proving in the 2D plane. However, in 3D, only a few approaches have been studied, and to our knowledge, nothing has been done in higher dimensions. In this paper, we present a few examples of incidence geometry theorems in dimensions 3, 4, and 5. We then prove them with the help of a combinatorial prover based on matroid theory ap…
▽ More
Several tools have been developed to enhance automation of theorem proving in the 2D plane. However, in 3D, only a few approaches have been studied, and to our knowledge, nothing has been done in higher dimensions. In this paper, we present a few examples of incidence geometry theorems in dimensions 3, 4, and 5. We then prove them with the help of a combinatorial prover based on matroid theory applied to geometry.
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
DawDreamer: Bridging the Gap Between Digital Audio Workstations and Python Interfaces
Authors:
David Braun
Abstract:
Audio production techniques which previously only existed in GUI-constrained digital audio workstations, livecoding environments, or C++ APIs are now accessible with our new Python module called DawDreamer. DawDreamer therefore bridges the gap between real sound engineers and coders imitating them with offline batch-processing. Like contemporary modules in this domain, DawDreamer can create direct…
▽ More
Audio production techniques which previously only existed in GUI-constrained digital audio workstations, livecoding environments, or C++ APIs are now accessible with our new Python module called DawDreamer. DawDreamer therefore bridges the gap between real sound engineers and coders imitating them with offline batch-processing. Like contemporary modules in this domain, DawDreamer can create directed acyclic graphs of audio processors such as VSTs which generate or manipulate audio streams. DawDreamer can also dynamically compile and execute code from Faust, a powerful signal processing language which can be deployed to many platforms and microcontrollers. We discuss DawDreamer's unique features in detail and potential applications across music information retrieval including source separation, transcription, and audio effect parameter inference. We provide fully cross-platform PyPI installers, a Linux Dockerfile, and an example Jupyter notebook.
△ Less
Submitted 18 November, 2021;
originally announced November 2021.
-
Mixture-of-Variational-Experts for Continual Learning
Authors:
Heinke Hihn,
Daniel A. Braun
Abstract:
One weakness of machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning (CL) paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learning and devel…
▽ More
One weakness of machine learning algorithms is the poor ability of models to solve new problems without forgetting previously acquired knowledge. The Continual Learning (CL) paradigm has emerged as a protocol to systematically investigate settings where the model sequentially observes samples generated by a series of tasks. In this work, we take a task-agnostic view of continual learning and develop a hierarchical information-theoretic optimality principle that facilitates a trade-off between learning and forgetting. We discuss this principle from a Bayesian perspective and show its connections to previous approaches to CL. Based on this principle, we propose a neural network layer, called the Mixture-of-Variational-Experts layer, that alleviates forgetting by creating a set of information processing paths through the network which is governed by a gating policy. Due to the general formulation based on generic utility functions, we can apply this optimality principle to a large variety of learning problems, including supervised learning, reinforcement learning, and generative modeling. We demonstrate the competitive performance of our method in continual supervised learning and in continual reinforcement learning.
△ Less
Submitted 1 March, 2022; v1 submitted 25 October, 2021;
originally announced October 2021.
-
Representing preorders with injective monotones
Authors:
Pedro Hack,
Daniel A. Braun,
Sebastian Gottwald
Abstract:
We introduce a new class of real-valued monotones in preordered spaces, injective monotones. We show that the class of preorders for which they exist lies in between the class of preorders with strict monotones and preorders with countable multi-utilities, improving upon the known classification of preordered spaces through real-valued monotones. We extend several well-known results for strict mon…
▽ More
We introduce a new class of real-valued monotones in preordered spaces, injective monotones. We show that the class of preorders for which they exist lies in between the class of preorders with strict monotones and preorders with countable multi-utilities, improving upon the known classification of preordered spaces through real-valued monotones. We extend several well-known results for strict monotones (Richter-Peleg functions) to injective monotones, we provide a construction of injective monotones from countable multi-utilities, and relate injective monotones to classic results concerning Debreu denseness and order separability. Along the way, we connect our results to Shannon entropy and the uncertainty preorder, obtaining new insights into how they are related. In particular, we show how injective montones can be used to generalize some appealing properties of Jaynes' maximum entropy principle, which is considered a basis for statistical inference and serves as a justification for many regularization techniques that appear throughout machine learning and decision theory.
△ Less
Submitted 24 November, 2021; v1 submitted 30 July, 2021;
originally announced July 2021.
-
Prediction of Hereditary Cancers Using Neural Networks
Authors:
Zoe Guan,
Giovanni Parmigiani,
Danielle Braun,
Lorenzo Trippa
Abstract:
Family history is a major risk factor for many types of cancer. Mendelian risk prediction models translate family histories into cancer risk predictions based on knowledge of cancer susceptibility genes. These models are widely used in clinical practice to help identify high-risk individuals. Mendelian models leverage the entire family history, but they rely on many assumptions about cancer suscep…
▽ More
Family history is a major risk factor for many types of cancer. Mendelian risk prediction models translate family histories into cancer risk predictions based on knowledge of cancer susceptibility genes. These models are widely used in clinical practice to help identify high-risk individuals. Mendelian models leverage the entire family history, but they rely on many assumptions about cancer susceptibility genes that are either unrealistic or challenging to validate due to low mutation prevalence. Training more flexible models, such as neural networks, on large databases of pedigrees can potentially lead to accuracy gains. In this paper, we develop a framework to apply neural networks to family history data and investigate their ability to learn inherited susceptibility to cancer. While there is an extensive literature on neural networks and their state-of-the-art performance in many tasks, there is little work applying them to family history data. We propose adaptations of fully-connected neural networks and convolutional neural networks to pedigrees. In data simulated under Mendelian inheritance, we demonstrate that our proposed neural network models are able to achieve nearly optimal prediction performance. Moreover, when the observed family history includes misreported cancer diagnoses, neural networks are able to outperform the Mendelian BRCAPRO model embedding the correct inheritance laws. Using a large dataset of over 200,000 family histories, the Risk Service cohort, we train prediction models for future risk of breast cancer. We validate the models using data from the Cancer Genetics Network.
△ Less
Submitted 25 June, 2021;
originally announced June 2021.
-
Transfer Learning as an Enabler of the Intelligent Digital Twin
Authors:
Benjamin Maschler,
Dominik Braun,
Nasser Jazdi,
Michael Weyrich
Abstract:
Digital Twins have been described as beneficial in many areas, such as virtual commissioning, fault prediction or reconfiguration planning. Equipping Digital Twins with artificial intelligence functionalities can greatly expand those beneficial applications or open up altogether new areas of application, among them cross-phase industrial transfer learning. In the context of machine learning, trans…
▽ More
Digital Twins have been described as beneficial in many areas, such as virtual commissioning, fault prediction or reconfiguration planning. Equipping Digital Twins with artificial intelligence functionalities can greatly expand those beneficial applications or open up altogether new areas of application, among them cross-phase industrial transfer learning. In the context of machine learning, transfer learning represents a set of approaches that enhance learning new tasks based upon previously acquired knowledge. Here, knowledge is transferred from one lifecycle phase to another in order to reduce the amount of data or time needed to train a machine learning algorithm. Looking at common challenges in developing and deploying industrial machinery with deep learning functionalities, embracing this concept would offer several advantages: Using an intelligent Digital Twin, learning algorithms can be designed, configured and tested in the design phase before the physical system exists and real data can be collected. Once real data becomes available, the algorithms must merely be fine-tuned, significantly speeding up commissioning and reducing the probability of costly modifications. Furthermore, using the Digital Twin's simulation capabilities virtually injecting rare faults in order to train an algorithm's response or using reinforcement learning, e.g. to teach a robot, become practically feasible. This article presents several cross-phase industrial transfer learning use cases utilizing intelligent Digital Twins. A real cyber physical production system consisting of an automated welding machine and an automated guided vehicle equipped with a robot arm is used to illustrate the respective benefits.
△ Less
Submitted 3 December, 2020;
originally announced December 2020.
-
Binary Classification: Counterbalancing Class Imbalance by Applying Regression Models in Combination with One-Sided Label Shifts
Authors:
Peter Bellmann,
Heinke Hihn,
Daniel A. Braun,
Friedhelm Schwenker
Abstract:
In many real-world pattern recognition scenarios, such as in medical applications, the corresponding classification tasks can be of an imbalanced nature. In the current study, we focus on binary, imbalanced classification tasks, i.e.~binary classification tasks in which one of the two classes is under-represented (minority class) in comparison to the other class (majority class). In the literature…
▽ More
In many real-world pattern recognition scenarios, such as in medical applications, the corresponding classification tasks can be of an imbalanced nature. In the current study, we focus on binary, imbalanced classification tasks, i.e.~binary classification tasks in which one of the two classes is under-represented (minority class) in comparison to the other class (majority class). In the literature, many different approaches have been proposed, such as under- or oversampling, to counter class imbalance. In the current work, we introduce a novel method, which addresses the issues of class imbalance. To this end, we first transfer the binary classification task to an equivalent regression task. Subsequently, we generate a set of negative and positive target labels, such that the corresponding regression task becomes balanced, with respect to the redefined target label set. We evaluate our approach on a number of publicly available data sets in combination with Support Vector Machines. Moreover, we compare our proposed method to one of the most popular oversampling techniques (SMOTE). Based on the detailed discussion of the presented outcomes of our experimental evaluation, we provide promising ideas for future research directions.
△ Less
Submitted 30 November, 2020;
originally announced November 2020.
-
Specialization in Hierarchical Learning Systems
Authors:
Heinke Hihn,
Daniel A. Braun
Abstract:
Joining multiple decision-makers together is a powerful way to obtain more sophisticated decision-making systems, but requires to address the questions of division of labor and specialization. We investigate in how far information constraints in hierarchies of experts not only provide a principled method for regularization but also to enforce specialization. In particular, we devise an information…
▽ More
Joining multiple decision-makers together is a powerful way to obtain more sophisticated decision-making systems, but requires to address the questions of division of labor and specialization. We investigate in how far information constraints in hierarchies of experts not only provide a principled method for regularization but also to enforce specialization. In particular, we devise an information-theoretically motivated on-line learning rule that allows partitioning of the problem space into multiple sub-problems that can be solved by the individual experts. We demonstrate two different ways to apply our method: (i) partitioning problems based on individual data samples and (ii) based on sets of data samples representing tasks. Approach (i) equips the system with the ability to solve complex decision-making problems by finding an optimal combination of local expert decision-makers. Approach (ii) leads to decision-makers specialized in solving families of tasks, which equips the system with the ability to solve meta-learning problems. We show the broad applicability of our approach on a range of problems including classification, regression, density estimation, and reinforcement learning problems, both in the standard machine learning setup and in a meta-learning setting.
△ Less
Submitted 3 November, 2020;
originally announced November 2020.
-
P2D: a self-supervised method for depth estimation from polarimetry
Authors:
Marc Blanchon,
Désiré Sidibé,
Olivier Morel,
Ralph Seulin,
Daniel Braun,
Fabrice Meriaudeau
Abstract:
Monocular depth estimation is a recurring subject in the field of computer vision. Its ability to describe scenes via a depth map while reducing the constraints related to the formulation of perspective geometry tends to favor its use. However, despite the constant improvement of algorithms, most methods exploit only colorimetric information. Consequently, robustness to events to which the modalit…
▽ More
Monocular depth estimation is a recurring subject in the field of computer vision. Its ability to describe scenes via a depth map while reducing the constraints related to the formulation of perspective geometry tends to favor its use. However, despite the constant improvement of algorithms, most methods exploit only colorimetric information. Consequently, robustness to events to which the modality is not sensitive to, like specularity or transparency, is neglected. In response to this phenomenon, we propose using polarimetry as an input for a self-supervised monodepth network. Therefore, we propose exploiting polarization cues to encourage accurate reconstruction of scenes. Furthermore, we include a term of polarimetric regularization to state-of-the-art method to take specific advantage of the data. Our method is evaluated both qualitatively and quantitatively demonstrating that the contribution of this new information as well as an enhanced loss function improves depth estimation results, especially for specular areas.
△ Less
Submitted 15 July, 2020;
originally announced July 2020.
-
Construction and Elicitation of a Black Box Model in the Game of Bridge
Authors:
Véronique Ventos,
Daniel Braun,
Colin Deheeger,
Jean Pierre Desmoulins,
Jean Baptiste Fantun,
Swann Legras,
Alexis Rimbaud,
Céline Rouveirol,
Henry Soldano,
Solène Thépaut
Abstract:
We address the problem of building a decision model for a specific bidding situation in the game of Bridge. We propose the following multi-step methodology i) Build a set of examples for the decision problem and use simulations to associate a decision to each example ii) Use supervised relational learning to build an accurate and readable model iii) Perform a joint analysis between domain experts…
▽ More
We address the problem of building a decision model for a specific bidding situation in the game of Bridge. We propose the following multi-step methodology i) Build a set of examples for the decision problem and use simulations to associate a decision to each example ii) Use supervised relational learning to build an accurate and readable model iii) Perform a joint analysis between domain experts and data scientists to improve the learning language, including the production by experts of a handmade model iv) Build a better, more readable and accurate model.
△ Less
Submitted 4 April, 2022; v1 submitted 4 May, 2020;
originally announced May 2020.
-
The Two Kinds of Free Energy and the Bayesian Revolution
Authors:
Sebastian Gottwald,
Daniel A. Braun
Abstract:
The concept of free energy has its origins in 19th century thermodynamics, but has recently found its way into the behavioral and neural sciences, where it has been promoted for its wide applicability and has even been suggested as a fundamental principle of understanding intelligent behavior and brain function. We argue that there are essentially two different notions of free energy in current mo…
▽ More
The concept of free energy has its origins in 19th century thermodynamics, but has recently found its way into the behavioral and neural sciences, where it has been promoted for its wide applicability and has even been suggested as a fundamental principle of understanding intelligent behavior and brain function. We argue that there are essentially two different notions of free energy in current models of intelligent agency, that can both be considered as applications of Bayesian inference to the problem of action selection: one that appears when trading off accuracy and uncertainty based on a general maximum entropy principle, and one that formulates action selection in terms of minimizing an error measure that quantifies deviations of beliefs and policies from given reference models. The first approach provides a normative rule for action selection in the face of model uncertainty or when information processing capabilities are limited. The second approach directly aims to formulate the action selection problem as an inference problem in the context of Bayesian brain theories, also known as Active Inference in the literature. We elucidate the main ideas and discuss critical technical and conceptual issues revolving around these two notions of free energy that both claim to apply at all levels of decision-making, from the high-level deliberation of reasoning down to the low-level information processing of perception.
△ Less
Submitted 6 December, 2020; v1 submitted 24 April, 2020;
originally announced April 2020.
-
Neural-Network Heuristics for Adaptive Bayesian Quantum Estimation
Authors:
Lukas J. Fiderer,
Jonas Schuff,
Daniel Braun
Abstract:
Quantum metrology promises unprecedented measurement precision but suffers in practice from the limited availability of resources such as the number of probes, their coherence time, or non-classical quantum states. The adaptive Bayesian approach to parameter estimation allows for an efficient use of resources thanks to adaptive experiment design. For its practical success fast numerical solutions…
▽ More
Quantum metrology promises unprecedented measurement precision but suffers in practice from the limited availability of resources such as the number of probes, their coherence time, or non-classical quantum states. The adaptive Bayesian approach to parameter estimation allows for an efficient use of resources thanks to adaptive experiment design. For its practical success fast numerical solutions for the Bayesian update and the adaptive experiment design are crucial. Here we show that neural networks can be trained to become fast and strong experiment-design heuristics using a combination of an evolutionary strategy and reinforcement learning. Neural-network heuristics are shown to outperform established heuristics for the technologically important example of frequency estimation of a qubit that suffers from dephasing. Our method of creating neural-network heuristics is very general and complements the well-studied sequential Monte-Carlo method for Bayesian updates to form a complete framework for adaptive Bayesian quantum estimation.
△ Less
Submitted 7 April, 2021; v1 submitted 4 March, 2020;
originally announced March 2020.
-
Hierarchical Expert Networks for Meta-Learning
Authors:
Heinke Hihn,
Daniel A. Braun
Abstract:
The goal of meta-learning is to train a model on a variety of learning tasks, such that it can adapt to new problems within only a few iterations. Here we propose a principled information-theoretic model that optimally partitions the underlying problem space such that specialized expert decision-makers solve the resulting sub-problems. To drive this specialization we impose the same kind of inform…
▽ More
The goal of meta-learning is to train a model on a variety of learning tasks, such that it can adapt to new problems within only a few iterations. Here we propose a principled information-theoretic model that optimally partitions the underlying problem space such that specialized expert decision-makers solve the resulting sub-problems. To drive this specialization we impose the same kind of information processing constraints both on the partitioning and the expert decision-makers. We argue that this specialization leads to efficient adaptation to new tasks. To demonstrate the generality of our approach we evaluate three meta-learning domains: image classification, regression, and reinforcement learning.
△ Less
Submitted 9 September, 2020; v1 submitted 31 October, 2019;
originally announced November 2019.
-
Improving the dynamics of quantum sensors with reinforcement learning
Authors:
Jonas Schuff,
Lukas J. Fiderer,
Daniel Braun
Abstract:
Recently proposed quantum-chaotic sensors achieve quantum enhancements in measurement precision by applying nonlinear control pulses to the dynamics of the quantum sensor while using classical initial states that are easy to prepare. Here, we use the cross-entropy method of reinforcement learning to optimize the strength and position of control pulses. Compared to the quantum-chaotic sensors with…
▽ More
Recently proposed quantum-chaotic sensors achieve quantum enhancements in measurement precision by applying nonlinear control pulses to the dynamics of the quantum sensor while using classical initial states that are easy to prepare. Here, we use the cross-entropy method of reinforcement learning to optimize the strength and position of control pulses. Compared to the quantum-chaotic sensors with periodic control pulses in the presence of superradiant damping, we find that decoherence can be fought even better and measurement precision can be enhanced further by optimizing the control. In some examples, we find enhancements in sensitivity by more than an order of magnitude. By visualizing the evolution of the quantum state, the mechanism exploited by the reinforcement learning method is identified as a kind of spin-squeezing strategy that is adapted to the superradiant damping.
△ Less
Submitted 10 March, 2020; v1 submitted 22 August, 2019;
originally announced August 2019.
-
An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems
Authors:
Heinke Hihn,
Sebastian Gottwald,
Daniel A. Braun
Abstract:
Information-theoretic bounded rationality describes utility-optimizing decision-makers whose limited information-processing capabilities are formalized by information constraints. One of the consequences of bounded rationality is that resource-limited decision-makers can join together to solve decision-making problems that are beyond the capabilities of each individual. Here, we study an informati…
▽ More
Information-theoretic bounded rationality describes utility-optimizing decision-makers whose limited information-processing capabilities are formalized by information constraints. One of the consequences of bounded rationality is that resource-limited decision-makers can join together to solve decision-making problems that are beyond the capabilities of each individual. Here, we study an information-theoretic principle that drives division of labor and specialization when decision-makers with information constraints are joined together. We devise an on-line learning rule of this principle that learns a partitioning of the problem space such that it can be solved by specialized linear policies. We demonstrate the approach for decision-making problems whose complexity exceeds the capabilities of individual decision-makers, but can be solved by combining the decision-makers optimally. The strength of the model is that it is abstract and principled, yet has direct applications in classification, regression, reinforcement learning and adaptive control.
△ Less
Submitted 5 December, 2019; v1 submitted 26 July, 2019;
originally announced July 2019.
-
Using Machine Learning and Natural Language Processing to Review and Classify the Medical Literature on Cancer Susceptibility Genes
Authors:
Yujia Bao,
Zhengyi Deng,
Yan Wang,
Heeyoon Kim,
Victor Diego Armengol,
Francisco Acevedo,
Nofal Ouardaoui,
Cathy Wang,
Giovanni Parmigiani,
Regina Barzilay,
Danielle Braun,
Kevin S Hughes
Abstract:
PURPOSE: The medical literature relevant to germline genetics is growing exponentially. Clinicians need tools monitoring and prioritizing the literature to understand the clinical implications of the pathogenic genetic variants. We developed and evaluated two machine learning models to classify abstracts as relevant to the penetrance (risk of cancer for germline mutation carriers) or prevalence of…
▽ More
PURPOSE: The medical literature relevant to germline genetics is growing exponentially. Clinicians need tools monitoring and prioritizing the literature to understand the clinical implications of the pathogenic genetic variants. We developed and evaluated two machine learning models to classify abstracts as relevant to the penetrance (risk of cancer for germline mutation carriers) or prevalence of germline genetic mutations. METHODS: We conducted literature searches in PubMed and retrieved paper titles and abstracts to create an annotated dataset for training and evaluating the two machine learning classification models. Our first model is a support vector machine (SVM) which learns a linear decision rule based on the bag-of-ngrams representation of each title and abstract. Our second model is a convolutional neural network (CNN) which learns a complex nonlinear decision rule based on the raw title and abstract. We evaluated the performance of the two models on the classification of papers as relevant to penetrance or prevalence. RESULTS: For penetrance classification, we annotated 3740 paper titles and abstracts and used 60% for training the model, 20% for tuning the model, and 20% for evaluating the model. The SVM model achieves 89.53% accuracy (percentage of papers that were correctly classified) while the CNN model achieves 88.95 % accuracy. For prevalence classification, we annotated 3753 paper titles and abstracts. The SVM model achieves 89.14% accuracy while the CNN model achieves 89.13 % accuracy. CONCLUSION: Our models achieve high accuracy in classifying abstracts as relevant to penetrance or prevalence. By facilitating literature review, this tool could help clinicians and researchers keep abreast of the burgeoning knowledge of gene-cancer associations and keep the knowledge bases for clinical decision support tools up to date.
△ Less
Submitted 24 April, 2019;
originally announced April 2019.
-
Bounded rational decision-making from elementary computations that reduce uncertainty
Authors:
Sebastian Gottwald,
Daniel A. Braun
Abstract:
In its most basic form, decision-making can be viewed as a computational process that progressively eliminates alternatives, thereby reducing uncertainty. Such processes are generally costly, meaning that the amount of uncertainty that can be reduced is limited by the amount of available computational resources. Here, we introduce the notion of elementary computation based on a fundamental princip…
▽ More
In its most basic form, decision-making can be viewed as a computational process that progressively eliminates alternatives, thereby reducing uncertainty. Such processes are generally costly, meaning that the amount of uncertainty that can be reduced is limited by the amount of available computational resources. Here, we introduce the notion of elementary computation based on a fundamental principle for probability transfers that reduce uncertainty. Elementary computations can be considered as the inverse of Pigou-Dalton transfers applied to probability distributions, closely related to the concepts of majorization, T-transforms, and generalized entropies that induce a preorder on the space of probability distributions. As a consequence we can define resource cost functions that are order-preserving and therefore monotonic with respect to the uncertainty reduction. This leads to a comprehensive notion of decision-making processes with limited resources. Along the way, we prove several new results on majorization theory, as well as on entropy and divergence measures.
△ Less
Submitted 8 April, 2019;
originally announced April 2019.