-
Differentiable Optimization of Similarity Scores Between Models and Brains
Authors:
Nathan Cloos,
Moufan Li,
Markus Siegel,
Scott L. Brincat,
Earl K. Miller,
Guangyu Robert Yang,
Christopher J. Cueva
Abstract:
What metrics should guide the development of more realistic models of the brain? One proposal is to quantify the similarity between models and brains using methods such as linear regression, Centered Kernel Alignment (CKA), and angular Procrustes distance. To better understand the limitations of these similarity measures we analyze neural activity recorded in five experiments on nonhuman primates,…
▽ More
What metrics should guide the development of more realistic models of the brain? One proposal is to quantify the similarity between models and brains using methods such as linear regression, Centered Kernel Alignment (CKA), and angular Procrustes distance. To better understand the limitations of these similarity measures we analyze neural activity recorded in five experiments on nonhuman primates, and optimize synthetic datasets to become more similar to these neural recordings. How similar can these synthetic datasets be to neural activity while failing to encode task relevant variables? We find that some measures like linear regression and CKA, differ from angular Procrustes, and yield high similarity scores even when task relevant variables cannot be linearly decoded from the synthetic datasets. Synthetic datasets optimized to maximize similarity scores initially learn the first principal component of the target dataset, but angular Procrustes captures higher variance dimensions much earlier than methods like linear regression and CKA. We show in both theory and simulations how these scores change when different principal components are perturbed. And finally, we jointly optimize multiple similarity scores to find their allowed ranges, and show that a high angular Procrustes similarity, for example, implies a high CKA score, but not the converse.
△ Less
Submitted 9 July, 2024;
originally announced July 2024.
-
An efficient pipeline to compute patient-specific cerebral aneurysm wall tension
Authors:
Mostafa Jamshidian,
Benjamin Zwick,
Arosha S Dissanayake,
Adam Wittek,
Timothy J Phillips,
Stephen Honeybul,
Graeme J Hankey,
Karol Miller
Abstract:
Cerebral aneurysm rupture, leading to subarachnoid hemorrhage with a high mortality rate, disproportionately affects younger populations, resulting in a significant loss of productive life years. A significant proportion of these deaths is due to aneurysmal re-bleeding within the first three days following the initial bleed, prior to treatment. While early aneurysm treatment is recommended, there…
▽ More
Cerebral aneurysm rupture, leading to subarachnoid hemorrhage with a high mortality rate, disproportionately affects younger populations, resulting in a significant loss of productive life years. A significant proportion of these deaths is due to aneurysmal re-bleeding within the first three days following the initial bleed, prior to treatment. While early aneurysm treatment is recommended, there is no consensus on the ideal timing, and emergency treatment offers only an incremental benefit at a significant cost. Although various multivariable prediction models have been proposed to provide personalized risk assessments, no validated patient-specific predictor is available to rationalize emergency treatment. Furthermore, no model has yet incorporated emerging computational biomechanics-based biomarkers such as wall tension. In this paper, we proposed and validated an efficient semi-automatic pipeline to compute patient-specific cerebral aneurysm wall tension as a potential biomarker for the likelihood of re-bleeding. Our pipeline uses the patient's computed tomography angiography (CTA) image obtained at the time of subarachnoid hemorrhage diagnosis to create a patient-specific biomechanical model of the cerebral aneurysm using the finite element method. A distinctive feature of our approach is the straightforward model creation and wall tension computation using shell finite elements, without requiring patient-specific material properties or aneurysm wall thickness. Our non-invasive, patient-specific method for cerebral aneurysm wall tension can potentially provide individualized risk prediction and enhance clinical decision-making.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Abdominal aortic aneurysm wall stress: A 7-line code in MATLAB and a one-click software application
Authors:
Mostafa Jamshidian,
Saeideh Sekhavat,
Adam Wittek,
Karol Miller
Abstract:
An abdominal aortic aneurysm (AAA) is a life-threatening condition characterized by the irreversible dilation of the lower aorta, usually detected incidentally during imaging for other health issues. Current clinical practice for managing AAA relies on a one-size-fits-all approach, based on the aneurysm's maximum diameter and growth rate, which can lead to underestimation or overestimation of AAA…
▽ More
An abdominal aortic aneurysm (AAA) is a life-threatening condition characterized by the irreversible dilation of the lower aorta, usually detected incidentally during imaging for other health issues. Current clinical practice for managing AAA relies on a one-size-fits-all approach, based on the aneurysm's maximum diameter and growth rate, which can lead to underestimation or overestimation of AAA rupture risk. Patient-specific AAA wall stress, computed using biomechanical models derived from medical images without needing patient-specific material properties, has been widely investigated for developing individualized AAA rupture risk predictors. Therefore, AAA wall stress, determined reliably and quickly, has the potential to enhance patient-specific treatment plans. This paper presents a 7-line code, written in MATLAB using the Partial Differential Equation Toolbox, for AAA wall stress computations via finite element analysis. The code takes AAA wall geometry as input and outputs stress components over the AAA wall domain. Additionally, we present a one-click standalone software application for AAA wall stress computation, developed based on our 7-line code using MATLAB Compiler. After verification, we used our code to compute AAA wall stress in ten patients. Our analysis indicated that the 99th percentile of maximum principal stress across all patients ranged from 0.320 MPa to 0.522 MPa, with an average of 0.401 MPa and a standard deviation of 0.056 MPa. Moreover, for every case, the MATLAB simulation time was less than a minute on a laptop workstation.
△ Less
Submitted 6 July, 2024;
originally announced July 2024.
-
Enhance the Image: Super Resolution using Artificial Intelligence in MRI
Authors:
Ziyu Li,
Zihan Li,
Haoxiang Li,
Qiuyun Fan,
Karla L. Miller,
Wenchuan Wu,
Akshay S. Chaudhari,
Qiyuan Tian
Abstract:
This chapter provides an overview of deep learning techniques for improving the spatial resolution of MRI, ranging from convolutional neural networks, generative adversarial networks, to more advanced models including transformers, diffusion models, and implicit neural representations. Our exploration extends beyond the methodologies to scrutinize the impact of super-resolved images on clinical an…
▽ More
This chapter provides an overview of deep learning techniques for improving the spatial resolution of MRI, ranging from convolutional neural networks, generative adversarial networks, to more advanced models including transformers, diffusion models, and implicit neural representations. Our exploration extends beyond the methodologies to scrutinize the impact of super-resolved images on clinical and neuroscientific assessments. We also cover various practical topics such as network architectures, image evaluation metrics, network loss functions, and training data specifics, including downsampling methods for simulating low-resolution images and dataset selection. Finally, we discuss existing challenges and potential future directions regarding the feasibility and reliability of deep learning-based MRI super-resolution, with the aim to facilitate its wider adoption to benefit various clinical and neuroscientific applications.
△ Less
Submitted 19 June, 2024;
originally announced June 2024.
-
FlowMM: Generating Materials with Riemannian Flow Matching
Authors:
Benjamin Kurt Miller,
Ricky T. Q. Chen,
Anuroop Sriram,
Brandon M Wood
Abstract:
Crystalline materials are a fundamental component in next-generation technologies, yet modeling their distribution presents unique computational challenges. Of the plausible arrangements of atoms in a periodic lattice only a vanishingly small percentage are thermodynamically stable, which is a key indicator of the materials that can be experimentally realized. Two fundamental tasks in this area ar…
▽ More
Crystalline materials are a fundamental component in next-generation technologies, yet modeling their distribution presents unique computational challenges. Of the plausible arrangements of atoms in a periodic lattice only a vanishingly small percentage are thermodynamically stable, which is a key indicator of the materials that can be experimentally realized. Two fundamental tasks in this area are to (a) predict the stable crystal structure of a known composition of elements and (b) propose novel compositions along with their stable structures. We present FlowMM, a pair of generative models that achieve state-of-the-art performance on both tasks while being more efficient and more flexible than competing methods. We generalize Riemannian Flow Matching to suit the symmetries inherent to crystals: translation, rotation, permutation, and periodic boundary conditions. Our framework enables the freedom to choose the flow base distributions, drastically simplifying the problem of learning crystal structures compared with diffusion models. In addition to standard benchmarks, we validate FlowMM's generated structures with quantum chemistry calculations, demonstrating that it is about 3x more efficient, in terms of integration steps, at finding stable materials compared to previous open methods.
△ Less
Submitted 7 June, 2024;
originally announced June 2024.
-
Kinematics of Abdominal Aortic Aneurysms
Authors:
Mostafa Jamshidian,
Adam Wittek,
Saeideh Sekhavat,
Karol Miller
Abstract:
A search in Scopus within "Article title, Abstract, Keywords" unveils 2,444 documents focused on the biomechanics of Abdominal Aortic Aneurysm (AAA), mostly on AAA wall stress. Only 24 documents investigated AAA kinematics, an important topic that could potentially offer insights into the biomechanics of AAA. In this paper, we present an image-based approach for patient-specific, in vivo, and non-…
▽ More
A search in Scopus within "Article title, Abstract, Keywords" unveils 2,444 documents focused on the biomechanics of Abdominal Aortic Aneurysm (AAA), mostly on AAA wall stress. Only 24 documents investigated AAA kinematics, an important topic that could potentially offer insights into the biomechanics of AAA. In this paper, we present an image-based approach for patient-specific, in vivo, and non-invasive AAA kinematic analysis using patient's time-resolved 3D computed tomography angiography (4D CTA) images. Our approach relies on regularized deformable image registration for estimating wall displacement, estimation of the local wall strain as the ratio of its normal displacement to its local radius of curvature, and local surface fitting with non-deterministic outlier detection for estimating the wall radius of curvature. We verified our approach against synthetic ground truth image data created by warping a 3D CTA image of AAA using a realistic displacement field obtained from a finite element biomechanical model. We applied our approach to assess AAA wall displacements and strains in ten patients. Our kinematic analysis results indicated that the 99th percentile of circumferential wall strain, among all patients, ranged from 3.16% to 7.31%, with an average of 5.36% and a standard deviation of 1.28%.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Human-interpretable clustering of short-text using large language models
Authors:
Justin K. Miller,
Tristram J. Alexander
Abstract:
Large language models have seen extraordinary growth in popularity due to their human-like content generation capabilities. We show that these models can also be used to successfully cluster human-generated content, with success defined through the measures of distinctiveness and interpretability. This success is validated by both human reviewers and ChatGPT, providing an automated means to close…
▽ More
Large language models have seen extraordinary growth in popularity due to their human-like content generation capabilities. We show that these models can also be used to successfully cluster human-generated content, with success defined through the measures of distinctiveness and interpretability. This success is validated by both human reviewers and ChatGPT, providing an automated means to close the 'validation gap' that has challenged short-text clustering. Comparing the machine and human approaches we identify the biases inherent in each, and question the reliance on human-coding as the 'gold standard'. We apply our methodology to Twitter bios and find characteristic ways humans describe themselves, agreeing well with prior specialist work, but with interesting differences characteristic of the medium used to express identity.
△ Less
Submitted 12 May, 2024;
originally announced May 2024.
-
Active Learning of Dynamics Using Prior Domain Knowledge in the Sampling Process
Authors:
Kevin S. Miller,
Adam J. Thorpe,
Ufuk Topcu
Abstract:
We present an active learning algorithm for learning dynamics that leverages side information by explicitly incorporating prior domain knowledge into the sampling process. Our proposed algorithm guides the exploration toward regions that demonstrate high empirical discrepancy between the observed data and an imperfect prior model of the dynamics derived from side information. Through numerical exp…
▽ More
We present an active learning algorithm for learning dynamics that leverages side information by explicitly incorporating prior domain knowledge into the sampling process. Our proposed algorithm guides the exploration toward regions that demonstrate high empirical discrepancy between the observed data and an imperfect prior model of the dynamics derived from side information. Through numerical experiments, we demonstrate that this strategy explores regions of high discrepancy and accelerates learning while simultaneously reducing model uncertainty. We rigorously prove that our active learning algorithm yields a consistent estimate of the underlying dynamics by providing an explicit rate of convergence for the maximum predictive variance. We demonstrate the efficacy of our approach on an under-actuated pendulum system and on the half-cheetah MuJoCo environment.
△ Less
Submitted 25 March, 2024;
originally announced March 2024.
-
Towards Full Automation of Geometry Extraction for Biomechanical Analysis of Abdominal Aortic Aneurysm; Neural Network-Based versus Classical Methodologies
Authors:
Farah Alkhatib,
Mostafa Jamshidian,
Donatien Le Liepvre,
Florian Bernard,
Ludovic Minvielle,
Adam Wittek,
Karol Miller
Abstract:
In this study we investigated the impact of image segmentation methods on the results of stress computation in the wall of abdominal aortic aneurysms (AAAs). We compared wall stress distributions and magnitudes calculated from geometry models obtained from classical semi-automated segmentation versus automated neural network-based segmentation. Ten different AAA contrast-enhanced computed tomograp…
▽ More
In this study we investigated the impact of image segmentation methods on the results of stress computation in the wall of abdominal aortic aneurysms (AAAs). We compared wall stress distributions and magnitudes calculated from geometry models obtained from classical semi-automated segmentation versus automated neural network-based segmentation. Ten different AAA contrast-enhanced computed tomography (CT) images were semi-automatically segmented by an analyst, taking, depending on the quality of an image, between 15 and 40 minutes of human effort per patient. The same images were automatically segmented using PRAEVAorta 2, commercial software by NUREA (https://www.nurea-soft.com/), developed based on artificial intelligence (AI) algorithms, requiring only 1-2 minutes of computer time per patient. Aneurysm wall stress calculations performed using the BioPARR software (https://bioparr.mech.uwa.edu.au/) revealed that, compared to the classical semi-automated segmentation, the automatic neural network-based segmentation leads to equivalent stress distributions, and slightly higher peak and 99th percentile maximum principal stress values. This difference is due to consistently larger lumen surface areas in automatically segmented models as compared to classical semi-automated segmentations, resulting in greater total pressure load on the wall. Our findings are a steppingstone toward a fully automated pipeline for biomechanical analysis of AAAs, starting with CT scans and concluding with wall stress assessment, while at the same time highlighting the critical importance of the repeatable and accurate segmentation of the lumen, the difficult problem often underestimated by the literature.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Open Meshed Anatomy: Towards a comprehensive finite element hexahedral mesh derived from open atlases
Authors:
Andy Trung Huynh,
Benjamin Zwick,
Michael Halle,
Adam Wittek,
Karol Miller
Abstract:
Computational simulations using methods such as the finite element (FE) method rely on high-quality meshes for achieving accurate results. This study introduces a method for creating a high-quality hexahedral mesh using the Open Anatomy Project's brain atlas. Our atlas-based FE hexahedral mesh of the brain mitigates potential inaccuracies and uncertainties due to segmentation - a process that ofte…
▽ More
Computational simulations using methods such as the finite element (FE) method rely on high-quality meshes for achieving accurate results. This study introduces a method for creating a high-quality hexahedral mesh using the Open Anatomy Project's brain atlas. Our atlas-based FE hexahedral mesh of the brain mitigates potential inaccuracies and uncertainties due to segmentation - a process that often requires input of an inexperienced analyst. It accomplishes this by leveraging existing segmentation from the atlas. We further extend the mesh's usability by forming a two-way correspondence between the atlas and mesh. This feature facilitates property assignment for computational simulations and enhances result analysis within an anatomical context. We demonstrate the application of the mesh by solving the electroencephalography (EEG) forward problem. Our method simplifies the mesh creation process, reducing time and effort, and provides a more comprehensive and contextually enriched visualisation of simulation outcomes.
△ Less
Submitted 22 February, 2024;
originally announced February 2024.
-
Dirichlet Active Learning
Authors:
Kevin Miller,
Ryan Murray
Abstract:
This work introduces Dirichlet Active Learning (DiAL), a Bayesian-inspired approach to the design of active learning algorithms. Our framework models feature-conditional class probabilities as a Dirichlet random field and lends observational strength between similar features in order to calibrate the random field. This random field can then be utilized in learning tasks: in particular, we can use…
▽ More
This work introduces Dirichlet Active Learning (DiAL), a Bayesian-inspired approach to the design of active learning algorithms. Our framework models feature-conditional class probabilities as a Dirichlet random field and lends observational strength between similar features in order to calibrate the random field. This random field can then be utilized in learning tasks: in particular, we can use current estimates of mean and variance to conduct classification and active learning in the context where labeled data is scarce. We demonstrate the applicability of this model to low-label rate graph learning by constructing ``propagation operators'' based upon the graph Laplacian, and offer computational studies demonstrating the method's competitiveness with the state of the art. Finally, we provide rigorous guarantees regarding the ability of this approach to ensure both exploration and exploitation, expressed respectively in terms of cluster exploration and increased attention to decision boundaries.
△ Less
Submitted 9 November, 2023;
originally announced November 2023.
-
Searching for Optimal Runtime Assurance via Reachability and Reinforcement Learning
Authors:
Kristina Miller,
Christopher K. Zeitler,
William Shen,
Kerianne Hobbs,
Sayan Mitra,
John Schierman,
Mahesh Viswanathan
Abstract:
A runtime assurance system (RTA) for a given plant enables the exercise of an untrusted or experimental controller while assuring safety with a backup (or safety) controller. The relevant computational design problem is to create a logic that assures safety by switching to the safety controller as needed, while maximizing some performance criteria, such as the utilization of the untrusted controll…
▽ More
A runtime assurance system (RTA) for a given plant enables the exercise of an untrusted or experimental controller while assuring safety with a backup (or safety) controller. The relevant computational design problem is to create a logic that assures safety by switching to the safety controller as needed, while maximizing some performance criteria, such as the utilization of the untrusted controller. Existing RTA design strategies are well-known to be overly conservative and, in principle, can lead to safety violations. In this paper, we formulate the optimal RTA design problem and present a new approach for solving it. Our approach relies on reward shaping and reinforcement learning. It can guarantee safety and leverage machine learning technologies for scalability. We have implemented this algorithm and present experimental results comparing our approach with state-of-the-art reachability and simulation-based RTA approaches in a number of scenarios using aircraft models in 3D space with complex safety requirements. Our approach can guarantee safety while increasing utilization of the experimental controller over existing approaches.
△ Less
Submitted 6 October, 2023;
originally announced October 2023.
-
Simulation-based Inference with the Generalized Kullback-Leibler Divergence
Authors:
Benjamin Kurt Miller,
Marco Federici,
Christoph Weniger,
Patrick Forré
Abstract:
In Simulation-based Inference, the goal is to solve the inverse problem when the likelihood is only known implicitly. Neural Posterior Estimation commonly fits a normalized density estimator as a surrogate model for the posterior. This formulation cannot easily fit unnormalized surrogates because it optimizes the Kullback-Leibler divergence. We propose to optimize a generalized Kullback-Leibler di…
▽ More
In Simulation-based Inference, the goal is to solve the inverse problem when the likelihood is only known implicitly. Neural Posterior Estimation commonly fits a normalized density estimator as a surrogate model for the posterior. This formulation cannot easily fit unnormalized surrogates because it optimizes the Kullback-Leibler divergence. We propose to optimize a generalized Kullback-Leibler divergence that accounts for the normalization constant in unnormalized distributions. The objective recovers Neural Posterior Estimation when the model class is normalized and unifies it with Neural Ratio Estimation, combining both into a single objective. We investigate a hybrid model that offers the best of both worlds by learning a normalized base distribution and a learned ratio. We also present benchmark results.
△ Less
Submitted 3 October, 2023;
originally announced October 2023.
-
Cluster-aware Semi-supervised Learning: Relational Knowledge Distillation Provably Learns Clustering
Authors:
Yijun Dong,
Kevin Miller,
Qi Lei,
Rachel Ward
Abstract:
Despite the empirical success and practical significance of (relational) knowledge distillation that matches (the relations of) features between teacher and student models, the corresponding theoretical interpretations remain limited for various knowledge distillation paradigms. In this work, we take an initial step toward a theoretical understanding of relational knowledge distillation (RKD), wit…
▽ More
Despite the empirical success and practical significance of (relational) knowledge distillation that matches (the relations of) features between teacher and student models, the corresponding theoretical interpretations remain limited for various knowledge distillation paradigms. In this work, we take an initial step toward a theoretical understanding of relational knowledge distillation (RKD), with a focus on semi-supervised classification problems. We start by casting RKD as spectral clustering on a population-induced graph unveiled by a teacher model. Via a notion of clustering error that quantifies the discrepancy between the predicted and ground truth clusterings, we illustrate that RKD over the population provably leads to low clustering error. Moreover, we provide a sample complexity bound for RKD with limited unlabeled samples. For semi-supervised learning, we further demonstrate the label efficiency of RKD through a general framework of cluster-aware semi-supervised learning that assumes low clustering errors. Finally, by unifying data augmentation consistency regularization into this cluster-aware framework, we show that despite the common effect of learning accurate clusterings, RKD facilitates a "global" perspective through spectral clustering, whereas consistency regularization focuses on a "local" perspective via expansion.
△ Less
Submitted 23 October, 2023; v1 submitted 20 July, 2023;
originally announced July 2023.
-
Novel Batch Active Learning Approach and Its Application to Synthetic Aperture Radar Datasets
Authors:
James Chapman,
Bohan Chen,
Zheng Tan,
Jeff Calder,
Kevin Miller,
Andrea L. Bertozzi
Abstract:
Active learning improves the performance of machine learning methods by judiciously selecting a limited number of unlabeled data points to query for labels, with the aim of maximally improving the underlying classifier's performance. Recent gains have been made using sequential active learning for synthetic aperture radar (SAR) data arXiv:2204.00005. In each iteration, sequential active learning s…
▽ More
Active learning improves the performance of machine learning methods by judiciously selecting a limited number of unlabeled data points to query for labels, with the aim of maximally improving the underlying classifier's performance. Recent gains have been made using sequential active learning for synthetic aperture radar (SAR) data arXiv:2204.00005. In each iteration, sequential active learning selects a query set of size one while batch active learning selects a query set of multiple datapoints. While batch active learning methods exhibit greater efficiency, the challenge lies in maintaining model accuracy relative to sequential active learning methods. We developed a novel, two-part approach for batch active learning: Dijkstra's Annulus Core-Set (DAC) for core-set generation and LocalMax for batch sampling. The batch active learning process that combines DAC and LocalMax achieves nearly identical accuracy as sequential active learning but is more efficient, proportional to the batch size. As an application, a pipeline is built based on transfer learning feature embedding, graph learning, DAC, and LocalMax to classify the FUSAR-Ship and OpenSARShip datasets. Our pipeline outperforms the state-of-the-art CNN-based methods.
△ Less
Submitted 19 July, 2023;
originally announced July 2023.
-
RTAEval: A framework for evaluating runtime assurance logic
Authors:
Kristina Miller,
Christopher K. Zeitler,
William Shen,
Mahesh Viswanathan,
Sayan Mitra
Abstract:
Runtime assurance (RTA) addresses the problem of keeping an autonomous system safe while using an untrusted (or experimental) controller. This can be done via logic that explicitly switches between the untrusted controller and a safety controller, or logic that filters the input provided by the untrusted controller. While several tools implement specific instances of RTAs, there is currently no fr…
▽ More
Runtime assurance (RTA) addresses the problem of keeping an autonomous system safe while using an untrusted (or experimental) controller. This can be done via logic that explicitly switches between the untrusted controller and a safety controller, or logic that filters the input provided by the untrusted controller. While several tools implement specific instances of RTAs, there is currently no framework for evaluating different approaches. Given the importance of the RTA problem in building safe autonomous systems, an evaluation tool is needed. In this paper, we present the RTAEval framework as a low code framework that can be used to quickly evaluate different RTA logics for different types of agents in a variety of scenarios. RTAEval is designed to quickly create scenarios, run different RTA logics, and collect data that can be used to evaluate and visualize performance. In this paper, we describe different components of RTAEval and show how it can be used to create and evaluate scenarios involving multiple aircraft models.
△ Less
Submitted 5 June, 2023;
originally announced June 2023.
-
Balancing Simulation-based Inference for Conservative Posteriors
Authors:
Arnaud Delaunoy,
Benjamin Kurt Miller,
Patrick Forré,
Christoph Weniger,
Gilles Louppe
Abstract:
Conservative inference is a major concern in simulation-based inference. It has been shown that commonly used algorithms can produce overconfident posterior approximations. Balancing has empirically proven to be an effective way to mitigate this issue. However, its application remains limited to neural ratio estimation. In this work, we extend balancing to any algorithm that provides a posterior d…
▽ More
Conservative inference is a major concern in simulation-based inference. It has been shown that commonly used algorithms can produce overconfident posterior approximations. Balancing has empirically proven to be an effective way to mitigate this issue. However, its application remains limited to neural ratio estimation. In this work, we extend balancing to any algorithm that provides a posterior density. In particular, we introduce a balanced version of both neural posterior estimation and contrastive neural ratio estimation. We show empirically that the balanced versions tend to produce conservative posterior approximations on a wide variety of benchmarks. In addition, we provide an alternative interpretation of the balancing condition in terms of the $χ^2$ divergence.
△ Less
Submitted 21 April, 2023;
originally announced April 2023.
-
Autonomous Local Catalog Maintenance of Close Proximity Satellite Systems on Closed Natural Motion Trajectories
Authors:
Christopher W. Hays,
Kristina Miller,
Alexander Soderlund,
Sean Phillips,
Troy Henderson
Abstract:
To enable space mission sets like on-orbit servicing and manufacturing, agents in close proximity maybe operating too close to yield resolved localization solutions to operators from ground sensors. This leads to a requirement on the systems need to maintain a catalog of their local neighborhood, however, this may impose a large burden on each agent by requiring updating and maintenance of this ca…
▽ More
To enable space mission sets like on-orbit servicing and manufacturing, agents in close proximity maybe operating too close to yield resolved localization solutions to operators from ground sensors. This leads to a requirement on the systems need to maintain a catalog of their local neighborhood, however, this may impose a large burden on each agent by requiring updating and maintenance of this catalog at each node. To alleviate this burden, this paper considers the case of a single satellite agent (a chief) updating a single catalog. More specifically, we consider the case of numerous satellite deputy agents in a local neighborhood of a chief, the goal of the chief satellite is to maintain and update a catalog of all agents within this neighborhood through onboard measurements. We consider the agents having relative translational and attitude motion dynamics between the chief and deputy, with the chief centered at the origin of the frame. We provide an end-to-end solution of the this problem through providing both a supervisory control method coupled with a Bayesian Filter that propagates the belief state and provides the catalog solutions to the supervisor. The goal of the supervisory controller is to determine which agent to look at and at which times while adhering to constraints of the chief satellite. We provide a numerical validation to this problem with three agents.
△ Less
Submitted 9 February, 2023; v1 submitted 1 February, 2023;
originally announced February 2023.
-
Poisson Reweighted Laplacian Uncertainty Sampling for Graph-based Active Learning
Authors:
Kevin Miller,
Jeff Calder
Abstract:
We show that uncertainty sampling is sufficient to achieve exploration versus exploitation in graph-based active learning, as long as the measure of uncertainty properly aligns with the underlying model and the model properly reflects uncertainty in unexplored regions. In particular, we use a recently developed algorithm, Poisson ReWeighted Laplace Learning (PWLL) for the classifier and we introdu…
▽ More
We show that uncertainty sampling is sufficient to achieve exploration versus exploitation in graph-based active learning, as long as the measure of uncertainty properly aligns with the underlying model and the model properly reflects uncertainty in unexplored regions. In particular, we use a recently developed algorithm, Poisson ReWeighted Laplace Learning (PWLL) for the classifier and we introduce an acquisition function designed to measure uncertainty in this graph-based classifier that identifies unexplored regions of the data. We introduce a diagonal perturbation in PWLL which produces exponential localization of solutions, and controls the exploration versus exploitation tradeoff in active learning. We use the well-posed continuum limit of PWLL to rigorously analyze our method, and present experimental results on a number of graph-based image classification problems.
△ Less
Submitted 27 October, 2022;
originally announced October 2022.
-
Contrastive Neural Ratio Estimation for Simulation-based Inference
Authors:
Benjamin Kurt Miller,
Christoph Weniger,
Patrick Forré
Abstract:
Likelihood-to-evidence ratio estimation is usually cast as either a binary (NRE-A) or a multiclass (NRE-B) classification task. In contrast to the binary classification framework, the current formulation of the multiclass version has an intrinsic and unknown bias term, making otherwise informative diagnostics unreliable. We propose a multiclass framework free from the bias inherent to NRE-B at opt…
▽ More
Likelihood-to-evidence ratio estimation is usually cast as either a binary (NRE-A) or a multiclass (NRE-B) classification task. In contrast to the binary classification framework, the current formulation of the multiclass version has an intrinsic and unknown bias term, making otherwise informative diagnostics unreliable. We propose a multiclass framework free from the bias inherent to NRE-B at optimum, leaving us in the position to run diagnostics that practitioners depend on. It also recovers NRE-A in one corner case and NRE-B in the limiting case. For fair comparison, we benchmark the behavior of all algorithms in both familiar and novel training regimes: when jointly drawn data is unlimited, when data is fixed but prior draws are unlimited, and in the commonplace fixed data and parameters setting. Our investigations reveal that the highest performing models are distant from the competitors (NRE-A, NRE-B) in hyperparameter space. We make a recommendation for hyperparameters distinct from the previous models. We suggest two bounds on the mutual information as performance metrics for simulation-based inference methods, without the need for posterior samples, and provide experimental results. This version corrects a minor implementation error in $γ$, improving results.
△ Less
Submitted 4 July, 2024; v1 submitted 10 October, 2022;
originally announced October 2022.
-
Automated modeling of brain bioelectric activity within the 3D Slicer environment
Authors:
Saima Safdar,
Benjamin Zwick,
George Bourantas,
Grand Joldes,
Damon Hyde,
Simon Warfield,
Adam Wittek,
Karol Miller
Abstract:
Electrocorticography (ECoG) or intracranial electroencephalography (iEEG) monitors electric potential directly on the surface of the brain and can be used to inform treatment planning for epilepsy surgery when paired with numerical modeling. For solving the inverse problem in epilepsy seizure onset localization, accurate solution of the iEEG forward problem is critical which requires accurate repr…
▽ More
Electrocorticography (ECoG) or intracranial electroencephalography (iEEG) monitors electric potential directly on the surface of the brain and can be used to inform treatment planning for epilepsy surgery when paired with numerical modeling. For solving the inverse problem in epilepsy seizure onset localization, accurate solution of the iEEG forward problem is critical which requires accurate representation of the patient's brain geometry and tissue electrical conductivity. In this study, we present an automatic framework for constructing the brain volume conductor model for solving the iEEG forward problem and visualizing the brain bioelectric field on a deformed patient-specific brain model within the 3D Slicer environment. We solve the iEEG forward problem on the predicted postoperative geometry using the finite element method (FEM) which accounts for patient-specific inhomogeneity and anisotropy of tissue conductivity. We use an epilepsy case study to illustrate the workflow of our framework developed and integrated within 3D Slicer.
△ Less
Submitted 27 July, 2022;
originally announced August 2022.
-
Generation of Patient-specific Structured Hexahedral Mesh of Aortic Aneurysm Wall
Authors:
Farah Alkhatib,
George C. Bourantas,
Adam Wittek,
Karol Miller
Abstract:
Abdominal Aortic Aneurysm (AAA) is an enlargement in the lower part of the main artery Aorta by 1.5 times its normal diameter. AAA can cause death if rupture occurs. Elective surgeries are recommended to prevent rupture based on geometrical measurements of AAA diameter and diameter growth rate. Reliability of these geometric parameters to predict the AAA rupture risk has been questioned, and biome…
▽ More
Abdominal Aortic Aneurysm (AAA) is an enlargement in the lower part of the main artery Aorta by 1.5 times its normal diameter. AAA can cause death if rupture occurs. Elective surgeries are recommended to prevent rupture based on geometrical measurements of AAA diameter and diameter growth rate. Reliability of these geometric parameters to predict the AAA rupture risk has been questioned, and biomechanical assessment has been proposed to distinguish between patients with high and low risk of rupture. Stress in aneurysm wall is the main variable of interest in such assessment. Most studies use finite element method to compute AAA stress. This requires discretising patient-specific geometry (aneurysm wall and intraluminal thrombus ILT) into finite elements/meshes. Tetrahedral elements are most commonly used as they can be generated in seemingly automated and effortless way. In practice, however, due to complex aneurysm geometry, the process tends to require time consuming mesh optimisation to ensure sufficiently high quality of tetrahedral elements. Furthermore, ensuring solution convergence requires large number of tetrahedral elements, which leads to long computation times. In this study, we focus on generation of hexahedral meshes as they are known to provide converged solution for smaller number of elements than tetrahedral meshes. Generation of hexahedral meshes for continua with complex/irregular geometry, such as aneurysms, requires analyst interaction. We propose a procedure for generating high quality patient-specific hexahedral discretisation of aneurysm wall using the algorithms available in commercial software package for mesh generation. For aneurysm cases, we demonstrate that the procedure facilitates patient-specific mesh generation within timeframe consistent with clinical workflow constraints while requiring only limited input from the analyst.
△ Less
Submitted 14 June, 2022; v1 submitted 13 June, 2022;
originally announced June 2022.
-
Towards on-sky adaptive optics control using reinforcement learning
Authors:
J. Nousiainen,
C. Rajani,
M. Kasper,
T. Helin,
S. Y. Haffert,
C. Vérinaud,
J. R. Males,
K. Van Gorkom,
L. M. Close,
J. D. Long,
A. D. Hedglen,
O. Guyon,
L. Schatz,
M. Kautz,
J. Lumbres,
A. Rodack,
J. M. Knight,
K. Miller
Abstract:
The direct imaging of potentially habitable Exoplanets is one prime science case for the next generation of high contrast imaging instruments on ground-based extremely large telescopes. To reach this demanding science goal, the instruments are equipped with eXtreme Adaptive Optics (XAO) systems which will control thousands of actuators at a framerate of kilohertz to several kilohertz. Most of the…
▽ More
The direct imaging of potentially habitable Exoplanets is one prime science case for the next generation of high contrast imaging instruments on ground-based extremely large telescopes. To reach this demanding science goal, the instruments are equipped with eXtreme Adaptive Optics (XAO) systems which will control thousands of actuators at a framerate of kilohertz to several kilohertz. Most of the habitable exoplanets are located at small angular separations from their host stars, where the current XAO systems' control laws leave strong residuals.Current AO control strategies like static matrix-based wavefront reconstruction and integrator control suffer from temporal delay error and are sensitive to mis-registration, i.e., to dynamic variations of the control system geometry. We aim to produce control methods that cope with these limitations, provide a significantly improved AO correction and, therefore, reduce the residual flux in the coronagraphic point spread function.
We extend previous work in Reinforcement Learning for AO. The improved method, called PO4AO, learns a dynamics model and optimizes a control neural network, called a policy. We introduce the method and study it through numerical simulations of XAO with Pyramid wavefront sensing for the 8-m and 40-m telescope aperture cases. We further implemented PO4AO and carried out experiments in a laboratory environment using MagAO-X at the Steward laboratory. PO4AO provides the desired performance by improving the coronagraphic contrast in numerical simulations by factors 3-5 within the control region of DM and Pyramid WFS, in simulation and in the laboratory. The presented method is also quick to train, i.e., on timescales of typically 5-10 seconds, and the inference time is sufficiently small (< ms) to be used in real-time control for XAO with currently available hardware even for extremely large telescopes.
△ Less
Submitted 16 May, 2022;
originally announced May 2022.
-
Recovery by discretization corrected particle strength exchange (DC PSE) operators
Authors:
Benjamin F. Zwick,
George C. Bourantas,
Farah Alkhatib,
Adam Wittek,
Karol Miller
Abstract:
A new recovery technique based on discretization corrected particle strength exchange (DC PSE) operators is developed in this paper. DC PSE is a collocation method that can be used to compute derivatives directly at nodal points, instead of by projection from Gauss points as is done in many finite element-based recovery techniques. The proposed method is truly meshless and does not require patches…
▽ More
A new recovery technique based on discretization corrected particle strength exchange (DC PSE) operators is developed in this paper. DC PSE is a collocation method that can be used to compute derivatives directly at nodal points, instead of by projection from Gauss points as is done in many finite element-based recovery techniques. The proposed method is truly meshless and does not require patches of elements to be defined, which makes it generally applicable to point clouds and arbitrary element topologies. Numerical examples show that the proposed method is accurate and robust.
△ Less
Submitted 15 April, 2022;
originally announced April 2022.
-
Graph-based Active Learning for Semi-supervised Classification of SAR Data
Authors:
Kevin Miller,
John Mauro,
Jason Setiadi,
Xoaquin Baca,
Zhan Shi,
Jeff Calder,
Andrea L. Bertozzi
Abstract:
We present a novel method for classification of Synthetic Aperture Radar (SAR) data by combining ideas from graph-based learning and neural network methods within an active learning framework. Graph-based methods in machine learning are based on a similarity graph constructed from the data. When the data consists of raw images composed of scenes, extraneous information can make the classification…
▽ More
We present a novel method for classification of Synthetic Aperture Radar (SAR) data by combining ideas from graph-based learning and neural network methods within an active learning framework. Graph-based methods in machine learning are based on a similarity graph constructed from the data. When the data consists of raw images composed of scenes, extraneous information can make the classification task more difficult. In recent years, neural network methods have been shown to provide a promising framework for extracting patterns from SAR images. These methods, however, require ample training data to avoid overfitting. At the same time, such training data are often unavailable for applications of interest, such as automatic target recognition (ATR) and SAR data. We use a Convolutional Neural Network Variational Autoencoder (CNNVAE) to embed SAR data into a feature space, and then construct a similarity graph from the embedded data and apply graph-based semi-supervised learning techniques. The CNNVAE feature embedding and graph construction requires no labeled data, which reduces overfitting and improves the generalization performance of graph learning at low label rates. Furthermore, the method easily incorporates a human-in-the-loop for active learning in the data-labeling process. We present promising results and compare them to other standard machine learning methods on the Moving and Stationary Target Acquisition and Recognition (MSTAR) dataset for ATR with small amounts of labeled data.
△ Less
Submitted 30 March, 2022;
originally announced April 2022.
-
Generative Coarse-Graining of Molecular Conformations
Authors:
Wujie Wang,
Minkai Xu,
Chen Cai,
Benjamin Kurt Miller,
Tess Smidt,
Yusu Wang,
Jian Tang,
Rafael Gómez-Bombarelli
Abstract:
Coarse-graining (CG) of molecular simulations simplifies the particle representation by grouping selected atoms into pseudo-beads and drastically accelerates simulation. However, such CG procedure induces information losses, which makes accurate backmapping, i.e., restoring fine-grained (FG) coordinates from CG coordinates, a long-standing challenge. Inspired by the recent progress in generative m…
▽ More
Coarse-graining (CG) of molecular simulations simplifies the particle representation by grouping selected atoms into pseudo-beads and drastically accelerates simulation. However, such CG procedure induces information losses, which makes accurate backmapping, i.e., restoring fine-grained (FG) coordinates from CG coordinates, a long-standing challenge. Inspired by the recent progress in generative models and equivariant networks, we propose a novel model that rigorously embeds the vital probabilistic nature and geometric consistency requirements of the backmapping transformation. Our model encodes the FG uncertainties into an invariant latent space and decodes them back to FG geometries via equivariant convolutions. To standardize the evaluation of this domain, we provide three comprehensive benchmarks based on molecular dynamics trajectories. Experiments show that our approach always recovers more realistic structures and outperforms existing data-driven methods with a significant margin.
△ Less
Submitted 16 June, 2022; v1 submitted 28 January, 2022;
originally announced January 2022.
-
Efficient and Reliable Overlay Networks for Decentralized Federated Learning
Authors:
Yifan Hua,
Kevin Miller,
Andrea L. Bertozzi,
Chen Qian,
Bao Wang
Abstract:
We propose near-optimal overlay networks based on $d$-regular expander graphs to accelerate decentralized federated learning (DFL) and improve its generalization. In DFL a massive number of clients are connected by an overlay network, and they solve machine learning problems collaboratively without sharing raw data. Our overlay network design integrates spectral graph theory and the theoretical co…
▽ More
We propose near-optimal overlay networks based on $d$-regular expander graphs to accelerate decentralized federated learning (DFL) and improve its generalization. In DFL a massive number of clients are connected by an overlay network, and they solve machine learning problems collaboratively without sharing raw data. Our overlay network design integrates spectral graph theory and the theoretical convergence and generalization bounds for DFL. As such, our proposed overlay networks accelerate convergence, improve generalization, and enhance robustness to clients failures in DFL with theoretical guarantees. Also, we present an efficient algorithm to convert a given graph to a practical overlay network and maintaining the network topology after potential client failures. We numerically verify the advantages of DFL with our proposed networks on various benchmark tasks, ranging from image classification to language modeling using hundreds of clients.
△ Less
Submitted 12 December, 2021;
originally announced December 2021.
-
Automatically detecting anomalous exoplanet transits
Authors:
Christoph J. Hönes,
Benjamin Kurt Miller,
Ana M. Heras,
Bernard H. Foing
Abstract:
Raw light curve data from exoplanet transits is too complex to naively apply traditional outlier detection methods. We propose an architecture which estimates a latent representation of both the main transit and residual deviations with a pair of variational autoencoders. We show, using two fabricated datasets, that our latent representations of anomalous transit residuals are significantly more a…
▽ More
Raw light curve data from exoplanet transits is too complex to naively apply traditional outlier detection methods. We propose an architecture which estimates a latent representation of both the main transit and residual deviations with a pair of variational autoencoders. We show, using two fabricated datasets, that our latent representations of anomalous transit residuals are significantly more amenable to outlier detection than raw data or the latent representation of a traditional variational autoencoder. We then apply our method to real exoplanet transit data. Our study is the first which automatically identifies anomalous exoplanet transit light curves. We additionally release three first-of-their-kind datasets to enable further research.
△ Less
Submitted 16 November, 2021;
originally announced November 2021.
-
Fast and Credible Likelihood-Free Cosmology with Truncated Marginal Neural Ratio Estimation
Authors:
Alex Cole,
Benjamin Kurt Miller,
Samuel J. Witte,
Maxwell X. Cai,
Meiert W. Grootes,
Francesco Nattino,
Christoph Weniger
Abstract:
Sampling-based inference techniques are central to modern cosmological data analysis; these methods, however, scale poorly with dimensionality and typically require approximate or intractable likelihoods. In this paper we describe how Truncated Marginal Neural Ratio Estimation (TMNRE) (a new approach in so-called simulation-based inference) naturally evades these issues, improving the $(i)$ effici…
▽ More
Sampling-based inference techniques are central to modern cosmological data analysis; these methods, however, scale poorly with dimensionality and typically require approximate or intractable likelihoods. In this paper we describe how Truncated Marginal Neural Ratio Estimation (TMNRE) (a new approach in so-called simulation-based inference) naturally evades these issues, improving the $(i)$ efficiency, $(ii)$ scalability, and $(iii)$ trustworthiness of the inferred posteriors. Using measurements of the Cosmic Microwave Background (CMB), we show that TMNRE can achieve converged posteriors using orders of magnitude fewer simulator calls than conventional Markov Chain Monte Carlo (MCMC) methods. Remarkably, the required number of samples is effectively independent of the number of nuisance parameters. In addition, a property called \emph{local amortization} allows the performance of rigorous statistical consistency checks that are not accessible to sampling-based methods. TMNRE promises to become a powerful tool for cosmological data analysis, particularly in the context of extended cosmologies, where the timescale required for conventional sampling-based inference methods to converge can greatly exceed that of simple cosmological models such as $Λ$CDM. To perform these computations, we use an implementation of TMNRE via the open-source code \texttt{swyft}.
△ Less
Submitted 8 November, 2022; v1 submitted 15 November, 2021;
originally announced November 2021.
-
Provably Robust Model-Centric Explanations for Critical Decision-Making
Authors:
Cecilia G. Morales,
Nicholas Gisolfi,
Robert Edman,
James K. Miller,
Artur Dubrawski
Abstract:
We recommend using a model-centric, Boolean Satisfiability (SAT) formalism to obtain useful explanations of trained model behavior, different and complementary to what can be gleaned from LIME and SHAP, popular data-centric explanation tools in Artificial Intelligence (AI). We compare and contrast these methods, and show that data-centric methods may yield brittle explanations of limited practical…
▽ More
We recommend using a model-centric, Boolean Satisfiability (SAT) formalism to obtain useful explanations of trained model behavior, different and complementary to what can be gleaned from LIME and SHAP, popular data-centric explanation tools in Artificial Intelligence (AI). We compare and contrast these methods, and show that data-centric methods may yield brittle explanations of limited practical utility. The model-centric framework, however, can offer actionable insights into risks of using AI models in practice. For critical applications of AI, split-second decision making is best informed by robust explanations that are invariant to properties of data, the capability offered by model-centric frameworks.
△ Less
Submitted 26 October, 2021;
originally announced October 2021.
-
Model-Change Active Learning in Graph-Based Semi-Supervised Learning
Authors:
Kevin Miller,
Andrea L. Bertozzi
Abstract:
Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier. A challenge is to identify which points to label to best improve performance while limiting the number of new labels. "Model-change" active learning quantifies the resulting change incurred in the classifier by introducing the additional…
▽ More
Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier. A challenge is to identify which points to label to best improve performance while limiting the number of new labels. "Model-change" active learning quantifies the resulting change incurred in the classifier by introducing the additional label(s). We pair this idea with graph-based semi-supervised learning methods, that use the spectrum of the graph Laplacian matrix, which can be truncated to avoid prohibitively large computational and storage costs. We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution. We show a variety of multiclass examples that illustrate improved performance over prior state-of-art.
△ Less
Submitted 14 October, 2021;
originally announced October 2021.
-
Truncated Marginal Neural Ratio Estimation
Authors:
Benjamin Kurt Miller,
Alex Cole,
Patrick Forré,
Gilles Louppe,
Christoph Weniger
Abstract:
Parametric stochastic simulators are ubiquitous in science, often featuring high-dimensional input parameters and/or an intractable likelihood. Performing Bayesian parameter inference in this context can be challenging. We present a neural simulation-based inference algorithm which simultaneously offers simulation efficiency and fast empirical posterior testability, which is unique among modern al…
▽ More
Parametric stochastic simulators are ubiquitous in science, often featuring high-dimensional input parameters and/or an intractable likelihood. Performing Bayesian parameter inference in this context can be challenging. We present a neural simulation-based inference algorithm which simultaneously offers simulation efficiency and fast empirical posterior testability, which is unique among modern algorithms. Our approach is simulation efficient by simultaneously estimating low-dimensional marginal posteriors instead of the joint posterior and by proposing simulations targeted to an observation of interest via a prior suitably truncated by an indicator function. Furthermore, by estimating a locally amortized posterior our algorithm enables efficient empirical tests of the robustness of the inference results. Since scientists cannot access the ground truth, these tests are necessary for trusting inference in real-world applications. We perform experiments on a marginalized version of the simulation-based inference benchmark and two complex and narrow posteriors, highlighting the simulator efficiency of our algorithm as well as the quality of the estimated marginal posteriors.
△ Less
Submitted 26 October, 2021; v1 submitted 2 July, 2021;
originally announced July 2021.
-
Formalizing Hall's Marriage Theorem in Lean
Authors:
Alena Gusakov,
Bhavik Mehta,
Kyle A. Miller
Abstract:
We formalize Hall's Marriage Theorem in the Lean theorem prover for inclusion in mathlib, which is a community-driven effort to build a unified mathematics library for Lean. One goal of the mathlib project is to contain all of the topics of a complete undergraduate mathematics education.
We provide three presentations of the main theorem statement: in terms of indexed families of finite sets, of…
▽ More
We formalize Hall's Marriage Theorem in the Lean theorem prover for inclusion in mathlib, which is a community-driven effort to build a unified mathematics library for Lean. One goal of the mathlib project is to contain all of the topics of a complete undergraduate mathematics education.
We provide three presentations of the main theorem statement: in terms of indexed families of finite sets, of relations on types, and of matchings in bipartite graphs. We also formalize a version of Kőnig's lemma (in terms of inverse limits) to boost the theorem to the case of countably infinite index sets. We give a description of the design of the recent mathlib library for simple graphs, and we also give a necessary and sufficient condition for a simple graph to carry a function.
△ Less
Submitted 31 December, 2020;
originally announced January 2021.
-
Simulation-efficient marginal posterior estimation with swyft: stop wasting your precious time
Authors:
Benjamin Kurt Miller,
Alex Cole,
Gilles Louppe,
Christoph Weniger
Abstract:
We present algorithms (a) for nested neural likelihood-to-evidence ratio estimation, and (b) for simulation reuse via an inhomogeneous Poisson point process cache of parameters and corresponding simulations. Together, these algorithms enable automatic and extremely simulator efficient estimation of marginal and joint posteriors. The algorithms are applicable to a wide range of physics and astronom…
▽ More
We present algorithms (a) for nested neural likelihood-to-evidence ratio estimation, and (b) for simulation reuse via an inhomogeneous Poisson point process cache of parameters and corresponding simulations. Together, these algorithms enable automatic and extremely simulator efficient estimation of marginal and joint posteriors. The algorithms are applicable to a wide range of physics and astronomy problems and typically offer an order of magnitude better simulator efficiency than traditional likelihood-based sampling methods. Our approach is an example of likelihood-free inference, thus it is also applicable to simulators which do not offer a tractable likelihood function. Simulator runs are never rejected and can be automatically reused in future analysis. As functional prototype implementation we provide the open-source software package swyft.
△ Less
Submitted 27 November, 2020;
originally announced November 2020.
-
How Much Ad Viewability is Enough? The Effect of Display Ad Viewability on Advertising Effectiveness
Authors:
Christina Uhl,
Nadia Abou Nabout,
Klaus Miller
Abstract:
A large share of all online display advertisements (ads) are never seen by a human. For instance, an ad could appear below the page fold, where a user never scrolls. Yet, an ad is essentially ineffective if it is not at least somewhat viewable. Ad viewability - which refers to the pixel percentage-in-view and the exposure duration of an online display ad - has recently garnered great interest amon…
▽ More
A large share of all online display advertisements (ads) are never seen by a human. For instance, an ad could appear below the page fold, where a user never scrolls. Yet, an ad is essentially ineffective if it is not at least somewhat viewable. Ad viewability - which refers to the pixel percentage-in-view and the exposure duration of an online display ad - has recently garnered great interest among digital advertisers and publishers. However, we know very little about the impact of ad viewability on advertising effectiveness. We work to close this gap by analyzing a large-scale observational data set with more than 350,000 ad impressions similar to the data sets that are typically available to digital advertisers and publishers. This analysis reveals that longer exposure durations (>10 seconds) and 100% visible pixels do not appear to be optimal in generating view-throughs. The highest view-through rates seem to be generated with relatively lower pixel/second-combinations of 50%/1, 50%/5, 75%/1, and 75%/5. However, this analysis does not account for user behavior that may be correlated with or even drive ad viewability and may therefore result in endogeneity issues. Consequently, we manipulated ad viewability in a randomized online experiment for a major European news website, finding the highest ad recognition rates among relatively higher pixel/second-combinations of 75%/10, 100%/5 and 100%/10. Everything below 75\% or 5 seconds performs worse. Yet, we find that it may be sufficient to have either a long exposure duration or high pixel percentage-in-view to reach high advertising effectiveness. Our results provide guidance to advertisers enabling them to establish target viewability rates more appropriately and to publishers who wish to differentiate their viewability products.
△ Less
Submitted 26 August, 2020;
originally announced August 2020.
-
Relevance of Rotationally Equivariant Convolutions for Predicting Molecular Properties
Authors:
Benjamin Kurt Miller,
Mario Geiger,
Tess E. Smidt,
Frank Noé
Abstract:
Equivariant neural networks (ENNs) are graph neural networks embedded in $\mathbb{R}^3$ and are well suited for predicting molecular properties. The ENN library e3nn has customizable convolutions, which can be designed to depend only on distances between points, or also on angular features, making them rotationally invariant, or equivariant, respectively. This paper studies the practical value of…
▽ More
Equivariant neural networks (ENNs) are graph neural networks embedded in $\mathbb{R}^3$ and are well suited for predicting molecular properties. The ENN library e3nn has customizable convolutions, which can be designed to depend only on distances between points, or also on angular features, making them rotationally invariant, or equivariant, respectively. This paper studies the practical value of including angular dependencies for molecular property prediction directly via an ablation study with \texttt{e3nn} and the QM9 data set. We find that, for fixed network depth and parameter count, adding angular features decreased test error by an average of 23%. Meanwhile, increasing network depth decreased test error by only 4% on average, implying that rotationally equivariant layers are comparatively parameter efficient. We present an explanation of the accuracy improvement on the dipole moment, the target which benefited most from the introduction of angular features.
△ Less
Submitted 24 November, 2020; v1 submitted 19 August, 2020;
originally announced August 2020.
-
Intracranial hemodynamics simulations: An efficient and accurate immersed boundary scheme
Authors:
D. S. Lampropoulos,
G. C. Bourantas,
B. F. Zwick,
G. C. Kagadis,
A. Wittek,
K. Miller,
V. C. Loukopoulos
Abstract:
Computational fluid dynamics (CFD) studies have been increasingly used for blood flow simulations in intracranial aneurysms (ICAs). However, despite the continuous progress of body-fitted CFD solvers, generating a high quality mesh is still the bottleneck of the CFD simulation, and strongly affects the accuracy of the numerical solution. To overcome this challenge, which will allow preforming CFD…
▽ More
Computational fluid dynamics (CFD) studies have been increasingly used for blood flow simulations in intracranial aneurysms (ICAs). However, despite the continuous progress of body-fitted CFD solvers, generating a high quality mesh is still the bottleneck of the CFD simulation, and strongly affects the accuracy of the numerical solution. To overcome this challenge, which will allow preforming CFD simulations efficiently for a large number of aneurysm cases we use an Immersed Boundary (IB) method. The proposed scheme relies on Cartesian grids to solve the incompressible Navier-Stokes (N-S) equations, using a finite element solver, and Lagrangian points to discretize the immersed object. All grid generations are conducted through automated algorithms which require no user input. Consequently, we verify the proposed method by comparing our numerical findings (velocity values) with published experimental results. Finally, we test the ability of the scheme to efficiently handle hemodynamic simulations on complex geometries on a sample of four patient-specific intracranial aneurysms.
△ Less
Submitted 27 July, 2020;
originally announced July 2020.
-
Posterior Consistency of Semi-Supervised Regression on Graphs
Authors:
Andrea L. Bertozzi,
Bamdad Hosseini,
Hao Li,
Kevin Miller,
Andrew M. Stuart
Abstract:
Graph-based semi-supervised regression (SSR) is the problem of estimating the value of a function on a weighted graph from its values (labels) on a small subset of the vertices. This paper is concerned with the consistency of SSR in the context of classification, in the setting where the labels have small noise and the underlying graph weighting is consistent with well-clustered nodes. We present…
▽ More
Graph-based semi-supervised regression (SSR) is the problem of estimating the value of a function on a weighted graph from its values (labels) on a small subset of the vertices. This paper is concerned with the consistency of SSR in the context of classification, in the setting where the labels have small noise and the underlying graph weighting is consistent with well-clustered nodes. We present a Bayesian formulation of SSR in which the weighted graph defines a Gaussian prior, using a graph Laplacian, and the labeled data defines a likelihood. We analyze the rate of contraction of the posterior measure around the ground truth in terms of parameters that quantify the small label error and inherent clustering in the graph. We obtain bounds on the rates of contraction and illustrate their sharpness through numerical experiments. The analysis also gives insight into the choice of hyperparameters that enter the definition of the prior.
△ Less
Submitted 24 March, 2021; v1 submitted 24 July, 2020;
originally announced July 2020.
-
Efficient Graph-Based Active Learning with Probit Likelihood via Gaussian Approximations
Authors:
Kevin Miller,
Hao Li,
Andrea L. Bertozzi
Abstract:
We present a novel adaptation of active learning to graph-based semi-supervised learning (SSL) under non-Gaussian Bayesian models. We present an approximation of non-Gaussian distributions to adapt previously Gaussian-based acquisition functions to these more general cases. We develop an efficient rank-one update for applying "look-ahead" based methods as well as model retraining. We also introduc…
▽ More
We present a novel adaptation of active learning to graph-based semi-supervised learning (SSL) under non-Gaussian Bayesian models. We present an approximation of non-Gaussian distributions to adapt previously Gaussian-based acquisition functions to these more general cases. We develop an efficient rank-one update for applying "look-ahead" based methods as well as model retraining. We also introduce a novel "model change" acquisition function based on these approximations that further expands the available collection of active learning acquisition functions for such methods.
△ Less
Submitted 21 July, 2020;
originally announced July 2020.
-
Deep Reinforcement Learning and its Neuroscientific Implications
Authors:
Matthew Botvinick,
Jane X. Wang,
Will Dabney,
Kevin J. Miller,
Zeb Kurth-Nelson
Abstract:
The emergence of powerful artificial intelligence is defining new research directions in neuroscience. To date, this research has focused largely on deep neural networks trained using supervised learning, in tasks such as image classification. However, there is another area of recent AI work which has so far received less attention from neuroscientists, but which may have profound neuroscientific…
▽ More
The emergence of powerful artificial intelligence is defining new research directions in neuroscience. To date, this research has focused largely on deep neural networks trained using supervised learning, in tasks such as image classification. However, there is another area of recent AI work which has so far received less attention from neuroscientists, but which may have profound neuroscientific implications: deep reinforcement learning. Deep RL offers a comprehensive framework for studying the interplay among learning, representation and decision-making, offering to the brain sciences a new set of research tools and a wide range of novel hypotheses. In the present review, we provide a high-level introduction to deep RL, discuss some of its initial applications to neuroscience, and survey its wider implications for research on brain and behavior, concluding with a list of opportunities for next-stage research.
△ Less
Submitted 7 July, 2020;
originally announced July 2020.
-
Immersed boundary finite element method for blood flow simulation
Authors:
G. C. Bourantas,
D. L. Lampropoulos,
B. F. Zwick,
V. C. Loukopoulos,
A. Wittek,
K. Miller
Abstract:
We present an efficient and accurate immersed boundary (IB) finite element (FE) solver for numerically solving incompressible Navier--Stokes equations. Particular emphasis is given to internal flows with complex geometries (blood flow in the vasculature system). IB methods are computationally costly for internal flows, mainly due to the large percentage of grid points that lie outside the flow dom…
▽ More
We present an efficient and accurate immersed boundary (IB) finite element (FE) solver for numerically solving incompressible Navier--Stokes equations. Particular emphasis is given to internal flows with complex geometries (blood flow in the vasculature system). IB methods are computationally costly for internal flows, mainly due to the large percentage of grid points that lie outside the flow domain. In this study, we apply a local refinement strategy, along with a domain reduction approach in order to reduce the grid that covers the flow domain and increase the percentage of the grid nodes that fall inside the flow domain. The proposed method utilizes an efficient and accurate FE solver with the incremental pressure correction scheme (IPCS), along with the boundary condition enforced IB method to numerically solve the transient, incompressible Navier--Stokes flow equations. We verify the accuracy of the numerical method using the analytical solution for Poiseuille flow in a cylinder. We further examine the accuracy and applicability of the proposed method by considering flow within complex geometries, such as blood flow in aneurysmal vessels and the aorta, flow configurations which would otherwise be extremely difficult to solve by most IB methods. Our method offers high accuracy, as demonstrated by the verification examples, and high efficiency, as demonstrated through the solution of blood flow within complex geometry on an off-the-shelf laptop computer.
△ Less
Submitted 4 July, 2020;
originally announced July 2020.
-
Finding Symmetry Breaking Order Parameters with Euclidean Neural Networks
Authors:
Tess E. Smidt,
Mario Geiger,
Benjamin Kurt Miller
Abstract:
Curie's principle states that "when effects show certain asymmetry, this asymmetry must be found in the causes that gave rise to them". We demonstrate that symmetry equivariant neural networks uphold Curie's principle and can be used to articulate many symmetry-relevant scientific questions into simple optimization problems. We prove these properties mathematically and demonstrate them numerically…
▽ More
Curie's principle states that "when effects show certain asymmetry, this asymmetry must be found in the causes that gave rise to them". We demonstrate that symmetry equivariant neural networks uphold Curie's principle and can be used to articulate many symmetry-relevant scientific questions into simple optimization problems. We prove these properties mathematically and demonstrate them numerically by training a Euclidean symmetry equivariant neural network to learn symmetry-breaking input to deform a square into a rectangle and to generate octahedra tilting patterns in perovskites.
△ Less
Submitted 26 October, 2020; v1 submitted 4 July, 2020;
originally announced July 2020.
-
Simple and robust element-free Galerkin method with interpolating shape functions for finite deformation elasticity
Authors:
George Bourantas,
Benjamin F. Zwick,
Grand Joldes,
Adam Wittek,
Karol Miller
Abstract:
In this paper, we present a meshless method belonging to the family of element-free Galerkin (EFG) methods. The distinguishing feature of the presented meshless method is that it allows accurate enforcement of essential boundary conditions. The method uses total Lagrangian formulation with explicit time integration to facilitate code simplicity and robust computations in applications that involve…
▽ More
In this paper, we present a meshless method belonging to the family of element-free Galerkin (EFG) methods. The distinguishing feature of the presented meshless method is that it allows accurate enforcement of essential boundary conditions. The method uses total Lagrangian formulation with explicit time integration to facilitate code simplicity and robust computations in applications that involve large deformations and non-linear materials. We use a regularized weight function, which closely approximates the Kronecker delta, to generate interpolating shape functions. The imposition of the prescribed displacements on the boundary becomes as straightforward as in the finite element (FE) method. The effectiveness and accuracy of the proposed method is demonstrated using 3D numerical examples that include cylinder indentation by 70% of its initial height, and indentation of the brain.
△ Less
Submitted 14 May, 2020;
originally announced May 2020.
-
System-Level Predictive Maintenance: Review of Research Literature and Gap Analysis
Authors:
Kyle Miller,
Artur Dubrawski
Abstract:
This paper reviews current literature in the field of predictive maintenance from the system point of view. We differentiate the existing capabilities of condition estimation and failure risk forecasting as currently applied to simple components, from the capabilities needed to solve the same tasks for complex assets. System-level analysis faces more complex latent degradation states, it has to co…
▽ More
This paper reviews current literature in the field of predictive maintenance from the system point of view. We differentiate the existing capabilities of condition estimation and failure risk forecasting as currently applied to simple components, from the capabilities needed to solve the same tasks for complex assets. System-level analysis faces more complex latent degradation states, it has to comprehensively account for active maintenance programs at each component level and consider coupling between different maintenance actions, while reflecting increased monetary and safety costs for system failures. As a result, methods that are effective for forecasting risk and informing maintenance decisions regarding individual components do not readily scale to provide reliable sub-system or system level insights. A novel holistic modeling approach is needed to incorporate available structural and physical knowledge and naturally handle the complexities of actively fielded and maintained assets.
△ Less
Submitted 11 May, 2020;
originally announced May 2020.
-
Recognizing Affiliation: Using Behavioural Traces to Predict the Quality of Social Interactions in Online Games
Authors:
Julian Frommel,
Valentin Sagl,
Ansgar E. Depping,
Colby Johanson,
Matthew K. Miller,
Regan L. Mandryk
Abstract:
Online social interactions in multiplayer games can be supportive and positive or toxic and harmful; however, few methods can easily assess interpersonal interaction quality in games. We use behavioural traces to predict affiliation between dyadic strangers, facilitated through their social interactions in an online gaming setting. We collected audio, video, in-game, and self-report data from 23 d…
▽ More
Online social interactions in multiplayer games can be supportive and positive or toxic and harmful; however, few methods can easily assess interpersonal interaction quality in games. We use behavioural traces to predict affiliation between dyadic strangers, facilitated through their social interactions in an online gaming setting. We collected audio, video, in-game, and self-report data from 23 dyads, extracted 75 features, trained Random Forest and Support Vector Machine models, and evaluated their performance predicting binary (high/low) as well as continuous affiliation toward a partner. The models can predict both binary and continuous affiliation with up to 79.1% accuracy (F1) and 20.1% explained variance (R2) on unseen data, with features based on verbal communication demonstrating the highest potential. Our findings can inform the design of multiplayer games and game communities, and guide the development of systems for matchmaking and mitigating toxic behaviour in online games.
△ Less
Submitted 6 March, 2020;
originally announced March 2020.
-
Cell-based Maximum Entropy Approximants for Three Dimensional Domains: Application in Large Strain Elastodynamics using the Meshless Total Lagrangian Explicit Dynamics Method
Authors:
Konstantinos A. Mountris,
George C. Bourantas,
Daniel Millán,
Grand R. Joldes,
Karol Miller,
Esther Pueyo,
Adam Wittek
Abstract:
We present the Cell-based Maximum Entropy (CME) approximants in E3 space by constructing the smooth approximation distance function to polyhedral surfaces. CME is a meshfree approximation method combining the properties of the Maximum Entropy approximants and the compact support of element-based interpolants. The method is evaluated in problems of large strain elastodynamics for three-dimensional…
▽ More
We present the Cell-based Maximum Entropy (CME) approximants in E3 space by constructing the smooth approximation distance function to polyhedral surfaces. CME is a meshfree approximation method combining the properties of the Maximum Entropy approximants and the compact support of element-based interpolants. The method is evaluated in problems of large strain elastodynamics for three-dimensional (3D) continua using the well-established Meshless Total Lagrangian Explicit Dynamics (MTLED) method. The accuracy and efficiency of the method is assessed in several numerical examples in terms of computational time, accuracy in boundary conditions imposition, and strain energy density error. Due to the smoothness of CME basis functions, the numerical stability in explicit time integration is preserved for large time step. The challenging task of essential boundary conditions imposition in non-interpolating meshless methods (e.g., Moving Least Squares) is eliminated in CME due to the weak Kronecker-delta property. The essential boundary conditions are imposed directly, similar to the Finite Element Method. CME is proven a valuable alternative to other meshless and element-based methods for large-scale elastodynamics in 3D.
△ Less
Submitted 29 October, 2019; v1 submitted 13 May, 2019;
originally announced May 2019.
-
Meta-learning of Sequential Strategies
Authors:
Pedro A. Ortega,
Jane X. Wang,
Mark Rowland,
Tim Genewein,
Zeb Kurth-Nelson,
Razvan Pascanu,
Nicolas Heess,
Joel Veness,
Alex Pritzel,
Pablo Sprechmann,
Siddhant M. Jayakumar,
Tom McGrath,
Kevin Miller,
Mohammad Azar,
Ian Osband,
Neil Rabinowitz,
András György,
Silvia Chiappa,
Simon Osindero,
Yee Whye Teh,
Hado van Hasselt,
Nando de Freitas,
Matthew Botvinick,
Shane Legg
Abstract:
In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal pred…
▽ More
In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class. Our goal is to equip the reader with the conceptual foundations of this tool for building new, scalable agents that operate on broad domains. To do so, we present basic algorithmic templates for building near-optimal predictors and reinforcement learners which behave as if they had a probabilistic model that allowed them to efficiently exploit task structure. Furthermore, we recast memory-based meta-learning within a Bayesian framework, showing that the meta-learned strategies are near-optimal because they amortize Bayes-filtered data, where the adaptation is implemented in the memory dynamics as a state-machine of sufficient statistics. Essentially, memory-based meta-learning translates the hard problem of probabilistic sequential inference into a regression problem.
△ Less
Submitted 18 July, 2019; v1 submitted 8 May, 2019;
originally announced May 2019.
-
Biomechanical modeling and computer simulation of the brain during neurosurgery
Authors:
K. Miller,
G. R. Joldes,
G. Bourantas,
S. K. Warfield,
D. E. Hyde,
R. Kikinis,
A. Wittek
Abstract:
Computational biomechanics of the brain for neurosurgery is an emerging area of research recently gaining in importance and practical applications. This review paper presents the contributions of the Intelligent Systems for Medicine Laboratory and it's collaborators to this field, discussing the modeling approaches adopted and the methods developed for obtaining the numerical solutions. We adopt a…
▽ More
Computational biomechanics of the brain for neurosurgery is an emerging area of research recently gaining in importance and practical applications. This review paper presents the contributions of the Intelligent Systems for Medicine Laboratory and it's collaborators to this field, discussing the modeling approaches adopted and the methods developed for obtaining the numerical solutions. We adopt a physics-based modeling approach, and describe the brain deformation in mechanical terms (such as displacements, strains and stresses), which can be computed using a biomechanical model, by solving a continuum mechanics problem. We present our modeling approaches related to geometry creation, boundary conditions, loading and material properties. From the point of view of solution methods, we advocate the use of fully nonlinear modeling approaches, capable of capturing very large deformations and nonlinear material behavior. We discuss finite element and meshless domain discretization, the use of the Total Lagrangian formulation of continuum mechanics, and explicit time integration for solving both time-accurate and steady state problems. We present the methods developed for handling contacts and for warping 3D medical images using the results of our simulations. We present two examples to showcase these methods: brain shift estimation for image registration and brain deformation computation for neuronavigation in epilepsy treatment.
△ Less
Submitted 1 April, 2019;
originally announced April 2019.
-
Suite of Meshless Algorithms for Accurate Computation of Soft Tissue Deformation for Surgical Simulation
Authors:
Grand Joldes,
George Bourantas,
Benjamin Zwick,
Habib Chowdhury,
Adam Wittek,
Sudip Agrawal,
Konstantinos Mountris,
Damon Hyde,
Simon K. Warfield,
Karol Miller
Abstract:
The ability to predict patient-specific soft tissue deformations is key for computer-integrated surgery systems and the core enabling technology for a new era of personalized medicine. Element-Free Galerkin (EFG) methods are better suited for solving soft tissue deformation problems than the finite element method (FEM) due to their capability of handling large deformation while also eliminating th…
▽ More
The ability to predict patient-specific soft tissue deformations is key for computer-integrated surgery systems and the core enabling technology for a new era of personalized medicine. Element-Free Galerkin (EFG) methods are better suited for solving soft tissue deformation problems than the finite element method (FEM) due to their capability of handling large deformation while also eliminating the necessity of creating a complex predefined mesh. Nevertheless, meshless methods based on EFG formulation, exhibit three major limitations: i) meshless shape functions using higher order basis cannot always be computed for arbitrarily distributed nodes (irregular node placement is crucial for facilitating automated discretization of complex geometries); ii) imposition of the Essential Boundary Conditions (EBC) is not straightforward; and, iii) numerical (Gauss) integration in space is not exact as meshless shape functions are not polynomial. This paper presents a suite of Meshless Total Lagrangian Explicit Dynamics (MTLED) algorithms incorporating a Modified Moving Least Squares (MMLS) method for interpolating scattered data both for visualization and for numerical computations of soft tissue deformation, a novel way of imposing EBC for explicit time integration, and an adaptive numerical integration procedure within the Meshless Total Lagrangian Explicit Dynamics algorithm. The appropriateness and effectiveness of the proposed methods is demonstrated using comparisons with the established non-linear procedures from commercial finite element software ABAQUS and experiments with very large deformations. To demonstrate the translational benefits of MTLED we also present a realistic brain-shift computation.
△ Less
Submitted 13 June, 2019; v1 submitted 12 March, 2019;
originally announced March 2019.
-
Greedy Multi-Channel Neighbor Discovery
Authors:
Niels Karowski,
Konstantin Miller,
Adam Wolisz
Abstract:
The accelerating penetration of physical environments by objects with information processing and wireless communication capabilities requires approaches to find potential communication partners and discover services. In the present work, we focus on passive discovery approaches in multi-channel wireless networks based on overhearing periodic beacon transmissions of neighboring devices which are ot…
▽ More
The accelerating penetration of physical environments by objects with information processing and wireless communication capabilities requires approaches to find potential communication partners and discover services. In the present work, we focus on passive discovery approaches in multi-channel wireless networks based on overhearing periodic beacon transmissions of neighboring devices which are otherwise agnostic to the discovery process. We propose a family of low-complexity algorithms that generate listening schedules guaranteed to discover all neighbors. The presented approaches simultaneously depending on the beacon periods optimize the worst case discovery time, the mean discovery time, and the mean number of neighbors discovered until any arbitrary in time. The presented algorithms are fully compatible with technologies such as IEEE 802.11 and IEEE 802.15.4. Complementing the proposed low-complexity algorithms, we formulate the problem of computing discovery schedules that minimize the mean discovery time for arbitrary beacon periods as an integer linear problem. We study the performance of the proposed approaches analytically, by means of numerical experiments, and by extensively simulating them under realistic conditions. We observe that the generated listening schedules significantly - by up to factor 4 for the mean discovery time, and by up to 300% for the mean number of neighbors discovered until each point in time - outperform the Passive Scan, a discovery approach defined in the IEEE 802.15.4 standard. Based on the gained insights, we discuss how the selection of the beacon periods influences the efficiency of the discovery process, and provide recommendations for the design of systems and protocols.
△ Less
Submitted 13 July, 2018;
originally announced July 2018.