Search | arXiv e-print repository

The GeometricKernels Package: Heat and Matérn Kernels for Geometric Learning on Manifolds, Meshes, and Graphs

Authors: Peter Mostowsky, Vincent Dutordoir, Iskander Azangulov, Noémie Jaquier, Michael John Hutchinson, Aditya Ravuri, Leonel Rozo, Alexander Terenin, Viacheslav Borovitskiy

Abstract: Kernels are a fundamental technical primitive in machine learning. In recent years, kernel-based methods such as Gaussian processes are becoming increasingly important in applications where quantifying uncertainty is of key interest. In settings that involve structured data defined on graphs, meshes, manifolds, or other related spaces, defining kernels with good uncertainty-quantification behavior… ▽ More Kernels are a fundamental technical primitive in machine learning. In recent years, kernel-based methods such as Gaussian processes are becoming increasingly important in applications where quantifying uncertainty is of key interest. In settings that involve structured data defined on graphs, meshes, manifolds, or other related spaces, defining kernels with good uncertainty-quantification behavior, and computing their value numerically, is less straightforward than in the Euclidean setting. To address this difficulty, we present GeometricKernels, a software package which implements the geometric analogs of classical Euclidean squared exponential - also known as heat - and Matérn kernels, which are widely-used in settings where uncertainty is of key interest. As a byproduct, we obtain the ability to compute Fourier-feature-type expansions, which are widely used in their own right, on a wide set of geometric spaces. Our implementation supports automatic differentiation in every major current framework simultaneously via a backend-agnostic design. In this companion paper to the package and its documentation, we outline the capabilities of the package and present an illustrated example of its interface. We also include a brief overview of the theory the package is built upon and provide some historic context in the appendix. △ Less

Submitted 10 July, 2024; originally announced July 2024.

arXiv:2407.07818 [pdf, other]

The Misclassification Likelihood Matrix: Some Classes Are More Likely To Be Misclassified Than Others

Authors: Daniel Sikar, Artur Garcez, Robin Bloomfield, Tillman Weyde, Kaleem Peeroo, Naman Singh, Maeve Hutchinson, Dany Laksono, Mirela Reljan-Delaney

Abstract: This study introduces the Misclassification Likelihood Matrix (MLM) as a novel tool for quantifying the reliability of neural network predictions under distribution shifts. The MLM is obtained by leveraging softmax outputs and clustering techniques to measure the distances between the predictions of a trained neural network and class centroids. By analyzing these distances, the MLM provides a comp… ▽ More This study introduces the Misclassification Likelihood Matrix (MLM) as a novel tool for quantifying the reliability of neural network predictions under distribution shifts. The MLM is obtained by leveraging softmax outputs and clustering techniques to measure the distances between the predictions of a trained neural network and class centroids. By analyzing these distances, the MLM provides a comprehensive view of the model's misclassification tendencies, enabling decision-makers to identify the most common and critical sources of errors. The MLM allows for the prioritization of model improvements and the establishment of decision thresholds based on acceptable risk levels. The approach is evaluated on the MNIST dataset using a Convolutional Neural Network (CNN) and a perturbed version of the dataset to simulate distribution shifts. The results demonstrate the effectiveness of the MLM in assessing the reliability of predictions and highlight its potential in enhancing the interpretability and risk mitigation capabilities of neural networks. The implications of this work extend beyond image classification, with ongoing applications in autonomous systems, such as self-driving cars, to improve the safety and reliability of decision-making in complex, real-world environments. △ Less

Submitted 13 August, 2024; v1 submitted 10 July, 2024; originally announced July 2024.

Comments: 9 pages, 7 figures, 1 table

arXiv:2402.08667 [pdf, other]

Target Score Matching

Authors: Valentin De Bortoli, Michael Hutchinson, Peter Wirnsberger, Arnaud Doucet

Abstract: Denoising Score Matching estimates the score of a noised version of a target distribution by minimizing a regression loss and is widely used to train the popular class of Denoising Diffusion Models. A well known limitation of Denoising Score Matching, however, is that it yields poor estimates of the score at low noise levels. This issue is particularly unfavourable for problems in the physical sci… ▽ More Denoising Score Matching estimates the score of a noised version of a target distribution by minimizing a regression loss and is widely used to train the popular class of Denoising Diffusion Models. A well known limitation of Denoising Score Matching, however, is that it yields poor estimates of the score at low noise levels. This issue is particularly unfavourable for problems in the physical sciences and for Monte Carlo sampling tasks for which the score of the clean original target is known. Intuitively, estimating the score of a slightly noised version of the target should be a simple task in such cases. In this paper, we address this shortcoming and show that it is indeed possible to leverage knowledge of the target score. We present a Target Score Identity and corresponding Target Score Matching regression loss which allows us to obtain score estimates admitting favourable properties at low noise levels. △ Less

Submitted 13 February, 2024; originally announced February 2024.

arXiv:2402.06320 [pdf, other]

Particle Denoising Diffusion Sampler

Authors: Angus Phillips, Hai-Dang Dau, Michael John Hutchinson, Valentin De Bortoli, George Deligiannidis, Arnaud Doucet

Abstract: Denoising diffusion models have become ubiquitous for generative modeling. The core idea is to transport the data distribution to a Gaussian by using a diffusion. Approximate samples from the data distribution are then obtained by estimating the time-reversal of this diffusion using score matching ideas. We follow here a similar strategy to sample from unnormalized probability densities and comput… ▽ More Denoising diffusion models have become ubiquitous for generative modeling. The core idea is to transport the data distribution to a Gaussian by using a diffusion. Approximate samples from the data distribution are then obtained by estimating the time-reversal of this diffusion using score matching ideas. We follow here a similar strategy to sample from unnormalized probability densities and compute their normalizing constants. However, the time-reversed diffusion is here simulated by using an original iterative particle scheme relying on a novel score matching loss. Contrary to standard denoising diffusion models, the resulting Particle Denoising Diffusion Sampler (PDDS) provides asymptotically consistent estimates under mild assumptions. We demonstrate PDDS on multimodal and high dimensional sampling tasks. △ Less

Submitted 15 June, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

Comments: To be published in ICML 2024. 37 pages, 20 figures, 3 tables, 5 algorithms

arXiv:2307.05439 [pdf, other]

Metropolis Sampling for Constrained Diffusion Models

Authors: Nic Fishman, Leo Klarner, Emile Mathieu, Michael Hutchinson, Valentin de Bortoli

Abstract: Denoising diffusion models have recently emerged as the predominant paradigm for generative modelling on image domains. In addition, their extension to Riemannian manifolds has facilitated a range of applications across the natural sciences. While many of these problems stand to benefit from the ability to specify arbitrary, domain-informed constraints, this setting is not covered by the existing… ▽ More Denoising diffusion models have recently emerged as the predominant paradigm for generative modelling on image domains. In addition, their extension to Riemannian manifolds has facilitated a range of applications across the natural sciences. While many of these problems stand to benefit from the ability to specify arbitrary, domain-informed constraints, this setting is not covered by the existing (Riemannian) diffusion model methodology. Recent work has attempted to address this issue by constructing novel noising processes based on the reflected Brownian motion and logarithmic barrier methods. However, the associated samplers are either computationally burdensome or only apply to convex subsets of Euclidean space. In this paper, we introduce an alternative, simple noising scheme based on Metropolis sampling that affords substantial gains in computational efficiency and empirical performance compared to the earlier samplers. Of independent interest, we prove that this new process corresponds to a valid discretisation of the reflected Brownian motion. We demonstrate the scalability and flexibility of our approach on a range of problem settings with convex and non-convex constraints, including applications from geospatial modelling, robotics and protein design. △ Less

Submitted 9 November, 2023; v1 submitted 11 July, 2023; originally announced July 2023.

Comments: NeurIPS 2023

arXiv:2307.05431 [pdf, other]

Geometric Neural Diffusion Processes

Authors: Emile Mathieu, Vincent Dutordoir, Michael J. Hutchinson, Valentin De Bortoli, Yee Whye Teh, Richard E. Turner

Abstract: Denoising diffusion models have proven to be a flexible and effective paradigm for generative modelling. Their recent extension to infinite dimensional Euclidean spaces has allowed for the modelling of stochastic processes. However, many problems in the natural sciences incorporate symmetries and involve data living in non-Euclidean spaces. In this work, we extend the framework of diffusion models… ▽ More Denoising diffusion models have proven to be a flexible and effective paradigm for generative modelling. Their recent extension to infinite dimensional Euclidean spaces has allowed for the modelling of stochastic processes. However, many problems in the natural sciences incorporate symmetries and involve data living in non-Euclidean spaces. In this work, we extend the framework of diffusion models to incorporate a series of geometric priors in infinite-dimension modelling. We do so by a) constructing a noising process which admits, as limiting distribution, a geometric Gaussian process that transforms under the symmetry group of interest, and b) approximating the score with a neural network that is equivariant w.r.t. this group. We show that with these conditions, the generative functional model admits the same symmetry. We demonstrate scalability and capacity of the model, using a novel Langevin-based conditional sampler, to fit complex scalar and vector fields, with Euclidean and spherical codomain, on synthetic and real-world weather data. △ Less

Submitted 11 July, 2023; originally announced July 2023.

arXiv:2304.05364 [pdf, other]

Diffusion Models for Constrained Domains

Authors: Nic Fishman, Leo Klarner, Valentin De Bortoli, Emile Mathieu, Michael Hutchinson

Abstract: Denoising diffusion models are a novel class of generative algorithms that achieve state-of-the-art performance across a range of domains, including image generation and text-to-image tasks. Building on this success, diffusion models have recently been extended to the Riemannian manifold setting, broadening their applicability to a range of problems from the natural and engineering sciences. Howev… ▽ More Denoising diffusion models are a novel class of generative algorithms that achieve state-of-the-art performance across a range of domains, including image generation and text-to-image tasks. Building on this success, diffusion models have recently been extended to the Riemannian manifold setting, broadening their applicability to a range of problems from the natural and engineering sciences. However, these Riemannian diffusion models are built on the assumption that their forward and backward processes are well-defined for all times, preventing them from being applied to an important set of tasks that consider manifolds defined via a set of inequality constraints. In this work, we introduce a principled framework to bridge this gap. We present two distinct noising processes based on (i) the logarithmic barrier metric and (ii) the reflected Brownian motion induced by the constraints. As existing diffusion model techniques cannot be applied in this setting, we derive new tools to define such models in our framework. We then demonstrate the practical utility of our methods on a number of synthetic and real-world tasks, including applications from robotics and protein design. △ Less

Submitted 7 March, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

Comments: Published in Transactions on Machine Learning Research (07/2023)

arXiv:2209.14125 [pdf, other]

Spectral Diffusion Processes

Authors: Angus Phillips, Thomas Seror, Michael Hutchinson, Valentin De Bortoli, Arnaud Doucet, Emile Mathieu

Abstract: Score-based generative modelling (SGM) has proven to be a very effective method for modelling densities on finite-dimensional spaces. In this work we propose to extend this methodology to learn generative models over functional spaces. To do so, we represent functional data in spectral space to dissociate the stochastic part of the processes from their space-time part. Using dimensionality reducti… ▽ More Score-based generative modelling (SGM) has proven to be a very effective method for modelling densities on finite-dimensional spaces. In this work we propose to extend this methodology to learn generative models over functional spaces. To do so, we represent functional data in spectral space to dissociate the stochastic part of the processes from their space-time part. Using dimensionality reduction techniques we then sample from their stochastic component using finite dimensional SGM. We demonstrate our method's effectiveness for modelling various multimodal datasets. △ Less

Submitted 28 November, 2022; v1 submitted 28 September, 2022; originally announced September 2022.

Comments: 17 pages, 11 figures, Score-based Method Workshop at 36th Conference on Neural Information Processing Systems (NeurIPS 2022)

arXiv:2207.03024 [pdf, other]

Riemannian Diffusion Schrödinger Bridge

Authors: James Thornton, Michael Hutchinson, Emile Mathieu, Valentin De Bortoli, Yee Whye Teh, Arnaud Doucet

Abstract: Score-based generative models exhibit state of the art performance on density estimation and generative modeling tasks. These models typically assume that the data geometry is flat, yet recent extensions have been developed to synthesize data living on Riemannian manifolds. Existing methods to accelerate sampling of diffusion models are typically not applicable in the Riemannian setting and Rieman… ▽ More Score-based generative models exhibit state of the art performance on density estimation and generative modeling tasks. These models typically assume that the data geometry is flat, yet recent extensions have been developed to synthesize data living on Riemannian manifolds. Existing methods to accelerate sampling of diffusion models are typically not applicable in the Riemannian setting and Riemannian score-based methods have not yet been adapted to the important task of interpolation of datasets. To overcome these issues, we introduce \emph{Riemannian Diffusion Schrödinger Bridge}. Our proposed method generalizes Diffusion Schrödinger Bridge introduced in \cite{debortoli2021neurips} to the non-Euclidean setting and extends Riemannian score-based models beyond the first time reversal. We validate our proposed method on synthetic data and real Earth and climate data. △ Less

Submitted 6 July, 2022; originally announced July 2022.

Comments: Accepted to Continuous Time Methods for Machine Learning, ICML 2022

arXiv:2205.02260 [pdf, other]

Multivariate Prediction Intervals for Random Forests

Authors: Brendan Folie, Maxwell Hutchinson

Abstract: Accurate uncertainty estimates can significantly improve the performance of iterative design of experiments, as in Sequential and Reinforcement learning. For many such problems in engineering and the physical sciences, the design task depends on multiple correlated model outputs as objectives and/or constraints. To better solve these problems, we propose a recalibrated bootstrap method to generate… ▽ More Accurate uncertainty estimates can significantly improve the performance of iterative design of experiments, as in Sequential and Reinforcement learning. For many such problems in engineering and the physical sciences, the design task depends on multiple correlated model outputs as objectives and/or constraints. To better solve these problems, we propose a recalibrated bootstrap method to generate multivariate prediction intervals for bagged models and show that it is well-calibrated. We apply the recalibrated bootstrap to a simulated sequential learning problem with multiple objectives and show that it leads to a marked decrease in the number of iterations required to find a satisfactory candidate. This indicates that the recalibrated bootstrap could be a valuable tool for practitioners using machine learning to optimize systems with multiple competing targets. △ Less

Submitted 19 May, 2022; v1 submitted 4 May, 2022; originally announced May 2022.

Comments: 9 pages, 4 figures. Submitted to NeurIPS 2022

arXiv:2202.02763 [pdf, other]

Riemannian Score-Based Generative Modelling

Authors: Valentin De Bortoli, Emile Mathieu, Michael Hutchinson, James Thornton, Yee Whye Teh, Arnaud Doucet

Abstract: Score-based generative models (SGMs) are a powerful class of generative models that exhibit remarkable empirical performance. Score-based generative modelling (SGM) consists of a ``noising'' stage, whereby a diffusion is used to gradually add Gaussian noise to data, and a generative model, which entails a ``denoising'' process defined by approximating the time-reversal of the diffusion. Existing S… ▽ More Score-based generative models (SGMs) are a powerful class of generative models that exhibit remarkable empirical performance. Score-based generative modelling (SGM) consists of a ``noising'' stage, whereby a diffusion is used to gradually add Gaussian noise to data, and a generative model, which entails a ``denoising'' process defined by approximating the time-reversal of the diffusion. Existing SGMs assume that data is supported on a Euclidean space, i.e. a manifold with flat geometry. In many domains such as robotics, geoscience or protein modelling, data is often naturally described by distributions living on Riemannian manifolds and current SGM techniques are not appropriate. We introduce here Riemannian Score-based Generative Models (RSGMs), a class of generative models extending SGMs to Riemannian manifolds. We demonstrate our approach on a variety of manifolds, and in particular with earth and climate science spherical data. △ Less

Submitted 22 November, 2022; v1 submitted 6 February, 2022; originally announced February 2022.

Comments: Neurips 2022 camera ready

arXiv:2110.14423 [pdf, other]

Vector-valued Gaussian Processes on Riemannian Manifolds via Gauge Independent Projected Kernels

Authors: Michael Hutchinson, Alexander Terenin, Viacheslav Borovitskiy, So Takao, Yee Whye Teh, Marc Peter Deisenroth

Abstract: Gaussian processes are machine learning models capable of learning unknown functions in a way that represents uncertainty, thereby facilitating construction of optimal decision-making systems. Motivated by a desire to deploy Gaussian processes in novel areas of science, a rapidly-growing line of research has focused on constructively extending these models to handle non-Euclidean domains, includin… ▽ More Gaussian processes are machine learning models capable of learning unknown functions in a way that represents uncertainty, thereby facilitating construction of optimal decision-making systems. Motivated by a desire to deploy Gaussian processes in novel areas of science, a rapidly-growing line of research has focused on constructively extending these models to handle non-Euclidean domains, including Riemannian manifolds, such as spheres and tori. We propose techniques that generalize this class to model vector fields on Riemannian manifolds, which are important in a number of application areas in the physical sciences. To do so, we present a general recipe for constructing gauge independent kernels, which induce Gaussian vector fields, i.e. vector-valued Gaussian processes coherent with geometry, from scalar-valued Riemannian kernels. We extend standard Gaussian process training methods, such as variational inference, to this setting. This enables vector-valued Gaussian processes on Riemannian manifolds to be trained using standard methods and makes them accessible to machine learning practitioners. △ Less

Submitted 25 November, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

Journal ref: Advances in Neural Information Processing Systems, 2021

arXiv:2101.02798 [pdf, other]

doi 10.1145/3415264.3425464

Enhanced Direct Delta Mush

Authors: Serguei Kalentchouk, Michael Hutchinson, Deepak Tolani

Abstract: Direct Delta Mush is a novel skinning deformation technique introduced by Le and Lewis (2019). It generalizes the iterative Delta Mush algorithm of Mancewicz et al (2014), providing a direct solution with improved efficiency and control. Compared to Linear Blend Skinning, Direct Delta Mush offers better quality of deformations and ease of authoring at comparable performance. However, Direct Delta… ▽ More Direct Delta Mush is a novel skinning deformation technique introduced by Le and Lewis (2019). It generalizes the iterative Delta Mush algorithm of Mancewicz et al (2014), providing a direct solution with improved efficiency and control. Compared to Linear Blend Skinning, Direct Delta Mush offers better quality of deformations and ease of authoring at comparable performance. However, Direct Delta Mush does not handle non-rigid joint transformations correctly which limits its application for most production environments. This paper presents an extension to Direct Delta Mush that integrates the non-rigid part of joint transformations into the algorithm. In addition, the paper also describes practical considerations for computing the orthogonal component of the transformation and stability issues observed during the implementation and testing. △ Less

Submitted 7 January, 2021; originally announced January 2021.

Journal ref: SA '20 Posters: SIGGRAPH Asia 2020 Posters, December 2020, Article No. 29

arXiv:2012.10885 [pdf, other]

LieTransformer: Equivariant self-attention for Lie Groups

Authors: Michael Hutchinson, Charline Le Lan, Sheheryar Zaidi, Emilien Dupont, Yee Whye Teh, Hyunjik Kim

Abstract: Group equivariant neural networks are used as building blocks of group invariant neural networks, which have been shown to improve generalisation performance and data efficiency through principled parameter sharing. Such works have mostly focused on group equivariant convolutions, building on the result that group equivariant linear maps are necessarily convolutions. In this work, we extend the sc… ▽ More Group equivariant neural networks are used as building blocks of group invariant neural networks, which have been shown to improve generalisation performance and data efficiency through principled parameter sharing. Such works have mostly focused on group equivariant convolutions, building on the result that group equivariant linear maps are necessarily convolutions. In this work, we extend the scope of the literature to self-attention, that is emerging as a prominent building block of deep learning models. We propose the LieTransformer, an architecture composed of LieSelfAttention layers that are equivariant to arbitrary Lie groups and their discrete subgroups. We demonstrate the generality of our approach by showing experimental results that are competitive to baseline methods on a wide range of tasks: shape counting on point clouds, molecular property regression and modelling particle trajectories under Hamiltonian dynamics. △ Less

Submitted 16 June, 2021; v1 submitted 20 December, 2020; originally announced December 2020.

arXiv:2011.12916 [pdf, other]

Equivariant Learning of Stochastic Fields: Gaussian Processes and Steerable Conditional Neural Processes

Authors: Peter Holderrieth, Michael Hutchinson, Yee Whye Teh

Abstract: Motivated by objects such as electric fields or fluid streams, we study the problem of learning stochastic fields, i.e. stochastic processes whose samples are fields like those occurring in physics and engineering. Considering general transformations such as rotations and reflections, we show that spatial invariance of stochastic fields requires an inference model to be equivariant. Leveraging rec… ▽ More Motivated by objects such as electric fields or fluid streams, we study the problem of learning stochastic fields, i.e. stochastic processes whose samples are fields like those occurring in physics and engineering. Considering general transformations such as rotations and reflections, we show that spatial invariance of stochastic fields requires an inference model to be equivariant. Leveraging recent advances from the equivariance literature, we study equivariance in two classes of models. Firstly, we fully characterise equivariant Gaussian processes. Secondly, we introduce Steerable Conditional Neural Processes (SteerCNPs), a new, fully equivariant member of the Neural Process family. In experiments with Gaussian process vector fields, images, and real-world weather data, we observe that SteerCNPs significantly improve the performance of previous models and equivariance leads to improvements in transfer learning tasks. △ Less

Submitted 17 July, 2021; v1 submitted 25 November, 2020; originally announced November 2020.

Journal ref: Proceedings of the 38th International Conference on Machine Learning, PMLR 139, 2021

arXiv:2011.04551 [pdf, other]

Think Global, Act Local: Gossip and Client Audits in Verifiable Data Structures

Authors: Sarah Meiklejohn, Pavel Kalinnikov, Cindy S. Lin, Martin Hutchinson, Gary Belvin, Mariana Raykova, Al Cutter

Abstract: In recent years, there has been increasing recognition of the benefits of having services provide auditable logs of data, as demonstrated by the deployment of Certificate Transparency and the development of other transparency projects. Most proposed systems, however, rely on a gossip protocol by which users can be assured that they have the same view of the log, but the few gossip protocols that d… ▽ More In recent years, there has been increasing recognition of the benefits of having services provide auditable logs of data, as demonstrated by the deployment of Certificate Transparency and the development of other transparency projects. Most proposed systems, however, rely on a gossip protocol by which users can be assured that they have the same view of the log, but the few gossip protocols that do exist today are not suited for near-term deployment. Furthermore, they assume the presence of global sets of auditors, who must be blindly trusted to correctly perform their roles, in order to achieve their stated transparency goals. In this paper, we address both of these issues by proposing a gossip protocol and a verifiable registry, Mog, in which users can perform their own auditing themselves. We prove the security of our protocols and demonstrate via experimental evaluations that they are performant in a variety of potential near-term deployments. △ Less

Submitted 9 November, 2020; originally announced November 2020.

arXiv:2010.06647 [pdf, other]

doi 10.1109/ACCESS.2021.3115476

Video Action Understanding

Authors: Matthew Hutchinson, Vijay Gadepally

Abstract: Many believe that the successes of deep learning on image understanding problems can be replicated in the realm of video understanding. However, due to the scale and temporal nature of video, the span of video understanding problems and the set of proposed deep learning solutions is arguably wider and more diverse than those of their 2D image siblings. Finding, identifying, and predicting actions… ▽ More Many believe that the successes of deep learning on image understanding problems can be replicated in the realm of video understanding. However, due to the scale and temporal nature of video, the span of video understanding problems and the set of proposed deep learning solutions is arguably wider and more diverse than those of their 2D image siblings. Finding, identifying, and predicting actions are a few of the most salient tasks in this emerging and rapidly evolving field. With a pedagogical emphasis, this tutorial introduces and systematizes fundamental topics, basic concepts, and notable examples in supervised video action understanding. Specifically, we clarify a taxonomy of action problems, catalog and highlight video datasets, describe common video data preparation methods, present the building blocks of state-of-the art deep learning model architectures, and formalize domain-specific metrics to baseline proposed solutions. This tutorial is intended to be accessible to a general computer science audience and assumes a conceptual understanding of supervised learning. △ Less

Submitted 3 October, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: Accepted for publication in IEEE Access

Journal ref: IEEE Access, Vol. 9, 2021

arXiv:2008.09037 [pdf]

doi 10.1109/HPEC43674.2020.9286249

Accuracy and Performance Comparison of Video Action Recognition Approaches

Authors: Matthew Hutchinson, Siddharth Samsi, William Arcand, David Bestor, Bill Bergeron, Chansup Byun, Micheal Houle, Matthew Hubbell, Micheal Jones, Jeremy Kepner, Andrew Kirby, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Albert Reuther, Charles Yee, Vijay Gadepally

Abstract: Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-t… ▽ More Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-the-art models by ensuring consistency in these training characteristics in order to provide readers with a meaningful comparison across different types of video action recognition algorithms. Accuracy of the models is evaluated using standard Top-1 and Top-5 accuracy metrics in addition to a proposed new accuracy metric. Additionally, we compare computational performance of distributed training from two to sixty-four GPUs on a state-of-the-art HPC system. △ Less

Submitted 20 August, 2020; originally announced August 2020.

Comments: Accepted for publication at IEEE HPEC 2020

arXiv:1911.10563 [pdf, other]

Differentially Private Federated Variational Inference

Authors: Mrinank Sharma, Michael Hutchinson, Siddharth Swaroop, Antti Honkela, Richard E. Turner

Abstract: In many real-world applications of machine learning, data are distributed across many clients and cannot leave the devices they are stored on. Furthermore, each client's data, computational resources and communication constraints may be very different. This setting is known as federated learning, in which privacy is a key concern. Differential privacy is commonly used to provide mathematical priva… ▽ More In many real-world applications of machine learning, data are distributed across many clients and cannot leave the devices they are stored on. Furthermore, each client's data, computational resources and communication constraints may be very different. This setting is known as federated learning, in which privacy is a key concern. Differential privacy is commonly used to provide mathematical privacy guarantees. This work, to the best of our knowledge, is the first to consider federated, differentially private, Bayesian learning. We build on Partitioned Variational Inference (PVI) which was recently developed to support approximate Bayesian inference in the federated setting. We modify the client-side optimisation of PVI to provide an ($ε$, $δ$)-DP guarantee. We show that it is possible to learn moderately private logistic regression models in the federated setting that achieve similar performance to models trained non-privately on centralised data. △ Less

Submitted 24 November, 2019; originally announced November 2019.

Comments: Privacy in Machine Learning Workshop (PriML 2019) at the 33rd Conference in Neural Information and Processing Systems (NeurIPS)

arXiv:1711.05099 [pdf, ps, other]

Overcoming data scarcity with transfer learning

Authors: Maxwell L. Hutchinson, Erin Antono, Brenna M. Gibbons, Sean Paradiso, Julia Ling, Bryce Meredig

Abstract: Despite increasing focus on data publication and discovery in materials science and related fields, the global view of materials data is highly sparse. This sparsity encourages training models on the union of multiple datasets, but simple unions can prove problematic as (ostensibly) equivalent properties may be measured or computed differently depending on the data source. These hidden contextual… ▽ More Despite increasing focus on data publication and discovery in materials science and related fields, the global view of materials data is highly sparse. This sparsity encourages training models on the union of multiple datasets, but simple unions can prove problematic as (ostensibly) equivalent properties may be measured or computed differently depending on the data source. These hidden contextual differences introduce irreducible errors into analyses, fundamentally limiting their accuracy. Transfer learning, where information from one dataset is used to inform a model on another, can be an effective tool for bridging sparse data while preserving the contextual differences in the underlying measurements. Here, we describe and compare three techniques for transfer learning: multi-task, difference, and explicit latent variable architectures. We show that difference architectures are most accurate in the multi-fidelity case of mixed DFT and experimental band gaps, while multi-task most improves classification performance of color with band gaps. For activation energies of steps in NO reduction, the explicit latent variable method is not only the most accurate, but also enjoys cancellation of errors in functions that depend on multiple tasks. These results motivate the publication of high quality materials datasets that encode transferable information, independent of industrial or academic interest in the particular labels, and encourage further development and application of transfer learning methods to materials informatics problems. △ Less

Submitted 2 November, 2017; originally announced November 2017.

arXiv:1711.00404 [pdf, ps, other]

Building Data-driven Models with Microstructural Images: Generalization and Interpretability

Authors: Julia Ling, Maxwell Hutchinson, Erin Antono, Brian DeCost, Elizabeth A. Holm, Bryce Meredig

Abstract: As data-driven methods rise in popularity in materials science applications, a key question is how these machine learning models can be used to understand microstructure. Given the importance of process-structure-property relations throughout materials science, it seems logical that models that can leverage microstructural data would be more capable of predicting property information. While there… ▽ More As data-driven methods rise in popularity in materials science applications, a key question is how these machine learning models can be used to understand microstructure. Given the importance of process-structure-property relations throughout materials science, it seems logical that models that can leverage microstructural data would be more capable of predicting property information. While there have been some recent attempts to use convolutional neural networks to understand microstructural images, these early studies have focused only on which featurizations yield the highest machine learning model accuracy for a single data set. This paper explores the use of convolutional neural networks for classifying microstructure with a more holistic set of objectives in mind: generalization between data sets, number of features required, and interpretability. △ Less

Submitted 1 November, 2017; originally announced November 2017.

arXiv:1706.02970 [pdf, ps, other]

doi 10.1145/2938615.2938617

On the Strong Scaling of the Spectral Element Solver Nek5000 on Petascale Systems

Authors: Nicolas Offermans, Oana Marin, Michel Schanen, Jing Gong, Paul Fischer, Philipp Schlatter, Aleks Obabko, Adam Peplinksi, Maxwell Hutchinson, Elia Merzari

Abstract: The present work is targeted at performing a strong scaling study of the high-order spectral element fluid dynamics solver Nek5000. Prior studies indicated a recommendable metric for strong scalability from a theoretical viewpoint, which we test here extensively on three parallel machines with different performance characteristics and interconnect networks, namely Mira (IBM Blue Gene/Q), Beskow (C… ▽ More The present work is targeted at performing a strong scaling study of the high-order spectral element fluid dynamics solver Nek5000. Prior studies indicated a recommendable metric for strong scalability from a theoretical viewpoint, which we test here extensively on three parallel machines with different performance characteristics and interconnect networks, namely Mira (IBM Blue Gene/Q), Beskow (Cray XC40) and Titan (Cray XK7). The test cases considered for the simulations correspond to a turbulent flow in a straight pipe at four different friction Reynolds numbers $Re_τ$ = 180, 360, 550 and 1000. Considering the linear model for parallel communication we quantify the machine characteristics in order to better assess the scaling behaviors of the code. Subsequently sampling and profiling tools are used to measure the computation and communication times over a large range of compute cores. We also study the effect of the two coarse grid solvers XXT and AMG on the computational time. Super-linear scaling due to a reduction in cache misses is observed on each computer. The strong scaling limit is attained for roughly 5000 - 10,000 degrees of freedom per core on Mira, 30,000 - 50,0000 on Beskow, with only a small impact of the problem size for both machines, and ranges between 10,000 and 220,000 depending on the problem size on Titan. This work aims at being a reference for Nek5000 users and also serves as a basis for potential issues to address as the community heads towards exascale supercomputers. △ Less

Submitted 9 June, 2017; originally announced June 2017.

Comments: 10 pages, 9 figures, Proceedings of the Exascale Applications and Software Conference 2016 (EASC '16, Stockholm)

arXiv:1306.5977 [pdf, ps, other]

doi 10.1016/j.tcs.2015.03.019

Enumeration of octagonal tilings

Authors: Maxwell Hutchinson, Michael Widom

Abstract: Random tilings are interesting as idealizations of atomistic models of quasicrystals and for their connection to problems in combinatorics and algorithms. Of particular interest is the tiling entropy density, which measures the relation of the number of distinct tilings to the number of constituent tiles. Tilings by squares and 45 degree rhombi receive special attention as presumably the simplest… ▽ More Random tilings are interesting as idealizations of atomistic models of quasicrystals and for their connection to problems in combinatorics and algorithms. Of particular interest is the tiling entropy density, which measures the relation of the number of distinct tilings to the number of constituent tiles. Tilings by squares and 45 degree rhombi receive special attention as presumably the simplest model that has not yet been solved exactly in the thermodynamic limit. However, an exact enumeration formula can be evaluated for tilings in finite regions with fixed boundaries. We implement this algorithm in an efficient manner, enabling the investigation of larger regions of parameter space than previously were possible. Our new results appear to yield monotone increasing and decreasing lower and upper bounds on the fixed boundary entropy density that converge toward S = 0.36021(3). △ Less

Submitted 16 March, 2015; v1 submitted 24 June, 2013; originally announced June 2013.

Journal ref: Theoretical Computer Science (2015), pp. 40-50

Showing 1–23 of 23 results for author: Hutchinson, M