-
Improving engagement, diversity, and retention in computer science with RadGrad: Results of a case study
Authors:
Philip M. Johnson,
Carleton Moore,
Peter Leong,
Seungoh Paek
Abstract:
RadGrad is a curriculum initiative implemented via an application that combines features of social networks, degree planners, individual learning plans, and serious games. RadGrad redefines traditional meanings of "progress" and "success" in the undergraduate computer science degree program in an attempt to improve engagement, retention, and diversity. In this paper, we describe the RadGrad Projec…
▽ More
RadGrad is a curriculum initiative implemented via an application that combines features of social networks, degree planners, individual learning plans, and serious games. RadGrad redefines traditional meanings of "progress" and "success" in the undergraduate computer science degree program in an attempt to improve engagement, retention, and diversity. In this paper, we describe the RadGrad Project and report on an evaluation study designed to assess the impact of RadGrad on student engagement, diversity, and retention. We also present opportunities and challenges that result from the use of the system.
△ Less
Submitted 27 June, 2024;
originally announced July 2024.
-
Tensor cumulants for statistical inference on invariant distributions
Authors:
Dmitriy Kunisky,
Cristopher Moore,
Alexander S. Wein
Abstract:
Many problems in high-dimensional statistics appear to have a statistical-computational gap: a range of values of the signal-to-noise ratio where inference is information-theoretically possible, but (conjecturally) computationally intractable. A canonical such problem is Tensor PCA, where we observe a tensor $Y$ consisting of a rank-one signal plus Gaussian noise. Multiple lines of work suggest th…
▽ More
Many problems in high-dimensional statistics appear to have a statistical-computational gap: a range of values of the signal-to-noise ratio where inference is information-theoretically possible, but (conjecturally) computationally intractable. A canonical such problem is Tensor PCA, where we observe a tensor $Y$ consisting of a rank-one signal plus Gaussian noise. Multiple lines of work suggest that Tensor PCA becomes computationally hard at a critical value of the signal's magnitude. In particular, below this transition, no low-degree polynomial algorithm can detect the signal with high probability; conversely, various spectral algorithms are known to succeed above this transition. We unify and extend this work by considering tensor networks, orthogonally invariant polynomials where multiple copies of $Y$ are "contracted" to produce scalars, vectors, matrices, or other tensors. We define a new set of objects, tensor cumulants, which provide an explicit, near-orthogonal basis for invariant polynomials of a given degree. This basis lets us unify and strengthen previous results on low-degree hardness, giving a combinatorial explanation of the hardness transition and of a continuum of subexponential-time algorithms that work below it, and proving tight lower bounds against low-degree polynomials for recovering rather than just detecting the signal. It also lets us analyze a new problem of distinguishing between different tensor ensembles, such as Wigner and Wishart tensors, establishing a sharp computational threshold and giving evidence of a new statistical-computational gap in the Central Limit Theorem for random tensors. Finally, we believe these cumulants are valuable mathematical objects in their own right: they generalize the free cumulants of free probability theory from matrices to tensors, and share many of their properties, including additivity under additive free convolution.
△ Less
Submitted 29 April, 2024;
originally announced April 2024.
-
The Common Core Ontologies
Authors:
Mark Jensen,
Giacomo De Colle,
Sean Kindya,
Cameron More,
Alexander P. Cox,
John Beverley
Abstract:
The Common Core Ontologies (CCO) are designed as a mid-level ontology suite that extends the Basic Formal Ontology. CCO has since been increasingly adopted by a broad group of users and applications and is proposed as the first standard mid-level ontology. Despite these successes, documentation of the contents and design patterns of the CCO has been comparatively minimal. This paper is a step towa…
▽ More
The Common Core Ontologies (CCO) are designed as a mid-level ontology suite that extends the Basic Formal Ontology. CCO has since been increasingly adopted by a broad group of users and applications and is proposed as the first standard mid-level ontology. Despite these successes, documentation of the contents and design patterns of the CCO has been comparatively minimal. This paper is a step toward providing enhanced documentation for the mid-level ontology suite through a discussion of the contents of the eleven ontologies that collectively comprise the Common Core Ontology suite.
△ Less
Submitted 15 August, 2024; v1 submitted 26 April, 2024;
originally announced April 2024.
-
Understanding the PULSAR Effect in Combined Radiotherapy and Immunotherapy through Attention Mechanisms with a Transformer Model
Authors:
Hao Peng,
Casey Moore,
Debabrata Saha,
Steve Jiang,
Robert Timmerman
Abstract:
PULSAR (personalized, ultra-fractionated stereotactic adaptive radiotherapy) is the adaptation of stereotactic ablative radiotherapy towards personalized cancer management. For the first time, we applied a transformer-based attention mechanism to investigate the underlying interactions between combined PULSAR and PD-L1 blockade immunotherapy based on a murine cancer model (Lewis Lung Carcinoma, LL…
▽ More
PULSAR (personalized, ultra-fractionated stereotactic adaptive radiotherapy) is the adaptation of stereotactic ablative radiotherapy towards personalized cancer management. For the first time, we applied a transformer-based attention mechanism to investigate the underlying interactions between combined PULSAR and PD-L1 blockade immunotherapy based on a murine cancer model (Lewis Lung Carcinoma, LLC). The proposed approach is able to predict the trend of tumor volume change semi-quantitatively, and excels in identifying the potential causal relationships through both self-attention and cross-attention scores.
△ Less
Submitted 6 March, 2024;
originally announced March 2024.
-
Pyclipse, a library for deidentification of free-text clinical notes
Authors:
Callandra Moore,
Jonathan Ranisau,
Walter Nelson,
Jeremy Petch,
Alistair Johnson
Abstract:
Automated deidentification of clinical text data is crucial due to the high cost of manual deidentification, which has been a barrier to sharing clinical text and the advancement of clinical natural language processing. However, creating effective automated deidentification tools faces several challenges, including issues in reproducibility due to differences in text processing, evaluation methods…
▽ More
Automated deidentification of clinical text data is crucial due to the high cost of manual deidentification, which has been a barrier to sharing clinical text and the advancement of clinical natural language processing. However, creating effective automated deidentification tools faces several challenges, including issues in reproducibility due to differences in text processing, evaluation methods, and a lack of consistency across clinical domains and institutions. To address these challenges, we propose the pyclipse framework, a unified and configurable evaluation procedure to streamline the comparison of deidentification algorithms. Pyclipse serves as a single interface for running open-source deidentification algorithms on local clinical data, allowing for context-specific evaluation. To demonstrate the utility of pyclipse, we compare six deidentification algorithms across four public and two private clinical text datasets. We find that algorithm performance consistently falls short of the results reported in the original papers, even when evaluated on the same benchmark dataset. These discrepancies highlight the complexity of accurately assessing and comparing deidentification algorithms, emphasizing the need for a reproducible, adjustable, and extensible framework like pyclipse. Our framework lays the foundation for a unified approach to evaluate and improve deidentification tools, ultimately enhancing patient protection in clinical natural language processing.
△ Less
Submitted 5 November, 2023;
originally announced November 2023.
-
A Self-Supervised Approach to Land Cover Segmentation
Authors:
Charles Moore,
Dakota Hester
Abstract:
Land use/land cover change (LULC) maps are integral resources in earth science and agricultural research. Due to the nature of such maps, the creation of LULC maps is often constrained by the time and human resources necessary to accurately annotate satellite imagery and remote sensing data. While computer vision models that perform semantic segmentation to create detailed labels from such data ar…
▽ More
Land use/land cover change (LULC) maps are integral resources in earth science and agricultural research. Due to the nature of such maps, the creation of LULC maps is often constrained by the time and human resources necessary to accurately annotate satellite imagery and remote sensing data. While computer vision models that perform semantic segmentation to create detailed labels from such data are not uncommon, litle research has been done on self-supervised and unsupervised approaches to labelling LULC maps without the use of ground-truth masks. Here, we demonstrate a self-supervised method of land cover segmentation that has no need for high-quality ground truth labels. The proposed deep learning employs a frozen pre-trained ViT backbone transferred from DINO in a STEGO architecture and is fine-tuned using a custom dataset consisting of very high resolution (VHR) sattelite imagery. After only 10 epochs of fine-tuning, an accuracy of roughly 52% was observed across 5 samples, signifying the feasibility of self-supervised models for the automated labelling of VHR LULC maps.
△ Less
Submitted 27 October, 2023;
originally announced October 2023.
-
URA*: Uncertainty-aware Path Planning using Image-based Aerial-to-Ground Traversability Estimation for Off-road Environments
Authors:
Charles Moore,
Shaswata Mitra,
Nisha Pillai,
Marc Moore,
Sudip Mittal,
Cindy Bethel,
Jingdao Chen
Abstract:
A major challenge with off-road autonomous navigation is the lack of maps or road markings that can be used to plan a path for autonomous robots. Classical path planning methods mostly assume a perfectly known environment without accounting for the inherent perception and sensing uncertainty from detecting terrain and obstacles in off-road environments. Recent work in computer vision and deep neur…
▽ More
A major challenge with off-road autonomous navigation is the lack of maps or road markings that can be used to plan a path for autonomous robots. Classical path planning methods mostly assume a perfectly known environment without accounting for the inherent perception and sensing uncertainty from detecting terrain and obstacles in off-road environments. Recent work in computer vision and deep neural networks has advanced the capability of terrain traversability segmentation from raw images; however, the feasibility of using these noisy segmentation maps for navigation and path planning has not been adequately explored. To address this problem, this research proposes an uncertainty-aware path planning method, URA* using aerial images for autonomous navigation in off-road environments. An ensemble convolutional neural network (CNN) model is first used to perform pixel-level traversability estimation from aerial images of the region of interest. The traversability predictions are represented as a grid of traversal probability values. An uncertainty-aware planner is then applied to compute the best path from a start point to a goal point given these noisy traversal probability estimates. The proposed planner also incorporates replanning techniques to allow rapid replanning during online robot operation. The proposed method is evaluated on the Massachusetts Road Dataset, the DeepGlobe dataset, as well as a dataset of aerial images from off-road proving grounds at Mississippi State University. Results show that the proposed image segmentation and planning methods outperform conventional planning algorithms in terms of the quality and feasibility of the initial path, as well as the quality of replanned paths.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
ssVERDICT: Self-Supervised VERDICT-MRI for Enhanced Prostate Tumour Characterisation
Authors:
Snigdha Sen,
Saurabh Singh,
Hayley Pye,
Caroline M. Moore,
Hayley Whitaker,
Shonit Punwani,
David Atkinson,
Eleftheria Panagiotaki,
Paddy J. Slator
Abstract:
Purpose: Demonstrating and assessing self-supervised machine learning fitting of the VERDICT (Vascular, Extracellular and Restricted DIffusion for Cytometry in Tumours) model for prostate. Methods: We derive a self-supervised neural network for fitting VERDICT (ssVERDICT) that estimates parameter maps without training data. We compare the performance of ssVERDICT to two established baseline method…
▽ More
Purpose: Demonstrating and assessing self-supervised machine learning fitting of the VERDICT (Vascular, Extracellular and Restricted DIffusion for Cytometry in Tumours) model for prostate. Methods: We derive a self-supervised neural network for fitting VERDICT (ssVERDICT) that estimates parameter maps without training data. We compare the performance of ssVERDICT to two established baseline methods for fitting diffusion MRI models: conventional nonlinear least squares (NLLS) and supervised deep learning. We do this quantitatively on simulated data, by comparing the Pearson's correlation coefficient, mean-squared error (MSE), bias, and variance with respect to the simulated ground truth. We also calculate in vivo parameter maps on a cohort of 20 prostate cancer patients and compare the methods' performance in discriminating benign from cancerous tissue via Wilcoxon's signed-rank test. Results: In simulations, ssVERDICT outperforms the baseline methods (NLLS and supervised DL) in estimating all the parameters from the VERDICT prostate model in terms of Pearson's correlation coefficient, bias, and MSE. In vivo, ssVERDICT shows stronger lesion conspicuity across all parameter maps, and improves discrimination between benign and cancerous tissue over the baseline methods. Conclusion: ssVERDICT significantly outperforms state-of-the-art methods for VERDICT model fitting, and shows for the first time, fitting of a complex three-compartment biophysical model with machine learning without the requirement of explicit training labels.
△ Less
Submitted 27 September, 2023; v1 submitted 12 September, 2023;
originally announced September 2023.
-
Towards Trustworthy Artificial Intelligence for Equitable Global Health
Authors:
Hong Qin,
Jude Kong,
Wandi Ding,
Ramneek Ahluwalia,
Christo El Morr,
Zeynep Engin,
Jake Okechukwu Effoduh,
Rebecca Hwa,
Serena Jingchuan Guo,
Laleh Seyyed-Kalantari,
Sylvia Kiwuwa Muyingo,
Candace Makeda Moore,
Ravi Parikh,
Reva Schwartz,
Dongxiao Zhu,
Xiaoqian Wang,
Yiye Zhang
Abstract:
Artificial intelligence (AI) can potentially transform global health, but algorithmic bias can exacerbate social inequities and disparity. Trustworthy AI entails the intentional design to ensure equity and mitigate potential biases. To advance trustworthy AI in global health, we convened a workshop on Fairness in Machine Intelligence for Global Health (FairMI4GH). The event brought together a glob…
▽ More
Artificial intelligence (AI) can potentially transform global health, but algorithmic bias can exacerbate social inequities and disparity. Trustworthy AI entails the intentional design to ensure equity and mitigate potential biases. To advance trustworthy AI in global health, we convened a workshop on Fairness in Machine Intelligence for Global Health (FairMI4GH). The event brought together a global mix of experts from various disciplines, community health practitioners, policymakers, and more. Topics covered included managing AI bias in socio-technical systems, AI's potential impacts on global health, and balancing data privacy with transparency. Panel discussions examined the cultural, political, and ethical dimensions of AI in global health. FairMI4GH aimed to stimulate dialogue, facilitate knowledge transfer, and spark innovative solutions. Drawing from NIST's AI Risk Management Framework, it provided suggestions for handling AI risks and biases. The need to mitigate data biases from the research design stage, adopt a human-centered approach, and advocate for AI transparency was recognized. Challenges such as updating legal frameworks, managing cross-border data sharing, and motivating developers to reduce bias were acknowledged. The event emphasized the necessity of diverse viewpoints and multi-dimensional dialogue for creating a fair and ethical AI framework for equitable global health.
△ Less
Submitted 10 September, 2023;
originally announced September 2023.
-
A model for efficient dynamical ranking in networks
Authors:
Andrea Della Vecchia,
Kibidi Neocosmos,
Daniel B. Larremore,
Cristopher Moore,
Caterina De Bacco
Abstract:
We present a physics-inspired method for inferring dynamic rankings in directed temporal networks - networks in which each directed and timestamped edge reflects the outcome and timing of a pairwise interaction. The inferred ranking of each node is real-valued and varies in time as each new edge, encoding an outcome like a win or loss, raises or lowers the node's estimated strength or prestige, as…
▽ More
We present a physics-inspired method for inferring dynamic rankings in directed temporal networks - networks in which each directed and timestamped edge reflects the outcome and timing of a pairwise interaction. The inferred ranking of each node is real-valued and varies in time as each new edge, encoding an outcome like a win or loss, raises or lowers the node's estimated strength or prestige, as is often observed in real scenarios including sequences of games, tournaments, or interactions in animal hierarchies. Our method works by solving a linear system of equations and requires only one parameter to be tuned. As a result, the corresponding algorithm is scalable and efficient. We test our method by evaluating its ability to predict interactions (edges' existence) and their outcomes (edges' directions) in a variety of applications, including both synthetic and real data. Our analysis shows that in many cases our method's performance is better than existing methods for predicting dynamic rankings and interaction outcomes.
△ Less
Submitted 9 August, 2024; v1 submitted 25 July, 2023;
originally announced July 2023.
-
Dataset balancing can hurt model performance
Authors:
R. Channing Moore,
Daniel P. W. Ellis,
Eduardo Fonseca,
Shawn Hershey,
Aren Jansen,
Manoj Plakal
Abstract:
Machine learning from training data with a skewed distribution of examples per class can lead to models that favor performance on common classes at the expense of performance on rare ones. AudioSet has a very wide range of priors over its 527 sound event classes. Classification performance on AudioSet is usually evaluated by a simple average over per-class metrics, meaning that performance on rare…
▽ More
Machine learning from training data with a skewed distribution of examples per class can lead to models that favor performance on common classes at the expense of performance on rare ones. AudioSet has a very wide range of priors over its 527 sound event classes. Classification performance on AudioSet is usually evaluated by a simple average over per-class metrics, meaning that performance on rare classes is equal in importance to the performance on common ones. Several recent papers have used dataset balancing techniques to improve performance on AudioSet. We find, however, that while balancing improves performance on the public AudioSet evaluation data it simultaneously hurts performance on an unpublished evaluation set collected under the same conditions. By varying the degree of balancing, we show that its benefits are fragile and depend on the evaluation set. We also do not find evidence indicating that balancing improves rare class performance relative to common classes. We therefore caution against blind application of balancing, as well as against paying too much attention to small improvements on a public evaluation set.
△ Less
Submitted 30 June, 2023;
originally announced July 2023.
-
Generative Adversarial Networks for Scintillation Signal Simulation in EXO-200
Authors:
S. Li,
I. Ostrovskiy,
Z. Li,
L. Yang,
S. Al Kharusi,
G. Anton,
I. Badhrees,
P. S. Barbeau,
D. Beck,
V. Belov,
T. Bhatta,
M. Breidenbach,
T. Brunner,
G. F. Cao,
W. R. Cen,
C. Chambers,
B. Cleveland,
M. Coon,
A. Craycraft,
T. Daniels,
L. Darroch,
S. J. Daugherty,
J. Davis,
S. Delaquis,
A. Der Mesrobian-Kabakian
, et al. (65 additional authors not shown)
Abstract:
Generative Adversarial Networks trained on samples of simulated or actual events have been proposed as a way of generating large simulated datasets at a reduced computational cost. In this work, a novel approach to perform the simulation of photodetector signals from the time projection chamber of the EXO-200 experiment is demonstrated. The method is based on a Wasserstein Generative Adversarial N…
▽ More
Generative Adversarial Networks trained on samples of simulated or actual events have been proposed as a way of generating large simulated datasets at a reduced computational cost. In this work, a novel approach to perform the simulation of photodetector signals from the time projection chamber of the EXO-200 experiment is demonstrated. The method is based on a Wasserstein Generative Adversarial Network - a deep learning technique allowing for implicit non-parametric estimation of the population distribution for a given set of objects. Our network is trained on real calibration data using raw scintillation waveforms as input. We find that it is able to produce high-quality simulated waveforms an order of magnitude faster than the traditional simulation approach and, importantly, generalize from the training sample and discern salient high-level features of the data. In particular, the network correctly deduces position dependency of scintillation light response in the detector and correctly recognizes dead photodetector channels. The network output is then integrated into the EXO-200 analysis framework to show that the standard EXO-200 reconstruction routine processes the simulated waveforms to produce energy distributions comparable to that of real waveforms. Finally, the remaining discrepancies and potential ways to improve the approach further are highlighted.
△ Less
Submitted 8 May, 2023; v1 submitted 11 March, 2023;
originally announced March 2023.
-
Improving Contrastive Learning on Visually Homogeneous Mars Rover Images
Authors:
Isaac Ronald Ward,
Charles Moore,
Kai Pak,
Jingdao Chen,
Edwin Goh
Abstract:
Contrastive learning has recently demonstrated superior performance to supervised learning, despite requiring no training labels. We explore how contrastive learning can be applied to hundreds of thousands of unlabeled Mars terrain images, collected from the Mars rovers Curiosity and Perseverance, and from the Mars Reconnaissance Orbiter. Such methods are appealing since the vast majority of Mars…
▽ More
Contrastive learning has recently demonstrated superior performance to supervised learning, despite requiring no training labels. We explore how contrastive learning can be applied to hundreds of thousands of unlabeled Mars terrain images, collected from the Mars rovers Curiosity and Perseverance, and from the Mars Reconnaissance Orbiter. Such methods are appealing since the vast majority of Mars images are unlabeled as manual annotation is labor intensive and requires extensive domain knowledge. Contrastive learning, however, assumes that any given pair of distinct images contain distinct semantic content. This is an issue for Mars image datasets, as any two pairs of Mars images are far more likely to be semantically similar due to the lack of visual diversity on the planet's surface. Making the assumption that pairs of images will be in visual contrast - when they are in fact not - results in pairs that are falsely considered as negatives, impacting training performance. In this study, we propose two approaches to resolve this: 1) an unsupervised deep clustering step on the Mars datasets, which identifies clusters of images containing similar semantic content and corrects false negative errors during training, and 2) a simple approach which mixes data from different domains to increase visual diversity of the total training dataset. Both cases reduce the rate of false negative pairs, thus minimizing the rate in which the model is incorrectly penalized during contrastive training. These modified approaches remain fully unsupervised end-to-end. To evaluate their performance, we add a single linear layer trained to generate class predictions based on these contrastively-learned features and demonstrate increased performance compared to supervised models; observing an improvement in classification accuracy of 3.06% using only 10% of the labeled data.
△ Less
Submitted 17 October, 2022;
originally announced October 2022.
-
Disordered Systems Insights on Computational Hardness
Authors:
David Gamarnik,
Cristopher Moore,
Lenka Zdeborová
Abstract:
In this review article, we discuss connections between the physics of disordered systems, phase transitions in inference problems, and computational hardness. We introduce two models representing the behavior of glassy systems, the spiked tensor model and the generalized linear model. We discuss the random (non-planted) versions of these problems as prototypical optimization problems, as well as t…
▽ More
In this review article, we discuss connections between the physics of disordered systems, phase transitions in inference problems, and computational hardness. We introduce two models representing the behavior of glassy systems, the spiked tensor model and the generalized linear model. We discuss the random (non-planted) versions of these problems as prototypical optimization problems, as well as the planted versions (with a hidden solution) as prototypical problems in statistical inference and learning. Based on ideas from physics, many of these problems have transitions where they are believed to jump from easy (solvable in polynomial time) to hard (requiring exponential time). We discuss several emerging ideas in theoretical computer science and statistics that provide rigorous evidence for hardness by proving that large classes of algorithms fail in the conjectured hard regime. This includes the overlap gap property, a particular mathematization of clustering or dynamical symmetry-breaking, which can be used to show that many algorithms that are local or robust to changes in their input fail. We also discuss the sum-of-squares hierarchy, which places bounds on proofs or algorithms that use low-degree polynomials such as standard spectral methods and semidefinite relaxations, including the Sherrington-Kirkpatrick model. Throughout the manuscript, we present connections to the physics of disordered systems and associated replica symmetry breaking properties.
△ Less
Submitted 18 October, 2022; v1 submitted 15 October, 2022;
originally announced October 2022.
-
Collaborative Quantization Embeddings for Intra-Subject Prostate MR Image Registration
Authors:
Ziyi Shen,
Qianye Yang,
Yuming Shen,
Francesco Giganti,
Vasilis Stavrinides,
Richard Fan,
Caroline Moore,
Mirabela Rusu,
Geoffrey Sonn,
Philip Torr,
Dean Barratt,
Yipeng Hu
Abstract:
Image registration is useful for quantifying morphological changes in longitudinal MR images from prostate cancer patients. This paper describes a development in improving the learning-based registration algorithms, for this challenging clinical application often with highly variable yet limited training data. First, we report that the latent space can be clustered into a much lower dimensional sp…
▽ More
Image registration is useful for quantifying morphological changes in longitudinal MR images from prostate cancer patients. This paper describes a development in improving the learning-based registration algorithms, for this challenging clinical application often with highly variable yet limited training data. First, we report that the latent space can be clustered into a much lower dimensional space than that commonly found as bottleneck features at the deep layer of a trained registration network. Based on this observation, we propose a hierarchical quantization method, discretizing the learned feature vectors using a jointly-trained dictionary with a constrained size, in order to improve the generalisation of the registration networks. Furthermore, a novel collaborative dictionary is independently optimised to incorporate additional prior information, such as the segmentation of the gland or other regions of interest, in the latent quantized space. Based on 216 real clinical images from 86 prostate cancer patients, we show the efficacy of both the designed components. Improved registration accuracy was obtained with statistical significance, in terms of both Dice on gland and target registration error on corresponding landmarks, the latter of which achieved 5.46 mm, an improvement of 28.7\% from the baseline without quantization. Experimental results also show that the difference in performance was indeed minimised between training and testing data.
△ Less
Submitted 14 July, 2022; v1 submitted 13 July, 2022;
originally announced July 2022.
-
Compressing the chronology of a temporal network with graph commutators
Authors:
Andrea J. Allen,
Cristopher Moore,
Laurent Hébert-Dufresne
Abstract:
Studies of dynamics on temporal networks often represent the network as a series of "snapshots," static networks active for short durations of time. We argue that successive snapshots can be aggregated if doing so has little effect on the overlying dynamics. We propose a method to compress network chronologies by progressively combining pairs of snapshots whose matrix commutators have the smallest…
▽ More
Studies of dynamics on temporal networks often represent the network as a series of "snapshots," static networks active for short durations of time. We argue that successive snapshots can be aggregated if doing so has little effect on the overlying dynamics. We propose a method to compress network chronologies by progressively combining pairs of snapshots whose matrix commutators have the smallest dynamical effect. We apply this method to epidemic modeling on real contact tracing data and find that it allows for significant compression while remaining faithful to the epidemic dynamics.
△ Less
Submitted 29 March, 2024; v1 submitted 23 May, 2022;
originally announced May 2022.
-
The spectrum of the Grigoriev-Laurent pseudomoments
Authors:
Dmitriy Kunisky,
Cristopher Moore
Abstract:
Grigoriev (2001) and Laurent (2003) independently showed that the sum-of-squares hierarchy of semidefinite programs does not exactly represent the hypercube $\{\pm 1\}^n$ until degree at least $n$ of the hierarchy. Laurent also observed that the pseudomoment matrices her proof constructs appear to have surprisingly simple and recursively structured spectra as $n$ increases. While several new proof…
▽ More
Grigoriev (2001) and Laurent (2003) independently showed that the sum-of-squares hierarchy of semidefinite programs does not exactly represent the hypercube $\{\pm 1\}^n$ until degree at least $n$ of the hierarchy. Laurent also observed that the pseudomoment matrices her proof constructs appear to have surprisingly simple and recursively structured spectra as $n$ increases. While several new proofs of the Grigoriev-Laurent lower bound have since appeared, Laurent's observations have remained unproved. We give yet another, representation-theoretic proof of the lower bound, which also yields exact formulae for the eigenvalues of the Grigoriev-Laurent pseudomoments. Using these, we prove and elaborate on Laurent's observations.
Our arguments have two features that may be of independent interest. First, we show that the Grigoriev-Laurent pseudomoments are a special case of a Gram matrix construction of pseudomoments proposed by Bandeira and Kunisky (2020). Second, we find a new realization of the irreducible representations of the symmetric group corresponding to Young diagrams with two rows, as spaces of multivariate polynomials that are multiharmonic with respect to an equilateral simplex.
△ Less
Submitted 10 March, 2022;
originally announced March 2022.
-
Effective Resistance for Pandemics: Mobility Network Sparsification for High-Fidelity Epidemic Simulation
Authors:
Alexander M. Mercier,
Samuel V. Scarpino,
Cristopher Moore
Abstract:
Network science has increasingly become central to the field of epidemiology and our ability to respond to infectious disease threats. However, many networks derived from modern datasets are not just large, but dense, with a high ratio of edges to nodes. This includes human mobility networks where most locations have a large number of links to many other locations. Simulating large-scale epidemics…
▽ More
Network science has increasingly become central to the field of epidemiology and our ability to respond to infectious disease threats. However, many networks derived from modern datasets are not just large, but dense, with a high ratio of edges to nodes. This includes human mobility networks where most locations have a large number of links to many other locations. Simulating large-scale epidemics requires substantial computational resources and in many cases is practically infeasible. One way to reduce the computational cost of simulating epidemics on these networks is sparsification, where a representative subset of edges is selected based on some measure of their importance. We test several sparsification strategies, ranging from naive thresholding to random sampling of edges, on mobility data from the U.S. Following recent work in computer science, we find that the most accurate approach uses the effective resistances of edges, which prioritizes edges that are the only efficient way to travel between their endpoints. The resulting sparse network preserves many aspects of the behavior of an SIR model, including both global quantities, like the epidemic size, and local details of stochastic events, including the probability each node becomes infected and its distribution of arrival times. This holds even when the sparse network preserves fewer than $10\%$ of the edges of the original network. In addition to its practical utility, this method helps illuminate which links of a weighted, undirected network are most important to disease spread.
△ Less
Submitted 26 July, 2022; v1 submitted 3 November, 2021;
originally announced November 2021.
-
Belief propagation for permutations, rankings, and partial orders
Authors:
George T. Cantwell,
Cristopher Moore
Abstract:
Many datasets give partial information about an ordering or ranking by indicating which team won a game, which item a user prefers, or who infected whom. We define a continuous spin system whose Gibbs distribution is the posterior distribution on permutations, given a probabilistic model of these interactions. Using the cavity method we derive a belief propagation algorithm that computes the margi…
▽ More
Many datasets give partial information about an ordering or ranking by indicating which team won a game, which item a user prefers, or who infected whom. We define a continuous spin system whose Gibbs distribution is the posterior distribution on permutations, given a probabilistic model of these interactions. Using the cavity method we derive a belief propagation algorithm that computes the marginal distribution of each node's position. In addition, the Bethe free energy lets us approximate the number of linear extensions of a partial order and perform model selection between competing probabilistic models, such as the Bradley-Terry-Luce model of noisy comparisons and its cousins.
△ Less
Submitted 5 May, 2022; v1 submitted 1 October, 2021;
originally announced October 2021.
-
Indirectly Supervised English Sentence Break Prediction Using Paragraph Break Probability Estimates
Authors:
Robert C. Moore
Abstract:
This report explores the use of paragraph break probability estimates to help predict the location of sentence breaks in English natural language text. We show that a sentence break predictor based almost solely on paragraph break probability estimates can achieve high accuracy on this task. This sentence break predictor is trained almost entirely on a large amount of naturally occurring text with…
▽ More
This report explores the use of paragraph break probability estimates to help predict the location of sentence breaks in English natural language text. We show that a sentence break predictor based almost solely on paragraph break probability estimates can achieve high accuracy on this task. This sentence break predictor is trained almost entirely on a large amount of naturally occurring text without sentence break annotations, with only a small amount of annotated data needed to tune two hyperparameters. We also show that even better results can be achieved across in-domain and out-of-domain test data, if paragraph break probability signals are combined with a support vector machine classifier trained on a somewhat larger amount of sentence-break-annotated data. Numerous related issues are addressed along the way.
△ Less
Submitted 24 September, 2021;
originally announced September 2021.
-
Reconstruction of Random Geometric Graphs: Breaking the Omega(r) distortion barrier
Authors:
Varsha Dani,
Josep Díaz,
Thomas P. Hayes,
Cristopher Moore
Abstract:
Embedding graphs in a geographical or latent space, i.e.\ inferring locations for vertices in Euclidean space or on a smooth manifold or submanifold, is a common task in network analysis, statistical inference, and graph visualization. We consider the classic model of random geometric graphs where $n$ points are scattered uniformly in a square of area $n$, and two points have an edge between them…
▽ More
Embedding graphs in a geographical or latent space, i.e.\ inferring locations for vertices in Euclidean space or on a smooth manifold or submanifold, is a common task in network analysis, statistical inference, and graph visualization. We consider the classic model of random geometric graphs where $n$ points are scattered uniformly in a square of area $n$, and two points have an edge between them if and only if their Euclidean distance is less than $r$. The reconstruction problem then consists of inferring the vertex positions, up to the symmetries of the square, given only the adjacency matrix of the resulting graph. We give an algorithm that, if $r=n^α$ for any $α> 0$, with high probability reconstructs the vertex positions with a maximum error of $O(n^β)$ where $β=1/2-(4/3)α$, until $α\ge 3/8$ where $β=0$ and the error becomes $O(\sqrt{\log n})$. This improves over earlier results, which were unable to reconstruct with error less than $r$. Our method estimates Euclidean distances using a hybrid of graph distances and short-range estimates based on the number of common neighbors. We extend our results to the surface of the sphere in $\R^3$ and to hypercubes in any constant fixed dimension. Additionally we examine the extent to which reconstruction is still possible when the original adjacency lists have had a subset of the edges independently deleted at random.
△ Less
Submitted 17 May, 2022; v1 submitted 29 July, 2021;
originally announced July 2021.
-
The Benefit Of Temporally-Strong Labels In Audio Event Classification
Authors:
Shawn Hershey,
Daniel P W Ellis,
Eduardo Fonseca,
Aren Jansen,
Caroline Liu,
R Channing Moore,
Manoj Plakal
Abstract:
To reveal the importance of temporal precision in ground truth audio event labels, we collected precise (~0.1 sec resolution) "strong" labels for a portion of the AudioSet dataset. We devised a temporally strong evaluation set (including explicit negatives of varying difficulty) and a small strong-labeled training subset of 67k clips (compared to the original dataset's 1.8M clips labeled at 10 sec…
▽ More
To reveal the importance of temporal precision in ground truth audio event labels, we collected precise (~0.1 sec resolution) "strong" labels for a portion of the AudioSet dataset. We devised a temporally strong evaluation set (including explicit negatives of varying difficulty) and a small strong-labeled training subset of 67k clips (compared to the original dataset's 1.8M clips labeled at 10 sec resolution). We show that fine-tuning with a mix of weak and strongly labeled data can substantially improve classifier performance, even when evaluated using only the original weak labels. For a ResNet50 architecture, d' on the strong evaluation data including explicit negatives improves from 1.13 to 1.41. The new labels are available as an update to AudioSet.
△ Less
Submitted 14 May, 2021;
originally announced May 2021.
-
Stochastic Properties of EIP-1559 Basefees
Authors:
Ian C. Moore,
Jagdeep Sidhu
Abstract:
EIP-1559 is a new proposed pricing mechanism for the Ethereum protocol developed to bring stability to fluctuating gas prices. To properly understand this as a stochastic process, it is necessary to develop the mathematical foundations to understand under what conditions the base fee gas price outcomes behave as a stationary process, and when it does not. Understanding these mathematical fundament…
▽ More
EIP-1559 is a new proposed pricing mechanism for the Ethereum protocol developed to bring stability to fluctuating gas prices. To properly understand this as a stochastic process, it is necessary to develop the mathematical foundations to understand under what conditions the base fee gas price outcomes behave as a stationary process, and when it does not. Understanding these mathematical fundamentals is critical to properly engineering a stable system.
△ Less
Submitted 20 May, 2021; v1 submitted 7 May, 2021;
originally announced May 2021.
-
Self-Supervised Learning from Automatically Separated Sound Scenes
Authors:
Eduardo Fonseca,
Aren Jansen,
Daniel P. W. Ellis,
Scott Wisdom,
Marco Tagliasacchi,
John R. Hershey,
Manoj Plakal,
Shawn Hershey,
R. Channing Moore,
Xavier Serra
Abstract:
Real-world sound scenes consist of time-varying collections of sound sources, each generating characteristic sound events that are mixed together in audio recordings. The association of these constituent sound events with their mixture and each other is semantically constrained: the sound scene contains the union of source classes and not all classes naturally co-occur. With this motivation, this…
▽ More
Real-world sound scenes consist of time-varying collections of sound sources, each generating characteristic sound events that are mixed together in audio recordings. The association of these constituent sound events with their mixture and each other is semantically constrained: the sound scene contains the union of source classes and not all classes naturally co-occur. With this motivation, this paper explores the use of unsupervised automatic sound separation to decompose unlabeled sound scenes into multiple semantically-linked views for use in self-supervised contrastive learning. We find that learning to associate input mixtures with their automatically separated outputs yields stronger representations than past approaches that use the mixtures alone. Further, we discover that optimal source separation is not required for successful contrastive learning by demonstrating that a range of separation system convergence states all lead to useful and often complementary example transformations. Our best system incorporates these unsupervised separation models into a single augmentation front-end and jointly optimizes similarity maximization and coincidence prediction objectives across the views. The result is an unsupervised audio representation that rivals state-of-the-art alternatives on the established shallow AudioSet classification benchmark.
△ Less
Submitted 14 September, 2021; v1 submitted 5 May, 2021;
originally announced May 2021.
-
Morphological Change Forecasting for Prostate Glands using Feature-based Registration and Kernel Density Extrapolation
Authors:
Qianye Yang,
Tom Vercauteren,
Yunguan Fu,
Francesco Giganti,
Nooshin Ghavami,
Vasilis Stavrinides,
Caroline Moore,
Matt Clarkson,
Dean Barratt,
Yipeng Hu
Abstract:
Organ morphology is a key indicator for prostate disease diagnosis and prognosis. For instance, In longitudinal study of prostate cancer patients under active surveillance, the volume, boundary smoothness and their changes are closely monitored on time-series MR image data. In this paper, we describe a new framework for forecasting prostate morphological changes, as the ability to detect such chan…
▽ More
Organ morphology is a key indicator for prostate disease diagnosis and prognosis. For instance, In longitudinal study of prostate cancer patients under active surveillance, the volume, boundary smoothness and their changes are closely monitored on time-series MR image data. In this paper, we describe a new framework for forecasting prostate morphological changes, as the ability to detect such changes earlier than what is currently possible may enable timely treatment or avoiding unnecessary confirmatory biopsies. In this work, an efficient feature-based MR image registration is first developed to align delineated prostate gland capsules to quantify the morphological changes using the inferred dense displacement fields (DDFs). We then propose to use kernel density estimation (KDE) of the probability density of the DDF-represented \textit{future morphology changes}, between current and future time points, before the future data become available. The KDE utilises a novel distance function that takes into account morphology, stage-of-progression and duration-of-change, which are considered factors in such subject-specific forecasting. We validate the proposed approach on image masks unseen to registration network training, without using any data acquired at the future target time points. The experiment results are presented on a longitudinal data set with 331 images from 73 patients, yielding an average Dice score of 0.865 on a holdout set, between the ground-truth and the image masks warped by the KDE-predicted-DDFs.
△ Less
Submitted 16 January, 2021;
originally announced January 2021.
-
Spectral Planting and the Hardness of Refuting Cuts, Colorability, and Communities in Random Graphs
Authors:
Afonso S. Bandeira,
Jess Banks,
Dmitriy Kunisky,
Cristopher Moore,
Alexander S. Wein
Abstract:
We study the problem of efficiently refuting the k-colorability of a graph, or equivalently certifying a lower bound on its chromatic number. We give formal evidence of average-case computational hardness for this problem in sparse random regular graphs, showing optimality of a simple spectral certificate. This evidence takes the form of a computationally-quiet planting: we construct a distributio…
▽ More
We study the problem of efficiently refuting the k-colorability of a graph, or equivalently certifying a lower bound on its chromatic number. We give formal evidence of average-case computational hardness for this problem in sparse random regular graphs, showing optimality of a simple spectral certificate. This evidence takes the form of a computationally-quiet planting: we construct a distribution of d-regular graphs that has significantly smaller chromatic number than a typical regular graph drawn uniformly at random, while providing evidence that these two distributions are indistinguishable by a large class of algorithms. We generalize our results to the more general problem of certifying an upper bound on the maximum k-cut.
This quiet planting is achieved by minimizing the effect of the planted structure (e.g. colorings or cuts) on the graph spectrum. Specifically, the planted structure corresponds exactly to eigenvectors of the adjacency matrix. This avoids the pushout effect of random matrix theory, and delays the point at which the planting becomes visible in the spectrum or local statistics. To illustrate this further, we give similar results for a Gaussian analogue of this problem: a quiet version of the spiked model, where we plant an eigenspace rather than adding a generic low-rank perturbation.
Our evidence for computational hardness of distinguishing two distributions is based on three different heuristics: stability of belief propagation, the local statistics hierarchy, and the low-degree likelihood ratio. Of independent interest, our results include general-purpose bounds on the low-degree likelihood ratio for multi-spiked matrix models, and an improved low-degree analysis of the stochastic block model.
△ Less
Submitted 27 August, 2020;
originally announced August 2020.
-
Addressing Missing Labels in Large-Scale Sound Event Recognition Using a Teacher-Student Framework With Loss Masking
Authors:
Eduardo Fonseca,
Shawn Hershey,
Manoj Plakal,
Daniel P. W. Ellis,
Aren Jansen,
R. Channing Moore,
Xavier Serra
Abstract:
The study of label noise in sound event recognition has recently gained attention with the advent of larger and noisier datasets. This work addresses the problem of missing labels, one of the big weaknesses of large audio datasets, and one of the most conspicuous issues for AudioSet. We propose a simple and model-agnostic method based on a teacher-student framework with loss masking to first ident…
▽ More
The study of label noise in sound event recognition has recently gained attention with the advent of larger and noisier datasets. This work addresses the problem of missing labels, one of the big weaknesses of large audio datasets, and one of the most conspicuous issues for AudioSet. We propose a simple and model-agnostic method based on a teacher-student framework with loss masking to first identify the most critical missing label candidates, and then ignore their contribution during the learning process. We find that a simple optimisation of the training label set improves recognition performance without additional computation. We discover that most of the improvement comes from ignoring a critical tiny portion of the missing labels. We also show that the damage done by missing labels is larger as the training set gets smaller, yet it can still be observed even when training with massive amounts of audio. We believe these insights can generalize to other large-scale datasets.
△ Less
Submitted 25 July, 2020; v1 submitted 2 May, 2020;
originally announced May 2020.
-
A spatial agent based model for simulating and optimizing networked eco-industrial systems
Authors:
J. Raimbault,
J. Broere,
M. Somveille,
J. M. Serna,
E. Strombom,
C. Moore,
B. Zhu,
L. Sugar
Abstract:
Industrial symbiosis involves creating integrated cycles of by-products and waste between networks of industrial actors in order to maximize economic value, while at the same time minimizing environmental strain. In such a network, the global environmental strain is no longer equal to the sum of the environmental strain of the individual actors, but it is dependent on how well the network performs…
▽ More
Industrial symbiosis involves creating integrated cycles of by-products and waste between networks of industrial actors in order to maximize economic value, while at the same time minimizing environmental strain. In such a network, the global environmental strain is no longer equal to the sum of the environmental strain of the individual actors, but it is dependent on how well the network performs as a whole. The development of methods to understand, manage or optimize such networks remains an open issue. In this paper we put forward a simulation model of by-product flow between industrial actors. The goal is to introduce a method for modelling symbiotic exchanges from a macro perspective. The model takes into account the effect of two main mechanisms on a multi-objective optimization of symbiotic processes. First it allows us to study the effect of geographical properties of the economic system, said differently, where actors are divided in space. Second, it allows us to study the effect of clustering complementary actors together as a function of distance, by means of a spatial correlation between the actors' by-products. Our simulations unveil patterns that are relevant for macro-level policy. First, our results show that the geographical properties are an important factor for the macro performance of symbiotic processes. Second, spatial correlations, which can be interpreted as planned clusters such as Eco-industrial parks, can lead to a very effective macro performance, but only if these are strictly implemented. Finally, we provide a proof of concept by comparing the model to real world data from the European Pollutant Release and Transfer Register database using georeferencing of the companies in the dataset. This work opens up research opportunities in interactive data-driven models and platforms to support real-world implementation of industrial symbiosis.
△ Less
Submitted 31 March, 2020;
originally announced March 2020.
-
The Planted Matching Problem: Phase Transitions and Exact Results
Authors:
Mehrdad Moharrami,
Cristopher Moore,
Jiaming Xu
Abstract:
We study the problem of recovering a planted matching in randomly weighted complete bipartite graphs $K_{n,n}$. For some unknown perfect matching $M^*$, the weight of an edge is drawn from one distribution $P$ if $e \in M^*$ and another distribution $Q$ if $e \notin M^*$. Our goal is to infer $M^*$, exactly or approximately, from the edge weights. In this paper we take $P=\exp(λ)$ and…
▽ More
We study the problem of recovering a planted matching in randomly weighted complete bipartite graphs $K_{n,n}$. For some unknown perfect matching $M^*$, the weight of an edge is drawn from one distribution $P$ if $e \in M^*$ and another distribution $Q$ if $e \notin M^*$. Our goal is to infer $M^*$, exactly or approximately, from the edge weights. In this paper we take $P=\exp(λ)$ and $Q=\exp(1/n)$, in which case the maximum-likelihood estimator of $M^*$ is the minimum-weight matching $M_{\text{min}}$. We obtain precise results on the overlap between $M^*$ and $M_{\text{min}}$, i.e., the fraction of edges they have in common. For $λ\ge 4$ we have almost perfect recovery, with overlap $1-o(1)$ with high probability. For $λ< 4$ the expected overlap is an explicit function $α(λ) < 1$: we compute it by generalizing Aldous' celebrated proof of the $ζ(2)$ conjecture for the un-planted model, using local weak convergence to relate $K_{n,n}$ to a type of weighted infinite tree, and then deriving a system of differential equations from a message-passing algorithm on this tree.
△ Less
Submitted 9 November, 2020; v1 submitted 18 December, 2019;
originally announced December 2019.
-
Linear Consistency for Proof-of-Stake Blockchains
Authors:
Erica Blum,
Aggelos Kiayias,
Cristopher Moore,
Saad Quader,
Alexander Russell
Abstract:
The blockchain data structure maintained via the longest-chain rule---popularized by Bitcoin---is a powerful algorithmic tool for consensus algorithms. Such algorithms achieve consistency for blocks in the chain as a function of their depth from the end of the chain. While the analysis of Bitcoin guarantees consistency with error $2^{-k}$ for blocks of depth $O(k)$, the state-of-the-art of proof-o…
▽ More
The blockchain data structure maintained via the longest-chain rule---popularized by Bitcoin---is a powerful algorithmic tool for consensus algorithms. Such algorithms achieve consistency for blocks in the chain as a function of their depth from the end of the chain. While the analysis of Bitcoin guarantees consistency with error $2^{-k}$ for blocks of depth $O(k)$, the state-of-the-art of proof-of-stake (PoS) blockchains suffers from a quadratic dependence on $k$: these protocols, exemplified by Ouroboros (Crypto 2017), Ouroboros Praos (Eurocrypt 2018) and Sleepy Consensus (Asiacrypt 2017), can only establish that depth $Θ(k^2)$ is sufficient. Whether this quadratic gap is an intrinsic limitation of PoS---due to issues such as the nothing-at-stake problem---has been an urgent open question, as deployed PoS blockchains further rely on consistency for protocol correctness.
We give an axiomatic theory of blockchain dynamics that permits rigorous reasoning about the longest-chain rule and achieve, in broad generality, $Θ(k)$ dependence on depth in order to achieve consistency error $2^{-k}$. In particular, for the first time, we show that PoS protocols can match proof-of-work protocols for linear consistency. We analyze the associated stochastic process, give a recursive relation for the critical functionals of this process, and derive tail bounds in both i.i.d. and martingale settings via associated generating functions.
△ Less
Submitted 22 November, 2019;
originally announced November 2019.
-
Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision
Authors:
Aren Jansen,
Daniel P. W. Ellis,
Shawn Hershey,
R. Channing Moore,
Manoj Plakal,
Ashok C. Popat,
Rif A. Saurous
Abstract:
Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on multimodal unsupervised learning (as infants) and active learning (as children). With this motivation, we present a learning framework for sound representation and…
▽ More
Humans do not acquire perceptual abilities in the way we train machines. While machine learning algorithms typically operate on large collections of randomly-chosen, explicitly-labeled examples, human acquisition relies more heavily on multimodal unsupervised learning (as infants) and active learning (as children). With this motivation, we present a learning framework for sound representation and recognition that combines (i) a self-supervised objective based on a general notion of unimodal and cross-modal coincidence, (ii) a clustering objective that reflects our need to impose categorical structure on our experiences, and (iii) a cluster-based active learning procedure that solicits targeted weak supervision to consolidate categories into relevant semantic classes. By training a combined sound embedding/clustering/classification network according to these criteria, we achieve a new state-of-the-art unsupervised audio representation and demonstrate up to a 20-fold reduction in the number of labels required to reach a desired classification performance.
△ Less
Submitted 13 November, 2019;
originally announced November 2019.
-
Percolation is Odd
Authors:
Stephan Mertens,
Cristopher Moore
Abstract:
We prove a remarkable combinatorial symmetry in the number of spanning configurations in site percolation: for a large class of lattices, the number of spanning configurations with an odd or even number of occupied sites differs by $\pm 1$. In particular, this symmetry implies that the total number of spanning configurations is always odd, independent of the size or shape of the lattice. The class…
▽ More
We prove a remarkable combinatorial symmetry in the number of spanning configurations in site percolation: for a large class of lattices, the number of spanning configurations with an odd or even number of occupied sites differs by $\pm 1$. In particular, this symmetry implies that the total number of spanning configurations is always odd, independent of the size or shape of the lattice. The class of lattices that share this symmetry includes the square lattice and the hypercubic lattice in any dimension, with a wide variety of boundary conditions.
△ Less
Submitted 4 December, 2019; v1 submitted 3 September, 2019;
originally announced September 2019.
-
Lecture Notes on Automata, Languages, and Grammars
Authors:
Cristopher Moore
Abstract:
These lecture notes are intended as a supplement to Moore and Mertens' The Nature of Computation or as a standalone resource, and are available to anyone who wants to use them. Comments are welcome, and please let me know if you use these notes in a course. There are 61 exercises.
I emphasize that automata are elementary playgrounds where we can explore the issues of deterministic and nondetermi…
▽ More
These lecture notes are intended as a supplement to Moore and Mertens' The Nature of Computation or as a standalone resource, and are available to anyone who wants to use them. Comments are welcome, and please let me know if you use these notes in a course. There are 61 exercises.
I emphasize that automata are elementary playgrounds where we can explore the issues of deterministic and nondeterministic computation. Unlike P vs. NP, we can prove that nondeterminism is equivalent to determinism, or strictly more powerful than determinism, in finite-state and push-down automata respectively. I also correct several historical and aesthetic injustices: in particular, the Myhill-Nerode theorem and the idea of building minimal DFAs from equivalence classes of prefixes is restored to its rightful place above the Pumping Lemma for regular languages. I also discuss the Pumping Lemma for context-free languages, and briefly discuss counter automata, queue automata, and the connection between unambiguous context-free languages and algebraic generating functions.
△ Less
Submitted 29 July, 2019;
originally announced July 2019.
-
Fast computation of loudness using a deep neural network
Authors:
Josef Schlittenlacher,
Richard E. Turner,
Brian C. J. Moore
Abstract:
The present paper introduces a deep neural network (DNN) for predicting the instantaneous loudness of a sound from its time waveform. The DNN was trained using the output of a more complex model, called the Cambridge loudness model. While a modern PC can perform a few hundred loudness computations per second using the Cambridge loudness model, it can perform more than 100,000 per second using the…
▽ More
The present paper introduces a deep neural network (DNN) for predicting the instantaneous loudness of a sound from its time waveform. The DNN was trained using the output of a more complex model, called the Cambridge loudness model. While a modern PC can perform a few hundred loudness computations per second using the Cambridge loudness model, it can perform more than 100,000 per second using the DNN, allowing real-time calculation of loudness. The root-mean-square deviation between the predictions of instantaneous loudness level using the two models was less than 0.5 phon for unseen types of sound. We think that the general approach of simulating a complex perceptual model by a much faster DNN can be applied to other perceptual models to make them run in real time.
△ Less
Submitted 24 May, 2019;
originally announced May 2019.
-
The Kikuchi Hierarchy and Tensor PCA
Authors:
Alexander S. Wein,
Ahmed El Alaoui,
Cristopher Moore
Abstract:
For the tensor PCA (principal component analysis) problem, we propose a new hierarchy of increasingly powerful algorithms with increasing runtime. Our hierarchy is analogous to the sum-of-squares (SOS) hierarchy but is instead inspired by statistical physics and related algorithms such as belief propagation and AMP (approximate message passing). Our level-$\ell$ algorithm can be thought of as a li…
▽ More
For the tensor PCA (principal component analysis) problem, we propose a new hierarchy of increasingly powerful algorithms with increasing runtime. Our hierarchy is analogous to the sum-of-squares (SOS) hierarchy but is instead inspired by statistical physics and related algorithms such as belief propagation and AMP (approximate message passing). Our level-$\ell$ algorithm can be thought of as a linearized message-passing algorithm that keeps track of $\ell$-wise dependencies among the hidden variables. Specifically, our algorithms are spectral methods based on the Kikuchi Hessian, which generalizes the well-studied Bethe Hessian to the higher-order Kikuchi free energies.
It is known that AMP, the flagship algorithm of statistical physics, has substantially worse performance than SOS for tensor PCA. In this work we 'redeem' the statistical physics approach by showing that our hierarchy gives a polynomial-time algorithm matching the performance of SOS. Our hierarchy also yields a continuum of subexponential-time algorithms, and we prove that these achieve the same (conjecturally optimal) tradeoff between runtime and statistical power as SOS. Our proofs are much simpler than prior work, and also apply to the related problem of refuting random $k$-XOR formulas. The results we present here apply to tensor PCA for tensors of all orders, and to $k$-XOR when $k$ is even.
Our methods suggest a new avenue for systematically obtaining optimal algorithms for Bayesian inference problems, and our results constitute a step toward unifying the statistical physics and sum-of-squares approaches to algorithm design.
△ Less
Submitted 1 October, 2019; v1 submitted 8 April, 2019;
originally announced April 2019.
-
A Deep Neural Network for Pixel-Level Electromagnetic Particle Identification in the MicroBooNE Liquid Argon Time Projection Chamber
Authors:
MicroBooNE collaboration,
C. Adams,
M. Alrashed,
R. An,
J. Anthony,
J. Asaadi,
A. Ashkenazi,
M. Auger,
S. Balasubramanian,
B. Baller,
C. Barnes,
G. Barr,
M. Bass,
F. Bay,
A. Bhat,
K. Bhattacharya,
M. Bishai,
A. Blake,
T. Bolton,
L. Camilleri,
D. Caratelli,
I. Caro Terrazas,
R. Carr,
R. Castillo Fernandez,
F. Cavanna
, et al. (148 additional authors not shown)
Abstract:
We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction cha…
▽ More
We have developed a convolutional neural network (CNN) that can make a pixel-level prediction of objects in image data recorded by a liquid argon time projection chamber (LArTPC) for the first time. We describe the network design, training techniques, and software tools developed to train this network. The goal of this work is to develop a complete deep neural network based data reconstruction chain for the MicroBooNE detector. We show the first demonstration of a network's validity on real LArTPC data using MicroBooNE collection plane images. The demonstration is performed for stopping muon and a $ν_μ$ charged current neutral pion data samples.
△ Less
Submitted 22 August, 2018;
originally announced August 2018.
-
Weakly-Supervised Convolutional Neural Networks for Multimodal Image Registration
Authors:
Yipeng Hu,
Marc Modat,
Eli Gibson,
Wenqi Li,
Nooshin Ghavami,
Ester Bonmati,
Guotai Wang,
Steven Bandula,
Caroline M. Moore,
Mark Emberton,
Sébastien Ourselin,
J. Alison Noble,
Dean C. Barratt,
Tom Vercauteren
Abstract:
One of the fundamental challenges in supervised learning for multimodal image registration is the lack of ground-truth for voxel-level spatial correspondence. This work describes a method to infer voxel-level transformation from higher-level correspondence information contained in anatomical labels. We argue that such labels are more reliable and practical to obtain for reference sets of image pai…
▽ More
One of the fundamental challenges in supervised learning for multimodal image registration is the lack of ground-truth for voxel-level spatial correspondence. This work describes a method to infer voxel-level transformation from higher-level correspondence information contained in anatomical labels. We argue that such labels are more reliable and practical to obtain for reference sets of image pairs than voxel-level correspondence. Typical anatomical labels of interest may include solid organs, vessels, ducts, structure boundaries and other subject-specific ad hoc landmarks. The proposed end-to-end convolutional neural network approach aims to predict displacement fields to align multiple labelled corresponding structures for individual image pairs during the training, while only unlabelled image pairs are used as the network input for inference. We highlight the versatility of the proposed strategy, for training, utilising diverse types of anatomical labels, which need not to be identifiable over all training image pairs. At inference, the resulting 3D deformable image registration algorithm runs in real-time and is fully-automated without requiring any anatomical labels or initialisation. Several network architecture variants are compared for registering T2-weighted magnetic resonance images and 3D transrectal ultrasound images from prostate cancer patients. A median target registration error of 3.6 mm on landmark centroids and a median Dice of 0.87 on prostate glands are achieved from cross-validation experiments, in which 108 pairs of multimodal images from 76 patients were tested with high-quality anatomical labels.
△ Less
Submitted 9 July, 2018;
originally announced July 2018.
-
Adversarial Deformation Regularization for Training Image Registration Neural Networks
Authors:
Yipeng Hu,
Eli Gibson,
Nooshin Ghavami,
Ester Bonmati,
Caroline M. Moore,
Mark Emberton,
Tom Vercauteren,
J. Alison Noble,
Dean C. Barratt
Abstract:
We describe an adversarial learning approach to constrain convolutional neural network training for image registration, replacing heuristic smoothness measures of displacement fields often used in these tasks. Using minimally-invasive prostate cancer intervention as an example application, we demonstrate the feasibility of utilizing biomechanical simulations to regularize a weakly-supervised anato…
▽ More
We describe an adversarial learning approach to constrain convolutional neural network training for image registration, replacing heuristic smoothness measures of displacement fields often used in these tasks. Using minimally-invasive prostate cancer intervention as an example application, we demonstrate the feasibility of utilizing biomechanical simulations to regularize a weakly-supervised anatomical-label-driven registration network for aligning pre-procedural magnetic resonance (MR) and 3D intra-procedural transrectal ultrasound (TRUS) images. A discriminator network is optimized to distinguish the registration-predicted displacement fields from the motion data simulated by finite element analysis. During training, the registration network simultaneously aims to maximize similarity between anatomical labels that drives image alignment and to minimize an adversarial generator loss that measures divergence between the predicted- and simulated deformation. The end-to-end trained network enables efficient and fully-automated registration that only requires an MR and TRUS image pair as input, without anatomical labels or simulated data during inference. 108 pairs of labelled MR and TRUS images from 76 prostate cancer patients and 71,500 nonlinear finite-element simulations from 143 different patients were used for this study. We show that, with only gland segmentation as training labels, the proposed method can help predict physically plausible deformation without any other smoothness penalty. Based on cross-validation experiments using 834 pairs of independent validation landmarks, the proposed adversarial-regularized registration achieved a target registration error of 6.3 mm that is significantly lower than those from several other regularization methods.
△ Less
Submitted 27 May, 2018;
originally announced May 2018.
-
A Review of Literature on Parallel Constraint Solving
Authors:
Ian P. Gent,
Ciaran McCreesh,
Ian Miguel,
Neil C. A. Moore,
Peter Nightingale,
Patrick Prosser,
Chris Unsworth
Abstract:
As multicore computing is now standard, it seems irresponsible for constraints researchers to ignore the implications of it. Researchers need to address a number of issues to exploit parallelism, such as: investigating which constraint algorithms are amenable to parallelisation; whether to use shared memory or distributed computation; whether to use static or dynamic decomposition; and how to best…
▽ More
As multicore computing is now standard, it seems irresponsible for constraints researchers to ignore the implications of it. Researchers need to address a number of issues to exploit parallelism, such as: investigating which constraint algorithms are amenable to parallelisation; whether to use shared memory or distributed computation; whether to use static or dynamic decomposition; and how to best exploit portfolios and cooperating search. We review the literature, and see that we can sometimes do quite well, some of the time, on some instances, but we are far from a general solution. Yet there seems to be little overall guidance that can be given on how best to exploit multicore computers to speed up constraint solving. We hope at least that this survey will provide useful pointers to future researchers wishing to correct this situation.
Under consideration in Theory and Practice of Logic Programming (TPLP).
△ Less
Submitted 29 March, 2018;
originally announced March 2018.
-
Eccentric, nonspinning, inspiral, Gaussian-process merger approximant for the detection and characterization of eccentric binary black hole mergers
Authors:
E. A. Huerta,
C. J. Moore,
Prayush Kumar,
Daniel George,
Alvin J. K. Chua,
Roland Haas,
Erik Wessel,
Daniel Johnson,
Derek Glennon,
Adam Rebei,
A. Miguel Holgado,
Jonathan R. Gair,
Harald P. Pfeiffer
Abstract:
We present $\texttt{ENIGMA}$, a time domain, inspiral-merger-ringdown waveform model that describes non-spinning binary black holes systems that evolve on moderately eccentric orbits. The inspiral evolution is described using a consistent combination of post-Newtonian theory, self-force and black hole perturbation theory. Assuming eccentric binaries that circularize prior to coalescence, we smooth…
▽ More
We present $\texttt{ENIGMA}$, a time domain, inspiral-merger-ringdown waveform model that describes non-spinning binary black holes systems that evolve on moderately eccentric orbits. The inspiral evolution is described using a consistent combination of post-Newtonian theory, self-force and black hole perturbation theory. Assuming eccentric binaries that circularize prior to coalescence, we smoothly match the eccentric inspiral with a stand-alone, quasi-circular merger, which is constructed using machine learning algorithms that are trained with quasi-circular numerical relativity waveforms. We show that $\texttt{ENIGMA}$ reproduces with excellent accuracy the dynamics of quasi-circular compact binaries. We validate $\texttt{ENIGMA}$ using a set of $\texttt{Einstein Toolkit}$ eccentric numerical relativity waveforms, which describe eccentric binary black hole mergers with mass-ratios between $1 \leq q \leq 5.5$, and eccentricities $e_0 \lesssim 0.2$ ten orbits before merger. We use this model to explore in detail the physics that can be extracted with moderately eccentric, non-spinning binary black hole mergers. We use $\texttt{ENIGMA}$ to show that GW150914, GW151226, GW170104, GW170814 and GW170608 can be effectively recovered with spinning, quasi-circular templates if the eccentricity of these events at a gravitational wave frequency of 10Hz satisfies $e_0\leq \{0.175,\, 0.125,\,0.175,\,0.175,\, 0.125\}$, respectively. We show that if these systems have eccentricities $e_0\sim 0.1$ at a gravitational wave frequency of 10Hz, they can be misclassified as quasi-circular binaries due to parameter space degeneracies between eccentricity and spin corrections. Using our catalog of eccentric numerical relativity simulations, we discuss the importance of including higher-order waveform multipoles in gravitational wave searches of eccentric binary black hole mergers.
△ Less
Submitted 24 January, 2018; v1 submitted 16 November, 2017;
originally announced November 2017.
-
Unsupervised Learning of Semantic Audio Representations
Authors:
Aren Jansen,
Manoj Plakal,
Ratheet Pandya,
Daniel P. W. Ellis,
Shawn Hershey,
Jiayang Liu,
R. Channing Moore,
Rif A. Saurous
Abstract:
Even in the absence of any explicit semantic annotation, vast collections of audio recordings provide valuable information for learning the categorical structure of sounds. We consider several class-agnostic semantic constraints that apply to unlabeled nonspeech audio: (i) noise and translations in time do not change the underlying sound category, (ii) a mixture of two sound events inherits the ca…
▽ More
Even in the absence of any explicit semantic annotation, vast collections of audio recordings provide valuable information for learning the categorical structure of sounds. We consider several class-agnostic semantic constraints that apply to unlabeled nonspeech audio: (i) noise and translations in time do not change the underlying sound category, (ii) a mixture of two sound events inherits the categories of the constituents, and (iii) the categories of events in close temporal proximity are likely to be the same or related. Without labels to ground them, these constraints are incompatible with classification loss functions. However, they may still be leveraged to identify geometric inequalities needed for triplet loss-based training of convolutional neural networks. The result is low-dimensional embeddings of the input spectrograms that recover 41% and 84% of the performance of their fully-supervised counterparts when applied to downstream query-by-example sound retrieval and sound event classification tasks, respectively. Moreover, in limited-supervision settings, our unsupervised embeddings double the state-of-the-art classification performance.
△ Less
Submitted 6 November, 2017;
originally announced November 2017.
-
Label-driven weakly-supervised learning for multimodal deformable image registration
Authors:
Yipeng Hu,
Marc Modat,
Eli Gibson,
Nooshin Ghavami,
Ester Bonmati,
Caroline M. Moore,
Mark Emberton,
J. Alison Noble,
Dean C. Barratt,
Tom Vercauteren
Abstract:
Spatially aligning medical images from different modalities remains a challenging task, especially for intraoperative applications that require fast and robust algorithms. We propose a weakly-supervised, label-driven formulation for learning 3D voxel correspondence from higher-level label correspondence, thereby bypassing classical intensity-based image similarity measures. During training, a conv…
▽ More
Spatially aligning medical images from different modalities remains a challenging task, especially for intraoperative applications that require fast and robust algorithms. We propose a weakly-supervised, label-driven formulation for learning 3D voxel correspondence from higher-level label correspondence, thereby bypassing classical intensity-based image similarity measures. During training, a convolutional neural network is optimised by outputting a dense displacement field (DDF) that warps a set of available anatomical labels from the moving image to match their corresponding counterparts in the fixed image. These label pairs, including solid organs, ducts, vessels, point landmarks and other ad hoc structures, are only required at training time and can be spatially aligned by minimising a cross-entropy function of the warped moving label and the fixed label. During inference, the trained network takes a new image pair to predict an optimal DDF, resulting in a fully-automatic, label-free, real-time and deformable registration. For interventional applications where large global transformation prevails, we also propose a neural network architecture to jointly optimise the global- and local displacements. Experiment results are presented based on cross-validating registrations of 111 pairs of T2-weighted magnetic resonance images and 3D transrectal ultrasound images from prostate cancer patients with a total of over 4000 anatomical labels, yielding a median target registration error of 4.2 mm on landmark centroids and a median Dice of 0.88 on prostate glands.
△ Less
Submitted 24 December, 2017; v1 submitted 5 November, 2017;
originally announced November 2017.
-
Minimum Circuit Size, Graph Isomorphism, and Related Problems
Authors:
Eric Allender,
Joshua A. Grochow,
Dieter van Melkebeek,
Cristopher Moore,
Andrew Morgan
Abstract:
We study the computational power of deciding whether a given truth-table can be described by a circuit of a given size (the Minimum Circuit Size Problem, or MCSP for short), and of the variant denoted as MKTP where circuit size is replaced by a polynomially-related Kolmogorov measure. All prior reductions from supposedly-intractable problems to MCSP / MKTP hinged on the power of MCSP / MKTP to dis…
▽ More
We study the computational power of deciding whether a given truth-table can be described by a circuit of a given size (the Minimum Circuit Size Problem, or MCSP for short), and of the variant denoted as MKTP where circuit size is replaced by a polynomially-related Kolmogorov measure. All prior reductions from supposedly-intractable problems to MCSP / MKTP hinged on the power of MCSP / MKTP to distinguish random distributions from distributions produced by hardness-based pseudorandom generator constructions. We develop a fundamentally different approach inspired by the well-known interactive proof system for the complement of Graph Isomorphism (GI). It yields a randomized reduction with zero-sided error from GI to MKTP. We generalize the result and show that GI can be replaced by any isomorphism problem for which the underlying group satisfies some elementary properties. Instantiations include Linear Code Equivalence, Permutation Group Conjugacy, and Matrix Subspace Conjugacy. Along the way we develop encodings of isomorphism classes that are efficiently decodable and achieve compression that is at or near the information-theoretic optimum; those encodings may be of independent interest.
△ Less
Submitted 26 October, 2017;
originally announced October 2017.
-
A physical model for efficient ranking in networks
Authors:
Caterina De Bacco,
Daniel B. Larremore,
Cristopher Moore
Abstract:
We present a physically-inspired model and an efficient algorithm to infer hierarchical rankings of nodes in directed networks. It assigns real-valued ranks to nodes rather than simply ordinal ranks, and it formalizes the assumption that interactions are more likely to occur between individuals with similar ranks. It provides a natural statistical significance test for the inferred hierarchy, and…
▽ More
We present a physically-inspired model and an efficient algorithm to infer hierarchical rankings of nodes in directed networks. It assigns real-valued ranks to nodes rather than simply ordinal ranks, and it formalizes the assumption that interactions are more likely to occur between individuals with similar ranks. It provides a natural statistical significance test for the inferred hierarchy, and it can be used to perform inference tasks such as predicting the existence or direction of edges. The ranking is obtained by solving a linear system of equations, which is sparse if the network is; thus the resulting algorithm is extremely efficient and scalable. We illustrate these findings by analyzing real and synthetic data, including datasets from animal behavior, faculty hiring, social support networks, and sports tournaments. We show that our method often outperforms a variety of others, in both speed and accuracy, in recovering the underlying ranks and predicting edge directions.
△ Less
Submitted 13 June, 2018; v1 submitted 3 September, 2017;
originally announced September 2017.
-
Intraoperative Organ Motion Models with an Ensemble of Conditional Generative Adversarial Networks
Authors:
Yipeng Hu,
Eli Gibson,
Tom Vercauteren,
Hashim U. Ahmed,
Mark Emberton,
Caroline M. Moore,
J. Alison Noble,
Dean C. Barratt
Abstract:
In this paper, we describe how a patient-specific, ultrasound-probe-induced prostate motion model can be directly generated from a single preoperative MR image. Our motion model allows for sampling from the conditional distribution of dense displacement fields, is encoded by a generative neural network conditioned on a medical image, and accepts random noise as additional input. The generative net…
▽ More
In this paper, we describe how a patient-specific, ultrasound-probe-induced prostate motion model can be directly generated from a single preoperative MR image. Our motion model allows for sampling from the conditional distribution of dense displacement fields, is encoded by a generative neural network conditioned on a medical image, and accepts random noise as additional input. The generative network is trained by a minimax optimisation with a second discriminative neural network, tasked to distinguish generated samples from training motion data. In this work, we propose that 1) jointly optimising a third conditioning neural network that pre-processes the input image, can effectively extract patient-specific features for conditioning; and 2) combining multiple generative models trained separately with heuristically pre-disjointed training data sets can adequately mitigate the problem of mode collapse. Trained with diagnostic T2-weighted MR images from 143 real patients and 73,216 3D dense displacement fields from finite element simulations of intraoperative prostate motion due to transrectal ultrasound probe pressure, the proposed models produced physically-plausible patient-specific motion of prostate glands. The ability to capture biomechanically simulated motion was evaluated using two errors representing generalisability and specificity of the model. The median values, calculated from a 10-fold cross-validation, were 2.8+/-0.3 mm and 1.7+/-0.1 mm, respectively. We conclude that the introduced approach demonstrates the feasibility of applying state-of-the-art machine learning algorithms to generate organ motion models from patient images, and shows significant promise for future research.
△ Less
Submitted 5 September, 2017;
originally announced September 2017.
-
Designing Strassen's algorithm
Authors:
Joshua A. Grochow,
Cristopher Moore
Abstract:
In 1969, Strassen shocked the world by showing that two n x n matrices could be multiplied in time asymptotically less than $O(n^3)$. While the recursive construction in his algorithm is very clear, the key gain was made by showing that 2 x 2 matrix multiplication could be performed with only 7 multiplications instead of 8. The latter construction was arrived at by a process of elimination and app…
▽ More
In 1969, Strassen shocked the world by showing that two n x n matrices could be multiplied in time asymptotically less than $O(n^3)$. While the recursive construction in his algorithm is very clear, the key gain was made by showing that 2 x 2 matrix multiplication could be performed with only 7 multiplications instead of 8. The latter construction was arrived at by a process of elimination and appears to come out of thin air. Here, we give the simplest and most transparent proof of Strassen's algorithm that we are aware of, using only a simple unitary 2-design and a few easy lines of calculation. Moreover, using basic facts from the representation theory of finite groups, we use 2-designs coming from group orbits to generalize our construction to all n (although the resulting algorithms aren't optimal for n at least 3).
△ Less
Submitted 30 August, 2017;
originally announced August 2017.
-
The Lovász Theta Function for Random Regular Graphs and Community Detection in the Hard Regime
Authors:
Jess Banks,
Robert Kleinberg,
Cristopher Moore
Abstract:
We derive upper and lower bounds on the degree $d$ for which the Lovász $\vartheta$ function, or equivalently sum-of-squares proofs with degree two, can refute the existence of a $k$-coloring in random regular graphs $G_{n,d}$. We show that this type of refutation fails well above the $k$-colorability transition, and in particular everywhere below the Kesten-Stigum threshold. This is consistent wi…
▽ More
We derive upper and lower bounds on the degree $d$ for which the Lovász $\vartheta$ function, or equivalently sum-of-squares proofs with degree two, can refute the existence of a $k$-coloring in random regular graphs $G_{n,d}$. We show that this type of refutation fails well above the $k$-colorability transition, and in particular everywhere below the Kesten-Stigum threshold. This is consistent with the conjecture that refuting $k$-colorability, or distinguishing $G_{n,d}$ from the planted coloring model, is hard in this region. Our results also apply to the disassortative case of the stochastic block model, adding evidence to the conjecture that there is a regime where community detection is computationally hard even though it is information-theoretically possible. Using orthogonal polynomials, we also provide explicit upper bounds on $\vartheta(\overline{G})$ for regular graphs of a given girth, which may be of independent interest.
△ Less
Submitted 28 August, 2017; v1 submitted 2 May, 2017;
originally announced May 2017.
-
The Computer Science and Physics of Community Detection: Landscapes, Phase Transitions, and Hardness
Authors:
Cristopher Moore
Abstract:
Community detection in graphs is the problem of finding groups of vertices which are more densely connected than they are to the rest of the graph. This problem has a long history, but it is undergoing a resurgence of interest due to the need to analyze social and biological networks. While there are many ways to formalize it, one of the most popular is as an inference problem, where there is a "g…
▽ More
Community detection in graphs is the problem of finding groups of vertices which are more densely connected than they are to the rest of the graph. This problem has a long history, but it is undergoing a resurgence of interest due to the need to analyze social and biological networks. While there are many ways to formalize it, one of the most popular is as an inference problem, where there is a "ground truth" community structure built into the graph somehow. The task is then to recover the ground truth knowing only the graph.
Recently it was discovered, first heuristically in physics and then rigorously in probability and computer science, that this problem has a phase transition at which it suddenly becomes impossible. Namely, if the graph is too sparse, or the probabilistic process that generates it is too noisy, then no algorithm can find a partition that is correlated with the planted one---or even tell if there are communities, i.e., distinguish the graph from a purely random one with high probability. Above this information-theoretic threshold, there is a second threshold beyond which polynomial-time algorithms are known to succeed; in between, there is a regime in which community detection is possible, but conjectured to require exponential time.
For computer scientists, this field offers a wealth of new ideas and open questions, with connections to probability and combinatorics, message-passing algorithms, and random matrix theory. Perhaps more importantly, it provides a window into the cultures of statistical physics and statistical inference, and how those cultures think about distributions of instances, landscapes of solutions, and hardness.
△ Less
Submitted 23 August, 2017; v1 submitted 1 February, 2017;
originally announced February 2017.
-
Community detection, link prediction, and layer interdependence in multilayer networks
Authors:
Caterina De Bacco,
Eleanor A. Power,
Daniel B. Larremore,
Cristopher Moore
Abstract:
Complex systems are often characterized by distinct types of interactions between the same entities. These can be described as a multilayer network where each layer represents one type of interaction. These layers may be interdependent in complicated ways, revealing different kinds of structure in the network. In this work we present a generative model, and an efficient expectation-maximization al…
▽ More
Complex systems are often characterized by distinct types of interactions between the same entities. These can be described as a multilayer network where each layer represents one type of interaction. These layers may be interdependent in complicated ways, revealing different kinds of structure in the network. In this work we present a generative model, and an efficient expectation-maximization algorithm, which allows us to perform inference tasks such as community detection and link prediction in this setting. Our model assumes overlapping communities that are common between the layers, while allowing these communities to affect each layer in a different way, including arbitrary mixtures of assortative, disassortative, or directed structure. It also gives us a mathematically principled way to define the interdependence between layers, by measuring how much information about one layer helps us predict links in another layer. In particular, this allows us to bundle layers together to compress redundant information, and identify small groups of layers which suffice to predict the remaining layers accurately. We illustrate these findings by analyzing synthetic data and two real multilayer networks, one representing social support relationships among villagers in South India and the other representing shared genetic substrings material between genes of the malaria parasite.
△ Less
Submitted 20 August, 2018; v1 submitted 5 January, 2017;
originally announced January 2017.
-
Matrix multiplication algorithms from group orbits
Authors:
Joshua A. Grochow,
Cristopher Moore
Abstract:
We show how to construct highly symmetric algorithms for matrix multiplication. In particular, we consider algorithms which decompose the matrix multiplication tensor into a sum of rank-1 tensors, where the decomposition itself consists of orbits under some finite group action. We show how to use the representation theory of the corresponding group to derive simple constraints on the decomposition…
▽ More
We show how to construct highly symmetric algorithms for matrix multiplication. In particular, we consider algorithms which decompose the matrix multiplication tensor into a sum of rank-1 tensors, where the decomposition itself consists of orbits under some finite group action. We show how to use the representation theory of the corresponding group to derive simple constraints on the decomposition, which we solve by hand for n=2,3,4,5, recovering Strassen's algorithm (in a particularly symmetric form) and new algorithms for larger n. While these new algorithms do not improve the known upper bounds on tensor rank or the matrix multiplication exponent, they are beautiful in their own right, and we point out modifications of this idea that could plausibly lead to further improvements. Our constructions also suggest further patterns that could be mined for new algorithms, including a tantalizing connection with lattices. In particular, using lattices we give the most transparent proof to date of Strassen's algorithm; the same proof works for all n, to yield a decomposition with $n^3 - n + 1$ terms.
△ Less
Submitted 12 December, 2016; v1 submitted 5 December, 2016;
originally announced December 2016.