-
Automatic Prediction of Amyotrophic Lateral Sclerosis Progression using Longitudinal Speech Transformer
Authors:
Liming Wang,
Yuan Gong,
Nauman Dawalatabad,
Marco Vilela,
Katerina Placek,
Brian Tracey,
Yishu Gong,
Alan Premasiri,
Fernando Vieira,
James Glass
Abstract:
Automatic prediction of amyotrophic lateral sclerosis (ALS) disease progression provides a more efficient and objective alternative than manual approaches. We propose ALS longitudinal speech transformer (ALST), a neural network-based automatic predictor of ALS disease progression from longitudinal speech recordings of ALS patients. By taking advantage of high-quality pretrained speech features and…
▽ More
Automatic prediction of amyotrophic lateral sclerosis (ALS) disease progression provides a more efficient and objective alternative than manual approaches. We propose ALS longitudinal speech transformer (ALST), a neural network-based automatic predictor of ALS disease progression from longitudinal speech recordings of ALS patients. By taking advantage of high-quality pretrained speech features and longitudinal information in the recordings, our best model achieves 91.0\% AUC, improving upon the previous best model by 5.6\% relative on the ALS TDI dataset. Careful analysis reveals that ALST is capable of fine-grained and interpretable predictions of ALS progression, especially for distinguishing between rarer and more severe cases. Code is publicly available.
△ Less
Submitted 26 June, 2024;
originally announced June 2024.
-
Towards practical reinforcement learning for tokamak magnetic control
Authors:
Brendan D. Tracey,
Andrea Michi,
Yuri Chervonyi,
Ian Davies,
Cosmin Paduraru,
Nevena Lazic,
Federico Felici,
Timo Ewalds,
Craig Donner,
Cristian Galperti,
Jonas Buchli,
Michael Neunert,
Andrea Huber,
Jonathan Evens,
Paula Kurylowicz,
Daniel J. Mankowitz,
Martin Riedmiller,
The TCV Team
Abstract:
Reinforcement learning (RL) has shown promising results for real-time control systems, including the domain of plasma magnetic control. However, there are still significant drawbacks compared to traditional feedback control approaches for magnetic confinement. In this work, we address key drawbacks of the RL method; achieving higher control accuracy for desired plasma properties, reducing the stea…
▽ More
Reinforcement learning (RL) has shown promising results for real-time control systems, including the domain of plasma magnetic control. However, there are still significant drawbacks compared to traditional feedback control approaches for magnetic confinement. In this work, we address key drawbacks of the RL method; achieving higher control accuracy for desired plasma properties, reducing the steady-state error, and decreasing the required time to learn new tasks. We build on top of \cite{degrave2022magnetic}, and present algorithmic improvements to the agent architecture and training procedure. We present simulation results that show up to 65\% improvement in shape accuracy, achieve substantial reduction in the long-term bias of the plasma current, and additionally reduce the training time required to learn new tasks by a factor of 3 or more. We present new experiments using the upgraded RL-based controllers on the TCV tokamak, which validate the simulation results achieved, and point the way towards routinely achieving accurate discharges using the RL approach.
△ Less
Submitted 5 October, 2023; v1 submitted 21 July, 2023;
originally announced July 2023.
-
From Motor Control to Team Play in Simulated Humanoid Football
Authors:
Siqi Liu,
Guy Lever,
Zhe Wang,
Josh Merel,
S. M. Ali Eslami,
Daniel Hennes,
Wojciech M. Czarnecki,
Yuval Tassa,
Shayegan Omidshafiei,
Abbas Abdolmaleki,
Noah Y. Siegel,
Leonard Hasenclever,
Luke Marris,
Saran Tunyasuvunakool,
H. Francis Song,
Markus Wulfmeier,
Paul Muller,
Tuomas Haarnoja,
Brendan D. Tracey,
Karl Tuyls,
Thore Graepel,
Nicolas Heess
Abstract:
Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals defined on much longer timescales, and in terms of relations that extend far beyond the body itself, ultimately involving coordination with other agents…
▽ More
Intelligent behaviour in the physical world exhibits structure at multiple spatial and temporal scales. Although movements are ultimately executed at the level of instantaneous muscle tensions or joint torques, they must be selected to serve goals defined on much longer timescales, and in terms of relations that extend far beyond the body itself, ultimately involving coordination with other agents. Recent research in artificial intelligence has shown the promise of learning-based approaches to the respective problems of complex movement, longer-term planning and multi-agent coordination. However, there is limited research aimed at their integration. We study this problem by training teams of physically simulated humanoid avatars to play football in a realistic virtual environment. We develop a method that combines imitation learning, single- and multi-agent reinforcement learning and population-based training, and makes use of transferable representations of behaviour for decision making at different levels of abstraction. In a sequence of stages, players first learn to control a fully articulated body to perform realistic, human-like movements such as running and turning; they then acquire mid-level football skills such as dribbling and shooting; finally, they develop awareness of others and play as a team, bridging the gap between low-level motor control at a timescale of milliseconds, and coordinated goal-directed behaviour as a team at the timescale of tens of seconds. We investigate the emergence of behaviours at different levels of abstraction, as well as the representations that underlie these behaviours using several analysis techniques, including statistics from real-world sports analytics. Our work constitutes a complete demonstration of integrated decision-making at multiple scales in a physically embodied multi-agent setting. See project video at https://youtu.be/KHMwq9pv7mg.
△ Less
Submitted 25 May, 2021;
originally announced May 2021.
-
Real World Games Look Like Spinning Tops
Authors:
Wojciech Marian Czarnecki,
Gauthier Gidel,
Brendan Tracey,
Karl Tuyls,
Shayegan Omidshafiei,
David Balduzzi,
Max Jaderberg
Abstract:
This paper investigates the geometrical properties of real world games (e.g. Tic-Tac-Toe, Go, StarCraft II). We hypothesise that their geometrical structure resemble a spinning top, with the upright axis representing transitive strength, and the radial axis, which corresponds to the number of cycles that exist at a particular transitive strength, representing the non-transitive dimension. We prove…
▽ More
This paper investigates the geometrical properties of real world games (e.g. Tic-Tac-Toe, Go, StarCraft II). We hypothesise that their geometrical structure resemble a spinning top, with the upright axis representing transitive strength, and the radial axis, which corresponds to the number of cycles that exist at a particular transitive strength, representing the non-transitive dimension. We prove the existence of this geometry for a wide class of real world games, exposing their temporal nature. Additionally, we show that this unique structure also has consequences for learning - it clarifies why populations of strategies are necessary for training of agents, and how population size relates to the structure of the game. Finally, we empirically validate these claims by using a selection of nine real world two-player zero-sum symmetric games, showing 1) the spinning top structure is revealed and can be easily re-constructed by using a new method of Nash clustering to measure the interaction between transitive and cyclical strategy behaviour, and 2) the effect that population size has on the convergence in these games.
△ Less
Submitted 17 June, 2020; v1 submitted 20 April, 2020;
originally announced April 2020.
-
Caveats for information bottleneck in deterministic scenarios
Authors:
Artemy Kolchinsky,
Brendan D. Tracey,
Steven Van Kuyk
Abstract:
Information bottleneck (IB) is a method for extracting information from one random variable $X$ that is relevant for predicting another random variable $Y$. To do so, IB identifies an intermediate "bottleneck" variable $T$ that has low mutual information $I(X;T)$ and high mutual information $I(Y;T)$. The "IB curve" characterizes the set of bottleneck variables that achieve maximal $I(Y;T)$ for a g…
▽ More
Information bottleneck (IB) is a method for extracting information from one random variable $X$ that is relevant for predicting another random variable $Y$. To do so, IB identifies an intermediate "bottleneck" variable $T$ that has low mutual information $I(X;T)$ and high mutual information $I(Y;T)$. The "IB curve" characterizes the set of bottleneck variables that achieve maximal $I(Y;T)$ for a given $I(X;T)$, and is typically explored by maximizing the "IB Lagrangian", $I(Y;T) - βI(X;T)$. In some cases, $Y$ is a deterministic function of $X$, including many classification problems in supervised learning where the output class $Y$ is a deterministic function of the input $X$. We demonstrate three caveats when using IB in any situation where $Y$ is a deterministic function of $X$: (1) the IB curve cannot be recovered by maximizing the IB Lagrangian for different values of $β$; (2) there are "uninteresting" trivial solutions at all points of the IB curve; and (3) for multi-layer classifiers that achieve low prediction error, different layers cannot exhibit a strict trade-off between compression and prediction, contrary to a recent proposal. We also show that when $Y$ is a small perturbation away from being a deterministic function of $X$, these three caveats arise in an approximate way. To address problem (1), we propose a functional that, unlike the IB Lagrangian, can recover the IB curve in all cases. We demonstrate the three caveats on the MNIST dataset.
△ Less
Submitted 8 February, 2019; v1 submitted 22 August, 2018;
originally announced August 2018.
-
Deep Reinforcement Learning for Event-Driven Multi-Agent Decision Processes
Authors:
Kunal Menda,
Yi-Chun Chen,
Justin Grana,
James W. Bono,
Brendan D. Tracey,
Mykel J. Kochenderfer,
David Wolpert
Abstract:
The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic durations, multiple agents executing decentralized policies in cooperative environments must act asynchronously. We present an algorithm that modifies generalized…
▽ More
The incorporation of macro-actions (temporally extended actions) into multi-agent decision problems has the potential to address the curse of dimensionality associated with such decision problems. Since macro-actions last for stochastic durations, multiple agents executing decentralized policies in cooperative environments must act asynchronously. We present an algorithm that modifies generalized advantage estimation for temporally extended actions, allowing a state-of-the-art policy optimization algorithm to optimize policies in Dec-POMDPs in which agents act asynchronously. We show that our algorithm is capable of learning optimal policies in two cooperative domains, one involving real-time bus holding control and one involving wildfire fighting with unmanned aircraft. Our algorithm works by framing problems as "event-driven decision processes," which are scenarios in which the sequence and timing of actions and events are random and governed by an underlying stochastic process. In addition to optimizing policies with continuous state and action spaces, our algorithm also facilitates the use of event-driven simulators, which do not require time to be discretized into time-steps. We demonstrate the benefit of using event-driven simulation in the context of multiple agents taking asynchronous actions. We show that fixed time-step simulation risks obfuscating the sequence in which closely separated events occur, adversely affecting the policies learned. In addition, we show that arbitrarily shrinking the time-step scales poorly with the number of agents.
△ Less
Submitted 29 May, 2019; v1 submitted 19 September, 2017;
originally announced September 2017.
-
On the Fusion of Compton Scatter and Attenuation Data for Limited-view X-ray Tomographic Applications
Authors:
Hamideh Rezaee,
Brian Tracey,
Eric L. Miller
Abstract:
In this paper we demonstrate the utility of fusing energy-resolved observations of Compton scattered photons with traditional attenuation data for the joint recovery of mass density and photoelectric absorption in the context of limited view tomographic imaging applications. We begin with the development of a physical and associated numerical model for the Compton scatter process. Using this model…
▽ More
In this paper we demonstrate the utility of fusing energy-resolved observations of Compton scattered photons with traditional attenuation data for the joint recovery of mass density and photoelectric absorption in the context of limited view tomographic imaging applications. We begin with the development of a physical and associated numerical model for the Compton scatter process. Using this model, we propose a variational approach recovering these two material properties. In addition to the typical data-fidelity terms, the optimization functional includes regularization for both the mass density and photoelectric coefficients. We consider a novel edge-preserving method in the case of mass density. To aid in the recovery of the photoelectric information, we draw on our recent method in \cite{r15} and employ a non-local regularization scheme that builds on the fact that mass density is more stably imaged. Simulation results demonstrate clear advantages associated with the use of both scattered photon data and energy resolved information in mapping the two material properties of interest. Specifically, comparing images obtained using only conventional attenuation data with those where we employ only Compton scatter photons and images formed from the combination of the two, shows that taking advantage of both types of data for reconstruction provides far more accurate results.
△ Less
Submitted 5 July, 2017;
originally announced July 2017.
-
Estimating Mixture Entropy with Pairwise Distances
Authors:
Artemy Kolchinsky,
Brendan D. Tracey
Abstract:
Mixture distributions arise in many parametric and non-parametric settings -- for example, in Gaussian mixture models and in non-parametric estimation. It is often necessary to compute the entropy of a mixture, but, in most cases, this quantity has no closed-form expression, making some form of approximation necessary. We propose a family of estimators based on a pairwise distance function between…
▽ More
Mixture distributions arise in many parametric and non-parametric settings -- for example, in Gaussian mixture models and in non-parametric estimation. It is often necessary to compute the entropy of a mixture, but, in most cases, this quantity has no closed-form expression, making some form of approximation necessary. We propose a family of estimators based on a pairwise distance function between mixture components, and show that this estimator class has many attractive properties. For many distributions of interest, the proposed estimators are efficient to compute, differentiable in the mixture parameters, and become exact when the mixture components are clustered. We prove this family includes lower and upper bounds on the mixture entropy. The Chernoff $α$-divergence gives a lower bound when chosen as the distance function, with the Bhattacharyya distance providing the tightest lower bound for components that are symmetric and members of a location family. The Kullback-Leibler divergence gives an upper bound when used as the distance function. We provide closed-form expressions of these bounds for mixtures of Gaussians, and discuss their applications to the estimation of mutual information. We then demonstrate that our bounds are significantly tighter than well-known existing bounds using numeric simulations. This estimator class is very useful in optimization problems involving maximization/minimization of entropy and mutual information, such as MaxEnt and rate distortion problems.
△ Less
Submitted 22 August, 2018; v1 submitted 7 June, 2017;
originally announced June 2017.
-
Nonlinear Information Bottleneck
Authors:
Artemy Kolchinsky,
Brendan D. Tracey,
David H. Wolpert
Abstract:
Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been cons…
▽ More
Information bottleneck (IB) is a technique for extracting information in one random variable $X$ that is relevant for predicting another random variable $Y$. IB works by encoding $X$ in a compressed "bottleneck" random variable $M$ from which $Y$ can be accurately decoded. However, finding the optimal bottleneck variable involves a difficult optimization problem, which until recently has been considered for only two limited cases: discrete $X$ and $Y$ with small state spaces, and continuous $X$ and $Y$ with a Gaussian joint distribution (in which case optimal encoding and decoding maps are linear). We propose a method for performing IB on arbitrarily-distributed discrete and/or continuous $X$ and $Y$, while allowing for nonlinear encoding and decoding maps. Our approach relies on a novel non-parametric upper bound for mutual information. We describe how to implement our method using neural networks. We then show that it achieves better performance than the recently-proposed "variational IB" method on several real-world datasets.
△ Less
Submitted 30 November, 2019; v1 submitted 5 May, 2017;
originally announced May 2017.
-
Modeling Social Organizations as Communication Networks
Authors:
David Wolpert,
Justin Grana,
Brendan Tracey,
Tim Kohler,
Artemy Kolchinsky
Abstract:
We identify the "organization" of a human social group as the communication network(s) within that group. We then introduce three theoretical approaches to analyzing what determines the structures of human organizations. All three approaches adopt a group-selection perspective, so that the group's network structure is (approximately) optimal, given the information-processing limitations of agents…
▽ More
We identify the "organization" of a human social group as the communication network(s) within that group. We then introduce three theoretical approaches to analyzing what determines the structures of human organizations. All three approaches adopt a group-selection perspective, so that the group's network structure is (approximately) optimal, given the information-processing limitations of agents within the social group, and the exogenous welfare function of the overall group. In the first approach we use a new sub-field of telecommunications theory called network coding, and focus on a welfare function that involves the ability of the organization to convey information among the agents. In the second approach we focus on a scenario where agents within the organization must allocate their future communication resources when the state of the future environment is uncertain. We show how this formulation can be solved with a linear program. In the third approach, we introduce an information synthesis problem in which agents within an organization receive information from various sources and must decide how to transform such information and transmit the results to other agents in the organization. We propose leveraging the computational power of neural networks to solve such problems. These three approaches formalize and synthesize work in fields including anthropology, archeology, economics and psychology that deal with organization structure, theory of the firm, span of control and cognitive limits on communication.
△ Less
Submitted 14 February, 2017;
originally announced February 2017.
-
Dynamics of beneficial epidemics
Authors:
Andrew Berdahl,
Christa Brelsford,
Caterina De Bacco,
Marion Dumas,
Vanessa Ferdinand,
Joshua A. Grochow,
Laurent Hébert-Dufresne,
Yoav Kallus,
Christopher P. Kempes,
Artemy Kolchinsky,
Daniel B. Larremore,
Eric Libby,
Eleanor A. Power,
Caitlin A. Stern,
Brendan Tracey
Abstract:
Pathogens can spread epidemically through populations. Beneficial contagions, such as viruses that enhance host survival or technological innovations that improve quality of life, also have the potential to spread epidemically. How do the dynamics of beneficial biological and social epidemics differ from those of detrimental epidemics? We investigate this question using three theoretical approache…
▽ More
Pathogens can spread epidemically through populations. Beneficial contagions, such as viruses that enhance host survival or technological innovations that improve quality of life, also have the potential to spread epidemically. How do the dynamics of beneficial biological and social epidemics differ from those of detrimental epidemics? We investigate this question using three theoretical approaches. First, in the context of population genetics, we show that a horizontally-transmissible element that increases fitness, such as viral DNA, spreads superexponentially through a population, more quickly than a beneficial mutation. Second, in the context of behavioral epidemiology, we show that infections that cause increased connectivity lead to superexponential fixation in the population. Third, in the context of dynamic social networks, we find that preferences for increased global infection accelerate spread and produce superexponential fixation, but preferences for local assortativity halt epidemics by disconnecting the infected from the susceptible. We conclude that the dynamics of beneficial biological and social epidemics are characterized by the rapid spread of beneficial elements, which is facilitated in biological systems by horizontal transmission and in social systems by active spreading behavior of infected individuals.
△ Less
Submitted 17 February, 2017; v1 submitted 7 April, 2016;
originally announced April 2016.
-
Predicting the behavior of interacting humans by fusing data from multiple sources
Authors:
Erik J. Schlicht,
Ritchie Lee,
David H. Wolpert,
Mykel J. Kochenderfer,
Brendan Tracey
Abstract:
Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but highfidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use onlin…
▽ More
Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but highfidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use online platforms, like Mechanical Turk, to run low-fidelity experiments to gather human performance data in an efficient manner. One concern with these experiments is that the results obtained from the online environment generalize poorly to the actual domain of interest. To address this limitation, we extend traditional multi-fidelity approaches to allow us to combine fewer data points from high-fidelity human-in-the-loop experiments with plentiful but less accurate data from low-fidelity experiments to produce accurate models of how humans interact. We present both model-based and model-free methods, and summarize the predictive performance of each method under dierent conditions.
△ Less
Submitted 9 August, 2014;
originally announced August 2014.
-
Stabilizing dual-energy X-ray computed tomography reconstructions using patch-based regularization
Authors:
Brian H. Tracey,
Eric L. Miller
Abstract:
Recent years have seen growing interest in exploiting dual- and multi-energy measurements in computed tomography (CT) in order to characterize material properties as well as object shape. Material characterization is performed by decomposing the scene into constitutive basis functions, such as Compton scatter and photoelectric absorption functions. While well motivated physically, the joint recove…
▽ More
Recent years have seen growing interest in exploiting dual- and multi-energy measurements in computed tomography (CT) in order to characterize material properties as well as object shape. Material characterization is performed by decomposing the scene into constitutive basis functions, such as Compton scatter and photoelectric absorption functions. While well motivated physically, the joint recovery of the spatial distribution of photoelectric and Compton properties is severely complicated by the fact that the data are several orders of magnitude more sensitive to Compton scatter coefficients than to photoelectric absorption, so small errors in Compton estimates can create large artifacts in the photoelectric estimate. To address these issues, we propose a model-based iterative approach which uses patch-based regularization terms to stabilize inversion of photoelectric coefficients, and solve the resulting problem though use of computationally attractive Alternating Direction Method of Multipliers (ADMM) solution techniques. Using simulations and experimental data acquired on a commercial scanner, we demonstrate that the proposed processing can lead to more stable material property estimates which should aid materials characterization in future dual- and multi-energy CT systems.
△ Less
Submitted 25 March, 2014;
originally announced March 2014.
-
Blockwise SURE Shrinkage for Non-Local Means
Authors:
Yue Wu,
Brian Tracey,
Premkumar Natarajan,
Joseph P. Noonan
Abstract:
In this letter, we investigate the shrinkage problem for the non-local means (NLM) image denoising. In particular, we derive the closed-form of the optimal blockwise shrinkage for NLM that minimizes the Stein's unbiased risk estimator (SURE). We also propose a constant complexity algorithm allowing fast blockwise shrinkage. Simulation results show that the proposed blockwise shrinkage method impro…
▽ More
In this letter, we investigate the shrinkage problem for the non-local means (NLM) image denoising. In particular, we derive the closed-form of the optimal blockwise shrinkage for NLM that minimizes the Stein's unbiased risk estimator (SURE). We also propose a constant complexity algorithm allowing fast blockwise shrinkage. Simulation results show that the proposed blockwise shrinkage method improves NLM performance in attaining higher peak signal noise ratio (PSNR) and structural similarity index (SSIM), and makes NLM more robust against parameter changes. Similar ideas can be applicable to other patchwise image denoising techniques.
△ Less
Submitted 18 May, 2013;
originally announced May 2013.
-
Cyber-Physical Security: A Game Theory Model of Humans Interacting over Control Systems
Authors:
Scott Backhaus,
Russell Bent,
James Bono,
Ritchie Lee,
Brendan Tracey,
David Wolpert,
Dongping Xie,
Yildiray Yildiz
Abstract:
Recent years have seen increased interest in the design and deployment of smart grid devices and control algorithms. Each of these smart communicating devices represents a potential access point for an intruder spurring research into intruder prevention and detection. However, no security measures are complete, and intruding attackers will compromise smart grid devices leading to the attacker and…
▽ More
Recent years have seen increased interest in the design and deployment of smart grid devices and control algorithms. Each of these smart communicating devices represents a potential access point for an intruder spurring research into intruder prevention and detection. However, no security measures are complete, and intruding attackers will compromise smart grid devices leading to the attacker and the system operator interacting via the grid and its control systems. The outcome of these machine-mediated human-human interactions will depend on the design of the physical and control systems mediating the interactions. If these outcomes can be predicted via simulation, they can be used as a tool for designing attack-resilient grids and control systems. However, accurate predictions require good models of not just the physical and control systems, but also of the human decision making. In this manuscript, we present an approach to develop such tools, i.e. models of the decisions of the cyber-physical intruder who is attacking the systems and the system operator who is defending it, and demonstrate its usefulness for design.
△ Less
Submitted 15 April, 2013;
originally announced April 2013.
-
Probabilistic Non-Local Means
Authors:
Yue Wu,
Brian Tracey,
Premkumar Natarajan,
Joseph P. Noonan
Abstract:
In this paper, we propose a so-called probabilistic non-local means (PNLM) method for image denoising. Our main contributions are: 1) we point out defects of the weight function used in the classic NLM; 2) we successfully derive all theoretical statistics of patch-wise differences for Gaussian noise; and 3) we employ this prior information and formulate the probabilistic weights truly reflecting t…
▽ More
In this paper, we propose a so-called probabilistic non-local means (PNLM) method for image denoising. Our main contributions are: 1) we point out defects of the weight function used in the classic NLM; 2) we successfully derive all theoretical statistics of patch-wise differences for Gaussian noise; and 3) we employ this prior information and formulate the probabilistic weights truly reflecting the similarity between two noisy patches. The probabilistic nature of the new weight function also provides a theoretical basis to choose thresholds rejecting dissimilar patches for fast computations. Our simulation results indicate the PNLM outperforms the classic NLM and many NLM recent variants in terms of peak signal noise ratio (PSNR) and structural similarity (SSIM) index. Encouraging improvements are also found when we replace the NLM weights with the probabilistic weights in tested NLM variants.
△ Less
Submitted 22 February, 2013;
originally announced February 2013.
-
James-Stein Type Center Pixel Weights for Non-Local Means Image Denoising
Authors:
Yue Wu,
Brian Tracey,
Joseph P. Noonan
Abstract:
Non-Local Means (NLM) and variants have been proven to be effective and robust in many image denoising tasks. In this letter, we study the parameter selection problem of center pixel weights (CPW) in NLM. Our key contributions are: 1) we give a novel formulation of the CPW problem from the statistical shrinkage perspective; 2) we introduce the James-Stein type CPWs for NLM; and 3) we propose a new…
▽ More
Non-Local Means (NLM) and variants have been proven to be effective and robust in many image denoising tasks. In this letter, we study the parameter selection problem of center pixel weights (CPW) in NLM. Our key contributions are: 1) we give a novel formulation of the CPW problem from the statistical shrinkage perspective; 2) we introduce the James-Stein type CPWs for NLM; and 3) we propose a new adaptive CPW that is locally tuned for each image pixel. Our experimental results showed that compared to existing CPW solutions, the new proposed CPWs are more robust and effective under various noise levels. In particular, the NLM with the James-Stein type CPWs attain higher means with smaller variances in terms of the peak signal and noise ratio, implying they improve the NLM robustness and make it less sensitive to parameter selection.
△ Less
Submitted 7 November, 2012;
originally announced November 2012.
-
Counter-Factual Reinforcement Learning: How to Model Decision-Makers That Anticipate The Future
Authors:
Ritchie Lee,
David H. Wolpert,
James Bono,
Scott Backhaus,
Russell Bent,
Brendan Tracey
Abstract:
This paper introduces a novel framework for modeling interacting humans in a multi-stage game. This "iterated semi network-form game" framework has the following desirable characteristics: (1) Bounded rational players, (2) strategic players (i.e., players account for one another's reward functions when predicting one another's behavior), and (3) computational tractability even on real-world system…
▽ More
This paper introduces a novel framework for modeling interacting humans in a multi-stage game. This "iterated semi network-form game" framework has the following desirable characteristics: (1) Bounded rational players, (2) strategic players (i.e., players account for one another's reward functions when predicting one another's behavior), and (3) computational tractability even on real-world systems. We achieve these benefits by combining concepts from game theory and reinforcement learning. To be precise, we extend the bounded rational "level-K reasoning" model to apply to games over multiple stages. Our extension allows the decomposition of the overall modeling problem into a series of smaller ones, each of which can be solved by standard reinforcement learning algorithms. We call this hybrid approach "level-K reinforcement learning". We investigate these ideas in a cyber battle scenario over a smart power grid and discuss the relationship between the behavior predicted by our model and what one might expect of real human defenders and attackers.
△ Less
Submitted 3 July, 2012;
originally announced July 2012.
-
Predicting the behavior of interacting humans by fusing data from multiple sources
Authors:
Erik J. Schlicht,
Ritchie Lee,
David H. Wolpert,
Mykel J. Kochenderfer,
Brendan Tracey
Abstract:
Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but high-fidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use onli…
▽ More
Multi-fidelity methods combine inexpensive low-fidelity simulations with costly but high-fidelity simulations to produce an accurate model of a system of interest at minimal cost. They have proven useful in modeling physical systems and have been applied to engineering problems such as wing-design optimization. During human-in-the-loop experimentation, it has become increasingly common to use online platforms, like Mechanical Turk, to run low-fidelity experiments to gather human performance data in an efficient manner. One concern with these experiments is that the results obtained from the online environment generalize poorly to the actual domain of interest. To address this limitation, we extend traditional multi-fidelity approaches to allow us to combine fewer data points from high-fidelity human-in-the-loop experiments with plentiful but less accurate data from low-fidelity experiments to produce accurate models of how humans interact. We present both model-based and model-free methods, and summarize the predictive performance of each method under different conditions.
△ Less
Submitted 26 June, 2012;
originally announced June 2012.
-
Using Supervised Learning to Improve Monte Carlo Integral Estimation
Authors:
Brendan Tracey,
David Wolpert,
Juan J. Alonso
Abstract:
Monte Carlo (MC) techniques are often used to estimate integrals of a multivariate function using randomly generated samples of the function. In light of the increasing interest in uncertainty quantification and robust design applications in aerospace engineering, the calculation of expected values of such functions (e.g. performance measures) becomes important. However, MC techniques often suffer…
▽ More
Monte Carlo (MC) techniques are often used to estimate integrals of a multivariate function using randomly generated samples of the function. In light of the increasing interest in uncertainty quantification and robust design applications in aerospace engineering, the calculation of expected values of such functions (e.g. performance measures) becomes important. However, MC techniques often suffer from high variance and slow convergence as the number of samples increases. In this paper we present Stacked Monte Carlo (StackMC), a new method for post-processing an existing set of MC samples to improve the associated integral estimate. StackMC is based on the supervised learning techniques of fitting functions and cross validation. It should reduce the variance of any type of Monte Carlo integral estimate (simple sampling, importance sampling, quasi-Monte Carlo, MCMC, etc.) without adding bias. We report on an extensive set of experiments confirming that the StackMC estimate of an integral is more accurate than both the associated unprocessed Monte Carlo estimate and an estimate based on a functional fit to the MC samples. These experiments run over a wide variety of integration spaces, numbers of sample points, dimensions, and fitting functions. In particular, we apply StackMC in estimating the expected value of the fuel burn metric of future commercial aircraft and in estimating sonic boom loudness measures. We compare the efficiency of StackMC with that of more standard methods and show that for negligible additional computational cost significant increases in accuracy are gained.
△ Less
Submitted 24 August, 2011;
originally announced August 2011.