-
SeamPose: Repurposing Seams as Capacitive Sensors in a Shirt for Upper-Body Pose Tracking
Authors:
Tianhong Catherine Yu,
Manru Mary Zhang,
Peter He,
Chi-Jung Lee,
Cassidy Cheesman,
Saif Mahmud,
Ruidong Zhang,
François Guimbretière,
Cheng Zhang
Abstract:
Seams are areas of overlapping fabric formed by stitching two or more pieces of fabric together in the cut-and-sew apparel manufacturing process. In SeamPose, we repurposed seams as capacitive sensors in a shirt for continuous upper-body pose estimation. Compared to previous all-textile motion-capturing garments that place the electrodes on the clothing surface, our solution leverages existing sea…
▽ More
Seams are areas of overlapping fabric formed by stitching two or more pieces of fabric together in the cut-and-sew apparel manufacturing process. In SeamPose, we repurposed seams as capacitive sensors in a shirt for continuous upper-body pose estimation. Compared to previous all-textile motion-capturing garments that place the electrodes on the clothing surface, our solution leverages existing seams inside of a shirt by machine-sewing insulated conductive threads over the seams. The unique invisibilities and placements of the seams afford the sensing shirt to look and wear similarly as a conventional shirt while providing exciting pose-tracking capabilities. To validate this approach, we implemented a proof-of-concept untethered shirt with 8 capacitive sensing seams. With a 12-participant user study, our customized deep-learning pipeline accurately estimates the relative (to the pelvis) upper-body 3D joint positions with a mean per joint position error (MPJPE) of 6.0 cm. SeamPose represents a step towards unobtrusive integration of smart clothing for everyday pose estimation.
△ Less
Submitted 6 August, 2024; v1 submitted 17 June, 2024;
originally announced June 2024.
-
Online/Offline Learning to Enable Robust Beamforming: Limited Feedback Meets Deep Generative Models
Authors:
Ying Li,
Zhidi Lin,
Kai Li,
Michael Minyi Zhang
Abstract:
Robust beamforming is a pivotal technique in massive multiple-input multiple-output (MIMO) systems as it mitigates interference among user equipment (UE). One current risk-neutral approach to robust beamforming is the stochastic weighted minimum mean square error method (WMMSE). However, this method necessitates statistical channel information, which is typically inaccessible, particularly in fift…
▽ More
Robust beamforming is a pivotal technique in massive multiple-input multiple-output (MIMO) systems as it mitigates interference among user equipment (UE). One current risk-neutral approach to robust beamforming is the stochastic weighted minimum mean square error method (WMMSE). However, this method necessitates statistical channel information, which is typically inaccessible, particularly in fifth-generation new radio frequency division duplex cellular systems with limited feedback. To tackle this challenge, we propose a novel approach that leverages a channel variational auto-encoder (CVAE) to simulate channel behaviors using limited feedback, eliminating the need for specific distribution assumptions present in existing methods. To seamlessly integrate model learning into practical wireless communication systems, this paper introduces two learning strategies to prepare the CVAE model for practical deployment. Firstly, motivated by the digital twin technology, we advocate employing a high-performance channel simulator to generate training data, enabling pretraining of the proposed CVAE while ensuring non-disruption to the practical wireless communication system. Moreover, we present an alternative online method for CVAE learning, where online training data is sourced based on channel estimations using Type II codebook. Numerical results demonstrate the effectiveness of these strategies, highlighting their exceptional performance in channel generation and robust beamforming applications.
△ Less
Submitted 9 April, 2024;
originally announced April 2024.
-
Preventing Model Collapse in Gaussian Process Latent Variable Models
Authors:
Ying Li,
Zhidi Lin,
Feng Yin,
Michael Minyi Zhang
Abstract:
Gaussian process latent variable models (GPLVMs) are a versatile family of unsupervised learning models commonly used for dimensionality reduction. However, common challenges in modeling data with GPLVMs include inadequate kernel flexibility and improper selection of the projection noise, leading to a type of model collapse characterized by vague latent representations that do not reflect the unde…
▽ More
Gaussian process latent variable models (GPLVMs) are a versatile family of unsupervised learning models commonly used for dimensionality reduction. However, common challenges in modeling data with GPLVMs include inadequate kernel flexibility and improper selection of the projection noise, leading to a type of model collapse characterized by vague latent representations that do not reflect the underlying data structure. This paper addresses these issues by, first, theoretically examining the impact of projection variance on model collapse through the lens of a linear GPLVM. Second, we tackle model collapse due to inadequate kernel flexibility by integrating the spectral mixture (SM) kernel and a differentiable random Fourier feature (RFF) kernel approximation, which ensures computational scalability and efficiency through off-the-shelf automatic differentiation tools for learning the kernel hyperparameters, projection variance, and latent representations within the variational inference framework. The proposed GPLVM, named advisedRFLVM, is evaluated across diverse datasets and consistently outperforms various salient competing models, including state-of-the-art variational autoencoders (VAEs) and other GPLVM variants, in terms of informative latent representations and missing data imputation.
△ Less
Submitted 18 June, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
From Concept to Field Tests: Accelerated Development of Multi-AUV Missions Using a High-Fidelity Faster-than-Real-Time Simulator
Authors:
Timothy R. Player,
Arjo Chakravarty,
Mabel M. Zhang,
Ben Yair Raanan,
Brian Kieft,
Yanwu Zhang,
Brett Hobson
Abstract:
We designed and validated a novel simulator for efficient development of multi-robot marine missions. To accelerate development of cooperative behaviors, the simulator models the robots' operating conditions with moderately high fidelity and runs significantly faster than real time, including acoustic communications, dynamic environmental data, and high-resolution bathymetry in large worlds. The s…
▽ More
We designed and validated a novel simulator for efficient development of multi-robot marine missions. To accelerate development of cooperative behaviors, the simulator models the robots' operating conditions with moderately high fidelity and runs significantly faster than real time, including acoustic communications, dynamic environmental data, and high-resolution bathymetry in large worlds. The simulator's ability to exceed a real-time factor (RTF) of 100 has been stress-tested with a robust continuous integration suite and was used to develop a multi-robot field experiment.
△ Less
Submitted 17 November, 2023;
originally announced November 2023.
-
Online Student-$t$ Processes with an Overall-local Scale Structure for Modelling Non-stationary Data
Authors:
Taole Sha,
Michael Minyi Zhang
Abstract:
Time-dependent data often exhibit characteristics, such as non-stationarity and heavy-tailed errors, that would be inappropriate to model with the typical assumptions used in popular models. Thus, more flexible approaches are required to be able to accommodate such issues. To this end, we propose a Bayesian mixture of student-$t$ processes with an overall-local scale structure for the covariance.…
▽ More
Time-dependent data often exhibit characteristics, such as non-stationarity and heavy-tailed errors, that would be inappropriate to model with the typical assumptions used in popular models. Thus, more flexible approaches are required to be able to accommodate such issues. To this end, we propose a Bayesian mixture of student-$t$ processes with an overall-local scale structure for the covariance. Moreover, we use a sequential Monte Carlo (SMC) sampler in order to perform online inference as data arrive in real-time. We demonstrate the superiority of our proposed approach compared to typical Gaussian process-based models on real-world data sets in order to prove the necessity of using mixtures of student-$t$ processes.
△ Less
Submitted 1 November, 2023;
originally announced November 2023.
-
A Bayesian Non-parametric Approach to Generative Models: Integrating Variational Autoencoder and Generative Adversarial Networks using Wasserstein and Maximum Mean Discrepancy
Authors:
Forough Fazeli-Asl,
Michael Minyi Zhang
Abstract:
Generative models have emerged as a promising technique for producing high-quality images that are indistinguishable from real images. Generative adversarial networks (GANs) and variational autoencoders (VAEs) are two of the most prominent and widely studied generative models. GANs have demonstrated excellent performance in generating sharp realistic images and VAEs have shown strong abilities to…
▽ More
Generative models have emerged as a promising technique for producing high-quality images that are indistinguishable from real images. Generative adversarial networks (GANs) and variational autoencoders (VAEs) are two of the most prominent and widely studied generative models. GANs have demonstrated excellent performance in generating sharp realistic images and VAEs have shown strong abilities to generate diverse images. However, GANs suffer from ignoring a large portion of the possible output space which does not represent the full diversity of the target distribution, and VAEs tend to produce blurry images. To fully capitalize on the strengths of both models while mitigating their weaknesses, we employ a Bayesian non-parametric (BNP) approach to merge GANs and VAEs. Our procedure incorporates both Wasserstein and maximum mean discrepancy (MMD) measures in the loss function to enable effective learning of the latent space and generate diverse and high-quality samples. By fusing the discriminative power of GANs with the reconstruction capabilities of VAEs, our novel model achieves superior performance in various generative tasks, such as anomaly detection and data augmentation. Furthermore, we enhance the model's capability by employing an extra generator in the code space, which enables us to explore areas of the code space that the VAE might have overlooked. With a BNP perspective, we can model the data distribution using an infinite-dimensional space, which provides greater flexibility in the model and reduces the risk of overfitting. By utilizing this framework, we can enhance the performance of both GANs and VAEs to create a more robust generative model suitable for various applications.
△ Less
Submitted 27 August, 2023;
originally announced August 2023.
-
Transmission spectroscopy of the lowest-density gas giant: metals and a potential extended outflow in HAT-P-67b
Authors:
Aaron Bello-Arufe,
Heather A. Knutson,
João M. Mendonça,
Michael M. Zhang,
Samuel H. C. Cabot,
Alexander D. Rathcke,
Ana Ulla,
Shreyas Vissapragada,
Lars A. Buchhave
Abstract:
Extremely low-density exoplanets are tantalizing targets for atmospheric characterization because of their promisingly large signals in transmission spectroscopy. We present the first analysis of the atmosphere of the lowest-density gas giant currently known, HAT-P-67 b. This inflated Saturn-mass exoplanet sits at the boundary between hot and ultrahot gas giants, where thermal dissociation of mole…
▽ More
Extremely low-density exoplanets are tantalizing targets for atmospheric characterization because of their promisingly large signals in transmission spectroscopy. We present the first analysis of the atmosphere of the lowest-density gas giant currently known, HAT-P-67 b. This inflated Saturn-mass exoplanet sits at the boundary between hot and ultrahot gas giants, where thermal dissociation of molecules begins to dominate atmospheric composition. We observed a transit of HAT-P-67 b at high spectral resolution with CARMENES and searched for atomic and molecular species using cross-correlation and likelihood mapping. Furthermore, we explored potential atmospheric escape by targeting H$α$ and the metastable helium line. We detect Ca II and Na I with significances of 13.2$σ$ and 4.6$σ$, respectively. Unlike in several ultrahot Jupiters, we do not measure a day-to-night wind. The large line depths of Ca II suggest that the upper atmosphere may be more ionized than models predict. We detect strong variability in H$α$ and the helium triplet during the observations. These signals suggest the possible presence of an extended planetary outflow that causes an early ingress and late egress. In the averaged transmission spectrum, we measure redshifted absorption at the $\sim 3.8\%$ and $\sim 4.5\%$ level in the H$α$ and He I triplet lines, respectively. From an isothermal Parker wind model, we derive a mass loss rate of $\dot{M} \sim 10^{13}~\rm{g/s}$ and an outflow temperature of $T \sim 9900~\rm{K}$. However, due to the lack of a longer out-of-transit baseline in our data, additional observations are needed to rule out stellar variability as the source of the H$α$ and He signals.
△ Less
Submitted 12 July, 2023;
originally announced July 2023.
-
Bayesian Non-linear Latent Variable Modeling via Random Fourier Features
Authors:
Michael Minyi Zhang,
Gregory W. Gundersen,
Barbara E. Engelhardt
Abstract:
The Gaussian process latent variable model (GPLVM) is a popular probabilistic method used for nonlinear dimension reduction, matrix factorization, and state-space modeling. Inference for GPLVMs is computationally tractable only when the data likelihood is Gaussian. Moreover, inference for GPLVMs has typically been restricted to obtaining maximum a posteriori point estimates, which can lead to over…
▽ More
The Gaussian process latent variable model (GPLVM) is a popular probabilistic method used for nonlinear dimension reduction, matrix factorization, and state-space modeling. Inference for GPLVMs is computationally tractable only when the data likelihood is Gaussian. Moreover, inference for GPLVMs has typically been restricted to obtaining maximum a posteriori point estimates, which can lead to overfitting, or variational approximations, which mischaracterize the posterior uncertainty. Here, we present a method to perform Markov chain Monte Carlo (MCMC) inference for generalized Bayesian nonlinear latent variable modeling. The crucial insight necessary to generalize GPLVMs to arbitrary observation models is that we approximate the kernel function in the Gaussian process mappings with random Fourier features; this allows us to compute the gradient of the posterior in closed form with respect to the latent variables. We show that we can generalize GPLVMs to non-Gaussian observations, such as Poisson, negative binomial, and multinomial distributions, using our random feature latent variable model (RFLVM). Our generalized RFLVMs perform on par with state-of-the-art latent variable models on a wide range of applications, including motion capture, images, and text data for the purpose of estimating the latent structure and imputing the missing data of these complex data sets.
△ Less
Submitted 14 June, 2023;
originally announced June 2023.
-
A Semi-Bayesian Nonparametric Estimator of the Maximum Mean Discrepancy Measure: Applications in Goodness-of-Fit Testing and Generative Adversarial Networks
Authors:
Forough Fazeli-Asl,
Michael Minyi Zhang,
Lizhen Lin
Abstract:
A classic inferential statistical problem is the goodness-of-fit (GOF) test. Such a test can be challenging when the hypothesized parametric model has an intractable likelihood and its distributional form is not available. Bayesian methods for GOF can be appealing due to their ability to incorporate expert knowledge through prior distributions.
However, standard Bayesian methods for this test of…
▽ More
A classic inferential statistical problem is the goodness-of-fit (GOF) test. Such a test can be challenging when the hypothesized parametric model has an intractable likelihood and its distributional form is not available. Bayesian methods for GOF can be appealing due to their ability to incorporate expert knowledge through prior distributions.
However, standard Bayesian methods for this test often require strong distributional assumptions on the data and their relevant parameters. To address this issue, we propose a semi-Bayesian nonparametric (semi-BNP) procedure in the context of the maximum mean discrepancy (MMD) measure that can be applied to the GOF test. Our method introduces a novel Bayesian estimator for the MMD, enabling the development of a measure-based hypothesis test for intractable models. Through extensive experiments, we demonstrate that our proposed test outperforms frequentist MMD-based methods by achieving a lower false rejection and acceptance rate of the null hypothesis. Furthermore, we showcase the versatility of our approach by embedding the proposed estimator within a generative adversarial network (GAN) framework. It facilitates a robust BNP learning approach as another significant application of our method. With our BNP procedure, this new GAN approach can enhance sample diversity and improve inferential accuracy compared to traditional techniques.
△ Less
Submitted 10 November, 2023; v1 submitted 5 March, 2023;
originally announced March 2023.
-
DAVE Aquatic Virtual Environment: Toward a General Underwater Robotics Simulator
Authors:
Mabel M. Zhang,
Woen-Sug Choi,
Jessica Herman,
Duane Davis,
Carson Vogt,
Michael McCarrin,
Yadunund Vijay,
Dharini Dutia,
William Lew,
Steven Peters,
Brian Bingham
Abstract:
We present DAVE Aquatic Virtual Environment (DAVE), an open source simulation stack for underwater robots, sensors, and environments. Conventional robotics simulators are not designed to address unique challenges that come with the marine environment, including but not limited to environment conditions that vary spatially and temporally, impaired or challenging perception, and the unavailability o…
▽ More
We present DAVE Aquatic Virtual Environment (DAVE), an open source simulation stack for underwater robots, sensors, and environments. Conventional robotics simulators are not designed to address unique challenges that come with the marine environment, including but not limited to environment conditions that vary spatially and temporally, impaired or challenging perception, and the unavailability of data in a generally unexplored environment. Given the variety of sensors and platforms, wheels are often reinvented for specific use cases that inevitably resist wider adoption.
Building on existing simulators, we provide a framework to help speed up the development and evaluation of algorithms that would otherwise require expensive and time-consuming operations at sea. The framework includes basic building blocks (e.g., new vehicles, water-tracking Doppler Velocity Logger, physics-based multibeam sonar) as well as development tools (e.g., dynamic bathymetry spawning, ocean currents), which allows the user to focus on methodology rather than software infrastructure. We demonstrate usage through example scenarios, bathymetric data import, user interfaces for data inspection and motion planning for manipulation, and visualizations.
△ Less
Submitted 6 September, 2022;
originally announced September 2022.
-
Sparse Infinite Random Feature Latent Variable Modeling
Authors:
Michael Minyi Zhang
Abstract:
We propose a non-linear, Bayesian non-parametric latent variable model where the latent space is assumed to be sparse and infinite dimensional a priori using an Indian buffet process prior. A posteriori, the number of instantiated dimensions in the latent space is guaranteed to be finite. The purpose of placing the Indian buffet process on the latent variables is to: 1.) Automatically and probabil…
▽ More
We propose a non-linear, Bayesian non-parametric latent variable model where the latent space is assumed to be sparse and infinite dimensional a priori using an Indian buffet process prior. A posteriori, the number of instantiated dimensions in the latent space is guaranteed to be finite. The purpose of placing the Indian buffet process on the latent variables is to: 1.) Automatically and probabilistically select the number of latent dimensions. 2.) Impose sparsity in the latent space, where the Indian buffet process will select which elements are exactly zero. Our proposed model allows for sparse, non-linear latent variable modeling where the number of latent dimensions is selected automatically. Inference is made tractable using the random Fourier approximation and we can easily implement posterior inference through Markov chain Monte Carlo sampling. This approach is amenable to many observation models beyond the Gaussian setting. We demonstrate the utility of our method on a variety of synthetic, biological and text datasets and show that we can obtain superior test set performance compared to previous latent variable models.
△ Less
Submitted 26 May, 2022; v1 submitted 19 May, 2022;
originally announced May 2022.
-
Arbitrary Distribution Modeling with Censorship in Real-Time Bidding Advertising
Authors:
Xu Li,
Michelle Ma Zhang,
Youjun Tong,
Zhenya Wang
Abstract:
The purpose of Inventory Pricing is to bid the right prices to online ad opportunities, which is crucial for a Demand-Side Platform (DSP) to win advertising auctions in Real-Time Bidding (RTB). In the planning stage, advertisers need the forecast of probabilistic models to make bidding decisions. However, most of the previous works made strong assumptions on the distribution form of the winning pr…
▽ More
The purpose of Inventory Pricing is to bid the right prices to online ad opportunities, which is crucial for a Demand-Side Platform (DSP) to win advertising auctions in Real-Time Bidding (RTB). In the planning stage, advertisers need the forecast of probabilistic models to make bidding decisions. However, most of the previous works made strong assumptions on the distribution form of the winning price, which reduced their accuracy and weakened their ability to make generalizations. Though some works recently tried to fit the distribution directly, their complex structure lacked efficiency on online inference. In this paper, we devise a novel loss function, Neighborhood Likelihood Loss (NLL), collaborating with a proposed framework, Arbitrary Distribution Modeling (ADM), to predict the winning price distribution under censorship with no pre-assumption required. We conducted experiments on two real-world experimental datasets and one large-scale, non-simulated production dataset in our system. Experiments showed that ADM outperformed the baselines both on algorithm and business metrics. By replaying historical data of the production environment, this method was shown to lead to good yield in our system. Without any pre-assumed specific distribution form, ADM showed significant advantages in effectiveness and efficiency, demonstrating its great capability in modeling sophisticated price landscapes.
△ Less
Submitted 26 October, 2021;
originally announced October 2021.
-
New $α$-Emitting Isotope $^{214}$U and Abnormal Enhancement of $α$-Particle Clustering in Lightest Uranium Isotopes
Authors:
Z. Y. Zhang,
H. B. Yang,
M. H. Huang,
Z. G. Gan,
C. X. Yuan,
C. Qi,
A. N. Andreyev,
M. L. Liu,
L. Ma,
M. M. Zhang,
Y. L. Tian,
Y. S. Wang,
J. G. Wang,
C. L. Yang,
G. S. Li,
Y. H. Qiang,
W. Q. Yang,
R. F. Chen,
H. B. Zhang,
Z. W. Lu,
X. X. Xu,
L. M. Duan,
H. R. Yang,
W. X. Huang,
Z. Liu
, et al. (17 additional authors not shown)
Abstract:
A new $α$-emitting isotope $^{214}$U, produced by fusion-evaporation reaction $^{182}$W($^{36}$Ar, 4n)$^{214}$U, was identified by employing the gas-filled recoil separator SHANS and recoil-$α$ correlation technique. More precise $α$-decay properties of even-even nuclei $^{216,218}$U were also measured in reactions of $^{40}$Ar, $^{40}$Ca with $^{180, 182, 184}$W targets. By combining the experime…
▽ More
A new $α$-emitting isotope $^{214}$U, produced by fusion-evaporation reaction $^{182}$W($^{36}$Ar, 4n)$^{214}$U, was identified by employing the gas-filled recoil separator SHANS and recoil-$α$ correlation technique. More precise $α$-decay properties of even-even nuclei $^{216,218}$U were also measured in reactions of $^{40}$Ar, $^{40}$Ca with $^{180, 182, 184}$W targets. By combining the experimental data, improved $α$-decay reduced widths $δ^2$ for the even-even Po--Pu nuclei in the vicinity of magic neutron number $N=126$ were deduced. Their systematic trends are discussed in terms of $N_{p}N_{n}$ scheme in order to study the influence of proton-neutron interaction on $α$ decay in this region of nuclei. It is strikingly found that the reduced widths of $^{214,216}$U are significantly enhanced by a factor of two as compared with the $N_{p}N_{n}$ systematics for the $84 \leq Z \leq 90$ and $N<126$ even-even nuclei. The abnormal enhancement is interpreted by the strong monopole interaction between the valence protons and neutrons occupying the $π1f_{7/2}$ and $ν1f_{5/2}$ spin-orbit partner orbits, which is supported by a large-scale shell model calculation.
△ Less
Submitted 15 January, 2021;
originally announced January 2021.
-
Accelerated Algorithms for Convex and Non-Convex Optimization on Manifolds
Authors:
Lizhen Lin,
Bayan Saparbayeva,
Michael Minyi Zhang,
David B. Dunson
Abstract:
We propose a general scheme for solving convex and non-convex optimization problems on manifolds. The central idea is that, by adding a multiple of the squared retraction distance to the objective function in question, we "convexify" the objective function and solve a series of convex sub-problems in the optimization procedure. One of the key challenges for optimization on manifolds is the difficu…
▽ More
We propose a general scheme for solving convex and non-convex optimization problems on manifolds. The central idea is that, by adding a multiple of the squared retraction distance to the objective function in question, we "convexify" the objective function and solve a series of convex sub-problems in the optimization procedure. One of the key challenges for optimization on manifolds is the difficulty of verifying the complexity of the objective function, e.g., whether the objective function is convex or non-convex, and the degree of non-convexity. Our proposed algorithm adapts to the level of complexity in the objective function. We show that when the objective function is convex, the algorithm provably converges to the optimum and leads to accelerated convergence. When the objective function is non-convex, the algorithm will converge to a stationary point. Our proposed method unifies insights from Nesterov's original idea for accelerating gradient descent algorithms with recent developments in optimization algorithms in Euclidean space. We demonstrate the utility of our algorithms on several manifold optimization tasks such as estimating intrinsic and extrinsic Fréchet means on spheres and low-rank matrix factorization with Grassmann manifolds applied to the Netflix rating data set.
△ Less
Submitted 17 October, 2020;
originally announced October 2020.
-
Latent variable modeling with random features
Authors:
Gregory W. Gundersen,
Michael Minyi Zhang,
Barbara E. Engelhardt
Abstract:
Gaussian process-based latent variable models are flexible and theoretically grounded tools for nonlinear dimension reduction, but generalizing to non-Gaussian data likelihoods within this nonlinear framework is statistically challenging. Here, we use random features to develop a family of nonlinear dimension reduction models that are easily extensible to non-Gaussian data likelihoods; we call the…
▽ More
Gaussian process-based latent variable models are flexible and theoretically grounded tools for nonlinear dimension reduction, but generalizing to non-Gaussian data likelihoods within this nonlinear framework is statistically challenging. Here, we use random features to develop a family of nonlinear dimension reduction models that are easily extensible to non-Gaussian data likelihoods; we call these random feature latent variable models (RFLVMs). By approximating a nonlinear relationship between the latent space and the observations with a function that is linear with respect to random features, we induce closed-form gradients of the posterior distribution with respect to the latent variable. This allows the RFLVM framework to support computationally tractable nonlinear latent variable models for a variety of data likelihoods in the exponential family without specialized derivations. Our generalized RFLVMs produce results comparable with other state-of-the-art dimension reduction methods on diverse types of data, including neural spike train recordings, images, and text data.
△ Less
Submitted 19 June, 2020;
originally announced June 2020.
-
Distributed, partially collapsed MCMC for Bayesian Nonparametrics
Authors:
Avinava Dubey,
Michael Minyi Zhang,
Eric P. Xing,
Sinead A. Williamson
Abstract:
Bayesian nonparametric (BNP) models provide elegant methods for discovering underlying latent features within a data set, but inference in such models can be slow. We exploit the fact that completely random measures, which commonly used models like the Dirichlet process and the beta-Bernoulli process can be expressed as, are decomposable into independent sub-measures. We use this decomposition to…
▽ More
Bayesian nonparametric (BNP) models provide elegant methods for discovering underlying latent features within a data set, but inference in such models can be slow. We exploit the fact that completely random measures, which commonly used models like the Dirichlet process and the beta-Bernoulli process can be expressed as, are decomposable into independent sub-measures. We use this decomposition to partition the latent measure into a finite measure containing only instantiated components, and an infinite measure containing all other components. We then select different inference algorithms for the two components: uncollapsed samplers mix well on the finite measure, while collapsed samplers mix well on the infinite, sparsely occupied tail. The resulting hybrid algorithm can be applied to a wide class of models, and can be easily distributed to allow scalable inference without sacrificing asymptotic convergence guarantees.
△ Less
Submitted 4 March, 2020; v1 submitted 15 January, 2020;
originally announced January 2020.
-
The Orbit of WASP-12b is Decaying
Authors:
Samuel W. Yee,
Joshua N. Winn,
Heather A. Knutson,
Kishore C. Patra,
Shreyas Vissapragada,
Michael M. Zhang,
Matthew J. Holman,
Avi Shporer,
Jason T. Wright
Abstract:
WASP-12b is a transiting hot Jupiter on a 1.09-day orbit around a late-F star. Since the planet's discovery in 2008, the time interval between transits has been decreasing by $29\pm 2$ msec year$^{-1}$. This is a possible sign of orbital decay, although the previously available data left open the possibility that the planet's orbit is slightly eccentric and is undergoing apsidal precession. Here,…
▽ More
WASP-12b is a transiting hot Jupiter on a 1.09-day orbit around a late-F star. Since the planet's discovery in 2008, the time interval between transits has been decreasing by $29\pm 2$ msec year$^{-1}$. This is a possible sign of orbital decay, although the previously available data left open the possibility that the planet's orbit is slightly eccentric and is undergoing apsidal precession. Here, we present new transit and occultation observations that provide more decisive evidence for orbital decay, which is favored over apsidal precession by a $Δ\mathrm{BIC}$ of 22.3 or Bayes factor of 70,000. We also present new radial-velocity data that rule out the Rømer effect as the cause of the period change. This makes WASP-12 the first planetary system for which we can be confident that the orbit is decaying. The decay timescale for the orbit is $P/\dot{P} = 3.25\pm 0.23$ Myr. Interpreting the decay as the result of tidal dissipation, the modified stellar tidal quality factor is $Q'_\star = 1.8 \times10^{5}$.
△ Less
Submitted 20 November, 2019;
originally announced November 2019.
-
Probabilistic Time of Arrival Localization
Authors:
Fernando Perez-Cruz,
Pablo M. Olmos,
Michael Minyi Zhang,
Howard Huang
Abstract:
In this paper, we take a new approach for time of arrival geo-localization. We show that the main sources of error in metropolitan areas are due to environmental imperfections that bias our solutions, and that we can rely on a probabilistic model to learn and compensate for them. The resulting localization error is validated using measurements from a live LTE cellular network to be less than 10 me…
▽ More
In this paper, we take a new approach for time of arrival geo-localization. We show that the main sources of error in metropolitan areas are due to environmental imperfections that bias our solutions, and that we can rely on a probabilistic model to learn and compensate for them. The resulting localization error is validated using measurements from a live LTE cellular network to be less than 10 meters, representing an order-of-magnitude improvement.
△ Less
Submitted 15 October, 2019;
originally announced October 2019.
-
Sequential Gaussian Processes for Online Learning of Nonstationary Functions
Authors:
Michael Minyi Zhang,
Bianca Dumitrascu,
Sinead A. Williamson,
Barbara E. Engelhardt
Abstract:
Many machine learning problems can be framed in the context of estimating functions, and often these are time-dependent functions that are estimated in real-time as observations arrive. Gaussian processes (GPs) are an attractive choice for modeling real-valued nonlinear functions due to their flexibility and uncertainty quantification. However, the typical GP regression model suffers from several…
▽ More
Many machine learning problems can be framed in the context of estimating functions, and often these are time-dependent functions that are estimated in real-time as observations arrive. Gaussian processes (GPs) are an attractive choice for modeling real-valued nonlinear functions due to their flexibility and uncertainty quantification. However, the typical GP regression model suffers from several drawbacks: 1) Conventional GP inference scales $O(N^{3})$ with respect to the number of observations; 2) Updating a GP model sequentially is not trivial; and 3) Covariance kernels typically enforce stationarity constraints on the function, while GPs with non-stationary covariance kernels are often intractable to use in practice. To overcome these issues, we propose a sequential Monte Carlo algorithm to fit infinite mixtures of GPs that capture non-stationary behavior while allowing for online, distributed inference. Our approach empirically improves performance over state-of-the-art methods for online GP estimation in the presence of non-stationarity in time-series data. To demonstrate the utility of our proposed online Gaussian process mixture-of-experts approach in applied settings, we show that we can sucessfully implement an optimization algorithm using online Gaussian process bandits.
△ Less
Submitted 6 May, 2023; v1 submitted 23 May, 2019;
originally announced May 2019.
-
A New Class of Time Dependent Latent Factor Models with Applications
Authors:
Sinead A. Williamson,
Michael Minyi Zhang,
Paul Damien
Abstract:
In many applications, observed data are influenced by some combination of latent causes. For example, suppose sensors are placed inside a building to record responses such as temperature, humidity, power consumption and noise levels. These random, observed responses are typically affected by many unobserved, latent factors (or features) within the building such as the number of individuals, the tu…
▽ More
In many applications, observed data are influenced by some combination of latent causes. For example, suppose sensors are placed inside a building to record responses such as temperature, humidity, power consumption and noise levels. These random, observed responses are typically affected by many unobserved, latent factors (or features) within the building such as the number of individuals, the turning on and off of electrical devices, power surges, etc. These latent factors are usually present for a contiguous period of time before disappearing; further, multiple factors could be present at a time. This paper develops new probabilistic methodology and inference methods for random object generation influenced by latent features exhibiting temporal persistence. Every datum is associated with subsets of a potentially infinite number of hidden, persistent features that account for temporal dynamics in an observation. The ensuing class of dynamic models constructed by adapting the Indian Buffet Process --- a probability measure on the space of random, unbounded binary matrices --- finds use in a variety of applications arising in operations, signal processing, biomedicine, marketing, image analysis, etc. Illustrations using synthetic and real data are provided.
△ Less
Submitted 17 April, 2019;
originally announced April 2019.
-
Communication Efficient Parallel Algorithms for Optimization on Manifolds
Authors:
Bayan Saparbayeva,
Michael Minyi Zhang,
Lizhen Lin
Abstract:
The last decade has witnessed an explosion in the development of models, theory and computational algorithms for "big data" analysis. In particular, distributed computing has served as a natural and dominating paradigm for statistical inference. However, the existing literature on parallel inference almost exclusively focuses on Euclidean data and parameters. While this assumption is valid for man…
▽ More
The last decade has witnessed an explosion in the development of models, theory and computational algorithms for "big data" analysis. In particular, distributed computing has served as a natural and dominating paradigm for statistical inference. However, the existing literature on parallel inference almost exclusively focuses on Euclidean data and parameters. While this assumption is valid for many applications, it is increasingly more common to encounter problems where the data or the parameters lie on a non-Euclidean space, like a manifold for example. Our work aims to fill a critical gap in the literature by generalizing parallel inference algorithms to optimization on manifolds. We show that our proposed algorithm is both communication efficient and carries theoretical convergence guarantees. In addition, we demonstrate the performance of our algorithm to the estimation of Fréchet means on simulated spherical data and the low-rank matrix completion problem over Grassmann manifolds applied to the Netflix prize data set.
△ Less
Submitted 1 November, 2018; v1 submitted 25 October, 2018;
originally announced October 2018.
-
Scalable Angular Discriminative Deep Metric Learning for Face Recognition
Authors:
Bowen Wu,
Huaming Wu,
Monica M. Y. Zhang
Abstract:
With the development of deep learning, Deep Metric Learning (DML) has achieved great improvements in face recognition. Specifically, the widely used softmax loss in the training process often bring large intra-class variations, and feature normalization is only exploited in the testing process to compute the pair similarities. To bridge the gap, we impose the intra-class cosine similarity between…
▽ More
With the development of deep learning, Deep Metric Learning (DML) has achieved great improvements in face recognition. Specifically, the widely used softmax loss in the training process often bring large intra-class variations, and feature normalization is only exploited in the testing process to compute the pair similarities. To bridge the gap, we impose the intra-class cosine similarity between the features and weight vectors in softmax loss larger than a margin in the training step, and extend it from four aspects. First, we explore the effect of a hard sample mining strategy. To alleviate the human labor of adjusting the margin hyper-parameter, a self-adaptive margin updating strategy is proposed. Then, a normalized version is given to take full advantage of the cosine similarity constraint. Furthermore, we enhance the former constraint to force the intra-class cosine similarity larger than the mean inter-class cosine similarity with a margin in the exponential feature projection space. Extensive experiments on Labeled Face in the Wild (LFW), Youtube Faces (YTF) and IARPA Janus Benchmark A (IJB-A) datasets demonstrate that the proposed methods outperform the mainstream DML methods and approach the state-of-the-art performance.
△ Less
Submitted 30 April, 2018; v1 submitted 29 April, 2018;
originally announced April 2018.
-
On fully residually-$\mathcal{R}$ groups
Authors:
Inna Bumagin,
Ming Ming Zhang
Abstract:
We consider the class $\mathcal{R}$ of finitely generated toral relatively hyperbolic groups. We show that groups from $\mathcal{R}$ are commutative transitive and generalize a theorem proved by Benjamin Baumslag to this class. We also discuss two definitions of (fully) residually-$\mathcal{C}$ groups and prove the equivalence of the two definitions for $\mathcal{C}=\mathcal{R}$. This is a general…
▽ More
We consider the class $\mathcal{R}$ of finitely generated toral relatively hyperbolic groups. We show that groups from $\mathcal{R}$ are commutative transitive and generalize a theorem proved by Benjamin Baumslag to this class. We also discuss two definitions of (fully) residually-$\mathcal{C}$ groups and prove the equivalence of the two definitions for $\mathcal{C}=\mathcal{R}$. This is a generalization of the similar result obtained by Ol'shanskii for $\mathcal{C}$ being the class of torsion-free hyperbolic groups. Let $Γ\in\mathcal{R}$ be non-abelian and non-elementary. We prove that every finitely generated fully residually-$Γ$ group embeds into a group from $\mathcal{R}$. On the other hand, we give an example of a finitely generated torsion-free fully residually-$\mathcal{H}$ group that does not embed into a group from $\mathcal{R}$; $\mathcal{H}$ is the class of hyperbolic groups.
△ Less
Submitted 13 January, 2018;
originally announced January 2018.
-
Accelerated Parallel Non-conjugate Sampling for Bayesian Non-parametric Models
Authors:
Michael Minyi Zhang,
Sinead A. Williamson,
Fernando Perez-Cruz
Abstract:
Inference of latent feature models in the Bayesian nonparametric setting is generally difficult, especially in high dimensional settings, because it usually requires proposing features from some prior distribution. In special cases, where the integration is tractable, we can sample new feature assignments according to a predictive likelihood. We present a novel method to accelerate the mixing of l…
▽ More
Inference of latent feature models in the Bayesian nonparametric setting is generally difficult, especially in high dimensional settings, because it usually requires proposing features from some prior distribution. In special cases, where the integration is tractable, we can sample new feature assignments according to a predictive likelihood. We present a novel method to accelerate the mixing of latent variable model inference by proposing feature locations based on the data, as opposed to the prior. First, we introduce an accelerated feature proposal mechanism that we show is a valid MCMC algorithm for posterior inference. Next, we propose an approximate inference strategy to perform accelerated inference in parallel. A two-stage algorithm that combines the two approaches provides a computationally attractive method that can quickly reach local convergence to the posterior distribution of our model, while allowing us to exploit parallelization.
△ Less
Submitted 29 April, 2022; v1 submitted 19 May, 2017;
originally announced May 2017.
-
Parallel Markov Chain Monte Carlo for the Indian Buffet Process
Authors:
Michael M. Zhang,
Avinava Dubey,
Sinead A. Williamson
Abstract:
Indian Buffet Process based models are an elegant way for discovering underlying features within a data set, but inference in such models can be slow. Inferring underlying features using Markov chain Monte Carlo either relies on an uncollapsed representation, which leads to poor mixing, or on a collapsed representation, which leads to a quadratic increase in computational complexity. Existing atte…
▽ More
Indian Buffet Process based models are an elegant way for discovering underlying features within a data set, but inference in such models can be slow. Inferring underlying features using Markov chain Monte Carlo either relies on an uncollapsed representation, which leads to poor mixing, or on a collapsed representation, which leads to a quadratic increase in computational complexity. Existing attempts at distributing inference have introduced additional approximation within the inference procedure. In this paper we present a novel algorithm to perform asymptotically exact parallel Markov chain Monte Carlo inference for Indian Buffet Process models. We take advantage of the fact that the features are conditionally independent under the beta-Bernoulli process. Because of this conditional independence, we can partition the features into two parts: one part containing only the finitely many instantiated features and the other part containing the infinite tail of uninstantiated features. For the finite partition, parallel inference is simple given the instantiation of features. But for the infinite tail, performing uncollapsed MCMC leads to poor mixing and hence we collapse out the features. The resulting hybrid sampler, while being parallel, produces samples asymptotically from the true posterior.
△ Less
Submitted 9 March, 2017;
originally announced March 2017.
-
Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search
Authors:
Mabel M. Zhang,
Nikolay Atanasov,
Kostas Daniilidis
Abstract:
This paper considers the problem of active object recognition using touch only. The focus is on adaptively selecting a sequence of wrist poses that achieves accurate recognition by enclosure grasps. It seeks to minimize the number of touches and maximize recognition confidence. The actions are formulated as wrist poses relative to each other, making the algorithm independent of absolute workspace…
▽ More
This paper considers the problem of active object recognition using touch only. The focus is on adaptively selecting a sequence of wrist poses that achieves accurate recognition by enclosure grasps. It seeks to minimize the number of touches and maximize recognition confidence. The actions are formulated as wrist poses relative to each other, making the algorithm independent of absolute workspace coordinates. The optimal sequence is approximated by Monte Carlo tree search. We demonstrate results in a physics engine and on a real robot. In the physics engine, most object instances were recognized in at most 16 grasps. On a real robot, our method recognized objects in 2--9 grasps and outperformed a greedy baseline.
△ Less
Submitted 29 July, 2017; v1 submitted 28 February, 2017;
originally announced March 2017.
-
Embarrassingly Parallel Inference for Gaussian Processes
Authors:
Michael Minyi Zhang,
Sinead A. Williamson
Abstract:
Training Gaussian process-based models typically involves an $ O(N^3)$ computational bottleneck due to inverting the covariance matrix. Popular methods for overcoming this matrix inversion problem cannot adequately model all types of latent functions, and are often not parallelizable. However, judicious choice of model structure can ameliorate this problem. A mixture-of-experts model that uses a m…
▽ More
Training Gaussian process-based models typically involves an $ O(N^3)$ computational bottleneck due to inverting the covariance matrix. Popular methods for overcoming this matrix inversion problem cannot adequately model all types of latent functions, and are often not parallelizable. However, judicious choice of model structure can ameliorate this problem. A mixture-of-experts model that uses a mixture of $K$ Gaussian processes offers modeling flexibility and opportunities for scalable inference. Our embarrassingly parallel algorithm combines low-dimensional matrix inversions with importance sampling to yield a flexible, scalable mixture-of-experts model that offers comparable performance to Gaussian process regression at a much lower computational cost.
△ Less
Submitted 3 March, 2020; v1 submitted 27 February, 2017;
originally announced February 2017.
-
Robust and Parallel Bayesian Model Selection
Authors:
Michael Minyi Zhang,
Henry Lam,
Lizhen Lin
Abstract:
Effective and accurate model selection is an important problem in modern data analysis. One of the major challenges is the computational burden required to handle large data sets that cannot be stored or processed on one machine. Another challenge one may encounter is the presence of outliers and contaminations that damage the inference quality. The parallel "divide and conquer" model selection st…
▽ More
Effective and accurate model selection is an important problem in modern data analysis. One of the major challenges is the computational burden required to handle large data sets that cannot be stored or processed on one machine. Another challenge one may encounter is the presence of outliers and contaminations that damage the inference quality. The parallel "divide and conquer" model selection strategy divides the observations of the full data set into roughly equal subsets and perform inference and model selection independently on each subset. After local subset inference, this method aggregates the posterior model probabilities or other model/variable selection criteria to obtain a final model by using the notion of geometric median. This approach leads to improved concentration in finding the "correct" model and model parameters and also is provably robust to outliers and data contamination.
△ Less
Submitted 22 March, 2018; v1 submitted 19 October, 2016;
originally announced October 2016.