Search | arXiv e-print repository

SeamPose: Repurposing Seams as Capacitive Sensors in a Shirt for Upper-Body Pose Tracking

Authors: Tianhong Catherine Yu, Manru Mary Zhang, Peter He, Chi-Jung Lee, Cassidy Cheesman, Saif Mahmud, Ruidong Zhang, François Guimbretière, Cheng Zhang

Abstract: Seams are areas of overlapping fabric formed by stitching two or more pieces of fabric together in the cut-and-sew apparel manufacturing process. In SeamPose, we repurposed seams as capacitive sensors in a shirt for continuous upper-body pose estimation. Compared to previous all-textile motion-capturing garments that place the electrodes on the clothing surface, our solution leverages existing sea… ▽ More Seams are areas of overlapping fabric formed by stitching two or more pieces of fabric together in the cut-and-sew apparel manufacturing process. In SeamPose, we repurposed seams as capacitive sensors in a shirt for continuous upper-body pose estimation. Compared to previous all-textile motion-capturing garments that place the electrodes on the clothing surface, our solution leverages existing seams inside of a shirt by machine-sewing insulated conductive threads over the seams. The unique invisibilities and placements of the seams afford the sensing shirt to look and wear similarly as a conventional shirt while providing exciting pose-tracking capabilities. To validate this approach, we implemented a proof-of-concept untethered shirt with 8 capacitive sensing seams. With a 12-participant user study, our customized deep-learning pipeline accurately estimates the relative (to the pelvis) upper-body 3D joint positions with a mean per joint position error (MPJPE) of 6.0 cm. SeamPose represents a step towards unobtrusive integration of smart clothing for everyday pose estimation. △ Less

Submitted 6 August, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

arXiv:2404.06055 [pdf, ps, other]

Online/Offline Learning to Enable Robust Beamforming: Limited Feedback Meets Deep Generative Models

Authors: Ying Li, Zhidi Lin, Kai Li, Michael Minyi Zhang

Abstract: Robust beamforming is a pivotal technique in massive multiple-input multiple-output (MIMO) systems as it mitigates interference among user equipment (UE). One current risk-neutral approach to robust beamforming is the stochastic weighted minimum mean square error method (WMMSE). However, this method necessitates statistical channel information, which is typically inaccessible, particularly in fift… ▽ More Robust beamforming is a pivotal technique in massive multiple-input multiple-output (MIMO) systems as it mitigates interference among user equipment (UE). One current risk-neutral approach to robust beamforming is the stochastic weighted minimum mean square error method (WMMSE). However, this method necessitates statistical channel information, which is typically inaccessible, particularly in fifth-generation new radio frequency division duplex cellular systems with limited feedback. To tackle this challenge, we propose a novel approach that leverages a channel variational auto-encoder (CVAE) to simulate channel behaviors using limited feedback, eliminating the need for specific distribution assumptions present in existing methods. To seamlessly integrate model learning into practical wireless communication systems, this paper introduces two learning strategies to prepare the CVAE model for practical deployment. Firstly, motivated by the digital twin technology, we advocate employing a high-performance channel simulator to generate training data, enabling pretraining of the proposed CVAE while ensuring non-disruption to the practical wireless communication system. Moreover, we present an alternative online method for CVAE learning, where online training data is sourced based on channel estimations using Type II codebook. Numerical results demonstrate the effectiveness of these strategies, highlighting their exceptional performance in channel generation and robust beamforming applications. △ Less

Submitted 9 April, 2024; originally announced April 2024.

arXiv:2404.01697 [pdf, other]

Preventing Model Collapse in Gaussian Process Latent Variable Models

Authors: Ying Li, Zhidi Lin, Feng Yin, Michael Minyi Zhang

Abstract: Gaussian process latent variable models (GPLVMs) are a versatile family of unsupervised learning models commonly used for dimensionality reduction. However, common challenges in modeling data with GPLVMs include inadequate kernel flexibility and improper selection of the projection noise, leading to a type of model collapse characterized by vague latent representations that do not reflect the unde… ▽ More Gaussian process latent variable models (GPLVMs) are a versatile family of unsupervised learning models commonly used for dimensionality reduction. However, common challenges in modeling data with GPLVMs include inadequate kernel flexibility and improper selection of the projection noise, leading to a type of model collapse characterized by vague latent representations that do not reflect the underlying data structure. This paper addresses these issues by, first, theoretically examining the impact of projection variance on model collapse through the lens of a linear GPLVM. Second, we tackle model collapse due to inadequate kernel flexibility by integrating the spectral mixture (SM) kernel and a differentiable random Fourier feature (RFF) kernel approximation, which ensures computational scalability and efficiency through off-the-shelf automatic differentiation tools for learning the kernel hyperparameters, projection variance, and latent representations within the variational inference framework. The proposed GPLVM, named advisedRFLVM, is evaluated across diverse datasets and consistently outperforms various salient competing models, including state-of-the-art variational autoencoders (VAEs) and other GPLVM variants, in terms of informative latent representations and missing data imputation. △ Less

Submitted 18 June, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: International Conference on Machine Learning (ICML), 2024

arXiv:2311.10377 [pdf, other]

doi 10.1109/ICRA48891.2023.10160447

From Concept to Field Tests: Accelerated Development of Multi-AUV Missions Using a High-Fidelity Faster-than-Real-Time Simulator

Authors: Timothy R. Player, Arjo Chakravarty, Mabel M. Zhang, Ben Yair Raanan, Brian Kieft, Yanwu Zhang, Brett Hobson

Abstract: We designed and validated a novel simulator for efficient development of multi-robot marine missions. To accelerate development of cooperative behaviors, the simulator models the robots' operating conditions with moderately high fidelity and runs significantly faster than real time, including acoustic communications, dynamic environmental data, and high-resolution bathymetry in large worlds. The s… ▽ More We designed and validated a novel simulator for efficient development of multi-robot marine missions. To accelerate development of cooperative behaviors, the simulator models the robots' operating conditions with moderately high fidelity and runs significantly faster than real time, including acoustic communications, dynamic environmental data, and high-resolution bathymetry in large worlds. The simulator's ability to exceed a real-time factor (RTF) of 100 has been stress-tested with a robust continuous integration suite and was used to develop a multi-robot field experiment. △ Less

Submitted 17 November, 2023; originally announced November 2023.

Journal ref: IEEE International Conference on Robotics and Automation (ICRA), London, United Kingdom, 2023, pp. 3102-3108

arXiv:2311.00564 [pdf, other]

Online Student-$t$ Processes with an Overall-local Scale Structure for Modelling Non-stationary Data

Authors: Taole Sha, Michael Minyi Zhang

Abstract: Time-dependent data often exhibit characteristics, such as non-stationarity and heavy-tailed errors, that would be inappropriate to model with the typical assumptions used in popular models. Thus, more flexible approaches are required to be able to accommodate such issues. To this end, we propose a Bayesian mixture of student-$t$ processes with an overall-local scale structure for the covariance.… ▽ More Time-dependent data often exhibit characteristics, such as non-stationarity and heavy-tailed errors, that would be inappropriate to model with the typical assumptions used in popular models. Thus, more flexible approaches are required to be able to accommodate such issues. To this end, we propose a Bayesian mixture of student-$t$ processes with an overall-local scale structure for the covariance. Moreover, we use a sequential Monte Carlo (SMC) sampler in order to perform online inference as data arrive in real-time. We demonstrate the superiority of our proposed approach compared to typical Gaussian process-based models on real-world data sets in order to prove the necessity of using mixtures of student-$t$ processes. △ Less

Submitted 1 November, 2023; originally announced November 2023.

Comments: 9 pages,5 figures

MSC Class: 62F15

arXiv:2308.14048 [pdf, other]

A Bayesian Non-parametric Approach to Generative Models: Integrating Variational Autoencoder and Generative Adversarial Networks using Wasserstein and Maximum Mean Discrepancy

Authors: Forough Fazeli-Asl, Michael Minyi Zhang

Abstract: Generative models have emerged as a promising technique for producing high-quality images that are indistinguishable from real images. Generative adversarial networks (GANs) and variational autoencoders (VAEs) are two of the most prominent and widely studied generative models. GANs have demonstrated excellent performance in generating sharp realistic images and VAEs have shown strong abilities to… ▽ More Generative models have emerged as a promising technique for producing high-quality images that are indistinguishable from real images. Generative adversarial networks (GANs) and variational autoencoders (VAEs) are two of the most prominent and widely studied generative models. GANs have demonstrated excellent performance in generating sharp realistic images and VAEs have shown strong abilities to generate diverse images. However, GANs suffer from ignoring a large portion of the possible output space which does not represent the full diversity of the target distribution, and VAEs tend to produce blurry images. To fully capitalize on the strengths of both models while mitigating their weaknesses, we employ a Bayesian non-parametric (BNP) approach to merge GANs and VAEs. Our procedure incorporates both Wasserstein and maximum mean discrepancy (MMD) measures in the loss function to enable effective learning of the latent space and generate diverse and high-quality samples. By fusing the discriminative power of GANs with the reconstruction capabilities of VAEs, our novel model achieves superior performance in various generative tasks, such as anomaly detection and data augmentation. Furthermore, we enhance the model's capability by employing an extra generator in the code space, which enables us to explore areas of the code space that the VAE might have overlooked. With a BNP perspective, we can model the data distribution using an infinite-dimensional space, which provides greater flexibility in the model and reduces the risk of overfitting. By utilizing this framework, we can enhance the performance of both GANs and VAEs to create a more robust generative model suitable for various applications. △ Less

Submitted 27 August, 2023; originally announced August 2023.

arXiv:2307.06356 [pdf, other]

doi 10.3847/1538-3881/acd935

Transmission spectroscopy of the lowest-density gas giant: metals and a potential extended outflow in HAT-P-67b

Authors: Aaron Bello-Arufe, Heather A. Knutson, João M. Mendonça, Michael M. Zhang, Samuel H. C. Cabot, Alexander D. Rathcke, Ana Ulla, Shreyas Vissapragada, Lars A. Buchhave

Abstract: Extremely low-density exoplanets are tantalizing targets for atmospheric characterization because of their promisingly large signals in transmission spectroscopy. We present the first analysis of the atmosphere of the lowest-density gas giant currently known, HAT-P-67 b. This inflated Saturn-mass exoplanet sits at the boundary between hot and ultrahot gas giants, where thermal dissociation of mole… ▽ More Extremely low-density exoplanets are tantalizing targets for atmospheric characterization because of their promisingly large signals in transmission spectroscopy. We present the first analysis of the atmosphere of the lowest-density gas giant currently known, HAT-P-67 b. This inflated Saturn-mass exoplanet sits at the boundary between hot and ultrahot gas giants, where thermal dissociation of molecules begins to dominate atmospheric composition. We observed a transit of HAT-P-67 b at high spectral resolution with CARMENES and searched for atomic and molecular species using cross-correlation and likelihood mapping. Furthermore, we explored potential atmospheric escape by targeting H$α$ and the metastable helium line. We detect Ca II and Na I with significances of 13.2$σ$ and 4.6$σ$, respectively. Unlike in several ultrahot Jupiters, we do not measure a day-to-night wind. The large line depths of Ca II suggest that the upper atmosphere may be more ionized than models predict. We detect strong variability in H$α$ and the helium triplet during the observations. These signals suggest the possible presence of an extended planetary outflow that causes an early ingress and late egress. In the averaged transmission spectrum, we measure redshifted absorption at the $\sim 3.8\%$ and $\sim 4.5\%$ level in the H$α$ and He I triplet lines, respectively. From an isothermal Parker wind model, we derive a mass loss rate of $\dot{M} \sim 10^{13}~\rm{g/s}$ and an outflow temperature of $T \sim 9900~\rm{K}$. However, due to the lack of a longer out-of-transit baseline in our data, additional observations are needed to rule out stellar variability as the source of the H$α$ and He signals. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: The Astronomical Journal, in press. 17 pages, 9 figures

arXiv:2306.08352 [pdf, other]

Bayesian Non-linear Latent Variable Modeling via Random Fourier Features

Authors: Michael Minyi Zhang, Gregory W. Gundersen, Barbara E. Engelhardt

Abstract: The Gaussian process latent variable model (GPLVM) is a popular probabilistic method used for nonlinear dimension reduction, matrix factorization, and state-space modeling. Inference for GPLVMs is computationally tractable only when the data likelihood is Gaussian. Moreover, inference for GPLVMs has typically been restricted to obtaining maximum a posteriori point estimates, which can lead to over… ▽ More The Gaussian process latent variable model (GPLVM) is a popular probabilistic method used for nonlinear dimension reduction, matrix factorization, and state-space modeling. Inference for GPLVMs is computationally tractable only when the data likelihood is Gaussian. Moreover, inference for GPLVMs has typically been restricted to obtaining maximum a posteriori point estimates, which can lead to overfitting, or variational approximations, which mischaracterize the posterior uncertainty. Here, we present a method to perform Markov chain Monte Carlo (MCMC) inference for generalized Bayesian nonlinear latent variable modeling. The crucial insight necessary to generalize GPLVMs to arbitrary observation models is that we approximate the kernel function in the Gaussian process mappings with random Fourier features; this allows us to compute the gradient of the posterior in closed form with respect to the latent variables. We show that we can generalize GPLVMs to non-Gaussian observations, such as Poisson, negative binomial, and multinomial distributions, using our random feature latent variable model (RFLVM). Our generalized RFLVMs perform on par with state-of-the-art latent variable models on a wide range of applications, including motion capture, images, and text data for the purpose of estimating the latent structure and imputing the missing data of these complex data sets. △ Less

Submitted 14 June, 2023; originally announced June 2023.

arXiv:2303.02637 [pdf, other]

A Semi-Bayesian Nonparametric Estimator of the Maximum Mean Discrepancy Measure: Applications in Goodness-of-Fit Testing and Generative Adversarial Networks

Authors: Forough Fazeli-Asl, Michael Minyi Zhang, Lizhen Lin

Abstract: A classic inferential statistical problem is the goodness-of-fit (GOF) test. Such a test can be challenging when the hypothesized parametric model has an intractable likelihood and its distributional form is not available. Bayesian methods for GOF can be appealing due to their ability to incorporate expert knowledge through prior distributions. However, standard Bayesian methods for this test of… ▽ More A classic inferential statistical problem is the goodness-of-fit (GOF) test. Such a test can be challenging when the hypothesized parametric model has an intractable likelihood and its distributional form is not available. Bayesian methods for GOF can be appealing due to their ability to incorporate expert knowledge through prior distributions. However, standard Bayesian methods for this test often require strong distributional assumptions on the data and their relevant parameters. To address this issue, we propose a semi-Bayesian nonparametric (semi-BNP) procedure in the context of the maximum mean discrepancy (MMD) measure that can be applied to the GOF test. Our method introduces a novel Bayesian estimator for the MMD, enabling the development of a measure-based hypothesis test for intractable models. Through extensive experiments, we demonstrate that our proposed test outperforms frequentist MMD-based methods by achieving a lower false rejection and acceptance rate of the null hypothesis. Furthermore, we showcase the versatility of our approach by embedding the proposed estimator within a generative adversarial network (GAN) framework. It facilitates a robust BNP learning approach as another significant application of our method. With our BNP procedure, this new GAN approach can enhance sample diversity and improve inferential accuracy compared to traditional techniques. △ Less

Submitted 10 November, 2023; v1 submitted 5 March, 2023; originally announced March 2023.

Comments: Typos corrected, Secondary (simulation and theoretical) results added, Additional discussion added, references added

arXiv:2209.02862 [pdf, other]

DAVE Aquatic Virtual Environment: Toward a General Underwater Robotics Simulator

Authors: Mabel M. Zhang, Woen-Sug Choi, Jessica Herman, Duane Davis, Carson Vogt, Michael McCarrin, Yadunund Vijay, Dharini Dutia, William Lew, Steven Peters, Brian Bingham

Abstract: We present DAVE Aquatic Virtual Environment (DAVE), an open source simulation stack for underwater robots, sensors, and environments. Conventional robotics simulators are not designed to address unique challenges that come with the marine environment, including but not limited to environment conditions that vary spatially and temporally, impaired or challenging perception, and the unavailability o… ▽ More We present DAVE Aquatic Virtual Environment (DAVE), an open source simulation stack for underwater robots, sensors, and environments. Conventional robotics simulators are not designed to address unique challenges that come with the marine environment, including but not limited to environment conditions that vary spatially and temporally, impaired or challenging perception, and the unavailability of data in a generally unexplored environment. Given the variety of sensors and platforms, wheels are often reinvented for specific use cases that inevitably resist wider adoption. Building on existing simulators, we provide a framework to help speed up the development and evaluation of algorithms that would otherwise require expensive and time-consuming operations at sea. The framework includes basic building blocks (e.g., new vehicles, water-tracking Doppler Velocity Logger, physics-based multibeam sonar) as well as development tools (e.g., dynamic bathymetry spawning, ocean currents), which allows the user to focus on methodology rather than software infrastructure. We demonstrate usage through example scenarios, bathymetric data import, user interfaces for data inspection and motion planning for manipulation, and visualizations. △ Less

Submitted 6 September, 2022; originally announced September 2022.

Comments: Accepted to IEEE/OES Autonomous Underwater Vehicles Symposium (AUV) 2022

arXiv:2205.09909 [pdf, other]

Sparse Infinite Random Feature Latent Variable Modeling

Authors: Michael Minyi Zhang

Abstract: We propose a non-linear, Bayesian non-parametric latent variable model where the latent space is assumed to be sparse and infinite dimensional a priori using an Indian buffet process prior. A posteriori, the number of instantiated dimensions in the latent space is guaranteed to be finite. The purpose of placing the Indian buffet process on the latent variables is to: 1.) Automatically and probabil… ▽ More We propose a non-linear, Bayesian non-parametric latent variable model where the latent space is assumed to be sparse and infinite dimensional a priori using an Indian buffet process prior. A posteriori, the number of instantiated dimensions in the latent space is guaranteed to be finite. The purpose of placing the Indian buffet process on the latent variables is to: 1.) Automatically and probabilistically select the number of latent dimensions. 2.) Impose sparsity in the latent space, where the Indian buffet process will select which elements are exactly zero. Our proposed model allows for sparse, non-linear latent variable modeling where the number of latent dimensions is selected automatically. Inference is made tractable using the random Fourier approximation and we can easily implement posterior inference through Markov chain Monte Carlo sampling. This approach is amenable to many observation models beyond the Gaussian setting. We demonstrate the utility of our method on a variety of synthetic, biological and text datasets and show that we can obtain superior test set performance compared to previous latent variable models. △ Less

Submitted 26 May, 2022; v1 submitted 19 May, 2022; originally announced May 2022.

arXiv:2110.13587 [pdf, other]

Arbitrary Distribution Modeling with Censorship in Real-Time Bidding Advertising

Authors: Xu Li, Michelle Ma Zhang, Youjun Tong, Zhenya Wang

Abstract: The purpose of Inventory Pricing is to bid the right prices to online ad opportunities, which is crucial for a Demand-Side Platform (DSP) to win advertising auctions in Real-Time Bidding (RTB). In the planning stage, advertisers need the forecast of probabilistic models to make bidding decisions. However, most of the previous works made strong assumptions on the distribution form of the winning pr… ▽ More The purpose of Inventory Pricing is to bid the right prices to online ad opportunities, which is crucial for a Demand-Side Platform (DSP) to win advertising auctions in Real-Time Bidding (RTB). In the planning stage, advertisers need the forecast of probabilistic models to make bidding decisions. However, most of the previous works made strong assumptions on the distribution form of the winning price, which reduced their accuracy and weakened their ability to make generalizations. Though some works recently tried to fit the distribution directly, their complex structure lacked efficiency on online inference. In this paper, we devise a novel loss function, Neighborhood Likelihood Loss (NLL), collaborating with a proposed framework, Arbitrary Distribution Modeling (ADM), to predict the winning price distribution under censorship with no pre-assumption required. We conducted experiments on two real-world experimental datasets and one large-scale, non-simulated production dataset in our system. Experiments showed that ADM outperformed the baselines both on algorithm and business metrics. By replaying historical data of the production environment, this method was shown to lead to good yield in our system. Without any pre-assumed specific distribution form, ADM showed significant advantages in effectiveness and efficiency, demonstrating its great capability in modeling sophisticated price landscapes. △ Less

Submitted 26 October, 2021; originally announced October 2021.

arXiv:2101.06023 [pdf, other]

doi 10.1103/PhysRevLett.126.152502

New $α$-Emitting Isotope $^{214}$U and Abnormal Enhancement of $α$-Particle Clustering in Lightest Uranium Isotopes

Authors: Z. Y. Zhang, H. B. Yang, M. H. Huang, Z. G. Gan, C. X. Yuan, C. Qi, A. N. Andreyev, M. L. Liu, L. Ma, M. M. Zhang, Y. L. Tian, Y. S. Wang, J. G. Wang, C. L. Yang, G. S. Li, Y. H. Qiang, W. Q. Yang, R. F. Chen, H. B. Zhang, Z. W. Lu, X. X. Xu, L. M. Duan, H. R. Yang, W. X. Huang, Z. Liu , et al. (17 additional authors not shown)

Abstract: A new $α$-emitting isotope $^{214}$U, produced by fusion-evaporation reaction $^{182}$W($^{36}$Ar, 4n)$^{214}$U, was identified by employing the gas-filled recoil separator SHANS and recoil-$α$ correlation technique. More precise $α$-decay properties of even-even nuclei $^{216,218}$U were also measured in reactions of $^{40}$Ar, $^{40}$Ca with $^{180, 182, 184}$W targets. By combining the experime… ▽ More A new $α$-emitting isotope $^{214}$U, produced by fusion-evaporation reaction $^{182}$W($^{36}$Ar, 4n)$^{214}$U, was identified by employing the gas-filled recoil separator SHANS and recoil-$α$ correlation technique. More precise $α$-decay properties of even-even nuclei $^{216,218}$U were also measured in reactions of $^{40}$Ar, $^{40}$Ca with $^{180, 182, 184}$W targets. By combining the experimental data, improved $α$-decay reduced widths $δ^2$ for the even-even Po--Pu nuclei in the vicinity of magic neutron number $N=126$ were deduced. Their systematic trends are discussed in terms of $N_{p}N_{n}$ scheme in order to study the influence of proton-neutron interaction on $α$ decay in this region of nuclei. It is strikingly found that the reduced widths of $^{214,216}$U are significantly enhanced by a factor of two as compared with the $N_{p}N_{n}$ systematics for the $84 \leq Z \leq 90$ and $N<126$ even-even nuclei. The abnormal enhancement is interpreted by the strong monopole interaction between the valence protons and neutrons occupying the $π1f_{7/2}$ and $ν1f_{5/2}$ spin-orbit partner orbits, which is supported by a large-scale shell model calculation. △ Less

Submitted 15 January, 2021; originally announced January 2021.

Journal ref: Phys. Rev. Lett. 126, 152502 (2021)

arXiv:2010.08908 [pdf, other]

Accelerated Algorithms for Convex and Non-Convex Optimization on Manifolds

Authors: Lizhen Lin, Bayan Saparbayeva, Michael Minyi Zhang, David B. Dunson

Abstract: We propose a general scheme for solving convex and non-convex optimization problems on manifolds. The central idea is that, by adding a multiple of the squared retraction distance to the objective function in question, we "convexify" the objective function and solve a series of convex sub-problems in the optimization procedure. One of the key challenges for optimization on manifolds is the difficu… ▽ More We propose a general scheme for solving convex and non-convex optimization problems on manifolds. The central idea is that, by adding a multiple of the squared retraction distance to the objective function in question, we "convexify" the objective function and solve a series of convex sub-problems in the optimization procedure. One of the key challenges for optimization on manifolds is the difficulty of verifying the complexity of the objective function, e.g., whether the objective function is convex or non-convex, and the degree of non-convexity. Our proposed algorithm adapts to the level of complexity in the objective function. We show that when the objective function is convex, the algorithm provably converges to the optimum and leads to accelerated convergence. When the objective function is non-convex, the algorithm will converge to a stationary point. Our proposed method unifies insights from Nesterov's original idea for accelerating gradient descent algorithms with recent developments in optimization algorithms in Euclidean space. We demonstrate the utility of our algorithms on several manifold optimization tasks such as estimating intrinsic and extrinsic Fréchet means on spheres and low-rank matrix factorization with Grassmann manifolds applied to the Netflix rating data set. △ Less

Submitted 17 October, 2020; originally announced October 2020.

arXiv:2006.11145 [pdf, other]

Latent variable modeling with random features

Authors: Gregory W. Gundersen, Michael Minyi Zhang, Barbara E. Engelhardt

Abstract: Gaussian process-based latent variable models are flexible and theoretically grounded tools for nonlinear dimension reduction, but generalizing to non-Gaussian data likelihoods within this nonlinear framework is statistically challenging. Here, we use random features to develop a family of nonlinear dimension reduction models that are easily extensible to non-Gaussian data likelihoods; we call the… ▽ More Gaussian process-based latent variable models are flexible and theoretically grounded tools for nonlinear dimension reduction, but generalizing to non-Gaussian data likelihoods within this nonlinear framework is statistically challenging. Here, we use random features to develop a family of nonlinear dimension reduction models that are easily extensible to non-Gaussian data likelihoods; we call these random feature latent variable models (RFLVMs). By approximating a nonlinear relationship between the latent space and the observations with a function that is linear with respect to random features, we induce closed-form gradients of the posterior distribution with respect to the latent variable. This allows the RFLVM framework to support computationally tractable nonlinear latent variable models for a variety of data likelihoods in the exponential family without specialized derivations. Our generalized RFLVMs produce results comparable with other state-of-the-art dimension reduction methods on diverse types of data, including neural spike train recordings, images, and text data. △ Less

Submitted 19 June, 2020; originally announced June 2020.

Comments: 21 pages, 7 figures

arXiv:2001.05591 [pdf, other]

Distributed, partially collapsed MCMC for Bayesian Nonparametrics

Authors: Avinava Dubey, Michael Minyi Zhang, Eric P. Xing, Sinead A. Williamson

Abstract: Bayesian nonparametric (BNP) models provide elegant methods for discovering underlying latent features within a data set, but inference in such models can be slow. We exploit the fact that completely random measures, which commonly used models like the Dirichlet process and the beta-Bernoulli process can be expressed as, are decomposable into independent sub-measures. We use this decomposition to… ▽ More Bayesian nonparametric (BNP) models provide elegant methods for discovering underlying latent features within a data set, but inference in such models can be slow. We exploit the fact that completely random measures, which commonly used models like the Dirichlet process and the beta-Bernoulli process can be expressed as, are decomposable into independent sub-measures. We use this decomposition to partition the latent measure into a finite measure containing only instantiated components, and an infinite measure containing all other components. We then select different inference algorithms for the two components: uncollapsed samplers mix well on the finite measure, while collapsed samplers mix well on the infinite, sparsely occupied tail. The resulting hybrid algorithm can be applied to a wide class of models, and can be easily distributed to allow scalable inference without sacrificing asymptotic convergence guarantees. △ Less

Submitted 4 March, 2020; v1 submitted 15 January, 2020; originally announced January 2020.

Comments: To appear in the 23rd International Conference on Artificial Intelligence and Statistics

Journal ref: Artificial Intelligence and Statistics, 108:3685-3695, 2020

arXiv:1911.09131 [pdf, other]

doi 10.3847/2041-8213/ab5c16

The Orbit of WASP-12b is Decaying

Authors: Samuel W. Yee, Joshua N. Winn, Heather A. Knutson, Kishore C. Patra, Shreyas Vissapragada, Michael M. Zhang, Matthew J. Holman, Avi Shporer, Jason T. Wright

Abstract: WASP-12b is a transiting hot Jupiter on a 1.09-day orbit around a late-F star. Since the planet's discovery in 2008, the time interval between transits has been decreasing by $29\pm 2$ msec year$^{-1}$. This is a possible sign of orbital decay, although the previously available data left open the possibility that the planet's orbit is slightly eccentric and is undergoing apsidal precession. Here,… ▽ More WASP-12b is a transiting hot Jupiter on a 1.09-day orbit around a late-F star. Since the planet's discovery in 2008, the time interval between transits has been decreasing by $29\pm 2$ msec year$^{-1}$. This is a possible sign of orbital decay, although the previously available data left open the possibility that the planet's orbit is slightly eccentric and is undergoing apsidal precession. Here, we present new transit and occultation observations that provide more decisive evidence for orbital decay, which is favored over apsidal precession by a $Δ\mathrm{BIC}$ of 22.3 or Bayes factor of 70,000. We also present new radial-velocity data that rule out the Rømer effect as the cause of the period change. This makes WASP-12 the first planetary system for which we can be confident that the orbit is decaying. The decay timescale for the orbit is $P/\dot{P} = 3.25\pm 0.23$ Myr. Interpreting the decay as the result of tidal dissipation, the modified stellar tidal quality factor is $Q'_\star = 1.8 \times10^{5}$. △ Less

Submitted 20 November, 2019; originally announced November 2019.

Comments: 16 pages, 6 tables, 5 figures, accepted to AJ

arXiv:1910.06569 [pdf, other]

doi 10.1109/LSP.2019.2944005

Probabilistic Time of Arrival Localization

Authors: Fernando Perez-Cruz, Pablo M. Olmos, Michael Minyi Zhang, Howard Huang

Abstract: In this paper, we take a new approach for time of arrival geo-localization. We show that the main sources of error in metropolitan areas are due to environmental imperfections that bias our solutions, and that we can rely on a probabilistic model to learn and compensate for them. The resulting localization error is validated using measurements from a live LTE cellular network to be less than 10 me… ▽ More In this paper, we take a new approach for time of arrival geo-localization. We show that the main sources of error in metropolitan areas are due to environmental imperfections that bias our solutions, and that we can rely on a probabilistic model to learn and compensate for them. The resulting localization error is validated using measurements from a live LTE cellular network to be less than 10 meters, representing an order-of-magnitude improvement. △ Less

Submitted 15 October, 2019; originally announced October 2019.

Comments: IEEE Signal Processing Letters, 2019

arXiv:1905.10003 [pdf, other]

doi 10.1109/TSP.2023.3267992

Sequential Gaussian Processes for Online Learning of Nonstationary Functions

Authors: Michael Minyi Zhang, Bianca Dumitrascu, Sinead A. Williamson, Barbara E. Engelhardt

Abstract: Many machine learning problems can be framed in the context of estimating functions, and often these are time-dependent functions that are estimated in real-time as observations arrive. Gaussian processes (GPs) are an attractive choice for modeling real-valued nonlinear functions due to their flexibility and uncertainty quantification. However, the typical GP regression model suffers from several… ▽ More Many machine learning problems can be framed in the context of estimating functions, and often these are time-dependent functions that are estimated in real-time as observations arrive. Gaussian processes (GPs) are an attractive choice for modeling real-valued nonlinear functions due to their flexibility and uncertainty quantification. However, the typical GP regression model suffers from several drawbacks: 1) Conventional GP inference scales $O(N^{3})$ with respect to the number of observations; 2) Updating a GP model sequentially is not trivial; and 3) Covariance kernels typically enforce stationarity constraints on the function, while GPs with non-stationary covariance kernels are often intractable to use in practice. To overcome these issues, we propose a sequential Monte Carlo algorithm to fit infinite mixtures of GPs that capture non-stationary behavior while allowing for online, distributed inference. Our approach empirically improves performance over state-of-the-art methods for online GP estimation in the presence of non-stationarity in time-series data. To demonstrate the utility of our proposed online Gaussian process mixture-of-experts approach in applied settings, we show that we can sucessfully implement an optimization algorithm using online Gaussian process bandits. △ Less

Submitted 6 May, 2023; v1 submitted 23 May, 2019; originally announced May 2019.

Journal ref: IEEE Transactions on Signal Processing, vol. 71, pp. 1539-1550, 2023

arXiv:1904.08548 [pdf, other]

A New Class of Time Dependent Latent Factor Models with Applications

Authors: Sinead A. Williamson, Michael Minyi Zhang, Paul Damien

Abstract: In many applications, observed data are influenced by some combination of latent causes. For example, suppose sensors are placed inside a building to record responses such as temperature, humidity, power consumption and noise levels. These random, observed responses are typically affected by many unobserved, latent factors (or features) within the building such as the number of individuals, the tu… ▽ More In many applications, observed data are influenced by some combination of latent causes. For example, suppose sensors are placed inside a building to record responses such as temperature, humidity, power consumption and noise levels. These random, observed responses are typically affected by many unobserved, latent factors (or features) within the building such as the number of individuals, the turning on and off of electrical devices, power surges, etc. These latent factors are usually present for a contiguous period of time before disappearing; further, multiple factors could be present at a time. This paper develops new probabilistic methodology and inference methods for random object generation influenced by latent features exhibiting temporal persistence. Every datum is associated with subsets of a potentially infinite number of hidden, persistent features that account for temporal dynamics in an observation. The ensuing class of dynamic models constructed by adapting the Indian Buffet Process --- a probability measure on the space of random, unbounded binary matrices --- finds use in a variety of applications arising in operations, signal processing, biomedicine, marketing, image analysis, etc. Illustrations using synthetic and real data are provided. △ Less

Submitted 17 April, 2019; originally announced April 2019.

Journal ref: Journal of Machine Learning Research 21(27):1-24, 2020

arXiv:1810.11155 [pdf, other]

Communication Efficient Parallel Algorithms for Optimization on Manifolds

Authors: Bayan Saparbayeva, Michael Minyi Zhang, Lizhen Lin

Abstract: The last decade has witnessed an explosion in the development of models, theory and computational algorithms for "big data" analysis. In particular, distributed computing has served as a natural and dominating paradigm for statistical inference. However, the existing literature on parallel inference almost exclusively focuses on Euclidean data and parameters. While this assumption is valid for man… ▽ More The last decade has witnessed an explosion in the development of models, theory and computational algorithms for "big data" analysis. In particular, distributed computing has served as a natural and dominating paradigm for statistical inference. However, the existing literature on parallel inference almost exclusively focuses on Euclidean data and parameters. While this assumption is valid for many applications, it is increasingly more common to encounter problems where the data or the parameters lie on a non-Euclidean space, like a manifold for example. Our work aims to fill a critical gap in the literature by generalizing parallel inference algorithms to optimization on manifolds. We show that our proposed algorithm is both communication efficient and carries theoretical convergence guarantees. In addition, we demonstrate the performance of our algorithm to the estimation of Fréchet means on simulated spherical data and the low-rank matrix completion problem over Grassmann manifolds applied to the Netflix prize data set. △ Less

Submitted 1 November, 2018; v1 submitted 25 October, 2018; originally announced October 2018.

Comments: 15 pages

arXiv:1804.10899 [pdf, ps, other]

Scalable Angular Discriminative Deep Metric Learning for Face Recognition

Authors: Bowen Wu, Huaming Wu, Monica M. Y. Zhang

Abstract: With the development of deep learning, Deep Metric Learning (DML) has achieved great improvements in face recognition. Specifically, the widely used softmax loss in the training process often bring large intra-class variations, and feature normalization is only exploited in the testing process to compute the pair similarities. To bridge the gap, we impose the intra-class cosine similarity between… ▽ More With the development of deep learning, Deep Metric Learning (DML) has achieved great improvements in face recognition. Specifically, the widely used softmax loss in the training process often bring large intra-class variations, and feature normalization is only exploited in the testing process to compute the pair similarities. To bridge the gap, we impose the intra-class cosine similarity between the features and weight vectors in softmax loss larger than a margin in the training step, and extend it from four aspects. First, we explore the effect of a hard sample mining strategy. To alleviate the human labor of adjusting the margin hyper-parameter, a self-adaptive margin updating strategy is proposed. Then, a normalized version is given to take full advantage of the cosine similarity constraint. Furthermore, we enhance the former constraint to force the intra-class cosine similarity larger than the mean inter-class cosine similarity with a margin in the exponential feature projection space. Extensive experiments on Labeled Face in the Wild (LFW), Youtube Faces (YTF) and IARPA Janus Benchmark A (IJB-A) datasets demonstrate that the proposed methods outperform the mainstream DML methods and approach the state-of-the-art performance. △ Less

Submitted 30 April, 2018; v1 submitted 29 April, 2018; originally announced April 2018.

arXiv:1801.04475 [pdf, ps, other]

doi 10.1080/00927872.2015.1065844

On fully residually-$\mathcal{R}$ groups

Authors: Inna Bumagin, Ming Ming Zhang

Abstract: We consider the class $\mathcal{R}$ of finitely generated toral relatively hyperbolic groups. We show that groups from $\mathcal{R}$ are commutative transitive and generalize a theorem proved by Benjamin Baumslag to this class. We also discuss two definitions of (fully) residually-$\mathcal{C}$ groups and prove the equivalence of the two definitions for $\mathcal{C}=\mathcal{R}$. This is a general… ▽ More We consider the class $\mathcal{R}$ of finitely generated toral relatively hyperbolic groups. We show that groups from $\mathcal{R}$ are commutative transitive and generalize a theorem proved by Benjamin Baumslag to this class. We also discuss two definitions of (fully) residually-$\mathcal{C}$ groups and prove the equivalence of the two definitions for $\mathcal{C}=\mathcal{R}$. This is a generalization of the similar result obtained by Ol'shanskii for $\mathcal{C}$ being the class of torsion-free hyperbolic groups. Let $Γ\in\mathcal{R}$ be non-abelian and non-elementary. We prove that every finitely generated fully residually-$Γ$ group embeds into a group from $\mathcal{R}$. On the other hand, we give an example of a finitely generated torsion-free fully residually-$\mathcal{H}$ group that does not embed into a group from $\mathcal{R}$; $\mathcal{H}$ is the class of hyperbolic groups. △ Less

Submitted 13 January, 2018; originally announced January 2018.

Comments: 13 pages

Journal ref: Communications in Algebra, 44 (2016) 2813-2827

arXiv:1705.07178 [pdf, other]

doi 10.1007/s11222-022-10108-z

Accelerated Parallel Non-conjugate Sampling for Bayesian Non-parametric Models

Authors: Michael Minyi Zhang, Sinead A. Williamson, Fernando Perez-Cruz

Abstract: Inference of latent feature models in the Bayesian nonparametric setting is generally difficult, especially in high dimensional settings, because it usually requires proposing features from some prior distribution. In special cases, where the integration is tractable, we can sample new feature assignments according to a predictive likelihood. We present a novel method to accelerate the mixing of l… ▽ More Inference of latent feature models in the Bayesian nonparametric setting is generally difficult, especially in high dimensional settings, because it usually requires proposing features from some prior distribution. In special cases, where the integration is tractable, we can sample new feature assignments according to a predictive likelihood. We present a novel method to accelerate the mixing of latent variable model inference by proposing feature locations based on the data, as opposed to the prior. First, we introduce an accelerated feature proposal mechanism that we show is a valid MCMC algorithm for posterior inference. Next, we propose an approximate inference strategy to perform accelerated inference in parallel. A two-stage algorithm that combines the two approaches provides a computationally attractive method that can quickly reach local convergence to the posterior distribution of our model, while allowing us to exploit parallelization. △ Less

Submitted 29 April, 2022; v1 submitted 19 May, 2017; originally announced May 2017.

Comments: To appear in Statistics & Computing

Journal ref: Statistics and Computing, Vol. 32, Num. 50, 2022

arXiv:1703.03457 [pdf, other]

Parallel Markov Chain Monte Carlo for the Indian Buffet Process

Authors: Michael M. Zhang, Avinava Dubey, Sinead A. Williamson

Abstract: Indian Buffet Process based models are an elegant way for discovering underlying features within a data set, but inference in such models can be slow. Inferring underlying features using Markov chain Monte Carlo either relies on an uncollapsed representation, which leads to poor mixing, or on a collapsed representation, which leads to a quadratic increase in computational complexity. Existing atte… ▽ More Indian Buffet Process based models are an elegant way for discovering underlying features within a data set, but inference in such models can be slow. Inferring underlying features using Markov chain Monte Carlo either relies on an uncollapsed representation, which leads to poor mixing, or on a collapsed representation, which leads to a quadratic increase in computational complexity. Existing attempts at distributing inference have introduced additional approximation within the inference procedure. In this paper we present a novel algorithm to perform asymptotically exact parallel Markov chain Monte Carlo inference for Indian Buffet Process models. We take advantage of the fact that the features are conditionally independent under the beta-Bernoulli process. Because of this conditional independence, we can partition the features into two parts: one part containing only the finitely many instantiated features and the other part containing the infinite tail of uninstantiated features. For the finite partition, parallel inference is simple given the instantiation of features. But for the infinite tail, performing uncollapsed MCMC leads to poor mixing and hence we collapse out the features. The resulting hybrid sampler, while being parallel, produces samples asymptotically from the true posterior. △ Less

Submitted 9 March, 2017; originally announced March 2017.

Comments: Workshop paper in Bayesian Nonparametrics: The Next Generation, NIPS 2015

arXiv:1703.00095 [pdf, other]

Active End-Effector Pose Selection for Tactile Object Recognition through Monte Carlo Tree Search

Authors: Mabel M. Zhang, Nikolay Atanasov, Kostas Daniilidis

Abstract: This paper considers the problem of active object recognition using touch only. The focus is on adaptively selecting a sequence of wrist poses that achieves accurate recognition by enclosure grasps. It seeks to minimize the number of touches and maximize recognition confidence. The actions are formulated as wrist poses relative to each other, making the algorithm independent of absolute workspace… ▽ More This paper considers the problem of active object recognition using touch only. The focus is on adaptively selecting a sequence of wrist poses that achieves accurate recognition by enclosure grasps. It seeks to minimize the number of touches and maximize recognition confidence. The actions are formulated as wrist poses relative to each other, making the algorithm independent of absolute workspace coordinates. The optimal sequence is approximated by Monte Carlo tree search. We demonstrate results in a physics engine and on a real robot. In the physics engine, most object instances were recognized in at most 16 grasps. On a real robot, our method recognized objects in 2--9 grasps and outperformed a greedy baseline. △ Less

Submitted 29 July, 2017; v1 submitted 28 February, 2017; originally announced March 2017.

Comments: Accepted to International Conference on Intelligent Robots and Systems (IROS) 2017

arXiv:1702.08420 [pdf, other]

Embarrassingly Parallel Inference for Gaussian Processes

Authors: Michael Minyi Zhang, Sinead A. Williamson

Abstract: Training Gaussian process-based models typically involves an $ O(N^3)$ computational bottleneck due to inverting the covariance matrix. Popular methods for overcoming this matrix inversion problem cannot adequately model all types of latent functions, and are often not parallelizable. However, judicious choice of model structure can ameliorate this problem. A mixture-of-experts model that uses a m… ▽ More Training Gaussian process-based models typically involves an $ O(N^3)$ computational bottleneck due to inverting the covariance matrix. Popular methods for overcoming this matrix inversion problem cannot adequately model all types of latent functions, and are often not parallelizable. However, judicious choice of model structure can ameliorate this problem. A mixture-of-experts model that uses a mixture of $K$ Gaussian processes offers modeling flexibility and opportunities for scalable inference. Our embarrassingly parallel algorithm combines low-dimensional matrix inversions with importance sampling to yield a flexible, scalable mixture-of-experts model that offers comparable performance to Gaussian process regression at a much lower computational cost. △ Less

Submitted 3 March, 2020; v1 submitted 27 February, 2017; originally announced February 2017.

Journal ref: Journal of Machine Learning Research 20, no. 169 (2019): 1-26

arXiv:1610.06194 [pdf, other]

doi 10.1016/j.csda.2018.05.016

Robust and Parallel Bayesian Model Selection

Authors: Michael Minyi Zhang, Henry Lam, Lizhen Lin

Abstract: Effective and accurate model selection is an important problem in modern data analysis. One of the major challenges is the computational burden required to handle large data sets that cannot be stored or processed on one machine. Another challenge one may encounter is the presence of outliers and contaminations that damage the inference quality. The parallel "divide and conquer" model selection st… ▽ More Effective and accurate model selection is an important problem in modern data analysis. One of the major challenges is the computational burden required to handle large data sets that cannot be stored or processed on one machine. Another challenge one may encounter is the presence of outliers and contaminations that damage the inference quality. The parallel "divide and conquer" model selection strategy divides the observations of the full data set into roughly equal subsets and perform inference and model selection independently on each subset. After local subset inference, this method aggregates the posterior model probabilities or other model/variable selection criteria to obtain a final model by using the notion of geometric median. This approach leads to improved concentration in finding the "correct" model and model parameters and also is provably robust to outliers and data contamination. △ Less

Submitted 22 March, 2018; v1 submitted 19 October, 2016; originally announced October 2016.

Journal ref: Computational Statistics & Data Analysis, Volume 127, 2018, Pages 229-247, ISSN 0167-9473

Showing 1–28 of 28 results for author: Zhang, M M