Zum Hauptinhalt springen

Showing 1–16 of 16 results for author: Wan, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2202.04105  [pdf, other

    cs.LG stat.ML

    Hierarchical Dependency Constrained Tree Augmented Naive Bayes Classifiers for Hierarchical Feature Spaces

    Authors: Cen Wan, Alex A. Freitas

    Abstract: The Tree Augmented Naive Bayes (TAN) classifier is a type of probabilistic graphical model that constructs a single-parent dependency tree to estimate the distribution of the data. In this work, we propose two novel Hierarchical dependency-based Tree Augmented Naive Bayes algorithms, i.e. Hie-TAN and Hie-TAN-Lite. Both methods exploit the pre-defined parent-child (generalisation-specialisation) re… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

  2. arXiv:2112.05045  [pdf, other

    stat.ME

    Multi-Kink Quantile Regression for Longitudinal Data with Application to the Progesterone Data Analysis

    Authors: Chuang Wan, Wei Zhong, Wenyang Zhang, Changliang Zou

    Abstract: Motivated by investigating the relationship between progesterone and the days in a menstrual cycle in a longitudinal study, we propose a multi-kink quantile regression model for longitudinal data analysis. It relaxes the linearity condition and assumes different regression forms in different regions of the domain of the threshold covariate. In this paper, we first propose a multi-kink quantile reg… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: 22pages; 3 figures

  3. arXiv:2109.00539  [pdf, other

    stat.ME cs.LG

    Spatially and Robustly Hybrid Mixture Regression Model for Inference of Spatial Dependence

    Authors: Wennan Chang, Pengtao Dang, Changlin Wan, Xiaoyu Lu, Yue Fang, Tong Zhao, Yong Zang, Bo Li, Chi Zhang, Sha Cao

    Abstract: In this paper, we propose a Spatial Robust Mixture Regression model to investigate the relationship between a response variable and a set of explanatory variables over the spatial domain, assuming that the relationships may exhibit complex spatially dynamic patterns that cannot be captured by constant regression coefficients. Our method integrates the robust finite mixture Gaussian regression mode… ▽ More

    Submitted 28 September, 2021; v1 submitted 1 September, 2021; originally announced September 2021.

    Comments: Accepted by ICDM IEEE 2021

  4. arXiv:2009.02305  [pdf, other

    stat.ME

    Composite Estimation for Quantile Regression Kink Models with Longitudinal Data

    Authors: Chuang Wan

    Abstract: Kink model is developed to analyze the data where the regression function is twostage linear but intersects at an unknown threshold. In quantile regression with longitudinal data, previous work assumed that the unknown threshold parameters or kink points are heterogeneous across different quantiles. However, the location where kink effect happens tend to be the same across different quantiles, esp… ▽ More

    Submitted 4 September, 2020; originally announced September 2020.

  5. arXiv:2008.06635  [pdf, other

    cs.LG stat.ML

    Orthogonalized SGD and Nested Architectures for Anytime Neural Networks

    Authors: Chengcheng Wan, Henry Hoffmann, Shan Lu, Michael Maire

    Abstract: We propose a novel variant of SGD customized for training network architectures that support anytime behavior: such networks produce a series of increasingly accurate outputs over time. Efficient architectural designs for these networks focus on re-using internal state; subnetworks must produce representations relevant for both immediate prediction as well as refinement by subsequent network stage… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

    Comments: ICML 2020

  6. arXiv:2007.15821  [pdf, other

    cs.LG cs.CG stat.ML

    Geometric All-Way Boolean Tensor Decomposition

    Authors: Changlin Wan, Wennan Chang, Tong Zhao, Sha Cao, Chi Zhang

    Abstract: Boolean tensor has been broadly utilized in representing high dimensional logical data collected on spatial, temporal and/or other relational domains. Boolean Tensor Decomposition (BTD) factorizes a binary tensor into the Boolean sum of multiple rank-1 tensors, which is an NP-hard problem. Existing BTD methods have been limited by their high computational cost, in applications to large scale or hi… ▽ More

    Submitted 26 October, 2020; v1 submitted 30 July, 2020; originally announced July 2020.

    Comments: NeurIPS 2020

  7. arXiv:2007.15816  [pdf, other

    cs.LG stat.ML

    Denoising individual bias for a fairer binary submatrix detection

    Authors: Changlin Wan, Wennan Chang, Tong Zhao, Sha Cao, Chi Zhang

    Abstract: Low rank representation of binary matrix is powerful in disentangling sparse individual-attribute associations, and has received wide applications. Existing binary matrix factorization (BMF) or co-clustering (CC) methods often assume i.i.d background noise. However, this assumption could be easily violated in real data, where heterogeneous row- or column-wise probability of binary entries results… ▽ More

    Submitted 9 August, 2020; v1 submitted 30 July, 2020; originally announced July 2020.

    Comments: Accepted at CIKM 2020

  8. arXiv:2007.09720  [pdf, ps, other

    stat.ME cs.LG

    Supervised clustering of high dimensional data using regularized mixture modeling

    Authors: Wennan Chang, Changlin Wan, Yong Zang, Chi Zhang, Sha Cao

    Abstract: Identifying relationships between molecular variations and their clinical presentations has been challenged by the heterogeneous causes of a disease. It is imperative to unveil the relationship between the high dimensional molecular manifestations and the clinical presentations, while taking into account the possible heterogeneity of the study subjects. We proposed a novel supervised clustering al… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

  9. arXiv:2006.09977  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    A novel sentence embedding based topic detection method for micro-blog

    Authors: Cong Wan, Shan Jiang, Cuirong Wang, Cong Wang, Changming Xu, Xianxia Chen, Ying Yuan

    Abstract: Topic detection is a challenging task, especially without knowing the exact number of topics. In this paper, we present a novel approach based on neural network to detect topics in the micro-blogging dataset. We use an unsupervised neural sentence embedding model to map the blogs to an embedding space. Our model is a weighted power mean word embedding model, and the weights are calculated by atten… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

  10. arXiv:2006.07924  [pdf, other

    stat.ME

    Estimation and Inference for Multi-Kink Quantile Regression

    Authors: Wei Zhong, Chuang Wan, Wenyang Zhang

    Abstract: The Multi-Kink Quantile Regression (MKQR) model is an important tool for analyzing data with heterogeneous conditional distributions, especially when quantiles of response variable are of interest, due to its robustness to outliers and heavy-tailed errors in the response. It assumes different linear quantile regression forms in different regions of the domain of the threshold covariate but are sti… ▽ More

    Submitted 14 June, 2020; originally announced June 2020.

    Comments: 39pages, 4 figures

  11. arXiv:2003.05731  [pdf, other

    cs.LG cs.DC cs.IR stat.ML

    SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier Detection

    Authors: Yue Zhao, Xiyang Hu, Cheng Cheng, Cong Wang, Changlin Wan, Wen Wang, Jianing Yang, Haoping Bai, Zheng Li, Cao Xiao, Yunlong Wang, Zhi Qiao, Jimeng Sun, Leman Akoglu

    Abstract: Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples with numerous high-stake applications including fraud detection and intrusion detection. Due to the lack of ground truth labels, practitioners often have to build a large number of unsupervised, heterogeneous models (i.e., different algorithms with varying hyperparameters) for further c… ▽ More

    Submitted 4 March, 2021; v1 submitted 10 March, 2020; originally announced March 2020.

    Comments: Proceedings of the 4th Conference on Machine Learning and Systems (MLSys). The code is available at see http://github.com/yzhao062/SUOD. arXiv admin note: text overlap with arXiv:2002.03222

  12. arXiv:1909.03991  [pdf, other

    cs.LG cs.CG stat.ML

    Fast And Efficient Boolean Matrix Factorization By Geometric Segmentation

    Authors: Changlin Wan, Wennan Chang, Tong Zhao, Mengya Li, Sha Cao, Chi Zhang

    Abstract: Boolean matrix has been used to represent digital information in many fields, including bank transaction, crime records, natural language processing, protein-protein interaction, etc. Boolean matrix factorization (BMF) aims to find an approximation of a binary matrix as the Boolean product of two low rank Boolean matrices, which could generate vast amount of information for the patterns of relatio… ▽ More

    Submitted 10 February, 2020; v1 submitted 9 September, 2019; originally announced September 2019.

    Comments: Accepted at AAAI 2020

  13. arXiv:1908.07483  [pdf, other

    cs.LG eess.SP stat.ML

    Sensor-Based Estimation of Dim Light Melatonin Onset (DLMO) Using Features of Two Time Scales

    Authors: Cheng Wan, Andrew W. McHill, Elizabeth Klerman, Akane Sano

    Abstract: Circadian rhythms influence multiple essential biological activities including sleep, performance, and mood. The dim light melatonin onset (DLMO) is the gold standard for measuring human circadian phase (i.e., timing). The collection of DLMO is expensive and time-consuming since multiple saliva or blood samples are required overnight in special conditions, and the samples must then be assayed for… ▽ More

    Submitted 1 March, 2022; v1 submitted 20 August, 2019; originally announced August 2019.

    Comments: 16 pages, 6 figures, 4 tables, ACM Transactions on Computing for Healthcare

  14. arXiv:1904.07998  [pdf, other

    cs.LG stat.ML

    SynC: A Unified Framework for Generating Synthetic Population with Gaussian Copula

    Authors: Colin Wan, Zheng Li, Alicia Guo, Yue Zhao

    Abstract: Synthetic population generation is the process of combining multiple socioeconomic and demographic datasets from different sources and/or granularity levels, and downscaling them to an individual level. Although it is a fundamental step for many data science tasks, an efficient and standard framework is absent. In this study, we propose a multi-stage framework called SynC (Synthetic Population via… ▽ More

    Submitted 10 November, 2019; v1 submitted 16 April, 2019; originally announced April 2019.

  15. arXiv:1702.03613  [pdf

    cs.LG stat.AP

    A Multi-model Combination Approach for Probabilistic Wind Power Forecasting

    Authors: You Lin, Ming Yang, Can Wan, Jianhui Wang, Yonghua Song

    Abstract: Short-term probabilistic wind power forecasting can provide critical quantified uncertainty information of wind generation for power system operation and control. As the complicated characteristics of wind power prediction error, it would be difficult to develop a universal forecasting model dominating over other alternative models. Therefore, a novel multi-model combination (MMC) approach for sho… ▽ More

    Submitted 12 February, 2017; originally announced February 2017.

  16. arXiv:1211.2945  [pdf

    stat.AP cs.NE

    The application of a perceptron model to classify an individual's response to a proposed loading dose regimen of Warfarin

    Authors: Cen Wan, Irina V. Biktasheva, Steven Lane

    Abstract: The dose regimen of Warfarin is separated into two phases. Firstly a loading dose is given, which is designed to bring the International Normalisation Ratio (INR) to within therapeutic range. Then a stable maintenance dose is given to maintain the INR within therapeutic range. In the United Kingdom (UK) the loading dose is usually given as three individual daily doses, the standard loading dose be… ▽ More

    Submitted 13 November, 2012; originally announced November 2012.

    Comments: 12 pages, 5 figures, 1 table

    MSC Class: 68T05; 92C50 ACM Class: I.2.1; I.5.1; I.5.2; J.3