Zum Hauptinhalt springen

Showing 1–34 of 34 results for author: Huo, X

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.16682  [pdf, other

    cs.CV

    SAM-CP: Marrying SAM with Composable Prompts for Versatile Segmentation

    Authors: Pengfei Chen, Lingxi Xie, Xinyue Huo, Xuehui Yu, Xiaopeng Zhang, Yingfei Sun, Zhenjun Han, Qi Tian

    Abstract: The Segment Anything model (SAM) has shown a generalized ability to group image pixels into patches, but applying it to semantic-aware segmentation still faces major challenges. This paper presents SAM-CP, a simple approach that establishes two types of composable prompts beyond SAM and composes them for versatile segmentation. Specifically, given a set of classes (in texts) and a set of SAM patch… ▽ More

    Submitted 23 July, 2024; originally announced July 2024.

  2. arXiv:2406.14900  [pdf, other

    cs.IR

    Decoding Matters: Addressing Amplification Bias and Homogeneity Issue for LLM-based Recommendation

    Authors: Keqin Bao, Jizhi Zhang, Yang Zhang, Xinyue Huo, Chong Chen, Fuli Feng

    Abstract: Adapting Large Language Models (LLMs) for recommendation requires careful consideration of the decoding process, given the inherent differences between generating items and natural language. Existing approaches often directly apply LLMs' original decoding methods. However, we find these methods encounter significant challenges: 1) amplification bias -- where standard length normalization inflates… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

  3. arXiv:2405.14051  [pdf, ps, other

    cs.LG math.ST

    A Concentration Inequality for Maximum Mean Discrepancy (MMD)-based Statistics and Its Application in Generative Models

    Authors: Yijin Ni, Xiaoming Huo

    Abstract: Maximum Mean Discrepancy (MMD) is a probability metric that has found numerous applications in machine learning. In this work, we focus on its application in generative models, including the minimum MMD estimator, Generative Moment Matching Network (GMMN), and Generative Adversarial Network (GAN). In these cases, MMD is part of an objective function in a minimization or min-max optimization proble… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  4. arXiv:2404.18214  [pdf, other

    cs.IR cs.AI cs.HC

    Contrastive Learning Method for Sequential Recommendation based on Multi-Intention Disentanglement

    Authors: Zeyu Hu, Yuzhi Xiao, Tao Huang, Xuanrong Huo

    Abstract: Sequential recommendation is one of the important branches of recommender system, aiming to achieve personalized recommended items for the future through the analysis and prediction of users' ordered historical interactive behaviors. However, along with the growth of the user volume and the increasingly rich behavioral information, how to understand and disentangle the user's interactive multi-int… ▽ More

    Submitted 8 May, 2024; v1 submitted 28 April, 2024; originally announced April 2024.

  5. arXiv:2403.12187  [pdf, ps, other

    stat.ML cs.LG math.ST

    Approximation of RKHS Functionals by Neural Networks

    Authors: Tian-Yi Zhou, Namjoon Suh, Guang Cheng, Xiaoming Huo

    Abstract: Motivated by the abundance of functional data such as time series and images, there has been a growing interest in integrating such data into neural networks and learning maps from function spaces to R (i.e., functionals). In this paper, we study the approximation of functionals on reproducing kernel Hilbert spaces (RKHS's) using neural networks. We establish the universality of the approximation… ▽ More

    Submitted 18 March, 2024; originally announced March 2024.

  6. arXiv:2402.15272  [pdf, other

    cs.CV cs.AI

    EMIFF: Enhanced Multi-scale Image Feature Fusion for Vehicle-Infrastructure Cooperative 3D Object Detection

    Authors: Zhe Wang, Siqi Fan, Xiaoliang Huo, Tongda Xu, Yan Wang, Jingjing Liu, Yilun Chen, Ya-Qin Zhang

    Abstract: In autonomous driving, cooperative perception makes use of multi-view cameras from both vehicles and infrastructure, providing a global vantage point with rich semantic context of road conditions beyond a single vehicle viewpoint. Currently, two major challenges persist in vehicle-infrastructure cooperative 3D (VIC3D) object detection: $1)$ inherent pose errors when fusing multi-view images, cause… ▽ More

    Submitted 23 February, 2024; originally announced February 2024.

    Comments: 7 pages, 8 figures. Accepted by ICRA 2024. arXiv admin note: text overlap with arXiv:arXiv:2303.10975

  7. arXiv:2401.15262  [pdf, other

    math.ST cs.LG stat.ME stat.ML

    Asymptotic Behavior of Adversarial Training Estimator under $\ell_\infty$-Perturbation

    Authors: Yiling Xie, Xiaoming Huo

    Abstract: Adversarial training has been proposed to hedge against adversarial attacks in machine learning and statistical models. This paper focuses on adversarial training under $\ell_\infty$-perturbation, which has recently attracted much research attention. The asymptotic behavior of the adversarial training estimator is investigated in the generalized linear model. The results imply that the limiting di… ▽ More

    Submitted 26 January, 2024; originally announced January 2024.

  8. arXiv:2401.04286  [pdf, ps, other

    stat.ML cs.LG

    Universal Consistency of Wide and Deep ReLU Neural Networks and Minimax Optimal Convergence Rates for Kolmogorov-Donoho Optimal Function Classes

    Authors: Hyunouk Ko, Xiaoming Huo

    Abstract: In this paper, we prove the universal consistency of wide and deep ReLU neural network classifiers trained on the logistic loss. We also give sufficient conditions for a class of probability measures for which classifiers based on neural networks achieve minimax optimal rates of convergence. The result applies to a wide range of known function classes. In particular, while most previous works impo… ▽ More

    Submitted 30 January, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

  9. arXiv:2311.15111  [pdf, other

    cs.CV

    UAE: Universal Anatomical Embedding on Multi-modality Medical Images

    Authors: Xiaoyu Bai, Fan Bai, Xiaofei Huo, Jia Ge, Jingjing Lu, Xianghua Ye, Ke Yan, Yong Xia

    Abstract: Identifying specific anatomical structures (\textit{e.g.}, lesions or landmarks) in medical images plays a fundamental role in medical image analysis. Exemplar-based landmark detection methods are receiving increasing attention since they can detect arbitrary anatomical points in inference while do not need landmark annotations in training. They use self-supervised learning to acquire a discrimina… ▽ More

    Submitted 18 January, 2024; v1 submitted 25 November, 2023; originally announced November 2023.

  10. arXiv:2310.10767  [pdf, ps, other

    cs.LG stat.ML

    Wide Neural Networks as Gaussian Processes: Lessons from Deep Equilibrium Models

    Authors: Tianxiang Gao, Xiaokai Huo, Hailiang Liu, Hongyang Gao

    Abstract: Neural networks with wide layers have attracted significant attention due to their equivalence to Gaussian processes, enabling perfect fitting of training data while maintaining generalization performance, known as benign overfitting. However, existing results mainly focus on shallow or finite-depth networks, necessitating a comprehensive analysis of wide neural networks with infinite-depth layers… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted by NeurIPS 2023

  11. arXiv:2309.15075  [pdf, other

    stat.ML cs.LG math.ST

    On Excess Risk Convergence Rates of Neural Network Classifiers

    Authors: Hyunouk Ko, Namjoon Suh, Xiaoming Huo

    Abstract: The recent success of neural networks in pattern recognition and classification problems suggests that neural networks possess qualities distinct from other more classical classifiers such as SVMs or boosting classifiers. This paper studies the performance of plug-in classifiers based on neural networks in a binary classification setting as measured by their excess risks. Compared to the typical s… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  12. arXiv:2308.08030  [pdf, other

    stat.ML cs.LG math.ST

    Classification of Data Generated by Gaussian Mixture Models Using Deep ReLU Networks

    Authors: Tian-Yi Zhou, Xiaoming Huo

    Abstract: This paper studies the binary classification of unbounded data from ${\mathbb R}^d$ generated under Gaussian Mixture Models (GMMs) using deep ReLU neural networks. We obtain $\unicode{x2013}$ for the first time $\unicode{x2013}$ non-asymptotic upper bounds and convergence rates of the excess risk (excess misclassification error) for the classification without restrictions on model parameters. The… ▽ More

    Submitted 15 August, 2023; originally announced August 2023.

  13. arXiv:2307.05109  [pdf, other

    cs.LG stat.ML

    Conformalization of Sparse Generalized Linear Models

    Authors: Etash Kumar Guha, Eugene Ndiaye, Xiaoming Huo

    Abstract: Given a sequence of observable variables $\{(x_1, y_1), \ldots, (x_n, y_n)\}$, the conformal prediction method estimates a confidence set for $y_{n+1}$ given $x_{n+1}$ that is valid for any finite sample size by merely assuming that the joint distribution of the data is permutation invariant. Although attractive, computing such a set is computationally infeasible in most regression problems. Indee… ▽ More

    Submitted 11 July, 2023; originally announced July 2023.

    Comments: ICML 2023

  14. arXiv:2307.03535  [pdf, other

    cs.CV

    Matching in the Wild: Learning Anatomical Embeddings for Multi-Modality Images

    Authors: Xiaoyu Bai, Fan Bai, Xiaofei Huo, Jia Ge, Tony C. W. Mok, Zi Li, Minfeng Xu, Jingren Zhou, Le Lu, Dakai Jin, Xianghua Ye, Jingjing Lu, Ke Yan

    Abstract: Radiotherapists require accurate registration of MR/CT images to effectively use information from both modalities. In a typical registration pipeline, rigid or affine transformations are applied to roughly align the fixed and moving images before proceeding with the deformation step. While recent learning-based methods have shown promising results in the rigid/affine step, these methods often requ… ▽ More

    Submitted 7 July, 2023; originally announced July 2023.

  15. arXiv:2305.18789  [pdf, other

    cs.LG

    Generalization Bounds for Magnitude-Based Pruning via Sparse Matrix Sketching

    Authors: Etash Kumar Guha, Prasanjit Dubey, Xiaoming Huo

    Abstract: In this paper, we derive a novel bound on the generalization error of Magnitude-Based pruning of overparameterized neural networks. Our work builds on the bounds in Arora et al. [2018] where the error depends on one, the approximation induced by pruning, and two, the number of parameters in the pruned model, and improves upon standard norm-based generalization bounds. The pruned estimates obtained… ▽ More

    Submitted 24 June, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Added code for reproducibility; Minor changes

  16. arXiv:2303.15579  [pdf, other

    stat.ML cs.LG

    Adjusted Wasserstein Distributionally Robust Estimator in Statistical Learning

    Authors: Yiling Xie, Xiaoming Huo

    Abstract: We propose an adjusted Wasserstein distributionally robust estimator -- based on a nonlinear transformation of the Wasserstein distributionally robust (WDRO) estimator in statistical learning. The classic WDRO estimator is asymptotically biased, while our adjusted WDRO estimator is asymptotically unbiased, resulting in a smaller asymptotic mean squared error. Further, under certain conditions, our… ▽ More

    Submitted 9 May, 2024; v1 submitted 27 March, 2023; originally announced March 2023.

  17. arXiv:2303.10975  [pdf, other

    cs.CV

    VIMI: Vehicle-Infrastructure Multi-view Intermediate Fusion for Camera-based 3D Object Detection

    Authors: Zhe Wang, Siqi Fan, Xiaoliang Huo, Tongda Xu, Yan Wang, Jingjing Liu, Yilun Chen, Ya-Qin Zhang

    Abstract: In autonomous driving, Vehicle-Infrastructure Cooperative 3D Object Detection (VIC3D) makes use of multi-view cameras from both vehicles and traffic infrastructure, providing a global vantage point with rich semantic context of road conditions beyond a single vehicle viewpoint. Two major challenges prevail in VIC3D: 1) inherent calibration noise when fusing multi-view images, caused by time asynch… ▽ More

    Submitted 20 March, 2023; originally announced March 2023.

    Comments: 8 pages, 9 figures

  18. arXiv:2303.09083  [pdf, other

    cs.CV

    Focus on Your Target: A Dual Teacher-Student Framework for Domain-adaptive Semantic Segmentation

    Authors: Xinyue Huo, Lingxi Xie, Wengang Zhou, Houqiang Li, Qi Tian

    Abstract: We study unsupervised domain adaptation (UDA) for semantic segmentation. Currently, a popular UDA framework lies in self-training which endows the model with two-fold abilities: (i) learning reliable semantics from the labeled images in the source domain, and (ii) adapting to the target domain via generating pseudo labels on the unlabeled images. We find that, by decreasing/increasing the proporti… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

    Comments: 12 pages, 7 figures, 10 tables

  19. arXiv:2303.03583  [pdf, other

    cs.CV

    Calibration-free BEV Representation for Infrastructure Perception

    Authors: Siqi Fan, Zhe Wang, Xiaoliang Huo, Yan Wang, Jingjing Liu

    Abstract: Effective BEV object detection on infrastructure can greatly improve traffic scenes understanding and vehicle-toinfrastructure (V2I) cooperative perception. However, cameras installed on infrastructure have various postures, and previous BEV detection methods rely on accurate calibration, which is difficult for practical applications due to inevitable natural factors (e.g., wind and snow). In this… ▽ More

    Submitted 13 April, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

  20. arXiv:2212.01259  [pdf, other

    stat.ML cs.LG

    Covariance Estimators for the ROOT-SGD Algorithm in Online Learning

    Authors: Yiling Luo, Xiaoming Huo, Yajun Mei

    Abstract: Online learning naturally arises in many statistical and machine learning problems. The most widely used methods in online learning are stochastic first-order algorithms. Among this family of algorithms, there is a recently developed algorithm, Recursive One-Over-T SGD (ROOT-SGD). ROOT-SGD is advantageous in that it converges at a non-asymptotically fast rate, and its estimator further converges t… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

  21. arXiv:2210.14184  [pdf, other

    stat.ML cs.LG

    Learning Ability of Interpolating Deep Convolutional Neural Networks

    Authors: Tian-Yi Zhou, Xiaoming Huo

    Abstract: It is frequently observed that overparameterized neural networks generalize well. Regarding such phenomena, existing theoretical work mainly devotes to linear settings or fully-connected neural networks. This paper studies the learning ability of an important family of deep neural networks, deep convolutional neural networks (DCNNs), under both underparameterized and overparameterized settings. We… ▽ More

    Submitted 16 August, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

  22. arXiv:2209.10218  [pdf, other

    eess.IV cs.CV

    HiFuse: Hierarchical Multi-Scale Feature Fusion Network for Medical Image Classification

    Authors: Xiangzuo Huo, Gang Sun, Shengwei Tian, Yan Wang, Long Yu, Jun Long, Wendong Zhang, Aolun Li

    Abstract: Medical image classification has developed rapidly under the impetus of the convolutional neural network (CNN). Due to the fixed size of the receptive field of the convolution kernel, it is difficult to capture the global features of medical images. Although the self-attention-based Transformer can model long-range dependencies, it has high computational complexity and lacks local inductive bias.… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

  23. arXiv:2208.14893  [pdf, other

    cs.CV

    Improving RGB-D Point Cloud Registration by Learning Multi-scale Local Linear Transformation

    Authors: Ziming Wang, Xiaoliang Huo, Zhenghao Chen, Jing Zhang, Lu Sheng, Dong Xu

    Abstract: Point cloud registration aims at estimating the geometric transformation between two point cloud scans, in which point-wise correspondence estimation is the key to its success. In addition to previous methods that seek correspondences by hand-crafted or learnt geometric features, recent point cloud registration methods have tried to apply RGB-D data to achieve more accurate correspondence. However… ▽ More

    Submitted 31 August, 2022; v1 submitted 31 August, 2022; originally announced August 2022.

    Comments: Accepted to ECCV 2022, supplementary materials included

  24. The Directional Bias Helps Stochastic Gradient Descent to Generalize in Kernel Regression Models

    Authors: Yiling Luo, Xiaoming Huo, Yajun Mei

    Abstract: We study the Stochastic Gradient Descent (SGD) algorithm in nonparametric statistics: kernel regression in particular. The directional bias property of SGD, which is known in the linear regression setting, is generalized to the kernel regression. More specifically, we prove that SGD with moderate and annealing step-size converges along the direction of the eigenvector that corresponds to the large… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

  25. Implicit Regularization Properties of Variance Reduced Stochastic Mirror Descent

    Authors: Yiling Luo, Xiaoming Huo, Yajun Mei

    Abstract: In machine learning and statistical data analysis, we often run into objective function that is a summation: the number of terms in the summation possibly is equal to the sample size, which can be enormous. In such a setting, the stochastic mirror descent (SMD) algorithm is a numerically efficient method -- each iteration involving a very small subset of the data. The variance reduction version of… ▽ More

    Submitted 29 April, 2022; originally announced May 2022.

  26. arXiv:2204.02684  [pdf, other

    cs.CV

    Domain-Agnostic Prior for Transfer Semantic Segmentation

    Authors: Xinyue Huo, Lingxi Xie, Hengtong Hu, Wengang Zhou, Houqiang Li, Qi Tian

    Abstract: Unsupervised domain adaptation (UDA) is an important topic in the computer vision community. The key difficulty lies in defining a common property between the source and target domains so that the source-domain features can align with the target-domain semantics. In this paper, we present a simple and effective mechanism that regularizes cross-domain representation learning with a domain-agnostic… ▽ More

    Submitted 20 April, 2022; v1 submitted 6 April, 2022; originally announced April 2022.

    Comments: Accepted by CVPR 2022

  27. arXiv:2203.00813  [pdf, other

    stat.ML cs.DS math.OC

    An Accelerated Stochastic Algorithm for Solving the Optimal Transport Problem

    Authors: Yiling Xie, Yiling Luo, Xiaoming Huo

    Abstract: A primal-dual accelerated stochastic gradient descent with variance reduction algorithm (PDASGD) is proposed to solve linear-constrained optimization problems. PDASGD could be applied to solve the discrete optimal transport (OT) problem and enjoys the best-known computational complexity -- $\widetilde{\mathcal{O}}(n^2/ε)$, where $n$ is the number of atoms, and $ε>0$ is the accuracy. In the literat… ▽ More

    Submitted 29 May, 2023; v1 submitted 1 March, 2022; originally announced March 2022.

    Comments: Compared with previous versions, both theoretical complexity and numerical performances have been improved for solving the OT problem in this version

  28. arXiv:2201.12533  [pdf, other

    cs.CV

    Light field Rectification based on relative pose estimation

    Authors: Xiao Huo, Dongyang Jin, Saiping Zhang, Fuzheng Yang

    Abstract: Hand-held light field (LF) cameras have unique advantages in computer vision such as 3D scene reconstruction and depth estimation. However, the related applications are limited by the ultra-small baseline, e.g., leading to the extremely low depth resolution in reconstruction. To solve this problem, we propose to rectify LF to obtain a large baseline. Specifically, the proposed method aligns two LF… ▽ More

    Submitted 29 January, 2022; originally announced January 2022.

  29. arXiv:2011.09941  [pdf, other

    cs.CV cs.LG

    Heterogeneous Contrastive Learning: Encoding Spatial Information for Compact Visual Representations

    Authors: Xinyue Huo, Lingxi Xie, Longhui Wei, Xiaopeng Zhang, Hao Li, Zijie Yang, Wengang Zhou, Houqiang Li, Qi Tian

    Abstract: Contrastive learning has achieved great success in self-supervised visual representation learning, but existing approaches mostly ignored spatial information which is often crucial for visual representation. This paper presents heterogeneous contrastive learning (HCL), an effective approach that adds spatial information to the encoding stage to alleviate the learning inconsistency between the cont… ▽ More

    Submitted 19 November, 2020; originally announced November 2020.

    Comments: 10 pages, 4 figures, 6 tables

  30. arXiv:2010.13934  [pdf, other

    stat.ML cs.LG stat.CO

    Accelerate the Warm-up Stage in the Lasso Computation via a Homotopic Approach

    Authors: Yujie Zhao, Xiaoming Huo

    Abstract: In optimization, it is known that when the objective functions are strictly convex and well-conditioned, gradient-based approaches can be extremely effective, e.g., achieving the exponential rate of convergence. On the other hand, the existing Lasso-type estimator in general cannot achieve the optimal rate due to the undesirable behavior of the absolute function at the origin. A homotopic method i… ▽ More

    Submitted 6 March, 2023; v1 submitted 26 October, 2020; originally announced October 2020.

    Comments: 19 pages, 3 figures, 3 tables

  31. arXiv:2006.13461  [pdf, other

    cs.CV

    ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Medical Image Segmentation

    Authors: Xinyue Huo, Lingxi Xie, Jianzhong He, Zijie Yang, Qi Tian

    Abstract: In medical image analysis, semi-supervised learning is an effective method to extract knowledge from a small amount of labeled data and a large amount of unlabeled data. This paper focuses on a popular pipeline known as self learning, and points out a weakness named lazy learning that refers to the difficulty for a model to learn from the pseudo labels generated by itself. To alleviate this issue,… ▽ More

    Submitted 6 August, 2020; v1 submitted 24 June, 2020; originally announced June 2020.

  32. arXiv:2001.03734  [pdf, other

    cs.CV

    A Two-step Calibration Method for Unfocused Light Field Camera Based on Projection Model Analysis

    Authors: Dongyang Jin, Saiping Zhang, Xiao Huo, Wei Zhang, Fuzheng Yang

    Abstract: Accurately calibrating light field camera is essential to its applications. Rapid progress has been made in this area in the past decades. In this paper, detailed analysis was first performed towards the state of the art projection models for calibration which were further interpreted in three representations, including the correspondence between rays and pixels, 3D physical points and pixels and… ▽ More

    Submitted 10 March, 2021; v1 submitted 11 January, 2020; originally announced January 2020.

    Comments: 11 pages, 9 figures

    ACM Class: I.4

  33. arXiv:1912.00524  [pdf, other

    stat.ML cs.LG

    Factor Analysis on Citation, Using a Combined Latent and Logistic Regression Model

    Authors: Namjoon Suh, Xiaoming Huo, Eric Heim, Lee Seversky

    Abstract: We propose a combined model, which integrates the latent factor model and the logistic regression model, for the citation network. It is noticed that neither a latent factor model nor a logistic regression model alone is sufficient to capture the structure of the data. The proposed model has a latent (i.e., factor analysis) model to represents the main technological trends (a.k.a., factors), and a… ▽ More

    Submitted 1 December, 2019; originally announced December 2019.

    Comments: Citation network, matrix decomposition, latent variable model, logistic regression model, convex optimization, alternating direction method of multiplier

  34. arXiv:1511.01443  [pdf, ps, other

    stat.ME cs.DC stat.ML

    A Distributed One-Step Estimator

    Authors: Cheng Huang, Xiaoming Huo

    Abstract: Distributed statistical inference has recently attracted enormous attention. Many existing work focuses on the averaging estimator. We propose a one-step approach to enhance a simple-averaging based distributed estimator. We derive the corresponding asymptotic properties of the newly proposed estimator. We find that the proposed one-step estimator enjoys the same asymptotic properties as the centr… ▽ More

    Submitted 10 November, 2015; v1 submitted 4 November, 2015; originally announced November 2015.

    Comments: 31 pages