Zum Hauptinhalt springen

Showing 1–50 of 63 results for author: Lu, S

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.15920  [pdf, other

    cs.LG stat.ML

    SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning

    Authors: Shuai Zhang, Heshan Devaka Fernando, Miao Liu, Keerthiram Murugesan, Songtao Lu, Pin-Yu Chen, Tianyi Chen, Meng Wang

    Abstract: This paper studies the transfer reinforcement learning (RL) problem where multiple RL problems have different reward functions but share the same underlying transition dynamics. In this setting, the Q-function of each RL problem (task) can be decomposed into a successor feature (SF) and a reward mapping: the former characterizes the transition dynamics, and the latter characterizes the task-specif… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.16173

  2. arXiv:2402.10456  [pdf, other

    stat.ML cs.LG stat.AP stat.ME

    Generative Modeling for Tabular Data via Penalized Optimal Transport Network

    Authors: Wenhui Sophia Lu, Chenyang Zhong, Wing Hung Wong

    Abstract: The task of precisely learning the probability distribution of rows within tabular data and producing authentic synthetic samples is both crucial and non-trivial. Wasserstein generative adversarial network (WGAN) marks a notable improvement in generative modeling, addressing the challenges faced by its predecessor, generative adversarial network. However, due to the mixed data types and multimodal… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: 37 pages, 23 figures

  3. arXiv:2402.03167  [pdf, other

    math.OC cs.LG stat.ML

    Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity

    Authors: Boao Kong, Shuchen Zhu, Songtao Lu, Xinmeng Huang, Kun Yuan

    Abstract: Stochastic bilevel optimization (SBO) is becoming increasingly essential in machine learning due to its versatility in handling nested structures. To address large-scale SBO, decentralized approaches have emerged as effective paradigms in which nodes communicate with immediate neighbors without a central server, thereby improving communication efficiency and enhancing algorithmic robustness. Howev… ▽ More

    Submitted 26 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: 37 pages, 6 figures

  4. arXiv:2401.06980  [pdf, other

    cs.CL cs.LG stat.ML

    Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization

    Authors: A F M Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen

    Abstract: In this paper, we present a novel bilevel optimization-based training approach to training acoustic models for automatic speech recognition (ASR) tasks that we term {bi-level joint unsupervised and supervised training (BL-JUST)}. {BL-JUST employs a lower and upper level optimization with an unsupervised loss and a supervised loss respectively, leveraging recent advances in penalty-based bilevel op… ▽ More

    Submitted 13 January, 2024; originally announced January 2024.

    Comments: This paper has been accepted in ICASSP-2024 conference

  5. arXiv:2309.12425  [pdf, other

    stat.ME math.ST

    Principal Stratification with Continuous Post-Treatment Variables: Nonparametric Identification and Semiparametric Estimation

    Authors: Sizhu Lu, Zhichao Jiang, Peng Ding

    Abstract: Post-treatment variables often complicate causal inference. They appear in many scientific problems, including noncompliance, truncation by death, mediation, and surrogate endpoint evaluation. Principal stratification is a strategy to address these challenges by adjusting for the potential values of the post-treatment variables, defined as the principal strata. It allows for characterizing treatme… ▽ More

    Submitted 3 April, 2024; v1 submitted 21 September, 2023; originally announced September 2023.

  6. arXiv:2309.01465  [pdf, other

    stat.ME

    Identifiability and estimation of the competing risks model under exclusion restrictions

    Authors: Munir Hiabu, Simon M. S. LU, Ralf A. Wilke

    Abstract: The non-identifiability of the competing risks model requires researchers to work with restrictions on the model to obtain informative results. We present a new identifiability solution based on an exclusion restriction. Many areas of applied research use methods that rely on exclusion restrcitions. It appears natural to also use them for the identifiability of competing risks models. By imposing… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

  7. arXiv:2306.00023  [pdf

    cs.LG cs.AI stat.AP

    Predicting Heart Disease and Reducing Survey Time Using Machine Learning Algorithms

    Authors: Salahaldeen Rababa, Asma Yamin, Shuxia Lu, Ashraf Obaidat

    Abstract: Currently, many researchers and analysts are working toward medical diagnosis enhancement for various diseases. Heart disease is one of the common diseases that can be considered a significant cause of mortality worldwide. Early detection of heart disease significantly helps in reducing the risk of heart failure. Consequently, the Centers for Disease Control and Prevention (CDC) conducts a health-… ▽ More

    Submitted 30 May, 2023; originally announced June 2023.

  8. arXiv:2305.17643  [pdf, other

    stat.ME

    Flexible sensitivity analysis for causal inference in observational studies subject to unmeasured confounding

    Authors: Sizhu Lu, Peng Ding

    Abstract: Causal inference with observational studies often suffers from unmeasured confounding, yielding biased estimators based on the unconfoundedness assumption. Sensitivity analysis assesses how the causal conclusions change with respect to different degrees of unmeasured confounding. Most existing sensitivity analysis methods work well for specific types of statistical estimation or testing strategies… ▽ More

    Submitted 29 March, 2024; v1 submitted 28 May, 2023; originally announced May 2023.

  9. arXiv:2303.09700  [pdf, other

    cs.SI cs.AI cs.LG stat.AP

    Delayed and Indirect Impacts of Link Recommendations

    Authors: Han Zhang, Shangen Lu, Yixin Wang, Mihaela Curmei

    Abstract: The impacts of link recommendations on social networks are challenging to evaluate, and so far they have been studied in limited settings. Observational studies are restricted in the kinds of causal questions they can answer and naive A/B tests often lead to biased evaluations due to unaccounted network interference. Furthermore, evaluations in simulation settings are often limited to static netwo… ▽ More

    Submitted 16 March, 2023; originally announced March 2023.

  10. arXiv:2208.04886  [pdf

    stat.ME

    Direct and diffuse shading factors modelling for the most representative agrivoltaic system layouts

    Authors: Sebastian Zainali, Silvia Ma Lu, Bengt Stridh, Anders Avelin, Stefano Amaducci, Michele Colauzzi, Pietro Elia Campana

    Abstract: Agrivoltaic systems are becoming more popular as a critical technology for attaining several sustainable development goals such as affordable and clean energy, zero hunger, clean water and sanitation, and climate action. However, understanding the shading effects on crops is fundamental to choosing an optimal agrivoltaic system as a wrong choice could lead to severe crop reductions. In this study,… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

  11. arXiv:2207.13283  [pdf, other

    cs.LG math.OC stat.ML

    INTERACT: Achieving Low Sample and Communication Complexities in Decentralized Bilevel Learning over Networks

    Authors: Zhuqing Liu, Xin Zhang, Prashant Khanduri, Songtao Lu, Jia Liu

    Abstract: In recent years, decentralized bilevel optimization problems have received increasing attention in the networking and machine learning communities thanks to their versatility in modeling decentralized learning problems over peer-to-peer networks (e.g., multi-agent meta-learning, multi-agent reinforcement learning, personalized training, and Byzantine-resilient learning). However, for decentralized… ▽ More

    Submitted 5 October, 2022; v1 submitted 27 July, 2022; originally announced July 2022.

  12. arXiv:2206.13482  [pdf, other

    cs.LG math.OC stat.ML

    Understanding Benign Overfitting in Gradient-Based Meta Learning

    Authors: Lisha Chen, Songtao Lu, Tianyi Chen

    Abstract: Meta learning has demonstrated tremendous success in few-shot learning with limited supervised data. In those settings, the meta model is usually overparameterized. While the conventional statistical learning theory suggests that overparameterized models tend to overfit, empirical evidence reveals that overparameterized meta learning methods still work well -- a phenomenon often called "benign ove… ▽ More

    Submitted 9 November, 2022; v1 submitted 27 June, 2022; originally announced June 2022.

  13. arXiv:2106.07115  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Understanding Latent Correlation-Based Multiview Learning and Self-Supervision: An Identifiability Perspective

    Authors: Qi Lyu, Xiao Fu, Weiran Wang, Songtao Lu

    Abstract: Multiple views of data, both naturally acquired (e.g., image and audio) and artificially produced (e.g., via adding different noise to data samples), have proven useful in enhancing representation learning. Natural views are often handled by multiview analysis tools, e.g., (deep) canonical correlation analysis [(D)CCA], while the artificial ones are frequently used in self-supervised learning (SSL… ▽ More

    Submitted 8 April, 2022; v1 submitted 13 June, 2021; originally announced June 2021.

    Comments: Accepted to ICLR 2022 Spotlight, 37 pages, 11 figures

  14. arXiv:2103.11254  [pdf, other

    cs.LG stat.AP

    Understanding Heart-Failure Patients EHR Clinical Features via SHAP Interpretation of Tree-Based Machine Learning Model Predictions

    Authors: Shuyu Lu, Ruoyu Chen, Wei Wei, Xinghua Lu

    Abstract: Heart failure (HF) is a major cause of mortality. Accurately monitoring HF progress and adjust therapies are critical for improving patient outcomes. An experienced cardiologist can make accurate HF stage diagnoses based on combination of symptoms, signs, and lab results from the electronic health records (EHR) of a patient, without directly measuring heart function. We examined whether machine le… ▽ More

    Submitted 20 March, 2021; originally announced March 2021.

    Comments: Submitted to AMIA 2021 Annual Symposium

  15. arXiv:2103.06872  [pdf, other

    quant-ph cond-mat.str-el cs.LG stat.ML

    Tensor networks and efficient descriptions of classical data

    Authors: Sirui Lu, Márton Kanász-Nagy, Ivan Kukuljan, J. Ignacio Cirac

    Abstract: We investigate the potential of tensor network based machine learning methods to scale to large image and text data sets. For that, we study how the mutual information between a subregion and its complement scales with the subsystem size $L$, similarly to how it is done in quantum many-body physics. We find that for text, the mutual information scales as a power law $L^ν$ with a close to volume la… ▽ More

    Submitted 11 March, 2021; originally announced March 2021.

    Comments: 16 pages, 6 figures

  16. arXiv:2102.06933  [pdf, ps, other

    cs.LG math.OC stat.ML

    Revisiting Smoothed Online Learning

    Authors: Lijun Zhang, Wei Jiang, Shiyin Lu, Tianbao Yang

    Abstract: In this paper, we revisit the problem of smoothed online learning, in which the online learner suffers both a hitting cost and a switching cost, and target two performance metrics: competitive ratio and dynamic regret with switching cost. To bound the competitive ratio, we assume the hitting cost is known to the learner in each round, and investigate the simple idea of balancing the two costs by… ▽ More

    Submitted 18 May, 2021; v1 submitted 13 February, 2021; originally announced February 2021.

  17. arXiv:2102.03794  [pdf, other

    cs.LG cs.CV stat.ML

    A self-adaptive and robust fission clustering algorithm via heat diffusion and maximal turning angle

    Authors: Yu Han, Shizhan Lu, Haiyan Xu

    Abstract: Cluster analysis, which focuses on the grouping and categorization of similar elements, is widely used in various fields of research. A novel and fast clustering algorithm, fission clustering algorithm, is proposed in recent year. In this article, we propose a robust fission clustering (RFC) algorithm and a self-adaptive noise identification method. The RFC and the self-adaptive noise identificati… ▽ More

    Submitted 7 February, 2021; originally announced February 2021.

    Comments: 11 pages, 8 figures

  18. arXiv:2011.12581  [pdf, other

    cs.LG stat.ML

    Overcoming Catastrophic Forgetting via Direction-Constrained Optimization

    Authors: Yunfei Teng, Anna Choromanska, Murray Campbell, Songtao Lu, Parikshit Ram, Lior Horesh

    Abstract: This paper studies a new design of the optimization algorithm for training deep learning models with a fixed architecture of the classification network in a continual learning framework. The training data is non-stationary and the non-stationarity is imposed by a sequence of distinct tasks. We first analyze a deep model trained on only one learning task in isolation and identify a region in networ… ▽ More

    Submitted 1 July, 2022; v1 submitted 25 November, 2020; originally announced November 2020.

  19. arXiv:2009.13714  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Learning to Generate Image Source-Agnostic Universal Adversarial Perturbations

    Authors: Pu Zhao, Parikshit Ram, Songtao Lu, Yuguang Yao, Djallel Bouneffouf, Xue Lin, Sijia Liu

    Abstract: Adversarial perturbations are critical for certifying the robustness of deep learning models. A universal adversarial perturbation (UAP) can simultaneously attack multiple images, and thus offers a more unified threat model, obviating an image-wise attack algorithm. However, the existing UAP generator is underdeveloped when images are drawn from different image sources (e.g., with different image… ▽ More

    Submitted 17 August, 2022; v1 submitted 28 September, 2020; originally announced September 2020.

  20. arXiv:2009.01027  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    DARTS-: Robustly Stepping out of Performance Collapse Without Indicators

    Authors: Xiangxiang Chu, Xiaoxing Wang, Bo Zhang, Shun Lu, Xiaolin Wei, Junchi Yan

    Abstract: Despite the fast development of differentiable architecture search (DARTS), it suffers from long-standing performance instability, which extremely limits its application. Existing robustifying methods draw clues from the resulting deteriorated behavior instead of finding out its causing factor. Various indicators such as Hessian eigenvalues are proposed as a signal to stop searching before the per… ▽ More

    Submitted 15 January, 2021; v1 submitted 2 September, 2020; originally announced September 2020.

    Comments: Accepted to ICLR2021

  21. arXiv:2008.06635  [pdf, other

    cs.LG stat.ML

    Orthogonalized SGD and Nested Architectures for Anytime Neural Networks

    Authors: Chengcheng Wan, Henry Hoffmann, Shan Lu, Michael Maire

    Abstract: We propose a novel variant of SGD customized for training network architectures that support anytime behavior: such networks produce a series of increasingly accurate outputs over time. Efficient architectural designs for these networks focus on re-using internal state; subnetworks must produce representations relevant for both immediate prediction as well as refinement by subsequent network stage… ▽ More

    Submitted 14 August, 2020; originally announced August 2020.

    Comments: ICML 2020

  22. arXiv:2007.01208  [pdf

    cs.LG stat.ML

    Exponentially Weighted l_2 Regularization Strategy in Constructing Reinforced Second-order Fuzzy Rule-based Model

    Authors: Congcong Zhang, Sung-Kwun Oh, Witold Pedrycz, Zunwei Fu, Shanzhen Lu

    Abstract: In the conventional Takagi-Sugeno-Kang (TSK)-type fuzzy models, constant or linear functions are usually utilized as the consequent parts of the fuzzy rules, but they cannot effectively describe the behavior within local regions defined by the antecedent parts. In this article, a theoretical and practical design methodology is developed to address this problem. First, the information granulation (… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: 22 pages

  23. arXiv:2007.00756  [pdf, other

    stat.AP q-bio.PE

    An Early Warning Approach to Monitor COVID-19 Activity with Multiple Digital Traces in Near Real-Time

    Authors: Nicole E. Kogan, Leonardo Clemente, Parker Liautaud, Justin Kaashoek, Nicholas B. Link, Andre T. Nguyen, Fred S. Lu, Peter Huybers, Bernd Resch, Clemens Havas, Andreas Petutschnig, Jessica Davis, Matteo Chinazzi, Backtosch Mustafa, William P. Hanage, Alessandro Vespignani, Mauricio Santillana

    Abstract: Non-pharmaceutical interventions (NPIs) have been crucial in curbing COVID-19 in the United States (US). Consequently, relaxing NPIs through a phased re-opening of the US amid still-high levels of COVID-19 susceptibility could lead to new epidemic waves. This calls for a COVID-19 early warning system. Here we evaluate multiple digital data streams as early warning indicators of increasing or decre… ▽ More

    Submitted 3 July, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

  24. arXiv:2007.00169  [pdf, other

    cs.LG stat.ML

    Regularly Updated Deterministic Policy Gradient Algorithm

    Authors: Shuai Han, Wenbo Zhou, Shuai Lü, Jiayu Yu

    Abstract: Deep Deterministic Policy Gradient (DDPG) algorithm is one of the most well-known reinforcement learning methods. However, this method is inefficient and unstable in practical applications. On the other hand, the bias and variance of the Q estimation in the target function are sometimes difficult to control. This paper proposes a Regularly Updated Deterministic (RUD) policy gradient algorithm for… ▽ More

    Submitted 30 June, 2020; originally announced July 2020.

  25. arXiv:2006.10980  [pdf, other

    cs.LG cs.AI stat.ML

    NROWAN-DQN: A Stable Noisy Network with Noise Reduction and Online Weight Adjustment for Exploration

    Authors: Shuai Han, Wenbo Zhou, Jing Liu, Shuai Lü

    Abstract: Deep reinforcement learning has been applied more and more widely nowadays, especially in various complex control tasks. Effective exploration for noisy networks is one of the most important issues in deep reinforcement learning. Noisy networks tend to produce stable outputs for agents. However, this tendency is not always enough to find a stable policy for an agent, which decreases efficiency and… ▽ More

    Submitted 19 June, 2020; originally announced June 2020.

  26. arXiv:2006.08141  [pdf, other

    math.OC cs.LG stat.ML

    Non-convex Min-Max Optimization: Applications, Challenges, and Recent Theoretical Advances

    Authors: Meisam Razaviyayn, Tianjian Huang, Songtao Lu, Maher Nouiehed, Maziar Sanjabi, Mingyi Hong

    Abstract: The min-max optimization problem, also known as the saddle point problem, is a classical optimization problem which is also studied in the context of zero-sum games. Given a class of objective functions, the goal is to find a value for the argument which leads to a small objective value even for the worst case function in the given class. Min-max optimization problems have recently become very pop… ▽ More

    Submitted 18 August, 2020; v1 submitted 15 June, 2020; originally announced June 2020.

    Journal ref: IEEE Signal Processing Magazine (Volume: 37, Issue: 5, Sept. 2020)

  27. Exploring the Connection Between Binary and Spiking Neural Networks

    Authors: Sen Lu, Abhronil Sengupta

    Abstract: On-chip edge intelligence has necessitated the exploration of algorithmic techniques to reduce the compute requirements of current machine learning frameworks. This work aims to bridge the recent algorithmic progress in training Binary Neural Networks and Spiking Neural Networks - both of which are driven by the same motivation and yet synergies between the two have not been fully explored. We sho… ▽ More

    Submitted 21 May, 2020; v1 submitted 23 February, 2020; originally announced February 2020.

  28. arXiv:2002.04101  [pdf, ps, other

    econ.EM stat.AP stat.ME

    Sequential Monitoring of Changes in Housing Prices

    Authors: Lajos Horváth, Zhenya Liu, Shanglin Lu

    Abstract: We propose a sequential monitoring scheme to find structural breaks in real estate markets. The changes in the real estate prices are modeled by a combination of linear and autoregressive terms. The monitoring scheme is based on a detector and a suitably chosen boundary function. If the detector crosses the boundary function, a structural break is detected. We provide the asymptotics for the proce… ▽ More

    Submitted 10 February, 2020; originally announced February 2020.

    Comments: 47 pages, 12 figures

  29. arXiv:2002.02085  [pdf, ps, other

    cs.LG stat.ML

    Minimizing Dynamic Regret and Adaptive Regret Simultaneously

    Authors: Lijun Zhang, Shiyin Lu, Tianbao Yang

    Abstract: Regret minimization is treated as the golden rule in the traditional study of online learning. However, regret minimization algorithms tend to converge to the static optimum, thus being suboptimal for changing environments. To address this limitation, new performance measures, including dynamic regret and adaptive regret have been proposed to guide the design of online algorithms. The former one a… ▽ More

    Submitted 5 February, 2020; originally announced February 2020.

  30. arXiv:2001.05887  [pdf, other

    cs.LG cs.CV stat.ML

    MixPath: A Unified Approach for One-shot Neural Architecture Search

    Authors: Xiangxiang Chu, Shun Lu, Xudong Li, Bo Zhang

    Abstract: Blending multiple convolutional kernels is proved advantageous in neural architecture design. However, current two-stage neural architecture search methods are mainly limited to single-path search spaces. How to efficiently search models of multi-path structures remains a difficult problem. In this paper, we are motivated to train a one-shot multi-path supernet to accurately evaluate the candidate… ▽ More

    Submitted 19 July, 2023; v1 submitted 16 January, 2020; originally announced January 2020.

    Comments: ICCV2023

  31. arXiv:2001.04786  [pdf, other

    cs.LG math.OC stat.ML

    Distributed Learning in the Non-Convex World: From Batch to Streaming Data, and Beyond

    Authors: Tsung-Hui Chang, Mingyi Hong, Hoi-To Wai, Xinwei Zhang, Songtao Lu

    Abstract: Distributed learning has become a critical enabler of the massively connected world envisioned by many. This article discusses four key elements of scalable distributed processing and real-time intelligence --- problems, data, communication and computation. Our aim is to provide a fresh and unique perspective about how these elements should work together in an effective and coherent manner. In par… ▽ More

    Submitted 14 January, 2020; originally announced January 2020.

    Comments: Submitted to IEEE Signal Processing Magazine Special Issue on Distributed, Streaming Machine Learning; THC, MH, HTW contributed equally

  32. arXiv:1912.11477  [pdf, other

    cs.LG cs.CV cs.DC stat.ML

    Self-adaption grey DBSCAN clustering

    Authors: Shizhan Lu

    Abstract: Clustering analysis, a classical issue in data mining, is widely used in various research areas. This article aims at proposing a self-adaption grey DBSCAN clustering (SAG-DBSCAN) algorithm. First, the grey relational matrix is used to obtain the grey local density indicator, and then this indicator is applied to make self-adapting noise identification for obtaining a dense subset of clustering da… ▽ More

    Submitted 23 December, 2019; originally announced December 2019.

    Comments: 8 pages, 4 figures, 4 tables. arXiv admin note: text overlap with arXiv:1906.11416

  33. arXiv:1912.01792  [pdf, ps, other

    cs.LG cs.NE stat.ML

    Learn Electronic Health Records by Fully Decentralized Federated Learning

    Authors: Songtao Lu, Yawen Zhang, Yunlong Wang, Christina Mack

    Abstract: Federated learning opens a number of research opportunities due to its high communication efficiency in distributed training problems within a star network. In this paper, we focus on improving the communication efficiency for fully decentralized federated learning over a graph, where the algorithm performs local updates for several iterations and then enables communications among the nodes. In su… ▽ More

    Submitted 9 December, 2019; v1 submitted 3 December, 2019; originally announced December 2019.

  34. arXiv:1911.04602  [pdf, other

    stat.ME

    A clustered Gaussian process model for computer experiments

    Authors: Chih-Li Sung, Benjamin Haaland, Youngdeok Hwang, Siyuan Lu

    Abstract: A Gaussian process has been one of the important approaches for emulating computer simulations. However, the stationarity assumption for a Gaussian process and the intractability for large-scale dataset limit its availability in practice. In this article, we propose a clustered Gaussian process model which segments the input data into multiple clusters, in each of which a Gaussian process model is… ▽ More

    Submitted 5 November, 2020; v1 submitted 11 November, 2019; originally announced November 2019.

  35. arXiv:1910.10375  [pdf, other

    stat.AP

    Statistical Modeling for Spatio-Temporal Data from Stochastic Convection-Diffusion Processes

    Authors: Xiao Liu, Kyongmin Yeo, Siyuan Lu

    Abstract: This paper proposes a physical-statistical modeling approach for spatio-temporal data arising from a class of stochastic convection-diffusion processes. Such processes are widely found in scientific and engineering applications where fundamental physics imposes critical constraints on how data can be modeled and how models should be interpreted. The idea of spectrum decomposition is employed to ap… ▽ More

    Submitted 5 August, 2020; v1 submitted 23 October, 2019; originally announced October 2019.

  36. arXiv:1910.10196  [pdf, other

    cs.LG stat.ML

    No-regret Non-convex Online Meta-Learning

    Authors: Zhenxun Zhuang, Yunlong Wang, Kezi Yu, Songtao Lu

    Abstract: The online meta-learning framework is designed for the continual lifelong learning setting. It bridges two fields: meta-learning which tries to extract prior knowledge from past tasks for fast learning of future tasks, and online-learning which deals with the sequential setting where problems are revealed one by one. In this paper, we generalize the original framework from convex to non-convex set… ▽ More

    Submitted 18 February, 2020; v1 submitted 22 October, 2019; originally announced October 2019.

  37. arXiv:1910.05857  [pdf, ps, other

    math.OC cs.DC cs.LG eess.SP stat.ML

    Improving the Sample and Communication Complexity for Decentralized Non-Convex Optimization: A Joint Gradient Estimation and Tracking Approach

    Authors: Haoran Sun, Songtao Lu, Mingyi Hong

    Abstract: Many modern large-scale machine learning problems benefit from decentralized and stochastic optimization. Recent works have shown that utilizing both decentralized computing and local stochastic gradient estimates can outperform state-of-the-art centralized algorithms, in applications involving highly non-convex problems, such as training deep neural networks. In this work, we propose a decentra… ▽ More

    Submitted 13 October, 2019; originally announced October 2019.

    Journal ref: Published at the International Conference on Machine Learning (ICML 2020)

  38. arXiv:1909.13806  [pdf, other

    cs.LG math.OC stat.ML

    Min-Max Optimization without Gradients: Convergence and Applications to Adversarial ML

    Authors: Sijia Liu, Songtao Lu, Xiangyi Chen, Yao Feng, Kaidi Xu, Abdullah Al-Dujaili, Minyi Hong, Una-May O'Reilly

    Abstract: In this paper, we study the problem of constrained robust (min-max) optimization ina black-box setting, where the desired optimizer cannot access the gradients of the objective function but may query its values. We present a principled optimization framework, integrating a zeroth-order (ZO) gradient estimator with an alternating projected stochastic gradient descent-ascent method, where the former… ▽ More

    Submitted 16 June, 2020; v1 submitted 30 September, 2019; originally announced September 2019.

    Comments: ICML 2020

  39. arXiv:1909.02187  [pdf, ps, other

    cs.LG stat.ML

    Adaptive and Efficient Algorithms for Tracking the Best Expert

    Authors: Shiyin Lu, Lijun Zhang

    Abstract: In this paper, we consider the problem of prediction with expert advice in dynamic environments. We choose tracking regret as the performance metric and develop two adaptive and efficient algorithms with data-dependent tracking regret bounds. The first algorithm achieves a second-order tracking regret bound, which improves existing first-order bounds. The second algorithm enjoys a path-length boun… ▽ More

    Submitted 8 February, 2020; v1 submitted 4 September, 2019; originally announced September 2019.

  40. arXiv:1907.04450  [pdf, ps, other

    math.OC cs.CC stat.ML

    SNAP: Finding Approximate Second-Order Stationary Solutions Efficiently for Non-convex Linearly Constrained Problems

    Authors: Songtao Lu, Meisam Razaviyayn, Bo Yang, Kejun Huang, Mingyi Hong

    Abstract: This paper proposes low-complexity algorithms for finding approximate second-order stationary points (SOSPs) of problems with smooth non-convex objective and linear constraints. While finding (approximate) SOSPs is computationally intractable, we first show that generic instances of the problem can be solved efficiently. More specifically, for a generic problem instance, certain strict complementa… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

  41. arXiv:1906.11416  [pdf, other

    cs.LG cs.CV stat.ML

    Clustering by the way of atomic fission

    Authors: Shizhan Lu

    Abstract: Cluster analysis which focuses on the grouping and categorization of similar elements is widely used in various fields of research. Inspired by the phenomenon of atomic fission, a novel density-based clustering algorithm is proposed in this paper, called fission clustering (FC). It focuses on mining the dense families of a dataset and utilizes the information of the distance matrix to fissure clus… ▽ More

    Submitted 26 June, 2019; originally announced June 2019.

    Comments: 9 pages, 3 figures

    Journal ref: IEEE ACCESS 2020

  42. arXiv:1906.07304  [pdf, other

    cs.LG cs.AI cs.SC stat.ML

    Neurally-Guided Structure Inference

    Authors: Sidi Lu, Jiayuan Mao, Joshua B. Tenenbaum, Jiajun Wu

    Abstract: Most structure inference methods either rely on exhaustive search or are purely data-driven. Exhaustive search robustly infers the structure of arbitrarily complex data, but it is slow. Data-driven methods allow efficient inference, but do not generalize when test data have more complex structures than training data. In this paper, we propose a hybrid inference algorithm, the Neurally-Guided Struc… ▽ More

    Submitted 15 August, 2019; v1 submitted 17 June, 2019; originally announced June 2019.

    Comments: Proceedings of the 36th International Conference on Machine Learning (ICML 2019). First two authors contributed equally. Project page: http://ngsi.csail.mit.edu

    Journal ref: PMLR(2019)97: 4144--4153

  43. arXiv:1905.12879  [pdf, other

    cs.LG stat.ML

    Multi-Objective Generalized Linear Bandits

    Authors: Shiyin Lu, Guanghui Wang, Yao Hu, Lijun Zhang

    Abstract: In this paper, we study the multi-objective bandits (MOB) problem, where a learner repeatedly selects one arm to play and then receives a reward vector consisting of multiple objectives. MOB has found many real-world applications as varied as online recommendation and network routing. On the other hand, these applications typically contain contextual information that can guide the learning process… ▽ More

    Submitted 30 May, 2019; originally announced May 2019.

  44. arXiv:1905.12766  [pdf, other

    stat.ML cs.LG

    Noisy and Incomplete Boolean Matrix Factorizationvia Expectation Maximization

    Authors: Lifan Liang, Songjian Lu

    Abstract: Probabilistic approach to Boolean matrix factorization can provide solutions robustagainst noise and missing values with linear computational complexity. However,the assumption about latent factors can be problematic in real world applications.This study proposed a new probabilistic algorithm free of assumptions of latentfactors, while retaining the advantages of previous algorithms. Real data exp… ▽ More

    Submitted 29 May, 2019; originally announced May 2019.

  45. arXiv:1905.05917  [pdf, other

    cs.LG math.OC stat.ML

    Adaptivity and Optimality: A Universal Algorithm for Online Convex Optimization

    Authors: Guanghui Wang, Shiyin Lu, Lijun Zhang

    Abstract: In this paper, we study adaptive online convex optimization, and aim to design a universal algorithm that achieves optimal regret bounds for multiple common types of loss functions. Existing universal methods are limited in the sense that they are optimal for only a subclass of loss functions. To address this limitation, we propose a novel online method, namely Maler, which enjoys the optimal… ▽ More

    Submitted 14 May, 2019; originally announced May 2019.

    Comments: 12 pages, 2 figures

  46. arXiv:1905.02957  [pdf, other

    cs.LG stat.ML

    SAdam: A Variant of Adam for Strongly Convex Functions

    Authors: Guanghui Wang, Shiyin Lu, Weiwei Tu, Lijun Zhang

    Abstract: The Adam algorithm has become extremely popular for large-scale machine learning. Under convexity condition, it has been proved to enjoy a data-dependant $O(\sqrt{T})$ regret bound where $T$ is the time horizon. However, whether strong convexity can be utilized to further improve the performance remains an open problem. In this paper, we give an affirmative answer by developing a variant of Adam (… ▽ More

    Submitted 8 May, 2019; originally announced May 2019.

    Comments: 19 pages, 9 figures

  47. arXiv:1904.03779  [pdf, ps, other

    cs.LG stat.ML

    Cluster Developing 1-Bit Matrix Completion

    Authors: Chengkun Zhang. Junbin Gao, Stephen Lu

    Abstract: Matrix completion has a long-time history of usage as the core technique of recommender systems. In particular, 1-bit matrix completion, which considers the prediction as a ``Recommended'' or ``Not Recommended'' question, has proved its significance and validity in the field. However, while customers and products aggregate into interacted clusters, state-of-the-art model-based 1-bit recommender sy… ▽ More

    Submitted 7 April, 2019; originally announced April 2019.

    Comments: 16 Pages

  48. arXiv:1903.11385  [pdf, ps, other

    eess.SP cs.LG stat.ML

    Signal Demodulation with Machine Learning Methods for Physical Layer Visible Light Communications: Prototype Platform, Open Dataset and Algorithms

    Authors: Shuai Ma, Jiahui Dai, Songtao Lu, Hang Li, Han Zhang, Chun Du, Shiyin Li

    Abstract: In this paper, we investigate the design and implementation of machine learning (ML) based demodulation methods in the physical layer of visible light communication (VLC) systems. We build a flexible hardware prototype of an end-to-end VLC system, from which the received signals are collected as the real data. The dataset is available online, which contains eight types of modulated signals. Then,… ▽ More

    Submitted 13 March, 2019; originally announced March 2019.

  49. arXiv:1903.04297  [pdf, ps, other

    eess.SP cs.LG stat.ML

    Deep Learning for Signal Demodulation in Physical Layer Wireless Communications: Prototype Platform, Open Dataset, and Analytics

    Authors: Hongmei Wang, Zhenzhen Wu, Shuai Ma, Songtao Lu, Han Zhang, Guoru Ding, Shiyin Li

    Abstract: In this paper, we investigate deep learning (DL)-enabled signal demodulation methods and establish the first open dataset of real modulated signals for wireless communication systems. Specifically, we propose a flexible communication prototype platform for measuring real modulation dataset. Then, based on the measured dataset, two DL-based demodulators, called deep belief network (DBN)-support vec… ▽ More

    Submitted 8 March, 2019; originally announced March 2019.

  50. Hybrid Block Successive Approximation for One-Sided Non-Convex Min-Max Problems: Algorithms and Applications

    Authors: Songtao Lu, Ioannis Tsaknakis, Mingyi Hong, Yongxin Chen

    Abstract: The min-max problem, also known as the saddle point problem, is a class of optimization problems which minimizes and maximizes two subsets of variables simultaneously. This class of problems can be used to formulate a wide range of signal processing and communication (SPCOM) problems. Despite its popularity, most existing theory for this class has been mainly developed for problems with certain sp… ▽ More

    Submitted 16 March, 2021; v1 submitted 21 February, 2019; originally announced February 2019.