Zum Hauptinhalt springen

Showing 1–11 of 11 results for author: Samorodnitsky, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.03818  [pdf, other

    stat.ML cs.LG

    Empirical Risk Minimization for Losses without Variance

    Authors: Guanhua Fang, Ping Li, Gennady Samorodnitsky

    Abstract: This paper considers an empirical risk minimization problem under heavy-tailed settings, where data does not have finite variance, but only has $p$-th moment with $p \in (1,2)$. Instead of using estimation procedure based on truncated observed data, we choose the optimizer by minimizing the risk value. Those risk values can be robustly estimated via using the remarkable Catoni's method (Catoni, 20… ▽ More

    Submitted 7 September, 2023; originally announced September 2023.

  2. arXiv:2307.11280  [pdf, other

    cs.LG cs.CR cs.DS

    Epsilon*: Privacy Metric for Machine Learning Models

    Authors: Diana M. Negoescu, Humberto Gonzalez, Saad Eddin Al Orjany, Jilei Yang, Yuliia Lut, Rahul Tandra, Xiaowen Zhang, Xinyi Zheng, Zach Douglas, Vidita Nolkha, Parvez Ahammad, Gennady Samorodnitsky

    Abstract: We introduce Epsilon*, a new privacy metric for measuring the privacy risk of a single model instance prior to, during, or after deployment of privacy mitigation strategies. The metric requires only black-box access to model predictions, does not require training data re-sampling or model re-training, and can be used to measure the privacy risk of models not trained with differential privacy. Epsi… ▽ More

    Submitted 9 February, 2024; v1 submitted 20 July, 2023; originally announced July 2023.

  3. arXiv:2306.13824  [pdf, other

    cs.CR cs.DS cs.LG

    Adaptive Privacy Composition for Accuracy-first Mechanisms

    Authors: Ryan Rogers, Gennady Samorodnitsky, Zhiwei Steven Wu, Aaditya Ramdas

    Abstract: In many practical applications of differential privacy, practitioners seek to provide the best privacy guarantees subject to a target level of accuracy. A recent line of work by Ligett et al. '17 and Whitehouse et al. '22 has developed such accuracy-first mechanisms by leveraging the idea of noise reduction that adds correlated noise to the sufficient statistic in a private computation and produce… ▽ More

    Submitted 5 December, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

  4. arXiv:2306.04902  [pdf, other

    cs.DS cs.LG math.ST

    A Cover Time Study of a non-Markovian Algorithm

    Authors: Guanhua Fang, Gennady Samorodnitsky, Zhiqiang Xu

    Abstract: Given a traversal algorithm, cover time is the expected number of steps needed to visit all nodes in a given graph. A smaller cover time means a higher exploration efficiency of traversal algorithm. Although random walk algorithms have been studied extensively in the existing literature, there has been no cover time result for any non-Markovian method. In this work, we stand on a theoretical persp… ▽ More

    Submitted 11 August, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: 25 pages

  5. arXiv:2211.13172  [pdf, other

    stat.ML cs.LG math.ST stat.ME

    Kernel PCA for multivariate extremes

    Authors: Marco Avella-Medina, Richard A. Davis, Gennady Samorodnitsky

    Abstract: We propose kernel PCA as a method for analyzing the dependence structure of multivariate extremes and demonstrate that it can be a powerful tool for clustering and dimension reduction. Our work provides some theoretical insight into the preimages obtained by kernel PCA, demonstrating that under certain conditions they can effectively identify clusters in the data. We build on these new insights to… ▽ More

    Submitted 23 November, 2022; v1 submitted 23 November, 2022; originally announced November 2022.

  6. arXiv:2211.08311  [pdf, ps, other

    stat.ML cs.LG

    On Penalization in Stochastic Multi-armed Bandits

    Authors: Guanhua Fang, Ping Li, Gennady Samorodnitsky

    Abstract: We study an important variant of the stochastic multi-armed bandit (MAB) problem, which takes penalization into consideration. Instead of directly maximizing cumulative expected reward, we need to balance between the total reward and fairness level. In this paper, we present some new insights in MAB and formulate the problem in the penalization framework, where rigorous penalized regret can be wel… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

  7. arXiv:2208.03185  [pdf, ps, other

    math.ST cs.LG stat.ML

    Catoni-style Confidence Sequences under Infinite Variance

    Authors: Sujay Bhatt, Guanhua Fang, Ping Li, Gennady Samorodnitsky

    Abstract: In this paper, we provide an extension of confidence sequences for settings where the variance of the data-generating distribution does not exist or is infinite. Confidence sequences furnish confidence intervals that are valid at arbitrary data-dependent stopping times, naturally having a wide range of applications. We first establish a lower bound for the width of the Catoni-style confidence sequ… ▽ More

    Submitted 5 August, 2022; originally announced August 2022.

    Comments: 10 pages

  8. arXiv:2204.07821  [pdf, other

    math.ST cs.CG math.AT stat.ML

    Detection of Small Holes by the Scale-Invariant Robust Density-Aware Distance (RDAD) Filtration

    Authors: Chunyin Siu, Gennady Samorodnitsky, Christina Lee Yu, Andrey Yao

    Abstract: A novel topological-data-analytical (TDA) method is proposed to distinguish, from noise, small holes surrounded by high-density regions of a probability density function. The proposed method is robust against additive noise and outliers. Traditional TDA tools, like those based on the distance filtration, often struggle to distinguish small features from noise, because both have short persistences.… ▽ More

    Submitted 30 March, 2024; v1 submitted 16 April, 2022; originally announced April 2022.

    Comments: 39 pages, 38 figs, J Appl. and Comput. Topology (2024). GitHub: [github.com/c-siu/RDAD]. Published version: [rdcu.be/dCXLa]. Diff of v2/3: added publication info, NO post-submission improvements (Cor2-3 rephrased and proven, setup of Sec4.1 explained, complexity computed in Sec6.1, Thm5 simplified, comparison with DTM in Sec1,8, streamlining), so no change in pdf. Diff of v1/2: more thms, more discussion on conformality, fewer egs

    MSC Class: 62R40; 55N31; 52R40; 68T09

  9. arXiv:2111.07799  [pdf, other

    stat.ML cs.LG math.ST

    Spectral learning of multivariate extremes

    Authors: Marco Avella Medina, Richard A. Davis, Gennady Samorodnitsky

    Abstract: We propose a spectral clustering algorithm for analyzing the dependence structure of multivariate extremes. More specifically, we focus on the asymptotic dependence of multivariate extremes characterized by the angular or spectral measure in extreme value theory. Our work studies the theoretical performance of spectral clustering based on a random $k$-nearest neighbor graph constructed from an ext… ▽ More

    Submitted 1 August, 2023; v1 submitted 15 November, 2021; originally announced November 2021.

  10. arXiv:2109.04433  [pdf, ps, other

    stat.ML cs.LG

    Extreme Bandits using Robust Statistics

    Authors: Sujay Bhatt, Ping Li, Gennady Samorodnitsky

    Abstract: We consider a multi-armed bandit problem motivated by situations where only the extreme values, as opposed to expected values in the classical bandit setting, are of interest. We propose distribution free algorithms using robust statistics and characterize the statistical properties. We show that the provided algorithms achieve vanishing extremal regret under weaker conditions than existing algori… ▽ More

    Submitted 9 September, 2021; originally announced September 2021.

  11. arXiv:1308.1009  [pdf, ps, other

    cs.LG cs.DS cs.IR

    Sign Stable Projections, Sign Cauchy Projections and Chi-Square Kernels

    Authors: Ping Li, Gennady Samorodnitsky, John Hopcroft

    Abstract: The method of stable random projections is popular for efficiently computing the Lp distances in high dimension (where 0<p<=2), using small space. Because it adopts nonadaptive linear projections, this method is naturally suitable when the data are collected in a dynamic streaming fashion (i.e., turnstile data streams). In this paper, we propose to use only the signs of the projected data and anal… ▽ More

    Submitted 5 August, 2013; originally announced August 2013.