Zum Hauptinhalt springen

Showing 1–8 of 8 results for author: Scheipl, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.12806  [pdf, other

    stat.ML cs.LG

    DCSI -- An improved measure of cluster separability based on separation and connectedness

    Authors: Jana Gauss, Fabian Scheipl, Moritz Herrmann

    Abstract: Whether class labels in a given data set correspond to meaningful clusters is crucial for the evaluation of clustering algorithms using real-world data sets. This property can be quantified by separability measures. The central aspects of separability for density-based clustering are between-class separation and within-class connectedness, and neither classification-based complexity measures nor c… ▽ More

    Submitted 1 July, 2024; v1 submitted 19 October, 2023; originally announced October 2023.

  2. arXiv:2207.00510  [pdf, other

    cs.LG

    Enhancing cluster analysis via topological manifold learning

    Authors: Moritz Herrmann, Daniyal Kazempour, Fabian Scheipl, Peer Kröger

    Abstract: We discuss topological aspects of cluster analysis and show that inferring the topological structure of a dataset before clustering it can considerably enhance cluster detection: theoretical arguments and empirical evidence show that clustering embedding vectors, representing the structure of a data manifold instead of the observed feature vectors themselves, is highly beneficial. To demonstrate,… ▽ More

    Submitted 1 July, 2022; originally announced July 2022.

    Comments: 43, pages, 10 figures

  3. arXiv:2207.00367  [pdf, other

    stat.ML cs.LG

    A geometric framework for outlier detection in high-dimensional data

    Authors: Moritz Herrmann, Florian Pfisterer, Fabian Scheipl

    Abstract: Outlier or anomaly detection is an important task in data analysis. We discuss the problem from a geometrical perspective and provide a framework that exploits the metric structure of a data set. Our approach rests on the manifold assumption, i.e., that the observed, nominally high-dimensional data lie on a much lower dimensional manifold and that this intrinsic structure can be inferred with mani… ▽ More

    Submitted 29 July, 2022; v1 submitted 1 July, 2022; originally announced July 2022.

    Comments: 24 page, 6 figures, extended introduction, contribution, and discussion sections, additional experiments added

  4. arXiv:2109.06849  [pdf, other

    stat.ML cs.LG stat.CO

    A geometric perspective on functional outlier detection

    Authors: Moritz Herrmann, Fabian Scheipl

    Abstract: We consider functional outlier detection from a geometric perspective, specifically: for functional data sets drawn from a functional manifold which is defined by the data's modes of variation in amplitude and phase. Based on this manifold, we develop a conceptualization of functional outlier detection that is more widely applicable and realistic than previously proposed. Our theoretical and exper… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: 40 pages, 20 figures

  5. arXiv:2107.14330  [pdf, ps, other

    cs.CY cs.LG

    Developing Open Source Educational Resources for Machine Learning and Data Science

    Authors: Ludwig Bothmann, Sven Strickroth, Giuseppe Casalicchio, David Rügamer, Marius Lindauer, Fabian Scheipl, Bernd Bischl

    Abstract: Education should not be a privilege but a common good. It should be openly accessible to everyone, with as few barriers as possible; even more so for key technologies such as Machine Learning (ML) and Data Science (DS). Open Educational Resources (OER) are a crucial factor for greater educational equity. In this paper, we describe the specific requirements for OER in ML and DS and argue that it is… ▽ More

    Submitted 10 August, 2021; v1 submitted 28 July, 2021; originally announced July 2021.

    Comments: 6 pages

    Journal ref: Proceedings of the Third Teaching Machine Learning and Artificial Intelligence Workshop, PMLR 207:1-6, 2022

  6. arXiv:2012.11987  [pdf, other

    stat.ML cs.LG

    Unsupervised Functional Data Analysis via Nonlinear Dimension Reduction

    Authors: Moritz Herrmann, Fabian Scheipl

    Abstract: In recent years, manifold methods have moved into focus as tools for dimension reduction. Assuming that the high-dimensional data actually lie on or close to a low-dimensional nonlinear manifold, these methods have shown convincing results in several settings. This manifold assumption is often reasonable for functional data, i.e., data representing continuously observed functions, as well. However… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

    Comments: 29 pages, 11 figures

  7. arXiv:2006.15442  [pdf, other

    stat.ML cs.LG stat.CO

    A General Machine Learning Framework for Survival Analysis

    Authors: Andreas Bender, David Rügamer, Fabian Scheipl, Bernd Bischl

    Abstract: The modeling of time-to-event data, also known as survival analysis, requires specialized methods that can deal with censoring and truncation, time-varying features and effects, and that extend to settings with multiple competing events. However, many machine learning methods for survival analysis only consider the standard setting with right-censored data and proportional hazards assumption. The… ▽ More

    Submitted 17 April, 2021; v1 submitted 27 June, 2020; originally announced June 2020.

  8. arXiv:1911.07511  [pdf, other

    stat.ML cs.LG

    Benchmarking time series classification -- Functional data vs machine learning approaches

    Authors: Florian Pfisterer, Laura Beggel, Xudong Sun, Fabian Scheipl, Bernd Bischl

    Abstract: Time series classification problems have drawn increasing attention in the machine learning and statistical community. Closely related is the field of functional data analysis (FDA): it refers to the range of problems that deal with the analysis of data that is continuously indexed over some domain. While often employing different methods, both fields strive to answer similar questions, a common e… ▽ More

    Submitted 24 February, 2021; v1 submitted 18 November, 2019; originally announced November 2019.