Zum Hauptinhalt springen

Showing 1–21 of 21 results for author: Plant, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.00056  [pdf, other

    cs.LG cs.IR physics.chem-ph

    Temporal Subspace Clustering for Molecular Dynamics Data

    Authors: Anna Beer, Martin Heinrigs, Claudia Plant, Ira Assent

    Abstract: We introduce MOSCITO (MOlecular Dynamics Subspace Clustering with Temporal Observance), a subspace clustering for molecular dynamics data. MOSCITO groups those timesteps of a molecular dynamics trajectory together into clusters in which the molecule has similar conformations. In contrast to state-of-the-art methods, MOSCITO takes advantage of sequential relationships found in time series data. Unl… ▽ More

    Submitted 31 July, 2024; originally announced August 2024.

    Comments: Accepted as a research paper at BIOKDD 2024

    ACM Class: I.5.3; H.3.3; J.2

  2. arXiv:2406.18589  [pdf, other

    cs.CV cs.LG

    Text-Guided Alternative Image Clustering

    Authors: Andreas Stephan, Lukas Miklautz, Collin Leiber, Pedro Henrique Luz de Araujo, Dominik Répás, Claudia Plant, Benjamin Roth

    Abstract: Traditional image clustering techniques only find a single grouping within visual data. In particular, they do not provide a possibility to explicitly define multiple types of clustering. This work explores the potential of large vision-language models to facilitate alternative image clustering. We propose Text-Guided Alternative Image Consensus Clustering (TGAICC), a novel approach that leverages… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

  3. arXiv:2406.03614  [pdf

    cs.LG cs.CL q-fin.RM

    Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs

    Authors: Alexander Bakumenko, Kateřina Hlaváčková-Schindler, Claudia Plant, Nina C. Hubig

    Abstract: Detecting anomalies in general ledger data is of utmost importance to ensure trustworthiness of financial records. Financial audits increasingly rely on machine learning (ML) algorithms to identify irregular or potentially fraudulent journal entries, each characterized by a varying number of transactions. In machine learning, heterogeneity in feature dimensions adds significant complexity to data… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

  4. arXiv:2403.09171  [pdf, other

    cs.LG cs.AI

    ADEdgeDrop: Adversarial Edge Dropping for Robust Graph Neural Networks

    Authors: Zhaoliang Chen, Zhihao Wu, Ylli Sadikaj, Claudia Plant, Hong-Ning Dai, Shiping Wang, Yiu-Ming Cheung, Wenzhong Guo

    Abstract: Although Graph Neural Networks (GNNs) have exhibited the powerful ability to gather graph-structured information from neighborhood nodes via various message-passing mechanisms, the performance of GNNs is limited by poor generalization and fragile robustness caused by noisy and redundant graph data. As a prominent solution, Graph Augmentation Learning (GAL) has recently received increasing attentio… ▽ More

    Submitted 14 August, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

  5. arXiv:2402.02996  [pdf, other

    cs.LG cs.CV

    Text-Guided Image Clustering

    Authors: Andreas Stephan, Lukas Miklautz, Kevin Sidak, Jan Philip Wahle, Bela Gipp, Claudia Plant, Benjamin Roth

    Abstract: Image clustering divides a collection of images into meaningful groups, typically interpreted post-hoc via human-given annotations. Those are usually in the form of text, begging the question of using text as an abstraction for image clustering. Current image clustering methods, however, neglect the use of generated textual descriptions. We, therefore, propose Text-Guided Image Clustering, i.e., g… ▽ More

    Submitted 19 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

    Comments: Accepted to EACL 2024

  6. Extension of the Dip-test Repertoire -- Efficient and Differentiable p-value Calculation for Clustering

    Authors: Lena G. M. Bauer, Collin Leiber, Christian Böhm, Claudia Plant

    Abstract: Over the last decade, the Dip-test of unimodality has gained increasing interest in the data mining community as it is a parameter-free statistical test that reliably rates the modality in one-dimensional samples. It returns a so called Dip-value and a corresponding probability for the sample's unimodality (Dip-p-value). These two values share a sigmoidal relationship. However, the specific transf… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Journal ref: Proceedings of the 2023 SIAM International Conference on Data Mining (SDM) (pp. 109-117). Society for Industrial and Applied Mathematics

  7. Automatic Parameter Selection for Non-Redundant Clustering

    Authors: Collin Leiber, Dominik Mautz, Claudia Plant, Christian Böhm

    Abstract: High-dimensional datasets often contain multiple meaningful clusterings in different subspaces. For example, objects can be clustered either by color, weight, or size, revealing different interpretations of the given dataset. A variety of approaches are able to identify such non-redundant clusterings. However, most of these methods require the user to specify the expected number of subspaces and c… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Journal ref: Proceedings of the 2022 SIAM International Conference on Data Mining (SDM) (pp. 226-234). Society for Industrial and Applied Mathematics

  8. Spectral Clustering of Attributed Multi-relational Graphs

    Authors: Ylli Sadikaj, Yllka Velaj, Sahar Behzadi, Claudia Plant

    Abstract: Graph clustering aims at discovering a natural grouping of the nodes such that similar nodes are assigned to a common cluster. Many different algorithms have been proposed in the literature: for simple graphs, for graphs with attributes associated to nodes, and for graphs where edges represent different types of relations among nodes. However, complex data in many domains can be represented as bot… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Journal ref: Association for Computing Machinery, Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 21, Virtual Event, Singapore, August 2021, Pages 1431-1440

  9. arXiv:2304.07014  [pdf, other

    cs.LG cs.AI

    AGNN: Alternating Graph-Regularized Neural Networks to Alleviate Over-Smoothing

    Authors: Zhaoliang Chen, Zhihao Wu, Zhenghong Lin, Shiping Wang, Claudia Plant, Wenzhong Guo

    Abstract: Graph Convolutional Network (GCN) with the powerful capacity to explore graph-structural data has gained noticeable success in recent years. Nonetheless, most of the existing GCN-based models suffer from the notorious over-smoothing issue, owing to which shallow networks are extensively adopted. This may be problematic for complex graph datasets because a deeper GCN should be beneficial to propaga… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  10. arXiv:2304.06336  [pdf, other

    cs.LG cs.AI

    Attributed Multi-order Graph Convolutional Network for Heterogeneous Graphs

    Authors: Zhaoliang Chen, Zhihao Wu, Luying Zhong, Claudia Plant, Shiping Wang, Wenzhong Guo

    Abstract: Heterogeneous graph neural networks aim to discover discriminative node embeddings and relations from multi-relational networks.One challenge of heterogeneous graph learning is the design of learnable meta-paths, which significantly influences the quality of learned embeddings.Thus, in this paper, we propose an Attributed Multi-Order Graph Convolutional Network (AMOGCN), which automatically studie… ▽ More

    Submitted 18 April, 2023; v1 submitted 13 April, 2023; originally announced April 2023.

  11. arXiv:2212.06642  [pdf, other

    cs.LG physics.ao-ph

    AWT -- Clustering Meteorological Time Series Using an Aggregated Wavelet Tree

    Authors: Christina Pacher, Irene Schicker, Rosmarie deWit, Katerina Hlavackova-Schindler, Claudia Plant

    Abstract: Both clustering and outlier detection play an important role for meteorological measurements. We present the AWT algorithm, a clustering algorithm for time series data that also performs implicit outlier detection during the clustering. AWT integrates ideas of several well-known K-Means clustering algorithms. It chooses the number of clusters automatically based on a user-defined threshold paramet… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

    Comments: 11 pages; Extended version of the paper published at IEEE DSAA 2022

  12. Multi-view Graph Convolutional Networks with Differentiable Node Selection

    Authors: Zhaoliang Chen, Lele Fu, Shunxin Xiao, Shiping Wang, Claudia Plant, Wenzhong Guo

    Abstract: Multi-view data containing complementary and consensus information can facilitate representation learning by exploiting the intact integration of multi-view features. Because most objects in real world often have underlying connections, organizing multi-view data as heterogeneous graphs is beneficial to extracting latent information among different objects. Due to the powerful capability to gather… ▽ More

    Submitted 13 August, 2023; v1 submitted 9 December, 2022; originally announced December 2022.

    Journal ref: ACM Transactions on Knowledge Discovery from Data 18, 1, Article 6 (January 2024), 21 pages

  13. arXiv:2211.09155  [pdf, other

    cs.CV cs.AI cs.LG

    Learnable Graph Convolutional Network and Feature Fusion for Multi-view Learning

    Authors: Zhaoliang Chen, Lele Fu, Jie Yao, Wenzhong Guo, Claudia Plant, Shiping Wang

    Abstract: In practical applications, multi-view data depicting objectives from assorted perspectives can facilitate the accuracy increase of learning algorithms. However, given multi-view data, there is limited work for learning discriminative node relationships and graph information simultaneously via graph convolutional network that has drawn the attention from considerable researchers in recent years. Mo… ▽ More

    Submitted 16 November, 2022; originally announced November 2022.

  14. Deep Clustering With Consensus Representations

    Authors: Lukas Miklautz, Martin Teuffenbach, Pascal Weber, Rona Perjuci, Walid Durani, Christian Böhm, Claudia Plant

    Abstract: The field of deep clustering combines deep learning and clustering to learn representations that improve both the learned representation and the performance of the considered clustering method. Most existing deep clustering methods are designed for a single clustering method, e.g., k-means, spectral clustering, or Gaussian mixture models, but it is well known that no clustering algorithm works bes… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted by the IEEE International Conference on Data Mining (ICDM) 2022

  15. arXiv:2206.06714  [pdf, other

    cs.CV

    Interpretable Gait Recognition by Granger Causality

    Authors: Michal Balazia, Katerina Hlavackova-Schindler, Petr Sojka, Claudia Plant

    Abstract: Which joint interactions in the human gait cycle can be used as biometric characteristics? Most current methods on gait recognition suffer from the lack of interpretability. We propose an interpretable feature representation of gait sequences by the graphical Granger causal inference. Gait sequence of a person in the standardized motion capture format, constituting a set of 3D joint spatial trajec… ▽ More

    Submitted 7 December, 2022; v1 submitted 14 June, 2022; originally announced June 2022.

    Comments: Preprint. Full paper accepted at the IEEE/IAPR International Conference on Pattern Recognition (ICPR), Montreal, Canada, August 2022. 7 pages

    MSC Class: 68T05; 68T10 ACM Class: I.5

  16. arXiv:2206.06124  [pdf, other

    cs.LG math.ST

    Causal Discovery in Hawkes Processes by Minimum Description Length

    Authors: Amirkasra Jalaldoust, Katerina Hlavackova-Schindler, Claudia Plant

    Abstract: Hawkes processes are a special class of temporal point processes which exhibit a natural notion of causality, as occurrence of events in the past may increase the probability of events in the future. Discovery of the underlying influence network among the dimensions of multi-dimensional temporal processes is of high importance in disciplines where a high-frequency data is to model, e.g. in financi… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

    Comments: 10 pages, 3 figures; Will be published in Proceedings of the 36th AAAI Conference

  17. arXiv:2112.04845  [pdf, other

    cs.DC

    High performance computing on Android devices -- a case study

    Authors: Robert Fritze, Claudia Plant

    Abstract: High performance computing for low power devices can be useful to speed up calculations on processors that use a lower clock rate than computers for which energy efficiency is not an issue. In this trial, different high performance techniques for Android devices have been compared, with a special focus on the use of the GPU. Although not officially supported, the OpenCL framework can be used on An… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    ACM Class: C.1.4; D.1.3

  18. arXiv:2112.04800  [pdf, other

    cs.DC cs.LG

    GPU backed Data Mining on Android Devices

    Authors: Robert Fritze, Claudia Plant

    Abstract: Choosing an appropriate programming paradigm for high-performance computing on low-power devices can be useful to speed up calculations. Many Android devices have an integrated GPU and - although not officially supported - the OpenCL framework can be used on Android devices for addressing these GPUs. OpenCL supports thread and data parallelism. Applications that use the GPU must account for the fa… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: 11 pages

    ACM Class: D.1.3; C.1.4

  19. arXiv:2104.13323  [pdf, other

    cs.LG

    Network Embedding via Deep Prediction Model

    Authors: Xin Sun, Zenghui Song, Yongbo Yu, Junyu Dong, Claudia Plant, Christian Boehm

    Abstract: Network-structured data becomes ubiquitous in daily life and is growing at a rapid pace. It presents great challenges to feature engineering due to the high non-linearity and sparsity of the data. The local and global structure of the real-world networks can be reflected by dynamical transfer behaviors among nodes. This paper proposes a network embedding framework to capture the transfer behaviors… ▽ More

    Submitted 27 April, 2021; originally announced April 2021.

  20. arXiv:2011.03479  [pdf, other

    cs.LG cs.DC

    Massively Parallel Graph Drawing and Representation Learning

    Authors: Christian Böhm, Claudia Plant

    Abstract: To fully exploit the performance potential of modern multi-core processors, machine learning and data mining algorithms for big data must be parallelized in multiple ways. Today's CPUs consist of multiple cores, each following an independent thread of control, and each equipped with multiple arithmetic units which can perform the same operation on a vector of multiple data objects. Graph embedding… ▽ More

    Submitted 6 November, 2020; originally announced November 2020.

    Journal ref: IEEE BigData 2020

  21. arXiv:2003.11079  [pdf, other

    cs.LG cs.HC cs.SI stat.ML

    Incorporating User's Preference into Attributed Graph Clustering

    Authors: Wei Ye, Dominik Mautz, Christian Boehm, Ambuj Singh, Claudia Plant

    Abstract: Graph clustering has been studied extensively on both plain graphs and attributed graphs. However, all these methods need to partition the whole graph to find cluster structures. Sometimes, based on domain knowledge, people may have information about a specific target region in the graph and only want to find a single cluster concentrated on this local region. Such a task is called local clusterin… ▽ More

    Submitted 24 March, 2020; originally announced March 2020.