Zum Hauptinhalt springen

Showing 1–10 of 10 results for author: Ksieniewicz, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.10807  [pdf, other

    cs.CL cs.LG

    Employing Sentence Space Embedding for Classification of Data Stream from Fake News Domain

    Authors: Paweł Zyblewski, Jakub Klikowski, Weronika Borek-Marciniec, Paweł Ksieniewicz

    Abstract: Tabular data is considered the last unconquered castle of deep learning, yet the task of data stream classification is stated to be an equally important and demanding research area. Due to the temporal constraints, it is assumed that deep learning methods are not the optimal solution for application in this field. However, excluding the entire -- and prevalent -- group of methods seems rather rash… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: 8 pages, 8 figures

  2. arXiv:2406.10255  [pdf, other

    cs.CL cs.SI

    WarCov -- Large multilabel and multimodal dataset from social platform

    Authors: Weronika Borek-Marciniec, Pawel Zyblewski, Jakub Klikowski, Pawel Ksieniewicz

    Abstract: In the classification tasks, from raw data acquisition to the curation of a dataset suitable for use in evaluating machine learning models, a series of steps - often associated with high costs - are necessary. In the case of Natural Language Processing, initial cleaning and conversion can be performed automatically, but obtaining labels still requires the rationalized input of human experts. As a… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 13 pages, 6 figures

  3. arXiv:2404.07776  [pdf, other

    cs.LG

    Unsupervised Concept Drift Detection based on Parallel Activations of Neural Network

    Authors: Joanna Komorniczak, Paweł Ksieniewicz

    Abstract: Practical applications of artificial intelligence increasingly often have to deal with the streaming properties of real data, which, considering the time factor, are subject to phenomena such as periodicity and more or less chaotic degeneration - resulting directly in the concept drifts. The modern concept drift detectors almost always assume immediate access to labels, which due to their cost, li… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  4. arXiv:2402.06331  [pdf, other

    cs.LG cs.CV

    Taking Class Imbalance Into Account in Open Set Recognition Evaluation

    Authors: Joanna Komorniczak, Pawel Ksieniewicz

    Abstract: In recent years Deep Neural Network-based systems are not only increasing in popularity but also receive growing user trust. However, due to the closed-world assumption of such systems, they cannot recognize samples from unknown classes and often induce an incorrect label with high confidence. Presented work looks at the evaluation of methods for Open Set Recognition, focusing on the impact of cla… ▽ More

    Submitted 9 February, 2024; originally announced February 2024.

  5. torchosr -- a PyTorch extension package for Open Set Recognition models evaluation in Python

    Authors: Joanna Komorniczak, Pawel Ksieniewicz

    Abstract: The article presents the torchosr package - a Python package compatible with PyTorch library - offering tools and methods dedicated to Open Set Recognition in Deep Neural Networks. The package offers two state-of-the-art methods in the field, a set of functions for handling base sets and generation of derived sets for the Open Set Recognition task (where some classes are considered unknown and use… ▽ More

    Submitted 16 May, 2023; originally announced May 2023.

  6. arXiv:2207.06709  [pdf, ps, other

    cs.LG

    problexity -- an open-source Python library for binary classification problem complexity assessment

    Authors: Joanna Komorniczak, Pawel Ksieniewicz

    Abstract: The classification problem's complexity assessment is an essential element of many topics in the supervised learning domain. It plays a significant role in meta-learning -- becoming the basis for determining meta-attributes or multi-criteria optimization -- allowing the evaluation of the training set resampling without needing to rebuild the recognition model. The tools currently available for the… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

    Comments: 20 pages, 1 figure

  7. arXiv:2206.11867  [pdf, other

    cs.CL cs.LG

    Lifelong Learning Natural Language Processing Approach for Multilingual Data Classification

    Authors: Jędrzej Kozal, Michał Leś, Paweł Zyblewski, Paweł Ksieniewicz, Michał Woźniak

    Abstract: The abundance of information in digital media, which in today's world is the main source of knowledge about current events for the masses, makes it possible to spread disinformation on a larger scale than ever before. Consequently, there is a need to develop novel fake news detection approaches capable of adapting to changing factual contexts and generalizing previously or concurrently acquired kn… ▽ More

    Submitted 25 May, 2022; originally announced June 2022.

  8. arXiv:2112.10150  [pdf, ps, other

    cs.LG

    Active Weighted Aging Ensemble for Drifted Data Stream Classification

    Authors: Michał Woźniak, Paweł Zyblewski, Paweł Ksieniewicz

    Abstract: One of the significant problems of streaming data classification is the occurrence of concept drift, consisting of the change of probabilistic characteristics of the classification task. This phenomenon destabilizes the performance of the classification model and seriously degrades its quality. An appropriate strategy counteracting this phenomenon is required to adapt the classifier to the changin… ▽ More

    Submitted 19 December, 2021; originally announced December 2021.

    Comments: 29 pages, 3 figures

  9. arXiv:2101.01142  [pdf, other

    cs.CL cs.SI

    Advanced Machine Learning Techniques for Fake News (Online Disinformation) Detection: A Systematic Mapping Study

    Authors: Michal Choras, Konstantinos Demestichas, Agata Gielczyk, Alvaro Herrero, Pawel Ksieniewicz, Konstantina Remoundou, Daniel Urda, Michal Wozniak

    Abstract: Fake news has now grown into a big problem for societies and also a major challenge for people fighting disinformation. This phenomenon plagues democratic elections, reputations of individual persons or organizations, and has negatively impacted citizens, (e.g., during the COVID-19 pandemic in the US or Brazil). Hence, developing effective tools to fight this phenomenon by employing advanced Machi… ▽ More

    Submitted 28 December, 2020; originally announced January 2021.

  10. arXiv:2001.11077  [pdf, ps, other

    cs.LG cs.CV stat.ML

    stream-learn -- open-source Python library for difficult data stream batch analysis

    Authors: Paweł Ksieniewicz, Paweł Zyblewski

    Abstract: stream-learn is a Python package compatible with scikit-learn and developed for the drifting and imbalanced data stream analysis. Its main component is a stream generator, which allows to produce a synthetic data stream that may incorporate each of the three main concept drift types (i.e. sudden, gradual and incremental drift) in their recurring or non-recurring versions. The package allows conduc… ▽ More

    Submitted 29 January, 2020; originally announced January 2020.