Zum Hauptinhalt springen

Showing 1–4 of 4 results for author: Jasinska-Kobus, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.10992  [pdf, other

    cs.CL cs.LG

    AlleNoise -- large-scale text classification benchmark dataset with real-world label noise

    Authors: Alicja Rączkowska, Aleksandra Osowska-Kurczab, Jacek Szczerbiński, Kalina Jasinska-Kobus, Klaudia Nazarko

    Abstract: Label noise remains a challenge for training robust classification models. Most methods for mitigating label noise have been benchmarked using primarily datasets with synthetic noise. While the need for datasets with realistic noise distribution has partially been addressed by web-scraped benchmarks such as WebVision and Clothing1M, those benchmarks are restricted to the computer vision domain. Wi… ▽ More

    Submitted 24 June, 2024; originally announced July 2024.

  2. Propensity-scored Probabilistic Label Trees

    Authors: Marek Wydmuch, Kalina Jasinska-Kobus, Rohit Babbar, Krzysztof Dembczyński

    Abstract: Extreme multi-label classification (XMLC) refers to the task of tagging instances with small subsets of relevant labels coming from an extremely large set of all possible labels. Recently, XMLC has been widely applied to diverse web applications such as automatic content labeling, online advertising, or recommendation systems. In such environments, label distribution is often highly imbalanced, co… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: The extended version of SIGIR '21 Short Research Paper

  3. arXiv:2009.11218  [pdf, ps, other

    cs.LG stat.ML

    Probabilistic Label Trees for Extreme Multi-label Classification

    Authors: Kalina Jasinska-Kobus, Marek Wydmuch, Krzysztof Dembczynski, Mikhail Kuznetsov, Robert Busa-Fekete

    Abstract: Extreme multi-label classification (XMLC) is a learning task of tagging instances with a small subset of relevant labels chosen from an extremely large pool of possible labels. Problems of this scale can be efficiently handled by organizing labels as a tree, like in hierarchical softmax used for multi-class problems. In this paper, we thoroughly investigate probabilistic label trees (PLTs) which c… ▽ More

    Submitted 23 September, 2020; originally announced September 2020.

  4. arXiv:2007.04451  [pdf, ps, other

    cs.LG stat.ML

    Online probabilistic label trees

    Authors: Kalina Jasinska-Kobus, Marek Wydmuch, Devanathan Thiruvenkatachari, Krzysztof Dembczyński

    Abstract: We introduce online probabilistic label trees (OPLTs), an algorithm that trains a label tree classifier in a fully online manner without any prior knowledge about the number of training instances, their features and labels. OPLTs are characterized by low time and space complexity as well as strong theoretical guarantees. They can be used for online multi-label and multi-class classification, inclu… ▽ More

    Submitted 26 March, 2021; v1 submitted 8 July, 2020; originally announced July 2020.

    Comments: Accepted at AISTATS 2021