Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Toselli, A

Searching in archive cs. Search in all archives.
.
  1. What distinguishes conspiracy from critical narratives? A computational analysis of oppositional discourse

    Authors: Damir Korenčić, Berta Chulvi, Xavier Bonet Casals, Alejandro Toselli, Mariona Taulé, Paolo Rosso

    Abstract: The current prevalence of conspiracy theories on the internet is a significant issue, tackled by many computational approaches. However, these approaches fail to recognize the relevance of distinguishing between texts which contain a conspiracy theory and texts which are simply critical and oppose mainstream narratives. Furthermore, little attention is usually paid to the role of inter-group confl… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: submitted to the Expert Systems journal

    ACM Class: I.2.7; J.4

  2. End-to-End Page-Level Assessment of Handwritten Text Recognition

    Authors: Enrique Vidal, Alejandro H. Toselli, Antonio Ríos-Vila, Jorge Calvo-Zaragoza

    Abstract: The evaluation of Handwritten Text Recognition (HTR) systems has traditionally used metrics based on the edit distance between HTR and ground truth (GT) transcripts, at both the character and word levels. This is very adequate when the experimental protocol assumes that both GT and HTR text lines are the same, which allows edit distances to be independently computed to each given line. Driven by r… ▽ More

    Submitted 21 May, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

    Comments: Published in Pattern Recognition

    ACM Class: I.5.4

  3. arXiv:2212.02352  [pdf, ps, other

    cs.CL

    Fake News and Hate Speech: Language in Common

    Authors: Berta Chulvi, Alejandro Toselli, Paolo Rosso

    Abstract: In this paper we raise the research question of whether fake news and hate speech spreaders share common patterns in language. We compute a novel index, the ingroup vs outgroup index, in three different datasets and we show that both phenomena share an "us vs them" narrative.

    Submitted 5 December, 2022; originally announced December 2022.

    Comments: 2 pages

  4. arXiv:2206.13342  [pdf, other

    cs.CV cs.CL cs.IR cs.LG

    Open Set Classification of Untranscribed Handwritten Documents

    Authors: José Ramón Prieto, Juan José Flores, Enrique Vidal, Alejandro H. Toselli, David Garrido, Carlos Alonso

    Abstract: Huge amounts of digital page images of important manuscripts are preserved in archives worldwide. The amounts are so large that it is generally unfeasible for archivists to adequately tag most of the documents with the required metadata so as to low proper organization of the archives and effective exploration by scholars and the general public. The class or ``typology'' of a document is perhaps t… ▽ More

    Submitted 20 June, 2022; originally announced June 2022.

  5. Digital Editions as Distant Supervision for Layout Analysis of Printed Books

    Authors: Alejandro H. Toselli, Si Wu, David A. Smith

    Abstract: Archivists, textual scholars, and historians often produce digital editions of historical documents. Using markup schemes such as those of the Text Encoding Initiative and EpiDoc, these digital editions often record documents' semantic regions (such as notes and figures) and physical features (such as page and line breaks) as well as transcribing their textual content. We describe methods for expl… ▽ More

    Submitted 23 December, 2021; originally announced December 2021.

    Comments: 15 pages, 2 figures. International Conference on Document Analysis and Recognition. Springer, Cham, 2021

  6. arXiv:2106.08499  [pdf, other

    cs.CV cs.AI cs.LG

    ICDAR 2021 Competition on Components Segmentation Task of Document Photos

    Authors: Celso A. M. Lopes Junior, Ricardo B. das Neves Junior, Byron L. D. Bezerra, Alejandro H. Toselli, Donato Impedovo

    Abstract: This paper describes the short-term competition on the Components Segmentation Task of Document Photos that was prepared in the context of the 16th International Conference on Document Analysis and Recognition (ICDAR 2021). This competition aims to bring together researchers working in the field of identification document image processing and provides them a suitable benchmark to compare their tec… ▽ More

    Submitted 8 July, 2021; v1 submitted 15 June, 2021; originally announced June 2021.

    Comments: 15 pages; 5 figures; Accepted at ICDAR 2021: 16th International Conference on Document Analysis and Recognition

  7. arXiv:2104.04556  [pdf, other

    cs.IR

    A Probabilistic Framework for Lexicon-based Keyword Spotting in Handwritten Text Images

    Authors: E. Vidal, A. H. Toselli, J. Puigcerver

    Abstract: Query by String Keyword Spotting (KWS) is here considered as a key technology for indexing large collections of handwritten text images to allow fast textual access to the contents of these collections. Under this perspective, a probabilistic framework for lexicon-based KWS in text images is presented. The presentation aims at providing a tutorial view that helps to understand the relations betwee… ▽ More

    Submitted 9 April, 2021; originally announced April 2021.

    Comments: 42 pages, 35 headers, 16 figures/tables

    Report number: Tech. rep., UPV (2017)