Zum Hauptinhalt springen

Showing 1–2 of 2 results for author: Manrique-Gómez, L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12852  [pdf, other

    cs.CL cs.AI

    Historical Ink: Semantic Shift Detection for 19th Century Spanish

    Authors: Tony Montes, Laura Manrique-Gómez, Rubén Manrique

    Abstract: This paper explores the evolution of word meanings in 19th-century Spanish texts, with an emphasis on Latin American Spanish, using computational linguistics techniques. It addresses the Semantic Shift Detection (SSD) task, which is crucial for understanding linguistic evolution, particularly in historical contexts. The study focuses on analyzing a set of Spanish target words. To achieve this, a 1… ▽ More

    Submitted 18 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

    Comments: 13 pages; added a preprint-reference URL

    Report number: montes-etal-2024-historical ACM Class: I.2.7

    Journal ref: ACL, Proceedings of the 5th Workshop on Computational Approaches to Historical Language Change, pages 29-41, 2024

  2. arXiv:2407.12838  [pdf, other

    cs.CL cs.DL

    Historical Ink: 19th Century Latin American Spanish Newspaper Corpus with LLM OCR Correction

    Authors: Laura Manrique-Gómez, Tony Montes, Rubén Manrique

    Abstract: This paper presents two significant contributions: first, a novel dataset of 19th-century Latin American press texts, which addresses the lack of specialized corpora for historical and linguistic analysis in this region. Second, it introduces a framework for OCR error correction and linguistic surface form detection in digitized corpora, utilizing a Large Language Model. This framework is adaptabl… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    ACM Class: I.2.7