Zum Hauptinhalt springen

Showing 1–5 of 5 results for author: Akhondi, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2311.15402  [pdf, other

    cs.CL

    Learning Section Weights for Multi-Label Document Classification

    Authors: Maziar Moradi Fard, Paula Sorrolla Bayod, Kiomars Motarjem, Mohammad Alian Nejadi, Saber Akhondi, Camilo Thorne

    Abstract: Multi-label document classification is a traditional task in NLP. Compared to single-label classification, each document can be assigned multiple classes. This problem is crucially important in various domains, such as tagging scientific articles. Documents are often structured into several sections such as abstract and title. Current approaches treat different sections equally for multi-label cla… ▽ More

    Submitted 26 November, 2023; originally announced November 2023.

    Comments: 7 pages, 4 figures, 5 tables

  2. arXiv:2311.14633  [pdf, other

    cs.CV cs.AI

    One Strike, You're Out: Detecting Markush Structures in Low Signal-to-Noise Ratio Images

    Authors: Thomas Jurriaans, Kinga Szarkowska, Eric Nalisnick, Markus Schwoerer, Camilo Thorne, Saber Akhondi

    Abstract: Modern research increasingly relies on automated methods to assist researchers. An example of this is Optical Chemical Structure Recognition (OCSR), which aids chemists in retrieving information about chemicals from large amounts of documents. Markush structures are chemical structures that cannot be parsed correctly by OCSR and cause errors. The focus of this research was to propose and test a no… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: 15 pages, 9 tables, 16 figures

  3. arXiv:2306.13379  [pdf, other

    cs.CL

    Stress Testing BERT Anaphora Resolution Models for Reaction Extraction in Chemical Patents

    Authors: Chieling Yueh, Evangelos Kanoulas, Bruno Martins, Camilo Thorne, Saber Akhondi

    Abstract: The high volume of published chemical patents and the importance of a timely acquisition of their information gives rise to automating information extraction from chemical patents. Anaphora resolution is an important component of comprehensive information extraction, and is critical for extracting reactions. In chemical patents, there are five anaphoric relations of interest: co-reference, transfo… ▽ More

    Submitted 23 June, 2023; originally announced June 2023.

  4. arXiv:2010.12912  [pdf, other

    cs.CL cs.LG

    Word Embeddings for Chemical Patent Natural Language Processing

    Authors: Camilo Thorne, Saber Akhondi

    Abstract: We evaluate chemical patent word embeddings against known biomedical embeddings and show that they outperform the latter extrinsically and intrinsically. We also show that using contextualized embeddings can induce predictive models of reasonable performance for this domain over a relatively small gold standard.

    Submitted 24 October, 2020; originally announced October 2020.

    Comments: Extended version of an extended abstract presented (and reviewed) at the Latinx Workshop at ICML 2020

    MSC Class: 68T50 ACM Class: I.2.7; J.3; C.4

  5. arXiv:1907.02679  [pdf, other

    cs.CL

    Improving Chemical Named Entity Recognition in Patents with Contextualized Word Embeddings

    Authors: Zenan Zhai, Dat Quoc Nguyen, Saber A. Akhondi, Camilo Thorne, Christian Druckenbrodt, Trevor Cohn, Michelle Gregory, Karin Verspoor

    Abstract: Chemical patents are an important resource for chemical information. However, few chemical Named Entity Recognition (NER) systems have been evaluated on patent documents, due in part to their structural and linguistic complexity. In this paper, we explore the NER performance of a BiLSTM-CRF model utilising pre-trained word embeddings, character-level word representations and contextualized ELMo wo… ▽ More

    Submitted 5 July, 2019; originally announced July 2019.