Zum Hauptinhalt springen

Showing 1–50 of 63 results for author: Demeester, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.06266  [pdf, other

    cs.LG cs.AI cs.CL

    Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment

    Authors: Karel D'Oosterlinck, Winnie Xu, Chris Develder, Thomas Demeester, Amanpreet Singh, Christopher Potts, Douwe Kiela, Shikib Mehri

    Abstract: Large Language Models (LLMs) are often aligned using contrastive alignment objectives and preference pair datasets. The interaction between model, paired data, and objective makes alignment a complicated procedure, sometimes producing subpar results. We study this and find that (i) preference data gives a better learning signal when the underlying responses are contrastive, and (ii) alignment obje… ▽ More

    Submitted 29 August, 2024; v1 submitted 12 August, 2024; originally announced August 2024.

  2. arXiv:2408.04303  [pdf, other

    cs.CL cs.LG

    Trans-Tokenization and Cross-lingual Vocabulary Transfers: Language Adaptation of LLMs for Low-Resource NLP

    Authors: François Remy, Pieter Delobelle, Hayastan Avetisyan, Alfiya Khabibullina, Miryam de Lhoneux, Thomas Demeester

    Abstract: The development of monolingual language models for low and mid-resource languages continues to be hindered by the difficulty in sourcing high-quality training data. In this study, we present a novel cross-lingual vocabulary transfer strategy, trans-tokenization, designed to tackle this challenge and enable more efficient language adaptation. Our approach focuses on adapting a high-resource monolin… ▽ More

    Submitted 8 August, 2024; originally announced August 2024.

    Comments: Accepted at COLM 2024

  3. arXiv:2403.09481  [pdf, other

    cs.AI

    Clinical Reasoning over Tabular Data and Text with Bayesian Networks

    Authors: Paloma Rabaey, Johannes Deleu, Stefan Heytens, Thomas Demeester

    Abstract: Bayesian networks are well-suited for clinical reasoning on tabular data, but are less compatible with natural language data, for which neural networks provide a successful framework. This paper compares and discusses strategies to augment Bayesian networks with neural text representations, both in a generative and discriminative manner. This is illustrated with simulation results for a primary ca… ▽ More

    Submitted 23 May, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: AI in Medicine 2024

  4. arXiv:2401.12652  [pdf, other

    cs.CE q-fin.CP

    From Numbers to Words: Multi-Modal Bankruptcy Prediction Using the ECL Dataset

    Authors: Henri Arno, Klaas Mulier, Joke Baeck, Thomas Demeester

    Abstract: In this paper, we present ECL, a novel multi-modal dataset containing the textual and numerical data from corporate 10K filings and associated binary bankruptcy labels. Furthermore, we develop and critically evaluate several classical and neural bankruptcy prediction models using this dataset. Our findings suggest that the information contained in each data modality is complementary for bankruptcy… ▽ More

    Submitted 23 January, 2024; originally announced January 2024.

    Comments: Presented at the 6th Workshop on Financial Technology and Natural Language Processing (FinNLP) @ IJCNLP-AACL 2023 in Bali, Indonesia

  5. arXiv:2401.12178  [pdf, other

    cs.CL cs.AI

    In-Context Learning for Extreme Multi-Label Classification

    Authors: Karel D'Oosterlinck, Omar Khattab, François Remy, Thomas Demeester, Chris Develder, Christopher Potts

    Abstract: Multi-label classification problems with thousands of classes are hard to solve with in-context learning alone, as language models (LMs) might lack prior knowledge about the precise classes or how to assign them, and it is generally infeasible to demonstrate every class in a prompt. We propose a general program, $\texttt{Infer--Retrieve--Rank}$, that defines multi-step interactions between LMs and… ▽ More

    Submitted 22 January, 2024; originally announced January 2024.

  6. arXiv:2312.07837  [pdf, other

    cs.LG stat.ML

    The Real Deal Behind the Artificial Appeal: Inferential Utility of Tabular Synthetic Data

    Authors: Alexander Decruyenaere, Heidelinde Dehaene, Paloma Rabaey, Christiaan Polet, Johan Decruyenaere, Stijn Vansteelandt, Thomas Demeester

    Abstract: Recent advances in generative models facilitate the creation of synthetic data to be made available for research in privacy-sensitive contexts. However, the analysis of synthetic data raises a unique set of methodological challenges. In this work, we highlight the importance of inferential utility and provide empirical evidence against naive inference from synthetic data, whereby synthetic data ar… ▽ More

    Submitted 12 June, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

    Comments: Accepted for the 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024), *joint first authors

  7. arXiv:2311.18434  [pdf, other

    cs.LG cond-mat.dis-nn

    Exploring the Temperature-Dependent Phase Transition in Modern Hopfield Networks

    Authors: Felix Koulischer, Cédric Goemaere, Tom van der Meersch, Johannes Deleu, Thomas Demeester

    Abstract: The recent discovery of a connection between Transformers and Modern Hopfield Networks (MHNs) has reignited the study of neural networks from a physical energy-based perspective. This paper focuses on the pivotal effect of the inverse temperature hyperparameter $β$ on the distribution of energy minima of the MHN. To achieve this, the distribution of energy minima is tracked in a simplified MHN in… ▽ More

    Submitted 30 November, 2023; originally announced November 2023.

    Comments: Accepted as poster for Associative Memory and Hopfield Networks workshop at NeurIPS23

  8. arXiv:2311.16075  [pdf

    cs.CL cs.AI cs.IR

    BioLORD-2023: Semantic Textual Representations Fusing LLM and Clinical Knowledge Graph Insights

    Authors: François Remy, Kris Demuynck, Thomas Demeester

    Abstract: In this study, we investigate the potential of Large Language Models to complement biomedical knowledge graphs in the training of semantic models for the biomedical and clinical domains. Drawing on the wealth of the UMLS knowledge graph and harnessing cutting-edge Large Language Models, we propose a new state-of-the-art approach for obtaining high-fidelity representations of biomedical concepts an… ▽ More

    Submitted 27 November, 2023; originally announced November 2023.

    Comments: Preprint of upcoming journal article

  9. arXiv:2311.15673  [pdf, other

    cs.LG cs.NE

    Accelerating Hopfield Network Dynamics: Beyond Synchronous Updates and Forward Euler

    Authors: Cédric Goemaere, Johannes Deleu, Thomas Demeester

    Abstract: The Hopfield network serves as a fundamental energy-based model in machine learning, capturing memory retrieval dynamics through an ordinary differential equation (ODE). The model's output, the equilibrium point of the ODE, is traditionally computed via synchronous updates using the forward Euler method. This paper aims to overcome some of the disadvantages of this approach. We propose a conceptua… ▽ More

    Submitted 21 August, 2024; v1 submitted 27 November, 2023; originally announced November 2023.

    Comments: Accepted at the ML-DE Workshop at ECAI 2024

  10. arXiv:2311.15047  [pdf, other

    cs.LG cs.NE

    Training a Hopfield Variational Autoencoder with Equilibrium Propagation

    Authors: Tom Van Der Meersch, Johannes Deleu, Thomas Demeester

    Abstract: On dedicated analog hardware, equilibrium propagation is an energy-efficient alternative to backpropagation. In spite of its theoretical guarantees, its application in the AI domain remains limited to the discriminative setting. Meanwhile, despite its high computational demands, generative AI is on the rise. In this paper, we demonstrate the application of Equilibrium Propagation in training a var… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

    Comments: Associative Memory & Hopfield Networks in 2023 (NeurIPS 2023 workshop)

  11. arXiv:2311.10905  [pdf, other

    cs.CL cs.AI

    Flexible Model Interpretability through Natural Language Model Editing

    Authors: Karel D'Oosterlinck, Thomas Demeester, Chris Develder, Christopher Potts

    Abstract: Model interpretability and model editing are crucial goals in the age of large language models. Interestingly, there exists a link between these two goals: if a method is able to systematically edit model behavior with regard to a human concept of interest, this editor method can help make internal representations more interpretable by pointing towards relevant representations and systematically m… ▽ More

    Submitted 17 November, 2023; originally announced November 2023.

    Comments: Extended Abstract -- work in progress. BlackboxNLP2023

  12. arXiv:2311.06549  [pdf, other

    cs.CL

    Zero-Shot Cross-Lingual Sentiment Classification under Distribution Shift: an Exploratory Study

    Authors: Maarten De Raedt, Semere Kiros Bitew, Fréderic Godin, Thomas Demeester, Chris Develder

    Abstract: The brittleness of finetuned language model performance on out-of-distribution (OOD) test samples in unseen domains has been well-studied for English, yet is unexplored for multi-lingual models. Therefore, we study generalization to OOD test data specifically in zero-shot cross-lingual transfer settings, analyzing performance impacts of both language and domain shifts between train and test data.… ▽ More

    Submitted 11 November, 2023; originally announced November 2023.

    Comments: The 3rd Workshop on Multilingual Representation Learning (MRL@EMNLP2023)

  13. arXiv:2310.15636  [pdf, other

    cs.CL cs.AI

    Career Path Prediction using Resume Representation Learning and Skill-based Matching

    Authors: Jens-Joris Decorte, Jeroen Van Hautte, Johannes Deleu, Chris Develder, Thomas Demeester

    Abstract: The impact of person-job fit on job satisfaction and performance is widely acknowledged, which highlights the importance of providing workers with next steps at the right time in their career. This task of predicting the next step in a career is known as career path prediction, and has diverse applications such as turnover prevention and internal job mobility. Existing methods to career path predi… ▽ More

    Submitted 24 October, 2023; originally announced October 2023.

    Comments: Accepted to the 3nd Workshop on Recommender Systems for Human Resources (RecSys in HR 2023) as part of RecSys 2023

  14. EmoTwiCS: A Corpus for Modelling Emotion Trajectories in Dutch Customer Service Dialogues on Twitter

    Authors: Sofie Labat, Thomas Demeester, Véronique Hoste

    Abstract: Due to the rise of user-generated content, social media is increasingly adopted as a channel to deliver customer service. Given the public character of these online platforms, the automatic detection of emotions forms an important application in monitoring customer satisfaction and preventing negative word-of-mouth. This paper introduces EmoTwiCS, a corpus of 9,489 Dutch customer service dialogues… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Preprint to Language Resources and Evaluation Journal

  15. arXiv:2310.06165  [pdf, other

    cs.CL cs.AI

    CAW-coref: Conjunction-Aware Word-level Coreference Resolution

    Authors: Karel D'Oosterlinck, Semere Kiros Bitew, Brandon Papineau, Christopher Potts, Thomas Demeester, Chris Develder

    Abstract: State-of-the-art coreference resolutions systems depend on multiple LLM calls per document and are thus prohibitively expensive for many use cases (e.g., information extraction with large corpora). The leading word-level coreference system (WL-coref) attains 96.6% of these SOTA systems' performance while being much more efficient. In this work, we identify a routine yet important failure case of W… ▽ More

    Submitted 19 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: Accepted at CRAC 2023

  16. arXiv:2310.03477  [pdf, other

    cs.CL cs.AI

    Tik-to-Tok: Translating Language Models One Token at a Time: An Embedding Initialization Strategy for Efficient Language Adaptation

    Authors: François Remy, Pieter Delobelle, Bettina Berendt, Kris Demuynck, Thomas Demeester

    Abstract: Training monolingual language models for low and mid-resource languages is made challenging by limited and often inadequate pretraining data. In this study, we propose a novel model conversion strategy to address this issue, adapting high-resources monolingual language models to a new target language. By generalizing over a word translation dictionary encompassing both the source and target langua… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

    Comments: As first reviewed at TACL

  17. arXiv:2307.16338  [pdf, other

    cs.CL

    Distractor generation for multiple-choice questions with predictive prompting and large language models

    Authors: Semere Kiros Bitew, Johannes Deleu, Chris Develder, Thomas Demeester

    Abstract: Large Language Models (LLMs) such as ChatGPT have demonstrated remarkable performance across various tasks and have garnered significant attention from both researchers and practitioners. However, in an educational context, we still observe a performance gap in generating distractors -- i.e., plausible yet incorrect answers -- with LLMs for multiple-choice questions (MCQs). In this study, we propo… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

    Comments: 16 pages, Accepted at the 1st International Tutorial and Workshop on Responsible Knowledge Discovery in Education

  18. arXiv:2307.10778  [pdf, other

    cs.CL

    Extreme Multi-Label Skill Extraction Training using Large Language Models

    Authors: Jens-Joris Decorte, Severine Verlinden, Jeroen Van Hautte, Johannes Deleu, Chris Develder, Thomas Demeester

    Abstract: Online job ads serve as a valuable source of information for skill requirements, playing a crucial role in labor market analysis and e-recruitment processes. Since such ads are typically formatted in free text, natural language processing (NLP) technologies are required to automatically process them. We specifically focus on the task of detecting skills (mentioned literally, or implicitly describe… ▽ More

    Submitted 20 July, 2023; originally announced July 2023.

    Comments: Accepted to the International workshop on AI for Human Resources and Public Employment Services (AI4HR&PES) as part of ECML-PKDD 2023

  19. arXiv:2306.01584  [pdf, other

    cs.CL

    Learning from Partially Annotated Data: Example-aware Creation of Gap-filling Exercises for Language Learning

    Authors: Semere Kiros Bitew, Johannes Deleu, A. Seza Doğruöz, Chris Develder, Thomas Demeester

    Abstract: Since performing exercises (including, e.g., practice tests) forms a crucial component of learning, and creating such exercises requires non-trivial effort from the teacher, there is a great value in automatic exercise generation in digital tools in education. In this paper, we particularly focus on automatic creation of gapfilling exercises for language learning, specifically grammar exercises. S… ▽ More

    Submitted 15 June, 2023; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 12 pages, Accepted in the 18th Workshop on Innovative Use of NLP for Building Educational Applications

  20. arXiv:2306.00665  [pdf, other

    cs.CL

    Automatic Glossary of Clinical Terminology: a Large-Scale Dictionary of Biomedical Definitions Generated from Ontological Knowledge

    Authors: François Remy, Thomas Demeester

    Abstract: Background: More than 400,000 biomedical concepts and some of their relationships are contained in SnomedCT, a comprehensive biomedical ontology. However, their concept names are not always readily interpretable by non-experts, or patients looking at their own electronic health records (EHR). Clear definitions or descriptions in understandable language are often not available. Therefore, generatin… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: Accepted at the BioNLP 2023 workshop

  21. arXiv:2305.19783  [pdf, other

    cs.CL

    IDAS: Intent Discovery with Abstractive Summarization

    Authors: Maarten De Raedt, Fréderic Godin, Thomas Demeester, Chris Develder

    Abstract: Intent discovery is the task of inferring latent intents from a set of unlabeled utterances, and is a useful step towards the efficient creation of new conversational agents. We show that recent competitive methods in intent discovery can be outperformed by clustering utterances based on abstractive summaries, i.e., "labels", that retain the core elements while removing non-essential information.… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: The 5th Workshop on NLP for Conversational AI (NLP4ConvAI@ACL)

  22. arXiv:2305.13395  [pdf, other

    cs.CL

    BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance

    Authors: Karel D'Oosterlinck, François Remy, Johannes Deleu, Thomas Demeester, Chris Develder, Klim Zaporojets, Aneiss Ghodsi, Simon Ellershaw, Jack Collins, Christopher Potts

    Abstract: Timely and accurate extraction of Adverse Drug Events (ADE) from biomedical literature is paramount for public safety, but involves slow and costly manual labor. We set out to improve drug safety monitoring (pharmacovigilance, PV) through the use of Natural Language Processing (NLP). We introduce BioDEX, a large-scale resource for Biomedical adverse Drug Event Extraction, rooted in the historical… ▽ More

    Submitted 20 October, 2023; v1 submitted 22 May, 2023; originally announced May 2023.

    Comments: 28 pages. EMNLP Findings 2023

  23. arXiv:2305.06801  [pdf, other

    cs.CL

    Detecting Idiomatic Multiword Expressions in Clinical Terminology using Definition-Based Representation Learning

    Authors: François Remy, Alfiya Khabibullina, Thomas Demeester

    Abstract: This paper shines a light on the potential of definition-based semantic models for detecting idiomatic and semi-idiomatic multiword expressions (MWEs) in clinical terminology. Our study focuses on biomedical entities defined in the UMLS ontology and aims to help prioritize the translation efforts of these entities. In particular, we develop an effective tool for scoring the idiomaticity of biomedi… ▽ More

    Submitted 11 May, 2023; originally announced May 2023.

    Comments: Best Paper Award @ MWE 2023

  24. arXiv:2302.02500  [pdf, other

    cs.CL

    TempEL: Linking Dynamically Evolving and Newly Emerging Entities

    Authors: Klim Zaporojets, Lucie-Aimee Kaffee, Johannes Deleu, Thomas Demeester, Chris Develder, Isabelle Augenstein

    Abstract: In our continuously evolving world, entities change over time and new, previously non-existing or unknown, entities appear. We study how this evolutionary scenario impacts the performance on a well established entity linking (EL) task. For that study, we introduce TempEL, an entity linking dataset that consists of time-stratified English Wikipedia snapshots from 2013 to 2022, from which we collect… ▽ More

    Submitted 5 February, 2023; originally announced February 2023.

  25. arXiv:2211.08243  [pdf, other

    cs.LG

    Neural Bayesian Network Understudy

    Authors: Paloma Rabaey, Cedric De Boom, Thomas Demeester

    Abstract: Bayesian Networks may be appealing for clinical decision-making due to their inclusion of causal knowledge, but their practical adoption remains limited as a result of their inability to deal with unstructured data. While neural networks do not have this limitation, they are not interpretable and are inherently unable to deal with causal structure in the input space. Our goal is to build neural ne… ▽ More

    Submitted 15 November, 2022; originally announced November 2022.

    Comments: 12 pages, submitted to NeurIPS 2022 Workshop on Causal Machine Learning for Real-World Impact (CML4Impact 2022)

  26. Learning to Reuse Distractors to support Multiple Choice Question Generation in Education

    Authors: Semere Kiros Bitew, Amir Hadifar, Lucas Sterckx, Johannes Deleu, Chris Develder, Thomas Demeester

    Abstract: Multiple choice questions (MCQs) are widely used in digital learning systems, as they allow for automating the assessment process. However, due to the increased digital literacy of students and the advent of social media platforms, MCQ tests are widely shared online, and teachers are continuously challenged to create new questions, which is an expensive and time-consuming task. A particularly sens… ▽ More

    Submitted 13 December, 2022; v1 submitted 25 October, 2022; originally announced October 2022.

    Comments: 24 pages and 4 figures Accepted for publication in IEEE Transactions on Learning technologies

  27. arXiv:2210.11892  [pdf, other

    cs.CL cs.IR

    BioLORD: Learning Ontological Representations from Definitions (for Biomedical Concepts and their Textual Descriptions)

    Authors: François Remy, Kris Demuynck, Thomas Demeester

    Abstract: This work introduces BioLORD, a new pre-training strategy for producing meaningful representations for clinical sentences and biomedical concepts. State-of-the-art methodologies operate by maximizing the similarity in representation of names referring to the same concept, and preventing collapse through contrastive learning. However, because biomedical names are not always self-explanatory, it som… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: Accepted in Findings of EMNLP 2022

  28. arXiv:2210.11805  [pdf, other

    cs.CL

    Robustifying Sentiment Classification by Maximally Exploiting Few Counterfactuals

    Authors: Maarten De Raedt, Fréderic Godin, Chris Develder, Thomas Demeester

    Abstract: For text classification tasks, finetuned language models perform remarkably well. Yet, they tend to rely on spurious patterns in training data, thus limiting their performance on out-of-distribution (OOD) test data. Among recent models aiming to avoid this spurious pattern problem, adding extra counterfactual samples to the training data has proven to be very effective. Yet, counterfactual data ge… ▽ More

    Submitted 21 October, 2022; originally announced October 2022.

    Comments: EMNLP 2022

  29. arXiv:2210.06104  [pdf, other

    cs.CL

    EduQG: A Multi-format Multiple Choice Dataset for the Educational Domain

    Authors: Amir Hadifar, Semere Kiros Bitew, Johannes Deleu, Chris Develder, Thomas Demeester

    Abstract: We introduce a high-quality dataset that contains 3,397 samples comprising (i) multiple choice questions, (ii) answers (including distractors), and (iii) their source documents, from the educational domain. Each question is phrased in two forms, normal and close. Correct answers are linked to source documents with sentence-level annotations. Thus, our versatile dataset can be used for both questio… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  30. arXiv:2209.05987  [pdf, other

    cs.CL

    Design of Negative Sampling Strategies for Distantly Supervised Skill Extraction

    Authors: Jens-Joris Decorte, Jeroen Van Hautte, Johannes Deleu, Chris Develder, Thomas Demeester

    Abstract: Skills play a central role in the job market and many human resources (HR) processes. In the wake of other digital experiences, today's online job market has candidates expecting to see the right opportunities based on their skill set. Similarly, enterprises increasingly need to use data to guarantee that the skills within their workforce remain future-proof. However, structured information about… ▽ More

    Submitted 13 September, 2022; originally announced September 2022.

    Comments: Accepted to the 2nd Workshop on Recommender Systems for Human Resources (RecSys in HR 2022) as part of RecSys 2022

  31. arXiv:2208.11334  [pdf, other

    cs.CL q-fin.CP

    Next-Year Bankruptcy Prediction from Textual Data: Benchmark and Baselines

    Authors: Henri Arno, Klaas Mulier, Joke Baeck, Thomas Demeester

    Abstract: Models for bankruptcy prediction are useful in several real-world scenarios, and multiple research contributions have been devoted to the task, based on structured (numerical) as well as unstructured (textual) data. However, the lack of a common benchmark dataset and evaluation strategy impedes the objective comparison between models. This paper introduces such a benchmark for the unstructured dat… ▽ More

    Submitted 24 August, 2022; originally announced August 2022.

    Comments: Presented at the 4th Workshop on Financial Technology and Natural Language Processing (FinNLP) @ IJCAI-ECAI 2022 in Vienna, Austria

  32. CookDial: A dataset for task-oriented dialogs grounded in procedural documents

    Authors: Yiwei Jiang, Klim Zaporojets, Johannes Deleu, Thomas Demeester, Chris Develder

    Abstract: This work presents a new dialog dataset, CookDial, that facilitates research on task-oriented dialog systems with procedural knowledge understanding. The corpus contains 260 human-to-human task-oriented dialogs in which an agent, given a recipe document, guides the user to cook a dish. Dialogs in CookDial exhibit two unique features: (i) procedural alignment between the dialog flow and supporting… ▽ More

    Submitted 17 June, 2022; originally announced June 2022.

    Comments: The dataset and codes are available at https://github.com/YiweiJiang2015/CookDial

    Journal ref: Applied Intelligence, 1-19 (2022)

  33. arXiv:2109.09605  [pdf, other

    cs.CL

    JobBERT: Understanding Job Titles through Skills

    Authors: Jens-Joris Decorte, Jeroen Van Hautte, Thomas Demeester, Chris Develder

    Abstract: Job titles form a cornerstone of today's human resources (HR) processes. Within online recruitment, they allow candidates to understand the contents of a vacancy at a glance, while internal HR departments use them to organize and structure many of their processes. As job titles are a compact, convenient, and readily available data source, modeling them with high accuracy can greatly benefit many H… ▽ More

    Submitted 20 September, 2021; originally announced September 2021.

    Comments: Accepted to the International workshop on Fair, Effective And Sustainable Talent management using data science (FEAST) as part of ECML-PKDD 2021

  34. arXiv:2108.13530  [pdf, other

    cs.CL

    Towards Consistent Document-level Entity Linking: Joint Models for Entity Linking and Coreference Resolution

    Authors: Klim Zaporojets, Johannes Deleu, Yiwei Jiang, Thomas Demeester, Chris Develder

    Abstract: We consider the task of document-level entity linking (EL), where it is important to make consistent decisions for entity mentions over the full document jointly. We aim to leverage explicit "connections" among mentions within the document itself: we propose to join the EL task with that of coreference resolution (coref). This is complementary to related works that exploit either (i) implicit docu… ▽ More

    Submitted 1 July, 2022; v1 submitted 30 August, 2021; originally announced August 2021.

  35. arXiv:2107.02286  [pdf, other

    cs.CL

    Injecting Knowledge Base Information into End-to-End Joint Entity and Relation Extraction and Coreference Resolution

    Authors: Severine Verlinden, Klim Zaporojets, Johannes Deleu, Thomas Demeester, Chris Develder

    Abstract: We consider a joint information extraction (IE) model, solving named entity recognition, coreference resolution and relation extraction jointly over the whole document. In particular, we study how to inject information from a knowledge base (KB) in such IE model, based on unsupervised entity linking. The used KB entity representations are learned from either (i) hyperlinked text documents (Wikiped… ▽ More

    Submitted 5 July, 2021; originally announced July 2021.

  36. arXiv:2104.07944  [pdf, other

    cs.CL cs.AI

    A Million Tweets Are Worth a Few Points: Tuning Transformers for Customer Service Tasks

    Authors: Amir Hadifar, Sofie Labat, Véronique Hoste, Chris Develder, Thomas Demeester

    Abstract: In online domain-specific customer service applications, many companies struggle to deploy advanced NLP models successfully, due to the limited availability of and noise in their datasets. While prior research demonstrated the potential of migrating large open-domain pretrained models for domain-specific tasks, the appropriate (pre)training strategies have not yet been rigorously evaluated in such… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

  37. arXiv:2104.03630  [pdf, other

    cs.CL cs.LG

    A Simple Geometric Method for Cross-Lingual Linguistic Transformations with Pre-trained Autoencoders

    Authors: Maarten De Raedt, Fréderic Godin, Pieter Buteneers, Chris Develder, Thomas Demeester

    Abstract: Powerful sentence encoders trained for multiple languages are on the rise. These systems are capable of embedding a wide range of linguistic properties into vector representations. While explicit probing tasks can be used to verify the presence of specific linguistic properties, it is unclear whether the vector representations can be manipulated to indirectly steer such properties. For efficient l… ▽ More

    Submitted 21 September, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

    Comments: EMNLP 2021 - Short Paper Track

  38. arXiv:2009.12626  [pdf, other

    cs.CL

    DWIE: an entity-centric dataset for multi-task document-level information extraction

    Authors: Klim Zaporojets, Johannes Deleu, Chris Develder, Thomas Demeester

    Abstract: This paper presents DWIE, the 'Deutsche Welle corpus for Information Extraction', a newly created multi-task dataset that combines four main Information Extraction (IE) annotation subtasks: (i) Named Entity Recognition (NER), (ii) Coreference Resolution, (iii) Relation Extraction (RE), and (iv) Entity Linking. DWIE is conceived as an entity-centric dataset that describes interactions and propertie… ▽ More

    Submitted 9 March, 2021; v1 submitted 26 September, 2020; originally announced September 2020.

  39. Solving Arithmetic Word Problems by Scoring Equations with Recursive Neural Networks

    Authors: Klim Zaporojets, Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder

    Abstract: Solving arithmetic word problems is a cornerstone task in assessing language understanding and reasoning capabilities in NLP systems. Recent works use automatic extraction and ranking of candidate solution equations providing the answer to arithmetic word problems. In this work, we explore novel approaches to score such candidate solution equations using tree-structured recursive neural network (T… ▽ More

    Submitted 9 March, 2021; v1 submitted 11 September, 2020; originally announced September 2020.

    Journal ref: Expert Systems with Applications, 174 (2021) 114704

  40. arXiv:2001.06296  [pdf, other

    eess.SP cs.LG stat.ML

    Overly Optimistic Prediction Results on Imbalanced Data: a Case Study of Flaws and Benefits when Applying Over-sampling

    Authors: Gilles Vandewiele, Isabelle Dehaene, György Kovács, Lucas Sterckx, Olivier Janssens, Femke Ongenae, Femke De Backere, Filip De Turck, Kristien Roelens, Johan Decruyenaere, Sofie Van Hoecke, Thomas Demeester

    Abstract: Information extracted from electrohysterography recordings could potentially prove to be an interesting additional source of information to estimate the risk on preterm birth. Recently, a large number of studies have reported near-perfect results to distinguish between recordings of patients that will deliver term or preterm using a public resource, called the Term/Preterm Electrohysterogram datab… ▽ More

    Submitted 28 November, 2020; v1 submitted 15 January, 2020; originally announced January 2020.

    Journal ref: Artificial Intelligence in Medicine. 111 (2021). 101987

  41. Block-wise Dynamic Sparseness

    Authors: Amir Hadifar, Johannes Deleu, Chris Develder, Thomas Demeester

    Abstract: Neural networks have achieved state of the art performance across a wide variety of machine learning tasks, often with large and computation-heavy models. Inducing sparseness as a way to reduce the memory and computation footprint of these models has seen significant research attention in recent years. In this paper, we present a new method for \emph{dynamic sparseness}, whereby part of the comput… ▽ More

    Submitted 14 January, 2020; originally announced January 2020.

  42. arXiv:1911.09431  [pdf, ps, other

    cs.LG eess.SY stat.ML

    System Identification with Time-Aware Neural Sequence Models

    Authors: Thomas Demeester

    Abstract: Established recurrent neural networks are well-suited to solve a wide variety of prediction tasks involving discrete sequences. However, they do not perform as well in the task of dynamical system identification, when dealing with observations from continuous variables that are unevenly sampled in time, for example due to missing observations. We show how such neural sequence models can be adapted… ▽ More

    Submitted 21 November, 2019; originally announced November 2019.

    Comments: 34th AAAI Conference on Artificial Intelligence (AAAI 2020)

  43. arXiv:1907.08194  [pdf, other

    cs.AI

    Neural Probabilistic Logic Programming in DeepProbLog

    Authors: Robin Manhaeve, Sebastijan Dumančić, Angelika Kimmig, Thomas Demeester, Luc De Raedt

    Abstract: We introduce DeepProbLog, a neural probabilistic logic programming language that incorporates deep learning by means of neural predicates. We show how existing inference and learning techniques of the underlying probabilistic logic programming language ProbLog can be adapted for the new language. We theoretically and experimentally demonstrate that DeepProbLog supports (i) both symbolic and subsym… ▽ More

    Submitted 23 September, 2019; v1 submitted 18 July, 2019; originally announced July 2019.

    Comments: Extended version of DeepProbLog: Neural Probabilistic Logic Programming (previously published at NeurIPS 2018). arXiv admin note: text overlap with arXiv:1805.10872

  44. arXiv:1903.05396  [pdf, other

    cs.CL

    Sub-event detection from Twitter streams as a sequence labeling problem

    Authors: Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder

    Abstract: This paper introduces improved methods for sub-event detection in social media streams, by applying neural sequence models not only on the level of individual posts, but also directly on the stream level. Current approaches to identify sub-events within a given event, such as a goal during a soccer match, essentially do not exploit the sequential nature of social media streams. We address this sho… ▽ More

    Submitted 13 March, 2019; originally announced March 2019.

    Comments: NAACL 2019

  45. arXiv:1808.09551  [pdf, ps, other

    cs.CL cs.AI cs.LG

    Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?

    Authors: Fréderic Godin, Kris Demuynck, Joni Dambre, Wesley De Neve, Thomas Demeester

    Abstract: Character-level features are currently used in different neural network-based natural language processing algorithms. However, little is known about the character-level patterns those models learn. Moreover, models are often compared only quantitatively while a qualitative analysis is missing. In this paper, we investigate which character-level patterns neural networks learn and if those patterns… ▽ More

    Submitted 28 August, 2018; originally announced August 2018.

    Comments: Accepted at EMNLP 2018

  46. arXiv:1808.08720  [pdf, other

    cs.LG cs.AI cs.CL stat.ML

    Predefined Sparseness in Recurrent Sequence Models

    Authors: Thomas Demeester, Johannes Deleu, Fréderic Godin, Chris Develder

    Abstract: Inducing sparseness while training neural networks has been shown to yield models with a lower memory footprint but similar effectiveness to dense models. However, sparseness is typically induced starting from a dense model, and thus this advantage does not hold during training. We propose techniques to enforce sparseness upfront in recurrent sequence models for NLP applications, to also benefit t… ▽ More

    Submitted 27 August, 2018; originally announced August 2018.

    Comments: the SIGNLL Conference on Computational Natural Language Learning (CoNLL, 2018)

  47. arXiv:1808.06876  [pdf, other

    cs.CL

    Adversarial training for multi-context joint entity and relation extraction

    Authors: Giannis Bekoulis, Johannes Deleu, Thomas Demeester, Chris Develder

    Abstract: Adversarial training (AT) is a regularization method that can be used to improve the robustness of neural network methods by adding small perturbations in the training data. We show how to use AT for the tasks of entity recognition and relation extraction. In particular, we demonstrate that applying AT to a general purpose baseline model for jointly extracting entities and relations, allows improv… ▽ More

    Submitted 14 January, 2019; v1 submitted 21 August, 2018; originally announced August 2018.

    Comments: EMNLP 2018, code is available at https://github.com/bekou/multihead_joint_entity_relation_extraction

  48. arXiv:1806.09439  [pdf, other

    cs.CL

    Prior Attention for Style-aware Sequence-to-Sequence Models

    Authors: Lucas Sterckx, Johannes Deleu, Chris Develder, Thomas Demeester

    Abstract: We extend sequence-to-sequence models with the possibility to control the characteristics or style of the generated output, via attention that is generated a priori (before decoding) from a latent code vector. After training an initial attention-based sequence-to-sequence model, we use a variational auto-encoder conditioned on representations of input sequences and a latent code vector space to ge… ▽ More

    Submitted 25 June, 2018; originally announced June 2018.

    Comments: 6 pages, 6 figures

  49. arXiv:1806.08727  [pdf, ps, other

    cs.CL cs.LG stat.ML

    Jack the Reader - A Machine Reading Framework

    Authors: Dirk Weissenborn, Pasquale Minervini, Tim Dettmers, Isabelle Augenstein, Johannes Welbl, Tim Rocktäschel, Matko Bošnjak, Jeff Mitchell, Thomas Demeester, Pontus Stenetorp, Sebastian Riedel

    Abstract: Many Machine Reading and Natural Language Understanding tasks require reading supporting text in order to answer questions. For example, in Question Answering, the supporting text can be newswire or Wikipedia articles; in Natural Language Inference, premises can be seen as the supporting text and hypotheses as questions. Providing a set of useful primitives operating in a single framework of relat… ▽ More

    Submitted 19 June, 2018; originally announced June 2018.

    Comments: Proceedings of the Annual Meeting of the Association for Computational Linguistics (ACL 2018), System Demonstrations

  50. arXiv:1805.10872  [pdf, other

    cs.AI

    DeepProbLog: Neural Probabilistic Logic Programming

    Authors: Robin Manhaeve, Sebastijan Dumančić, Angelika Kimmig, Thomas Demeester, Luc De Raedt

    Abstract: We introduce DeepProbLog, a probabilistic logic programming language that incorporates deep learning by means of neural predicates. We show how existing inference and learning techniques can be adapted for the new language. Our experiments demonstrate that DeepProbLog supports both symbolic and subsymbolic representations and inference, 1) program induction, 2) probabilistic (logic) programming, a… ▽ More

    Submitted 12 December, 2018; v1 submitted 28 May, 2018; originally announced May 2018.

    Comments: Accepted for spotlight at NeurIPS 2018