Zum Hauptinhalt springen

Showing 1–9 of 9 results for author: Christopoulou, F

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.09450  [pdf, other

    cs.AI cs.CL cs.LG q-bio.NC

    Human-like Episodic Memory for Infinite Context LLMs

    Authors: Zafeirios Fountas, Martin A Benfeghoul, Adnan Oomerjee, Fenia Christopoulou, Gerasimos Lampouras, Haitham Bou-Ammar, Jun Wang

    Abstract: Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts, limiting their ability to maintain coherence and accuracy over long sequences. In contrast, the human brain excels at organising and retrieving episodic experiences across vast temporal scales, spanning a lifetime. In this work, we introduce EM-LLM, a novel approach that integrat… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  2. arXiv:2402.05783  [pdf, other

    cs.CL

    Text-to-Code Generation with Modality-relative Pre-training

    Authors: Fenia Christopoulou, Guchun Zhang, Gerasimos Lampouras

    Abstract: Large pre-trained language models have recently been expanded and applied to programming language tasks with great success, often through further pre-training of a strictly-natural language model--where training sequences typically contain both natural and (linearised) programming language. Such approaches effectively map both modalities of the sequence into the same embedding space. However, prog… ▽ More

    Submitted 12 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Accepted at EACL 2024. 15 pages, 5 figures, 6 tables

  3. arXiv:2210.12540  [pdf, other

    cs.CL

    EntityCS: Improving Zero-Shot Cross-lingual Transfer with Entity-Centric Code Switching

    Authors: Chenxi Whitehouse, Fenia Christopoulou, Ignacio Iacobacci

    Abstract: Accurate alignment between languages is fundamental for improving cross-lingual pre-trained language models (XLMs). Motivated by the natural phenomenon of code-switching (CS) in multilingual speakers, CS has been used as an effective data augmentation method that offers language alignment at the word- or phrase-level, in contrast to sentence-level via parallel instances. Existing approaches either… ▽ More

    Submitted 13 February, 2023; v1 submitted 22 October, 2022; originally announced October 2022.

    Comments: Findings of EMNLP 2022

    Journal ref: Findings of the Association for Computational Linguistics: EMNLP 2022 (6698-6714)

  4. arXiv:2210.12499  [pdf, other

    cs.CL

    Training Dynamics for Curriculum Learning: A Study on Monolingual and Cross-lingual NLU

    Authors: Fenia Christopoulou, Gerasimos Lampouras, Ignacio Iacobacci

    Abstract: Curriculum Learning (CL) is a technique of training models via ranking examples in a typically increasing difficulty trend with the aim of accelerating convergence and improving generalisability. Current approaches for Natural Language Understanding (NLU) tasks use CL to improve in-distribution data performance often via heuristic-oriented or task-agnostic difficulties. In this work, instead, we e… ▽ More

    Submitted 24 November, 2022; v1 submitted 22 October, 2022; originally announced October 2022.

    Comments: 17 pages, 4 figures, 6 tables. To appear in EMNLP 2022

  5. arXiv:2207.11280  [pdf, other

    cs.LG cs.AI cs.CL cs.PL cs.SE

    PanGu-Coder: Program Synthesis with Function-Level Language Modeling

    Authors: Fenia Christopoulou, Gerasimos Lampouras, Milan Gritta, Guchun Zhang, Yinpeng Guo, Zhongqi Li, Qi Zhang, Meng Xiao, Bo Shen, Lin Li, Hao Yu, Li Yan, Pingyi Zhou, Xin Wang, Yuchi Ma, Ignacio Iacobacci, Yasheng Wang, Guangtai Liang, Jiansheng Wei, Xin Jiang, Qianxiang Wang, Qun Liu

    Abstract: We present PanGu-Coder, a pretrained decoder-only language model adopting the PanGu-Alpha architecture for text-to-code generation, i.e. the synthesis of programming language solutions given a natural language problem description. We train PanGu-Coder using a two-stage strategy: the first stage employs Causal Language Modelling (CLM) to pre-train on raw programming language data, while the second… ▽ More

    Submitted 22 July, 2022; originally announced July 2022.

    Comments: 27 pages

  6. arXiv:2104.08225  [pdf, other

    cs.CL

    Distantly Supervised Relation Extraction with Sentence Reconstruction and Knowledge Base Priors

    Authors: Fenia Christopoulou, Makoto Miwa, Sophia Ananiadou

    Abstract: We propose a multi-task, probabilistic approach to facilitate distantly supervised relation extraction by bringing closer the representations of sentences that contain the same Knowledge Base pairs. To achieve this, we bias the latent space of sentences via a Variational Autoencoder (VAE) that is trained jointly with a relation classifier. The latent code guides the pair representations and influe… ▽ More

    Submitted 16 April, 2021; originally announced April 2021.

    Comments: 16 pages, 9 figures, Accepted as a long paper at NAACL 2021

  7. arXiv:1909.00228  [pdf, other

    cs.CL

    Connecting the Dots: Document-level Neural Relation Extraction with Edge-oriented Graphs

    Authors: Fenia Christopoulou, Makoto Miwa, Sophia Ananiadou

    Abstract: Document-level relation extraction is a complex human process that requires logical inference to extract relationships between named entities in text. Existing approaches use graph-based neural models with words as nodes and edges as relations between them, to encode relations across sentences. These models are node-based, i.e., they form pair representations based solely on the two target node re… ▽ More

    Submitted 31 August, 2019; originally announced September 2019.

    Comments: 12 pages, 5 figures, 6 tables. Accepted in EMNLP-IJCNLP 2019

  8. arXiv:1906.04684  [pdf, other

    cs.CL cs.IR

    Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network

    Authors: Sunil Kumar Sahu, Fenia Christopoulou, Makoto Miwa, Sophia Ananiadou

    Abstract: Inter-sentence relation extraction deals with a number of complex semantic relationships in documents, which require local, non-local, syntactic and semantic dependencies. Existing methods do not fully exploit such dependencies. We present a novel inter-sentence relation extraction model that builds a labelled edge graph convolutional neural network model on a document-level graph. The graph is co… ▽ More

    Submitted 11 June, 2019; originally announced June 2019.

    Comments: Accepted in Association for Computational Linguistics (ACL) 2019 8 pages, 3 figures, 3 tables

  9. arXiv:1902.07023  [pdf, other

    cs.CL

    A Walk-based Model on Entity Graphs for Relation Extraction

    Authors: Fenia Christopoulou, Makoto Miwa, Sophia Ananiadou

    Abstract: We present a novel graph-based neural network model for relation extraction. Our model treats multiple pairs in a sentence simultaneously and considers interactions among them. All the entities in a sentence are placed as nodes in a fully-connected graph structure. The edges are represented with position-aware contexts around the entity pairs. In order to consider different relation paths between… ▽ More

    Submitted 13 March, 2020; v1 submitted 19 February, 2019; originally announced February 2019.

    Comments: 8 pages, 2 figures, 2 tables

    Journal ref: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 2018, pages 81-88