Skip to main content

Showing 1–4 of 4 results for author: Petcu, R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.04485  [pdf, other

    cs.CL cs.LG

    Leveraging Graph Structures to Detect Hallucinations in Large Language Models

    Authors: Noa Nonkes, Sergei Agaronian, Evangelos Kanoulas, Roxana Petcu

    Abstract: Large language models are extensively applied across a wide range of tasks, such as customer support, content creation, educational tutoring, and providing financial guidance. However, a well-known drawback is their predisposition to generate hallucinations. This damages the trustworthiness of the information these models provide, impacting decision-making and user confidence. We propose a method… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Journal ref: Proceedings of the TextGraphs-17 Workshop, ACL 2024

  2. arXiv:2405.13003  [pdf, other

    cs.CL cs.AI cs.IR

    A Survey on Recent Advances in Conversational Data Generation

    Authors: Heydar Soudani, Roxana Petcu, Evangelos Kanoulas, Faegheh Hasibi

    Abstract: Recent advancements in conversational systems have significantly enhanced human-machine interactions across various domains. However, training these systems is challenging due to the scarcity of specialized dialogue data. Traditionally, conversational datasets were created through crowdsourcing, but this method has proven costly, limited in scale, and labor-intensive. As a solution, the developmen… ▽ More

    Submitted 12 May, 2024; originally announced May 2024.

  3. arXiv:2402.14888  [pdf, other

    cs.LG cs.AI cs.CL

    Efficient data selection employing Semantic Similarity-based Graph Structures for model training

    Authors: Roxana Petcu, Subhadeep Maji

    Abstract: Recent developments in natural language processing (NLP) have highlighted the need for substantial amounts of data for models to capture textual information accurately. This raises concerns regarding the computational resources and time required for training such models. This paper introduces Semantics for data SAliency in Model performance Estimation (SeSaME). It is an efficient data sampling mec… ▽ More

    Submitted 22 February, 2024; originally announced February 2024.

    Comments: ICML 2023 Workshop: Sampling and Optimization in Discrete Space

  4. arXiv:2402.11633  [pdf, other

    cs.CL

    Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking dialogs

    Authors: Arian Askari, Roxana Petcu, Chuan Meng, Mohammad Aliannejadi, Amin Abolghasemi, Evangelos Kanoulas, Suzan Verberne

    Abstract: Identifying user intents in information-seeking dialogs is crucial for a system to meet user's information needs. Intent prediction (IP) is challenging and demands sufficient dialogs with human-labeled intents for training. However, manually annotating intents is resource-intensive. While large language models (LLMs) have been shown to be effective in generating synthetic data, there is no study o… ▽ More

    Submitted 18 February, 2024; originally announced February 2024.