Zum Hauptinhalt springen

Showing 1–4 of 4 results for author: Dhamecha, T I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.09879  [pdf, other

    cs.CL

    sPhinX: Sample Efficient Multilingual Instruction Fine-Tuning Through N-shot Guided Prompting

    Authors: Sanchit Ahuja, Kumar Tanmay, Hardik Hansrajbhai Chauhan, Barun Patra, Kriti Aggarwal, Luciano Del Corro, Arindam Mitra, Tejas Indulal Dhamecha, Ahmed Awadallah, Monojit Choudhary, Vishrav Chaudhary, Sunayana Sitaram

    Abstract: Despite the remarkable success of LLMs in English, there is a significant gap in performance in non-English languages. In order to address this, we introduce a novel recipe for creating a multilingual synthetic instruction tuning dataset, sPhinX, which is created by selectively translating instruction response pairs from English into 50 languages. We test the effectiveness of sPhinX by using it to… ▽ More

    Submitted 16 July, 2024; v1 submitted 13 July, 2024; originally announced July 2024.

    Comments: 20 pages, 12 tables, 5 figures

  2. arXiv:2301.01015  [pdf, other

    cs.CV cs.AI cs.CL

    Semi-Structured Object Sequence Encoders

    Authors: Rudra Murthy V, Riyaz Bhat, Chulaka Gunasekara, Siva Sankalp Patel, Hui Wan, Tejas Indulal Dhamecha, Danish Contractor, Marina Danilevsky

    Abstract: In this paper we explore the task of modeling semi-structured object sequences; in particular, we focus our attention on the problem of developing a structure-aware input representation for such sequences. Examples of such data include user activity on websites, machine logs, and many others. This type of data is often represented as a sequence of sets of key-value pairs over time and can present… ▽ More

    Submitted 22 May, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

  3. arXiv:2109.10534  [pdf, other

    cs.CL

    Role of Language Relatedness in Multilingual Fine-tuning of Language Models: A Case Study in Indo-Aryan Languages

    Authors: Tejas Indulal Dhamecha, Rudra Murthy V, Samarth Bharadwaj, Karthik Sankaranarayanan, Pushpak Bhattacharyya

    Abstract: We explore the impact of leveraging the relatedness of languages that belong to the same family in NLP models using multilingual fine-tuning. We hypothesize and validate that multilingual fine-tuning of pre-trained language models can yield better performance on downstream NLP applications, compared to models fine-tuned on individual languages. A first of its kind detailed study is presented to tr… ▽ More

    Submitted 22 September, 2021; originally announced September 2021.

    Comments: Accepted in EMNLP 2021

  4. arXiv:1902.09183  [pdf, other

    cs.CL

    Joint Multi-Domain Learning for Automatic Short Answer Grading

    Authors: Swarnadeep Saha, Tejas I. Dhamecha, Smit Marvaniya, Peter Foltz, Renuka Sindhgatta, Bikram Sengupta

    Abstract: One of the fundamental challenges towards building any intelligent tutoring system is its ability to automatically grade short student answers. A typical automatic short answer grading system (ASAG) grades student answers across multiple domains (or subjects). Grading student answers requires building a supervised machine learning model that evaluates the similarity of the student answer with the… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

    Comments: 11 pages