Zum Hauptinhalt springen

Showing 1–7 of 7 results for author: Vaidhya, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12327  [pdf, other

    cs.LG cs.AI cs.CL

    Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models

    Authors: Ayush Kaushal, Tejas Pandey, Tejas Vaidhya, Aaryan Bhagat, Irina Rish

    Abstract: Post-training quantization is the leading method for addressing memory-related bottlenecks in LLM inference, but unfortunately, it suffers from significant performance degradation below 4-bit precision. An alternative approach involves training compressed models directly at a low bitwidth (e.g., binary or ternary models). However, the performance, training dynamics, and scaling trends of such mode… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 32 pages, 12 figures, and 10 tables

    MSC Class: 68T30 ACM Class: I.2.6; I.2.7

  2. arXiv:2309.14021  [pdf, other

    cs.CL cs.AI

    LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot Compression

    Authors: Ayush Kaushal, Tejas Vaidhya, Irina Rish

    Abstract: Low Rank Decomposition of matrix - splitting a large matrix into a product of two smaller matrix offers a means for compression that reduces the parameters of a model without sparsification, and hence delivering more speedup on modern hardware. Moreover, unlike quantization, the compressed linear layers remain fully differentiable and all the parameters trainable, while being able to leverage the… ▽ More

    Submitted 25 September, 2023; originally announced September 2023.

    Comments: 9 pages

  3. arXiv:2202.13758  [pdf, other

    cs.CL cs.AI cs.CY cs.LG cs.LO

    Logical Fallacy Detection

    Authors: Zhijing Jin, Abhinav Lalwani, Tejas Vaidhya, Xiaoyu Shen, Yiwen Ding, Zhiheng Lyu, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf

    Abstract: Reasoning is central to human intelligence. However, fallacious arguments are common, and some exacerbate problems such as spreading misinformation about climate change. In this paper, we propose the task of logical fallacy detection, and provide a new dataset (Logic) of logical fallacies generally found in text, together with an additional challenge set for detecting logical fallacies in climate… ▽ More

    Submitted 11 December, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

    Comments: EMNLP 2021 Findings

  4. arXiv:2110.03618  [pdf, other

    cs.CL cs.AI cs.LG

    Causal Direction of Data Collection Matters: Implications of Causal and Anticausal Learning for NLP

    Authors: Zhijing Jin, Julius von Kügelgen, Jingwei Ni, Tejas Vaidhya, Ayush Kaushal, Mrinmaya Sachan, Bernhard Schölkopf

    Abstract: The principle of independent causal mechanisms (ICM) states that generative processes of real world data consist of independent modules which do not influence or inform each other. While this idea has led to fruitful developments in the field of causal inference, it is not widely-known in the NLP community. In this work, we argue that the causal direction of the data collection process bears nontr… ▽ More

    Submitted 19 October, 2021; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: EMNLP 2021 (Oral)

  5. arXiv:2101.05494  [pdf, ps, other

    cs.CL

    Hostility Detection in Hindi leveraging Pre-Trained Language Models

    Authors: Ojasv Kamal, Adarsh Kumar, Tejas Vaidhya

    Abstract: Hostile content on social platforms is ever increasing. This has led to the need for proper detection of hostile posts so that appropriate action can be taken to tackle them. Though a lot of work has been done recently in the English Language to solve the problem of hostile content online, similar works in Indian Languages are quite hard to find. This paper presents a transfer learning based appro… ▽ More

    Submitted 14 January, 2021; originally announced January 2021.

  6. Domain specific BERT representation for Named Entity Recognition of lab protocol

    Authors: Tejas Vaidhya, Ayush Kaushal

    Abstract: Supervised models trained to predict properties from representations have been achieving high accuracy on a variety of tasks. For instance, the BERT family seems to work exceptionally well on the downstream task from NER tagging to the range of other linguistic tasks. But the vocabulary used in the medical field contains a lot of different tokens used only in the medical industry such as the name… ▽ More

    Submitted 21 December, 2020; originally announced December 2020.

    Comments: EMNLP 2020 Workshop; 5 pages

    MSC Class: 68T50 ACM Class: I.2.7

  7. Leveraging Event Specific and Chunk Span features to Extract COVID Events from tweets

    Authors: Ayush Kaushal, Tejas Vaidhya

    Abstract: Twitter has acted as an important source of information during disasters and pandemic, especially during the times of COVID-19. In this paper, we describe our system entry for WNUT 2020 Shared Task-3. The task was aimed at automating the extraction of a variety of COVID-19 related events from Twitter, such as individuals who recently contracted the virus, someone with symptoms who were denied test… ▽ More

    Submitted 17 December, 2020; originally announced December 2020.

    Comments: EMNLP 2020 Workshop, Oral, 8 pages

    MSC Class: 68T50 ACM Class: I.2.7