Zum Hauptinhalt springen

Showing 1–11 of 11 results for author: Sharma, A S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.01416  [pdf, other

    cs.LG cs.AI

    The Quest for the Right Mediator: A History, Survey, and Theoretical Grounding of Causal Interpretability

    Authors: Aaron Mueller, Jannik Brinkmann, Millicent Li, Samuel Marks, Koyena Pal, Nikhil Prakash, Can Rager, Aruna Sankaranarayanan, Arnab Sen Sharma, Jiuding Sun, Eric Todd, David Bau, Yonatan Belinkov

    Abstract: Interpretability provides a toolset for understanding how and why neural networks behave in certain ways. However, there is little unity in the field: most studies employ ad-hoc evaluations and do not share theoretical foundations, making it difficult to measure progress and compare the pros and cons of different techniques. Furthermore, while mechanistic understanding is frequently discussed, the… ▽ More

    Submitted 2 August, 2024; originally announced August 2024.

  2. arXiv:2407.14561  [pdf, other

    cs.LG cs.AI

    NNsight and NDIF: Democratizing Access to Foundation Model Internals

    Authors: Jaden Fiotto-Kaufman, Alexander R Loftus, Eric Todd, Jannik Brinkmann, Caden Juang, Koyena Pal, Can Rager, Aaron Mueller, Samuel Marks, Arnab Sen Sharma, Francesca Lucchetti, Michael Ripa, Adam Belfki, Nikhil Prakash, Sumeet Multani, Carla Brodley, Arjun Guha, Jonathan Bell, Byron Wallace, David Bau

    Abstract: The enormous scale of state-of-the-art foundation models has limited their accessibility to scientists, because customized experiments at large model sizes require costly hardware and complex engineering that is impractical for most researchers. To alleviate these problems, we introduce NNsight, an open-source Python package with a simple, flexible API that can express interventions on any PyTorch… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: Code at https://nnsight.net

  3. arXiv:2404.03646  [pdf, other

    cs.CL

    Locating and Editing Factual Associations in Mamba

    Authors: Arnab Sen Sharma, David Atkinson, David Bau

    Abstract: We investigate the mechanisms of factual recall in the Mamba state space model. Our work is inspired by previous findings in autoregressive transformer language models suggesting that their knowledge recall is localized to particular modules at specific token locations; we therefore ask whether factual recall in Mamba can be similarly localized. To investigate this, we conduct four lines of experi… ▽ More

    Submitted 2 August, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

    Comments: 18 pages, COLM-2024

  4. arXiv:2310.15213  [pdf, other

    cs.CL cs.LG

    Function Vectors in Large Language Models

    Authors: Eric Todd, Millicent L. Li, Arnab Sen Sharma, Aaron Mueller, Byron C. Wallace, David Bau

    Abstract: We report the presence of a simple neural mechanism that represents an input-output function as a vector within autoregressive transformer language models (LMs). Using causal mediation analysis on a diverse range of in-context-learning (ICL) tasks, we find that a small number attention heads transport a compact representation of the demonstrated task, which we call a function vector (FV). FVs are… ▽ More

    Submitted 25 February, 2024; v1 submitted 23 October, 2023; originally announced October 2023.

    Comments: ICLR 2024. 52 pages, 30 figures, 23 tables. Code and data at https://functions.baulab.info

  5. arXiv:2308.09124  [pdf, other

    cs.CL

    Linearity of Relation Decoding in Transformer Language Models

    Authors: Evan Hernandez, Arnab Sen Sharma, Tal Haklay, Kevin Meng, Martin Wattenberg, Jacob Andreas, Yonatan Belinkov, David Bau

    Abstract: Much of the knowledge encoded in transformer language models (LMs) may be expressed in terms of relations: relations between words and their synonyms, entities and their attributes, etc. We show that, for a subset of relations, this computation is well-approximated by a single linear transformation on the subject representation. Linear relation representations may be obtained by constructing a fir… ▽ More

    Submitted 15 February, 2024; v1 submitted 17 August, 2023; originally announced August 2023.

  6. arXiv:2210.07286  [pdf, other

    cs.HC

    Augmenting Online Classes with an Attention Tracking Tool May Improve Student Engagement

    Authors: Arnab Sen Sharma, Mohammad Ruhul Amin, Muztaba Fuad

    Abstract: Online remote learning has certain advantages, such as higher flexibility and greater inclusiveness. However, a caveat is the teachers' limited ability to monitor student interaction during an online class, especially while teachers are sharing their screens. We have taken feedback from 12 teachers experienced in teaching undergraduate-level online classes on the necessity of an attention tracking… ▽ More

    Submitted 13 October, 2022; originally announced October 2022.

    Comments: 18 pages, 10 figures,

  7. arXiv:2210.07229  [pdf, other

    cs.CL cs.LG

    Mass-Editing Memory in a Transformer

    Authors: Kevin Meng, Arnab Sen Sharma, Alex Andonian, Yonatan Belinkov, David Bau

    Abstract: Recent work has shown exciting promise in updating large language models with new memories, so as to replace obsolete information or add specialized knowledge. However, this line of work is predominantly limited to updating single associations. We develop MEMIT, a method for directly updating a language model with many memories, demonstrating experimentally that it can scale up to thousands of ass… ▽ More

    Submitted 1 August, 2023; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: 18 pages, 11 figures. Code and data at https://memit.baulab.info

  8. arXiv:2206.00372  [pdf

    cs.CL

    BD-SHS: A Benchmark Dataset for Learning to Detect Online Bangla Hate Speech in Different Social Contexts

    Authors: Nauros Romim, Mosahed Ahmed, Md. Saiful Islam, Arnab Sen Sharma, Hriteshwar Talukder, Mohammad Ruhul Amin

    Abstract: Social media platforms and online streaming services have spawned a new breed of Hate Speech (HS). Due to the massive amount of user-generated content on these sites, modern machine learning techniques are found to be feasible and cost-effective to tackle this problem. However, linguistically diverse datasets covering different social contexts in which offensive language is typically used are requ… ▽ More

    Submitted 1 June, 2022; originally announced June 2022.

  9. arXiv:2112.01902  [pdf, other

    cs.CL

    HS-BAN: A Benchmark Dataset of Social Media Comments for Hate Speech Detection in Bangla

    Authors: Nauros Romim, Mosahed Ahmed, Md Saiful Islam, Arnab Sen Sharma, Hriteshwar Talukder, Mohammad Ruhul Amin

    Abstract: In this paper, we present HS-BAN, a binary class hate speech (HS) dataset in Bangla language consisting of more than 50,000 labeled comments, including 40.17% hate and rest are non hate speech. While preparing the dataset a strict and detailed annotation guideline was followed to reduce human annotation bias. The HS dataset was also preprocessed linguistically to extract different types of slang c… ▽ More

    Submitted 3 December, 2021; originally announced December 2021.

    Comments: Submitted to ICON 21 (Rejected)

  10. Presenting a Larger Up-to-date Movie Dataset and Investigating the Effects of Pre-released Attributes on Gross Revenue

    Authors: Arnab Sen Sharma, Tirtha Roy, Sadique Ahmmod Rifat, Maruf Ahmed Mridul

    Abstract: Movie-making has become one of the most costly and risky endeavors in the entertainment industry. Continuous change in the preference of the audience makes it harder to predict what kind of movie will be financially successful at the box office. So, it is no wonder that cautious, intelligent stakeholders and large production houses will always want to know the probable revenue that will be generat… ▽ More

    Submitted 7 December, 2021; v1 submitted 13 October, 2021; originally announced October 2021.

    Journal ref: Journal of Computer Science, Volume 17 No. 10, 2021, 870-888

  11. arXiv:1911.11062  [pdf, other

    cs.IR cs.CL cs.LG

    Automatic Detection of Satire in Bangla Documents: A CNN Approach Based on Hybrid Feature Extraction Model

    Authors: Arnab Sen Sharma, Maruf Ahmed Mridul, Md Saiful Islam

    Abstract: Widespread of satirical news in online communities is an ongoing trend. The nature of satires is so inherently ambiguous that sometimes it's too hard even for humans to understand whether it's actually satire or not. So, research interest has grown in this field. The purpose of this research is to detect Bangla satirical news spread in online news portals as well as social media. In this paper, we… ▽ More

    Submitted 19 November, 2019; originally announced November 2019.

    Comments: 5 pages, Conference paper