Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Chatha, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2310.06200  [pdf, other

    cs.CL cs.LG

    The Importance of Prompt Tuning for Automated Neuron Explanations

    Authors: Justin Lee, Tuomas Oikarinen, Arjun Chatha, Keng-Chi Chang, Yilan Chen, Tsui-Wei Weng

    Abstract: Recent advances have greatly increased the capabilities of large language models (LLMs), but our understanding of the models and their safety has not progressed as fast. In this paper we aim to understand LLMs deeper by studying their individual neurons. We build upon previous work showing large language models such as GPT-4 can be useful in explaining what each neuron in a language model does. Sp… ▽ More

    Submitted 11 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.