Skip to main content

Showing 1–4 of 4 results for author: Penn, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.03779  [pdf, other

    cs.CL cs.LG

    Functional Faithfulness in the Wild: Circuit Discovery with Differentiable Computation Graph Pruning

    Authors: Lei Yu, Jingcheng Niu, Zining Zhu, Gerald Penn

    Abstract: In this paper, we introduce a comprehensive reformulation of the task known as Circuit Discovery, along with DiscoGP, a novel and effective algorithm based on differentiable masking for discovering circuits. Circuit discovery is the task of interpreting the computational mechanisms of language models (LMs) by dissecting their functions and capabilities into sparse subnetworks (circuits). We identi… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  2. arXiv:2405.02421  [pdf, other

    cs.CL

    What does the Knowledge Neuron Thesis Have to do with Knowledge?

    Authors: Jingcheng Niu, Andrew Liu, Zining Zhu, Gerald Penn

    Abstract: We reassess the Knowledge Neuron (KN) Thesis: an interpretation of the mechanism underlying the ability of large language models to recall facts from a training corpus. This nascent thesis proposes that facts are recalled from the training corpus through the MLP weights in a manner resembling key-value memory, implying in effect that "knowledge" is stored in the network. Furthermore, by modifying… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: ICLR 2024 (Spotlight)

  3. arXiv:1901.00072  [pdf, ps, other

    cs.LG cs.CL cs.SD stat.ML

    Exploring spectro-temporal features in end-to-end convolutional neural networks

    Authors: Sean Robertson, Gerald Penn, Yingxue Wang

    Abstract: Triangular, overlapping Mel-scaled filters ("f-banks") are the current standard input for acoustic models that exploit their input's time-frequency geometry, because they provide a psycho-acoustically motivated time-frequency geometry for a speech signal. F-bank coefficients are provably robust to small deformations in the scale. In this paper, we explore two ways in which filter banks can be adju… ▽ More

    Submitted 31 December, 2018; originally announced January 2019.

  4. arXiv:cs/0007013  [pdf, ps, other

    cs.CL cs.PL

    Applying Constraint Handling Rules to HPSG

    Authors: Gerald Penn

    Abstract: Constraint Handling Rules (CHR) have provided a realistic solution to an over-arching problem in many fields that deal with constraint logic programming: how to combine recursive functions or relations with constraints while avoiding non-termination problems. This paper focuses on some other benefits that CHR, specifically their implementation in SICStus Prolog, have provided to computational li… ▽ More

    Submitted 7 July, 2000; originally announced July 2000.

    Comments: To appear, Proceedings of First Workshop on Rule-Based Constraint Reasoning and Programming, CL2000; 14 pages

    ACM Class: I.2.7; D.3.3