-
LLM Circuit Analyses Are Consistent Across Training and Scale
Authors:
Curt Tigges,
Michael Hanna,
Qinan Yu,
Stella Biderman
Abstract:
Most currently deployed large language models (LLMs) undergo continuous training or additional finetuning. By contrast, most research into LLMs' internal mechanisms focuses on models at one snapshot in time (the end of pre-training), raising the question of whether their results generalize to real-world settings. Existing studies of mechanisms over time focus on encoder-only or toy models, which d…
▽ More
Most currently deployed large language models (LLMs) undergo continuous training or additional finetuning. By contrast, most research into LLMs' internal mechanisms focuses on models at one snapshot in time (the end of pre-training), raising the question of whether their results generalize to real-world settings. Existing studies of mechanisms over time focus on encoder-only or toy models, which differ significantly from most deployed models. In this study, we track how model mechanisms, operationalized as circuits, emerge and evolve across 300 billion tokens of training in decoder-only LLMs, in models ranging from 70 million to 2.8 billion parameters. We find that task abilities and the functional components that support them emerge consistently at similar token counts across scale. Moreover, although such components may be implemented by different attention heads over time, the overarching algorithm that they implement remains. Surprisingly, both these algorithms and the types of components involved therein can replicate across model scale. These results suggest that circuit analyses conducted on small models at the end of pre-training can provide insights that still apply after additional pre-training and over model scale.
△ Less
Submitted 15 July, 2024;
originally announced July 2024.
-
Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion
Authors:
Dylan Zhang,
Curt Tigges,
Zory Zhang,
Stella Biderman,
Maxim Raginsky,
Talia Ringer
Abstract:
This paper investigates the ability of transformer-based models to learn structural recursion from examples. Recursion is a universal concept in both natural and formal languages. Structural recursion is central to the programming language and formal mathematics tasks where symbolic tools currently excel beyond neural models, such as inferring semantic relations between datatypes and emulating pro…
▽ More
This paper investigates the ability of transformer-based models to learn structural recursion from examples. Recursion is a universal concept in both natural and formal languages. Structural recursion is central to the programming language and formal mathematics tasks where symbolic tools currently excel beyond neural models, such as inferring semantic relations between datatypes and emulating program behavior. We introduce a general framework that nicely connects the abstract concepts of structural recursion in the programming language domain to concrete sequence modeling problems and learned models' behavior. The framework includes a representation that captures the general \textit{syntax} of structural recursion, coupled with two different frameworks for understanding their \textit{semantics} -- one that is more natural from a programming languages perspective and one that helps bridge that perspective with a mechanistic understanding of the underlying transformer architecture.
With our framework as a powerful conceptual tool, we identify different issues under various set-ups. The models trained to emulate recursive computations cannot fully capture the recursion yet instead fit short-cut algorithms and thus cannot solve certain edge cases that are under-represented in the training distribution. In addition, it is difficult for state-of-the-art large language models (LLMs) to mine recursive rules from in-context demonstrations. Meanwhile, these LLMs fail in interesting ways when emulating reduction (step-wise computation) of the recursive function.
△ Less
Submitted 23 January, 2024;
originally announced January 2024.
-
Linear Representations of Sentiment in Large Language Models
Authors:
Curt Tigges,
Oskar John Hollinsworth,
Atticus Geiger,
Neel Nanda
Abstract:
Sentiment is a pervasive feature in natural language text, yet it is an open question how sentiment is represented within Large Language Models (LLMs). In this study, we reveal that across a range of models, sentiment is represented linearly: a single direction in activation space mostly captures the feature across a range of tasks with one extreme for positive and the other for negative. Through…
▽ More
Sentiment is a pervasive feature in natural language text, yet it is an open question how sentiment is represented within Large Language Models (LLMs). In this study, we reveal that across a range of models, sentiment is represented linearly: a single direction in activation space mostly captures the feature across a range of tasks with one extreme for positive and the other for negative. Through causal interventions, we isolate this direction and show it is causally relevant in both toy tasks and real world datasets such as Stanford Sentiment Treebank. Through this case study we model a thorough investigation of what a single direction means on a broad data distribution.
We further uncover the mechanisms that involve this direction, highlighting the roles of a small subset of attention heads and neurons. Finally, we discover a phenomenon which we term the summarization motif: sentiment is not solely represented on emotionally charged words, but is additionally summarized at intermediate positions without inherent sentiment, such as punctuation and names. We show that in Stanford Sentiment Treebank zero-shot classification, 76% of above-chance classification accuracy is lost when ablating the sentiment direction, nearly half of which (36%) is due to ablating the summarized sentiment direction exclusively at comma positions.
△ Less
Submitted 23 October, 2023;
originally announced October 2023.
-
Can Transformers Learn to Solve Problems Recursively?
Authors:
Shizhuo Dylan Zhang,
Curt Tigges,
Stella Biderman,
Maxim Raginsky,
Talia Ringer
Abstract:
Neural networks have in recent years shown promise for helping software engineers write programs and even formally verify them. While semantic information plays a crucial part in these processes, it remains unclear to what degree popular neural architectures like transformers are capable of modeling that information. This paper examines the behavior of neural networks learning algorithms relevant…
▽ More
Neural networks have in recent years shown promise for helping software engineers write programs and even formally verify them. While semantic information plays a crucial part in these processes, it remains unclear to what degree popular neural architectures like transformers are capable of modeling that information. This paper examines the behavior of neural networks learning algorithms relevant to programs and formal verification proofs through the lens of mechanistic interpretability, focusing in particular on structural recursion. Structural recursion is at the heart of tasks on which symbolic tools currently outperform neural models, like inferring semantic relations between datatypes and emulating program behavior. We evaluate the ability of transformer models to learn to emulate the behavior of structurally recursive functions from input-output examples. Our evaluation includes empirical and conceptual analyses of the limitations and capabilities of transformer models in approximating these functions, as well as reconstructions of the ``shortcut" algorithms the model learns. By reconstructing these algorithms, we are able to correctly predict 91 percent of failure cases for one of the approximated functions. Our work provides a new foundation for understanding the behavior of neural networks that fail to solve the very tasks they are trained for.
△ Less
Submitted 25 June, 2023; v1 submitted 24 May, 2023;
originally announced May 2023.
-
Assembling a ring-shaped crystal in a microfabricated surface ion trap
Authors:
Boyan Tabakov,
Francisco Benito,
Matthew Blain,
Craig R. Clark,
Susan Clark,
Raymond A. Haltli,
Peter Maunz,
Jonathan D. Sterk,
Chris Tigges,
Daniel Stick
Abstract:
We report on experiments with a microfabricated surface trap designed for trapping a chain of ions in a ring. Uniform ion separation over most of the ring is achieved with a rotationally symmetric design and by measuring and suppressing undesired electric fields. After minimizing these fields the ions are confined primarily by an rf trapping pseudo-potential and their mutual Coulomb repulsion. The…
▽ More
We report on experiments with a microfabricated surface trap designed for trapping a chain of ions in a ring. Uniform ion separation over most of the ring is achieved with a rotationally symmetric design and by measuring and suppressing undesired electric fields. After minimizing these fields the ions are confined primarily by an rf trapping pseudo-potential and their mutual Coulomb repulsion. The ring-shaped crystal consists of approximately 400 Ca$^+$ ions with an estimated average separation of 9 $μm$.
△ Less
Submitted 26 January, 2015;
originally announced January 2015.
-
Characterization of fluorescence collection optics integrated with a micro-fabricated surface electrode ion trap
Authors:
Craig R. Clark,
Chin-wen Chou,
A. R. Ellis,
Jeff Hunker,
Shanalyn A. Kemme,
Peter Maunz,
Boyan Tabakov,
Chris Tigges,
Daniel L. Stick
Abstract:
One of the outstanding challenges for ion trap quantum information processing is to accurately detect the states of many ions in a scalable fashion. In the particular case of surface traps, geometric constraints make imaging perpendicular to the surface appealing for light collection at multiple locations with minimal cross-talk. In this report we describe an experiment integrating Diffractive Opt…
▽ More
One of the outstanding challenges for ion trap quantum information processing is to accurately detect the states of many ions in a scalable fashion. In the particular case of surface traps, geometric constraints make imaging perpendicular to the surface appealing for light collection at multiple locations with minimal cross-talk. In this report we describe an experiment integrating Diffractive Optic Elements (DOE's) with surface electrode traps, connected through in-vacuum multi-mode fibers. The square DOE's reported here were all designed with solid angle collection efficiencies of 3.58%; with all losses included a detection efficiency of 0.388% (1.02% excluding the PMT loss) was measured with a single Ca+ ion. The presence of the DOE had minimal effect on the stability of the ion, both in temporal variation of stray electric fields and in motional heating rates.
△ Less
Submitted 21 May, 2013;
originally announced May 2013.
-
Design, Fabrication, and Experimental Demonstration of Junction Surface Ion Traps
Authors:
D. L. Moehring,
C. Highstrete,
D. Stick,
K. M. Fortier,
R. Haltli,
C. Tigges,
M. G. Blain
Abstract:
We present the design, fabrication, and experimental implementation of surface ion traps with Y-shaped junctions. The traps are designed to minimize the pseudopotential variations in the junction region at the symmetric intersection of three linear segments. We experimentally demonstrate robust linear and junction shuttling with greater than one million round-trip shuttles without ion loss. By min…
▽ More
We present the design, fabrication, and experimental implementation of surface ion traps with Y-shaped junctions. The traps are designed to minimize the pseudopotential variations in the junction region at the symmetric intersection of three linear segments. We experimentally demonstrate robust linear and junction shuttling with greater than one million round-trip shuttles without ion loss. By minimizing the direct line of sight between trapped ions and dielectric surfaces, negligible day-to-day and trap-to-trap variations are observed. In addition to high-fidelity single-ion shuttling, multiple-ion chains survive splitting, ion-position swapping, and recombining routines. The development of two-dimensional trapping structures is an important milestone for ion-trap quantum computing and quantum simulations.
△ Less
Submitted 9 May, 2011;
originally announced May 2011.
-
Demonstration of a microfabricated surface electrode ion trap
Authors:
D Stick,
K M Fortier,
R Haltli,
C Highstrete,
D L Moehring,
C Tigges,
M G Blain
Abstract:
In this paper we present the design, modeling, and experimental testing of surface electrode ion traps fabricated in a heterostructure configuration comprising a silicon substrate, silicon dioxide insulators, and aluminum electrodes. This linear trap has a geometry with symmetric RF leads, two interior DC electrodes, and 40 individual lateral DC electrodes. Plasma enhanced chemical vapor depositio…
▽ More
In this paper we present the design, modeling, and experimental testing of surface electrode ion traps fabricated in a heterostructure configuration comprising a silicon substrate, silicon dioxide insulators, and aluminum electrodes. This linear trap has a geometry with symmetric RF leads, two interior DC electrodes, and 40 individual lateral DC electrodes. Plasma enhanced chemical vapor deposition (PECVD) was used to grow silicon dioxide pillars to electrically separate overhung aluminum electrodes from an aluminum ground plane. In addition to fabrication, we report techniques for modeling the control voltage solutions and the successful demonstration of trapping and shuttling ions in two identically constructed traps.
△ Less
Submitted 16 November, 2010; v1 submitted 5 August, 2010;
originally announced August 2010.