-
The Mysterious Case of Neuron 1512: Injectable Realignment Architectures Reveal Internal Characteristics of Meta's Llama 2 Model
Authors:
Brenden Smith,
Dallin Baker,
Clayton Chase,
Myles Barney,
Kaden Parker,
Makenna Allred,
Peter Hu,
Alex Evans,
Nancy Fulda
Abstract:
Large Language Models (LLMs) have an unrivaled and invaluable ability to "align" their output to a diverse range of human preferences, by mirroring them in the text they generate. The internal characteristics of such models, however, remain largely opaque. This work presents the Injectable Realignment Model (IRM) as a novel approach to language model interpretability and explainability. Inspired b…
▽ More
Large Language Models (LLMs) have an unrivaled and invaluable ability to "align" their output to a diverse range of human preferences, by mirroring them in the text they generate. The internal characteristics of such models, however, remain largely opaque. This work presents the Injectable Realignment Model (IRM) as a novel approach to language model interpretability and explainability. Inspired by earlier work on Neural Programming Interfaces, we construct and train a small network -- the IRM -- to induce emotion-based alignments within a 7B parameter LLM architecture. The IRM outputs are injected via layerwise addition at various points during the LLM's forward pass, thus modulating its behavior without changing the weights of the original model. This isolates the alignment behavior from the complex mechanisms of the transformer model. Analysis of the trained IRM's outputs reveals a curious pattern. Across more than 24 training runs and multiple alignment datasets, patterns of IRM activations align themselves in striations associated with a neuron's index within each transformer layer, rather than being associated with the layers themselves. Further, a single neuron index (1512) is strongly correlated with all tested alignments. This result, although initially counterintuitive, is directly attributable to design choices present within almost all commercially available transformer architectures, and highlights a potential weak point in Meta's pretrained Llama 2 models. It also demonstrates the value of the IRM architecture for language model analysis and interpretability. Our code and datasets are available at https://github.com/DRAGNLabs/injectable-alignment-model
△ Less
Submitted 4 July, 2024;
originally announced July 2024.
-
Nonequilibrium Casimir effects of nonreciprocal surface waves
Authors:
Chinmay Khandekar,
Siddharth Buddhiraju,
Paul R. Wilkinson,
James K. Gimzewski,
Alejandro W. Rodriguez,
Charles Chase,
Shanhui Fan
Abstract:
We show that an isotropic dipolar particle in the vicinity of a substrate made of nonreciprocal plasmonic materials can experience a lateral Casimir force and torque when the particle's temperature differs from that of the slab and the environment. We connect the existence of the lateral force to the asymmetric dispersion of nonreciprocal surface polaritons and the existence of the lateral torque…
▽ More
We show that an isotropic dipolar particle in the vicinity of a substrate made of nonreciprocal plasmonic materials can experience a lateral Casimir force and torque when the particle's temperature differs from that of the slab and the environment. We connect the existence of the lateral force to the asymmetric dispersion of nonreciprocal surface polaritons and the existence of the lateral torque to the spin-momentum locking of such surface waves. Using the formalism of fluctuational electrodynamics, we show that the features of lateral force and torque should be experimentally observable using a substrate of doped Indium Antimonide (InSb) placed in an external magnetic field, and for a variety of dielectric particles. Interestingly, we also find that the directions of the lateral force and the torque depend on the constituent materials of the particles, which suggests a sorting mechanism based on lateral nonequilibrium Casimir physics.
△ Less
Submitted 19 June, 2021;
originally announced June 2021.
-
Laser Optomechanics
Authors:
Weijian Yang,
S. Adair Gerke,
Kar Wei Ng,
Yi Rao,
Christopher Chase,
Connie J. Chang-Hasnain
Abstract:
Cavity optomechanics explores the coupling between the optical field and the mechanical oscillation to induce cooling and regenerative oscillation in a mechanical oscillator. So far, optomechanics relies on the detuning between the cavity and an external pump laser, where the laser acts only as a power supply. Here, we report a new scheme with mutual coupling between a mechanical oscillator that s…
▽ More
Cavity optomechanics explores the coupling between the optical field and the mechanical oscillation to induce cooling and regenerative oscillation in a mechanical oscillator. So far, optomechanics relies on the detuning between the cavity and an external pump laser, where the laser acts only as a power supply. Here, we report a new scheme with mutual coupling between a mechanical oscillator that supports a mirror of a vertical-cavity surface-emitting laser (VCSEL) and the optical field, greatly enhancing the light-matter energy transfer. In this work, we used an ultra-light-weight (130 pg) high-contrast-grating (HCG) mirror in a VCSEL, whose reflectivity spectrum is designed to facilitate strong optomechanical coupling, to demonstrate optomechanically-induced regenerative oscillation of the laser optomechanical cavity with > 550 nm self-oscillation amplitude of the micro-mechanical oscillator, two to three orders of magnitude larger than typical. This new scheme not only offers an efficient approach for high-speed wavelength-swept sources, but also has far-reaching significance in the realization of quantum entanglement of macroscopic objects and ultrasensitive measurement of displacements and forces.
△ Less
Submitted 26 February, 2015;
originally announced February 2015.
-
Randomly Broken Nuclei and Disordered Systems
Authors:
K. C. Chase,
P. Bhattacharyya,
A. Z. Mekjian
Abstract:
Similarities between models of fragmenting nuclei and disordered systems in condensed matter suggest corresponding methods. Several theoretical models of fragmentation investigated in this fashion show marked differences, indicating possible new methods for distinguishing models using yield data. Applying nuclear methods to disordered systems also yields interesting results.
Similarities between models of fragmenting nuclei and disordered systems in condensed matter suggest corresponding methods. Several theoretical models of fragmentation investigated in this fashion show marked differences, indicating possible new methods for distinguishing models using yield data. Applying nuclear methods to disordered systems also yields interesting results.
△ Less
Submitted 13 August, 1997;
originally announced August 1997.
-
Canonical and Microcanonical Ensemble Approaches to Bose-Einstein Condensation: The Thermodynamics of Particles in Harmonic Traps
Authors:
K. C. Chase,
A. Z. Mekjian,
L. Zamick
Abstract:
The thermodynamic properties of bosons moving in a harmonic trap in an arbitrary number of dimensions are investigated in the grand canonical, canonical and microcanonical ensembles by applying combinatorial techniques developed earlier in statistical nuclear fragmentation models. Thermodynamic functions such as the energy and specific heat are computed exactly in these ensembles. The occupation…
▽ More
The thermodynamic properties of bosons moving in a harmonic trap in an arbitrary number of dimensions are investigated in the grand canonical, canonical and microcanonical ensembles by applying combinatorial techniques developed earlier in statistical nuclear fragmentation models. Thermodynamic functions such as the energy and specific heat are computed exactly in these ensembles. The occupation of the ground or condensed state is also obtained exactly, and signals clearly the phase transition. The application of these techniques to fermionic systems is also briefly discussed.
△ Less
Submitted 10 August, 1997;
originally announced August 1997.
-
Studies in the statistical and thermal properties of hadronic matter under some extreme conditions
Authors:
K. C. Chase,
A. Z. Mekjian,
P. Meenakshisundaram
Abstract:
The thermal and statistical properties of hadronic matter under some extreme conditions are investigated using an exactly solvable canonical ensemble model. A unified model describing both the fragmentation of nuclei and the thermal properties of hadronic matter is developed. Simple expressions are obtained for quantities such as the hadronic equation of state, specific heat, compressibility, en…
▽ More
The thermal and statistical properties of hadronic matter under some extreme conditions are investigated using an exactly solvable canonical ensemble model. A unified model describing both the fragmentation of nuclei and the thermal properties of hadronic matter is developed. Simple expressions are obtained for quantities such as the hadronic equation of state, specific heat, compressibility, entropy, and excitation energy as a function of temperature and density. These expressions encompass the fermionic aspect of nucleons, such as degeneracy pressure and Fermi energy at low temperatures and the ideal gas laws at high temperatures and low density. Expressions are developed which connect these two extremes with behavior that resembles an ideal Bose gas with its associated Bose condensation. In the thermodynamic limit, an infinite cluster exists below a certain critical condition in a manner similar to the sudden appearance of the infinite cluster in percolation theory. The importance of multiplicity fluctuations is discussed and some recent data from the EOS collaboration on critical point behavior of nuclei can be accounted for using simple expressions obtained from the model.
△ Less
Submitted 26 September, 1996;
originally announced September 1996.
-
Critical point multiplicities and multiplicity fluctuations in heavy ion collisions
Authors:
K. C. Chase,
A. Z. Mekjian
Abstract:
An exactly solvable model of nuclear fragmentation is shown to lead to a simple connection between the critical point multiplicity $\langle m \rangle_{c}$ and the critical point exponent $τ$ recently reported on in the EOS collaboration. The importance of multiplicity fluctuations on critical point behavior is also discussed.
An exactly solvable model of nuclear fragmentation is shown to lead to a simple connection between the critical point multiplicity $\langle m \rangle_{c}$ and the critical point exponent $τ$ recently reported on in the EOS collaboration. The importance of multiplicity fluctuations on critical point behavior is also discussed.
△ Less
Submitted 3 October, 1995;
originally announced October 1995.
-
Exact methods for Campi plots
Authors:
K. C. Chase,
A. Z. Mekjian
Abstract:
We introduce for canonical fragmention models an exact method for computing expectation values which exclude the largest cluster. This method allows for the computation of the reduced multiplicity and other quantities of interest introduced by Campi, and a comparison shows that the percolation model and a recent canonical model differ mostly only in small respects in these ensemble averages.
We introduce for canonical fragmention models an exact method for computing expectation values which exclude the largest cluster. This method allows for the computation of the reduced multiplicity and other quantities of interest introduced by Campi, and a comparison shows that the percolation model and a recent canonical model differ mostly only in small respects in these ensemble averages.
△ Less
Submitted 23 August, 1995;
originally announced August 1995.
-
Heated nuclear matter, condensation phenomena and the hadronic equation of state
Authors:
K. C. Chase,
A. Z. Mekjian
Abstract:
The thermodynamic properties of heated nuclear matter are explored using an exactly solvable canonical ensemble model. This model reduces to the results of an ideal Fermi gas at low temperatures. At higher temperatures, the fragmentation of the nuclear matter into clusters of nucleons leads to features that resemble a Bose gas. Some parallels of this model with the phenomena of Bose condensation…
▽ More
The thermodynamic properties of heated nuclear matter are explored using an exactly solvable canonical ensemble model. This model reduces to the results of an ideal Fermi gas at low temperatures. At higher temperatures, the fragmentation of the nuclear matter into clusters of nucleons leads to features that resemble a Bose gas. Some parallels of this model with the phenomena of Bose condensation and with percolation phenomena are discussed. A simple expression for the hadronic equation of state is obtained from the model.
△ Less
Submitted 21 June, 1995;
originally announced June 1995.
-
A Fully Isotopic Model of Fragmentation
Authors:
K. C. Chase,
A. Z. Mekjian
Abstract:
A general model for the fragmentation of a two-component system (e.g. protons and neutrons) is proposed and solved exactly. The extension of this model to any number of components is also shown to be exactly solvable. A connection between this models and the permutation group is discussed. The notion of isotopic equivalence is defined in order to evaluate the equivalence of these models to earli…
▽ More
A general model for the fragmentation of a two-component system (e.g. protons and neutrons) is proposed and solved exactly. The extension of this model to any number of components is also shown to be exactly solvable. A connection between this models and the permutation group is discussed. The notion of isotopic equivalence is defined in order to evaluate the equivalence of these models to earlier one-component models. All the one-component models considered in earlier papers are shown to be equivalent to a particular class of two-component models. A simplified model applicable to the case of nuclear fragmentation is introduced and analyzed. Modifications to this model to include effects such as pairing and Coulomb interactions are discussed.
△ Less
Submitted 24 February, 1994;
originally announced February 1994.