Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Guevara, C R

Searching in archive cs. Search in all archives.
.
  1. arXiv:2211.12312  [pdf, other

    cs.LG cs.AI

    Interpreting Neural Networks through the Polytope Lens

    Authors: Sid Black, Lee Sharkey, Leo Grinsztajn, Eric Winsor, Dan Braun, Jacob Merizian, Kip Parker, Carlos Ramón Guevara, Beren Millidge, Gabriel Alfour, Connor Leahy

    Abstract: Mechanistic interpretability aims to explain what a neural network has learned at a nuts-and-bolts level. What are the fundamental primitives of neural network representations? Previous mechanistic descriptions have used individual neurons or their linear combinations to understand the representations a network has learned. But there are clues that neurons and their linear combinations are not the… ▽ More

    Submitted 22 November, 2022; originally announced November 2022.

    Comments: 22/11/22 initial upload