Skip to content
#

transformer-lens

Here are 26 public repositories matching this topic...

Mechanistic interpretability study comparing modular addition and subtraction circuits in 1-layer attention-only transformers via activation patching, logit lens, SVD circuit analysis, Fourier feature analysis, and causal scrubbing across three training stages.

  • Updated May 2, 2026
  • Python

Inspired by Alvin Lucier's I Am Sitting in a Room (1969), this applies an analogous rendering process to GPT-2 Small: the model's activation tensor is excited through iterative forward-pass feedback, repeating 500 times. As semantic content dissolves, dominant attractor states emerge, revealing the model's naked inner voice.

  • Updated Jul 2, 2026
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the transformer-lens topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the transformer-lens topic, visit your repo's landing page and select "manage topics."

Learn more