MSc Thesis: Bridging mechanistic interpretability circuits to faithful natural language explanations using ERASER evaluation metrics
msc-thesis explainability gpt-2 natural-language-explanations mechanistic-interpretability transformer-lens eraser-metrics
-
Updated
May 30, 2026 - Jupyter Notebook