Expand XAI notebooks with perturbation, TCAV, and attention vs attribution by aravind-3105 · Pull Request #28 · VectorInstitute/interpretability-llms-agents

aravind-3105 · 2026-03-12T21:42:03Z

Summary

This pull request significantly expands the xai_refresher module's documentation and dependencies to support four new advanced notebooks (perturbation, robustness, TCAV, and attention vs attribution) and clarifies their interconnections. It also updates the recommended setup, adds required libraries, and reorganizes further reading references for clarity and completeness.

Clickup Ticket(s): Link(s) if applicable.

Type of Change

🐛 Bug fix (non-breaking change that fixes an issue)
✨ New feature (non-breaking change that adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
📝 Documentation update
🔧 Refactoring (no functional changes)
⚡ Performance improvement
🧪 Test improvements
🔒 Security fix

Changes Made

Notebooks and Documentation Enhancements:

Added detailed descriptions and sequencing for four new advanced notebooks: perturbation-based attribution for vision and text, TCAV concept-level interpretability, and attention vs attribution analysis. Provided a summary table and cross-notebook connections to clarify learning progression and shared insights.
Updated the recommended setup to specify additional dependencies (captum, transformers, datasets, bertviz) needed for the new notebooks.
Revised the "Getting Started" section to include the new notebooks in the suggested order of completion, ensuring a logical learning path.

Dependencies and Environment:

Added captum, huggingface-hub, and bertviz to the ref1-refresher-interpretability dependency group in pyproject.toml to support the new notebooks.
Updated the Jupyter kernel specification in concept_grounding.ipynb for consistency with the new environment.

Reference Materials:

Reorganized and expanded the "Further Reading" section to group papers by theme (LIME, SHAP, perturbation, bias, TCAV, attention, concept grounding) and include recent and foundational works relevant to the new content.

Testing

Tests pass locally (uv run pytest tests/)
Type checking passes (uv run mypy <src_dir>)
Linting passes (uv run ruff check src_dir/)
Manual testing performed (describe below)

Manual testing details:

Screenshots/Recordings

Related Issues

Deployment Notes

Checklist

Code follows the project's style guidelines
Self-review of code completed
Documentation updated (if applicable)
No sensitive information (API keys, credentials) exposed

aravind-3105 · 2026-03-12T21:46:27Z

@shainarazavi I've added the extra notebooks we wanted to add to expand the refresher implementation. The main files to be checked are

perturbation_robustness_and_bias_text.ipynb
perturbation_robustness_and_bias_text.ipynb
attention_vs_attribution.ipynb
attention_vs_attribution.ipynb.ipynb
The remaining changes from the previous notebooks are minor, mostly just aesthetic updates.

Copilot

Pull request overview

This PR expands the implementations/xai_refresher module documentation and environment setup to support additional advanced XAI notebooks (perturbation/robustness, TCAV, attention vs attribution), and updates the dependency group/lockfile accordingly.

Changes:

Added new interpretability dependencies to the ref1-refresher-interpretability group and updated uv.lock.
Updated implementations/xai_refresher/README.md to document notebooks 5–8 and reorganize further reading.
Adjusted the Jupyter kernelspec metadata in concept_grounding.ipynb.

Reviewed changes

Copilot reviewed 3 out of 8 changed files in this pull request and generated 1 comment.

File	Description
`uv.lock`	Adds new resolved packages for the expanded notebook environment (but currently contains duplicated package entries).
`pyproject.toml`	Extends `ref1-refresher-interpretability` with new libraries for the added notebooks (but currently missing `datasets`).
`implementations/xai_refresher/concept_grounding.ipynb`	Updates kernelspec metadata to match the environment naming used by newer notebooks.
`implementations/xai_refresher/README.md`	Documents new notebook sequence, interconnections, and expands/reorganizes references.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-12T21:47:08Z

pyproject.toml

+    "captum>=0.8.0",
+    "huggingface-hub>=1.3.4",
+    "bertviz>=1.4.1",


ref1-refresher-interpretability is missing the datasets dependency, but the new notebooks/docs require it (e.g., tcav_concept_sensitivity.ipynb imports datasets). Please add datasets to this dependency group and regenerate uv.lock so the environment matches the README/setup instructions.

shainarazavi · 2026-03-16T01:26:47Z

@aravind-3105 some of them are PRs to run locally, lets meet tommorow and dry run and get it done

shainarazavi

we can dry run, however, from a quick review,check the model vs clf_model naming collision, also is`` def gradient_token_importance(text, target_class=None): defined twice??

shainarazavi

That line appears twice, once for the train results, once for the test features:
python data = torch.load(test_features_path, map_location="cpu")
Both calls omit weights_only=True, meaning arbitrary pickle code can execute on load. is it a design choice?

shainarazavi

Great work Aravind, we can dry run. Just a quick question but this line:
num_layers = len(model(**tokenizer("test", return_tensors="pt").to(device)).hidden_states) runs a full forward pass just to count layers, which is wasteful. And profession_concept is defined twice , i think once as full sentences, once as single words. I think we can pick one form and define it once.

shainarazavi · 2026-03-17T01:10:15Z

@aravind-3105 Thanks for the great work! Please feel free to merge the PR at your convenience.

… implementation 1

… dependencies

…ttribution and adjust library dependencies

… implementation 1

…nd update README for clarity on index-url requirements

…ity.ipynb

…on notebooks

…e and improving function definitions

…ution and perturbation_robustness_captum_image notebooks

… and attention notebooks

aravind-3105 requested review from Copilot and shainarazavi March 12, 2026 21:42

aravind-3105 self-assigned this Mar 12, 2026

aravind-3105 added the enhancement New feature or request label Mar 12, 2026

Copilot started reviewing on behalf of aravind-3105 March 12, 2026 21:42 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

shainarazavi reviewed Mar 16, 2026

View reviewed changes

aravind-3105 force-pushed the ref-imp1-expand branch from d13d16b to 3522393 Compare March 17, 2026 19:27

aravind-3105 added 17 commits March 17, 2026 19:25

Add perturbation & robustness notebooks for text and vision expanding…

6b5040b

… implementation 1

Update XAI Refresher README to include new perturbation notebooks and…

d2277ff

… dependencies

Add new TCAV notebook and README to reflect the addition

23cb694

Add notebook on attention vs attribution with bertviz

4d11b93

Update XAI Refresher README to include new notebook on attention vs a…

30dd39f

…ttribution and adjust library dependencies

Update cache directory settings and kernel specifications in notebooks

fa82427

Add perturbation & robustness notebooks for text and vision expanding…

64d0e3b

… implementation 1

Fix evaluation function call in 05_evaluation.ipynb (imp4) notebook a…

99db37c

…nd update README for clarity on index-url requirements

Update uv.lock

7a93683

Remove unused asyncio import from 05_evaluation.ipynb

33398f5

Remove huggingface-hub dependency from pyproject.toml

21c046b

Refactor imports and improve code formatting in tcav_concept_sensitiv…

f9e8f69

…ity.ipynb

Refactor imports and improve code formatting in attention, perturbati…

96ba026

…on notebooks

Refactor attention_vs_attribution.ipynb by reorganizing code structur…

8c39314

…e and improving function definitions

Refactor code structure and improve formatting in attention_vs_attrib…

93248bc

…ution and perturbation_robustness_captum_image notebooks

Fix model loading code and clean up output in perturbation robustness…

4f95c58

… and attention notebooks

Address code checks.

20c0351

aravind-3105 force-pushed the ref-imp1-expand branch from 18bff8b to 20c0351 Compare March 18, 2026 00:11

Add data image dog.jpeg

cfbb5bd

aravind-3105 merged commit e9ef86a into main Mar 18, 2026
2 checks passed

aravind-3105 deleted the ref-imp1-expand branch March 18, 2026 16:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expand XAI notebooks with perturbation, TCAV, and attention vs attribution#28

Expand XAI notebooks with perturbation, TCAV, and attention vs attribution#28
aravind-3105 merged 18 commits intomainfrom
ref-imp1-expand

aravind-3105 commented Mar 12, 2026 •

edited

Loading

Uh oh!

aravind-3105 commented Mar 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 12, 2026

Uh oh!

shainarazavi commented Mar 16, 2026

Uh oh!

shainarazavi left a comment

Uh oh!

shainarazavi left a comment

Uh oh!

shainarazavi left a comment

Uh oh!

shainarazavi commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

aravind-3105 commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of Change

Changes Made

Testing

Screenshots/Recordings

Related Issues

Deployment Notes

Checklist

Uh oh!

aravind-3105 commented Mar 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

shainarazavi commented Mar 16, 2026

Uh oh!

shainarazavi left a comment

Choose a reason for hiding this comment

Uh oh!

shainarazavi left a comment

Choose a reason for hiding this comment

Uh oh!

shainarazavi left a comment

Choose a reason for hiding this comment

Uh oh!

shainarazavi commented Mar 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

aravind-3105 commented Mar 12, 2026 •

edited

Loading