Skip to content

Conversation

@athrael-soju
Copy link

@athrael-soju athrael-soju commented Nov 17, 2025

My take on adding interpretability to colmodernvbert and colidefics3processor. Happy to take suggestions for improvements, if you think it's useful, but if not, happy to close it.

doc_1

doc_1_original doc_1_token_0_What doc_1_token_1_is doc_1_token_2_the doc_1_token_3_dividend
doc_1_token_4_pay doc_1_token_5_out doc_1_token_6_in doc_1_token_7_2012 doc_1_token_8_token_8

doc_2

doc_2_original doc_2_token_0_What doc_2_token_1_is doc_2_token_2_the doc_2_token_3_name
doc_2_token_4_of doc_2_token_5_the doc_2_token_6_person doc_2_token_7_in doc_2_token_8_the
doc_2_token_9_CC doc_2_token_10_field doc_2_token_11_token_11

doc_3

doc_3_original doc_3_token_0_What doc_3_token_1_is doc_3_token_2_the doc_3_token_3_personnel
doc_3_token_4_costs doc_3_token_5_in doc_3_token_6_the doc_3_token_7_4 doc_3_token_8_th
doc_3_token_9_year doc_3_token_10_token_10

@athrael-soju athrael-soju marked this pull request as draft November 17, 2025 21:25
@athrael-soju
Copy link
Author

athrael-soju commented Nov 17, 2025

@ManuelFay @paultltc @tonywu71 happy for any feedback you can provide.

@athrael-soju athrael-soju marked this pull request as ready for review November 18, 2025 18:34
@athrael-soju
Copy link
Author

As I understand, the “bleeding” happens when we upsample the 64×64 similarity grid to overlay on the original image and that its harmless. Happy to stand corrected.

Move generate_interpretability_maps.py from tests/models/modernvbert/ to examples/interpretability/colmodernvbert/ to better reflect its purpose as an example script rather than a test.
…bility functionality

Add handling for None longest_edge parameter in ColModernVBertProcessor and ColIdefics3Processor to support research use cases where resizing should be disabled. When longest_edge is None, original image dimensions are preserved.

Add get_local_image_mask() method to ColIdefics3Processor to identify local image tokens while excluding global patch tokens, enabling better spatial correspondence for interpretability analysis
@athrael-soju
Copy link
Author

Updated with lots of maps. Would appreciate a review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant