Optimize image blobs by willwade · Pull Request #10 · asterics/AsTeRICS-Grid-Boards

willwade · 2025-10-22T10:48:03Z

Optimize Embedded Image Blobs with ARASAAC/OpenSymbols URLs

Summary

Reduces repository size by about.. 50% by replacing embedded image blobs with external ARASAAC/OpenSymbols URLs and applying lossless compression.

Problem

1,274 embedded image blobs across 39 communicator boards
~375 MB of embedded blobs (51.7% of repository)
Many blobs are duplicates of publicly available symbols
Increases clone time and storage requirements

Solution

Implemented 4 things!

1 - Exact Matching (SHA256): 444 blobs → ARASAAC/OpenSymbols URLs
2 - Fuzzy Label Matching: 4,547 symbols matched (88-92% confidence) (<- This is the sketchiest part!)
3 - Perceptual Hashing: 555 additional blobs migrated (<- Second sketchiest part)
4 - Lossless Compression: 20% reduction on remaining blobs ( Easy win)

Results

444 blobs migrated (Layer 1)
555 additional blobs migrated (Layer 3)
830 custom artwork images preserved as blobs
193.99 MB saved (51.7% reduction) (BUT Note: a lot of this is just from converting from OBZ to json and it strips out the embedded images)
67 files modified across 39 boards

Files Changed

67 communicator board files (.grd.json)
Metadata files (live_metadata.json, live_predefined_*.json)
25 primary communicator boards

Testing

I've tried to manually test all changes by loading the pages up. BUT I will be honest - I cant be 10000% sure we havent messed something up. If we could A/B test this.. Or get someone to test this out that would be fab

Notes

Separated from vocal flair changes (fix-remaining-openboards PR )
Should be merged AFTER fix-remaining-openboards is done..

I should do a blogpost on this - as I learnt a ton about hashing. Ideally opensymbols and arasaac should create hashes on their API - that would be handy

- Replaced 444 image blobs with external ARASAAC/OpenSymbols URLs - Implemented fuzzy perceptual hashing with Hamming distance (threshold=16) - Dual-index matching: ARASAAC (13,623) + OpenSymbols (7,453) symbols - 830 custom artwork images remain as blobs (not in symbol libraries) - Updated all metadata files (live_metadata.json, live_predefined_*.json) - Zero errors during migration - 67 files modified across 39 communicator boards

- Recompressed 830 image blobs across 24 communicator files - JPEG optimization: 295 blobs, 1.47 MB saved (17.8%) - PNG optimization: 331 blobs, 1.24 MB saved (12.1%) - SVG optimization: 189 blobs, 12.2 KB saved (7.0%) - Total savings: 2.7 MB (14.3% reduction) Compression techniques: - JPEG: quality=85 with optimize flag (visually lossless) - PNG: optimize flag preserving transparency - SVG: removed metadata, comments, and collapsed whitespace Files optimized: - ABA programmes (689 KB saved) - Vocabuléo-by-LAdapeila (689 KB saved) - ARASAAC Global Grid Core Communicator (255 KB saved) - Communication in hospital (224 KB saved) - And 20 other communicator files No visual quality loss. All blobs remain embedded in .grd.json files.

Reduced embedded blobs from 830 to 275 (66.9% reduction) Migration strategy: - Exact SHA256 matches with symbol libraries - Perceptual hash matching (phash distance <= 20) - Label-based matching (exact and partial keyword matching) - Skipped large blobs (>500KB) likely to be custom images Results by matching method: - Label exact matches: 60 blobs - Label partial matches: 192 blobs - Perceptual hash matches: 3 blobs - Total migrated: 555 blobs Files with most migrations: - ABA programmes: 68 blobs migrated (95 -> 27) - Vocabuléo-by-LAdapeila: 129 blobs migrated (156 -> 27) - Global-Core Communicator variants: 16-20 blobs each Remaining 275 blobs are: - Brand/product images (Quick_Say20) - Custom artwork not in symbol libraries - Unique illustrations This reduces repository size and improves maintainability by using external symbol library URLs instead of embedded data.

klues · 2025-10-22T14:24:12Z

Awesome work!
As said, I won't merge this because it changes all default gridsets and I don't want to change gridsets from cooperation partners without confirmation from them that the changes are fine and don't mess up something.

However for these sets I think we could merge your improvements:

all from the openboardformat page: https://www.openboardformat.org/examples
- including the vocal flair 84 - which you didn't include in the other PR (and maybe others from openboardformat which we do not have right now)
AsTeRICS Grid default: because I know it by heart and will be able to check quickly if everything is still working

Question:

which format are the lossless compressed images? Still the original one, or some new/fancy format (which maybe isn't supported by all browsers)?

If you create a new PR changing only the sets I've mentioned above, please do the first PR with only running npm run generate-beta which only creates the metadata files for the beta environment at grid.asterics.eu/latest. There I can check the changes before merging the changes generated with npm run generate which affects the files for the prod version.

willwade · 2025-10-22T15:53:30Z

yeah - look at the previous PR - that should be good for JUST the vocal flair stuff #9 - I've removed all the grids in ->this<- pr here..

re: Formats. No - we could of swapped to WebP but I didnt want to do that. Basically just recompressed with guetzli for png jpeg ..

so to confirm: Look at the previous PR first - check that ok - then if good we can review this together a bit more..

klues · 2025-10-23T06:31:36Z

I've already merged #9 - but it only contained the vocal flair 112 not the 84 anymore. Vocal flair 112 is already available in prod - with 3.5MB instead of 60MB - which is great!

Basically just recompressed with guetzli for png jpeg ..

ok, great!

willwade added 3 commits October 22, 2025 11:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize image blobs#10

Optimize image blobs#10
willwade wants to merge 3 commits intoasterics:mainfrom
willwade:optimize-image-blobs

willwade commented Oct 22, 2025

Uh oh!

klues commented Oct 22, 2025

Uh oh!

willwade commented Oct 22, 2025

Uh oh!

klues commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

willwade commented Oct 22, 2025

Optimize Embedded Image Blobs with ARASAAC/OpenSymbols URLs

Summary

Problem

Solution

Results

Files Changed

Testing

Notes

Uh oh!

klues commented Oct 22, 2025

Uh oh!

willwade commented Oct 22, 2025

Uh oh!

klues commented Oct 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants