- 
                Notifications
    
You must be signed in to change notification settings  - Fork 119
 
feat: Deduplicate tagged blobs #5512
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
        
          
                rts/motoko-rts/src/persistence.rs
              
                Outdated
          
        
      | /// Setter method for the dedup table. | ||
| pub(crate) unsafe fn set_dedup_table_ptr(dedup_table: Value) { | ||
| let metadata = PersistentMetadata::get(); | ||
| (*metadata).dedup_table = dedup_table; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe here a write barrier would be needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great stuff. Just the one comment on barrier
This PR adds (tagged) blob deduplication support. The main issue it solves is that all external calls to a motoko canister go through candid deserialization and blobs passed as arguments end up as fresh blobs on the motoko heap. Calling multiple times with the same blob as argument creates multiple copies of the same blob. To achieve deduplication, this PR does the following: * in `internals.mo`, it creates a fixed-size hash-table which solves collisions via chaining. * sets up a thin RTS interface to set/get the hash-table allocated in `internals.mo` to be tracked by the RTS layer such that the table is not garbage collected and it survives upgrades. * to achieve deduplication, the hash table stores weak references pointing to the actual objects; once objects are garbage collected, the weak references will point to null. * a thin client interface (in `prim.mo`) to walk the hash table and check which deduplicated blobs are alive/dead and prune the dead ones if neeed.
This PR adds (tagged) blob deduplication support. The main issue it solves is that all external calls to a motoko canister go through candid deserialization and blobs passed as arguments end up as fresh blobs on the motoko heap. Calling multiple times with the same blob as argument creates multiple copies of the same blob.
To achieve deduplication, this PR does the following:
internals.mo, it creates a fixed-size hash-table which solves collisions via chaining.internals.moto be tracked by the RTS layer such that the table is not garbage collected and it survives upgrades.prim.mo) to walk the hash table and check which deduplicated blobs are alive/dead and prune the dead ones if neeed.