Skip to content

Conversation

scouten-adobe
Copy link
Collaborator

This is a sketch of the changes I'd like to see made to the Reader interface to make CAWG more directly part of the reading process from clients' point of view.

@scouten-adobe scouten-adobe self-assigned this Sep 2, 2025
Copy link

codecov bot commented Sep 2, 2025

Codecov Report

❌ Patch coverage is 14.54545% with 47 lines in your changes missing coverage. Please review.
✅ Project coverage is 28.13%. Comparing base (c4fc7a6) to head (aeab1bc).

Files with missing lines Patch % Lines
sdk/src/manifest.rs 6.25% 30 Missing ⚠️
sdk/src/reader.rs 28.57% 15 Missing ⚠️
sdk/src/manifest_store_report.rs 0.00% 1 Missing ⚠️
sdk/src/validation_status.rs 0.00% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (c4fc7a6) and HEAD (aeab1bc). Click for more details.

HEAD has 2 uploads less than BASE
Flag BASE (c4fc7a6) HEAD (aeab1bc)
3 1
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1370       +/-   ##
===========================================
- Coverage   78.41%   28.13%   -50.29%     
===========================================
  Files         162      144       -18     
  Lines       39515    27439    -12076     
===========================================
- Hits        30986     7719    -23267     
- Misses       8529    19720    +11191     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.


// Run CAWG post-validation - this is async and requires a runtime.
fn post_validate(result: Result<C2paReader, c2pa::Error>) -> Result<C2paReader, c2pa::Error> {
if true {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you need sync APIs here?

let mut reader = Reader::from_file(dest)?;

reader.post_validate_async(&CawgValidator {}).await?;
let reader = Reader::from_file_async(dest).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not keep the sync interface too? I am not sure we can make everything async

Self::from_store(store, &validation_log)
let /* mut */ result = Self::from_store(store, &validation_log)?;
if false {
// QUESTION: What to do if we're in the _sync version and there
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I wouldn't want everything to become async. I wonder if we need the sync interface for some native bindings? It's not great but maybe we need a blocking API? Or a cawg validation opt-out for the sync path?

@scouten-adobe scouten-adobe changed the title Add CAWG validation to reader feat: Add CAWG validation to reader Sep 11, 2025
Copy link

codspeed-hq bot commented Sep 11, 2025

CodSpeed Performance Report

Merging #1370 will not alter performance

Comparing scouten/cai-9212-add-cawg-to-reader (aeab1bc) with main (c4fc7a6)

Summary

✅ 16 untouched
⏩ 2 skipped1

Footnotes

  1. 2 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@scouten-adobe scouten-adobe changed the title feat: Add CAWG validation to reader feat: Add CAWG validation to Reader Sep 12, 2025
@scouten-adobe scouten-adobe force-pushed the scouten/cai-9212-add-cawg-to-reader branch from ba322ff to 5270494 Compare September 12, 2025 21:44
@scouten-adobe scouten-adobe marked this pull request as ready for review September 12, 2025 21:50
@scouten-adobe
Copy link
Collaborator Author

I think I've solved the sync/async issues by leaving the existing approach in C API unchanged. I've added a new convenience function (async only) which does the combined test. That's what we were doing in the C API wrapper functions already and I've removed my changes to those functions.

@tmathern tmathern requested a review from emensch September 12, 2025 22:03

#[test]
fn test_reader_post_validate() -> Result<()> {
if false {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed this was still here. Will remove shortly.

format: &str,
stream: impl Read + Seek,
) -> Result<Reader> {
let mut reader = Self::from_stream_async(format, stream).await?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementations in
#[cfg(target_arch = "wasm32")]
pub async fn from_stream_with_cawg_async

and

#[cfg(not(target_arch = "wasm32"))]
pub async fn from_stream_with_cawg_async

are the same. Is there no way to reuse instead of duplicating? No trick to bypass the differences in requesteds traits on the stream ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish … I wish …

Sadly, there is not such a trick. (I've looked.)

Comment on lines 225 to 227
if get_settings_value::<bool>("verify.verify_after_reading")? {
reader.post_validate_async(&CawgValidator {}).await?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So without verify.verify_after_reading, this will behave the same as Reader::from_stream_async?

Unless I'm mistaken, that feels a bit unintuitive to me. I'd think that calling from_stream_with_cawg_async would be "enough" from an end-user perspective to opt in to CAWG validation.

Copy link
Collaborator

@gpeacock gpeacock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just say that Reader::from_stream_async() supports cawg but reader_from_stream() doesn't yet. The same would go for Reader:from_file() I don't see why we should add new cawg specific methods here.
Eventually, hopefully the sync methods can support cawg too.

Another option is that we can configure the cawg support with a setting. So if you have cawg enabled we always do the cawg call and if you don't there's no extra overhead.

Maybe there's a separate (or the same) option that allows the sync methods to block on the async call. you can can disable the setting if you don't want that behavior.

Self::from_store(store, &validation_log)
let /* mut */ result = Self::from_store(store, &validation_log)?;
if false {
// QUESTION: What to do if we're in the _sync version and there
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to support both sync and async identity validation if we are going to include it at this level in the API. If it is part of validation, it can't just be async validation.

}

// Run CAWG post-validation - this is async and requires a runtime.
fn post_validate(result: Result<C2paReader, c2pa::Error>) -> Result<C2paReader, c2pa::Error> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PostValidate could be useful for validating and formatting any 3rd party assertions we don't have in the SDK.
Ideally there would be some way to integrate this into the definition of an Assertion such that you would just need to add an assertion handler to the SDK, but that's a problem for tomorrow.

/// [CAWG identity]: https://cawg.io/identity
/// [from_stream_with_cawg_async()]: Self::from_stream_with_cawg_async
#[cfg(not(target_arch = "wasm32"))]
pub async fn from_stream_with_cawg_async(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TO DO: Delete this and fold back into the from_stream_async call. Add a note (depending on outcome) that says not supported on sync.

Copy link
Collaborator Author

@scouten-adobe scouten-adobe Sep 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TO DO: If CAWG identity assertion is encountered, do the asynchronous blocking call in the sync implementation (borrow from the C bindings). Gavin may add a setting to disable if the blocking behavior is undesirable.

@scouten-adobe
Copy link
Collaborator Author

TO DO: Review c2patool usage.

@scouten-adobe
Copy link
Collaborator Author

TO DO: Check in to whether verify-on-sign does CAWG identity assertion verification and if it becomes async as a result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants