Skip to content

Support asset collection policies #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
Aug 4, 2025
Merged

Conversation

joachimvh
Copy link
Contributor

Adds support for policies targeting asset collections, and registering resources to automatically be part of those asset collections based on registered relations. docs/collections.md has a more extensive explanation.

This PR should not be merged immediately but exists to clearly show the new commits compared to the branch it started from.

@joachimvh joachimvh force-pushed the feat/collection-policies branch from c6a72fb to 5e38229 Compare July 24, 2025 13:07
@joachimvh joachimvh changed the base branch from fix/ODRL-evaluator to main July 24, 2025 13:07
@joachimvh
Copy link
Contributor Author

joachimvh commented Jul 24, 2025

The PR now has full support for asset collections (and some related changes). Some issues that were handled:

  • Resources can't be registered if their parents aren't registered, as the parent UMA ID is needed for the relation registration. To not cause too much delay, resources are still registered immediately without their relations, and updated with the relation once the parent registration is finished.
  • Due to the extra work when creating new resources, it was possible that registration was not finished before the first request came in. To resolve this, the function that fetches the ticket now waits when it sees that the relevant resource is still in the process of being registered.
  • When using an external container to store policy data, which is done in the demo setup, this caused an infinite dependency chain where the UMA tried to access the container to see the policies, but the RS then asked the UMA server if the request was allowed. The setup has been changed so this container does not use UMA for authorization.
  • UMA IDs are now used during all relevant calls instead of resource IDs (even though these are currently the same). This uncovered some issues. For example, when trying to create a resource, the target used to be the potential new resource. This is now impossible since there is no UMA ID yet for this resource. To handle this situation, the create scope now targets the first parent container that does exist when trying to create a new resource.

@joachimvh joachimvh requested a review from termontwouter July 24, 2025 13:32
@joachimvh joachimvh marked this pull request as ready for review July 24, 2025 13:33
@joachimvh joachimvh force-pushed the feat/collection-policies branch from 5e38229 to d5a4df1 Compare July 25, 2025 09:47
termontwouter
termontwouter previously approved these changes Jul 28, 2025
Copy link
Collaborator

@termontwouter termontwouter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @joachimvh 🙏 Can be merged i.m.o., if we transfer the remaining problems/discussions to issues.


Minor general remark, same as in #57: the example/test-scaffolding in packages/uma/config/rules should probably not really be part of the config folder.

Comment on lines 33 to 38
ex:assetCollection a odrl:AssetCollection ;
odrl:source <http://localhost:3000/container/> ;
odrl_p:relation ldp:contains .
```

The above policy gives Alice read permission on all resources in `http://localhost:3000/container/`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe specify that the role of <http://localhost:3000/container/> is the source asset identifier here, and not the resource identifier.

The keys are the relations and the values are arrays of scopes.
* `resource_relations`: A key/value map linking this resource to others through relations.
The keys are the relations and the values are the UMA IDs of the relation targets.
If the relation starts with `^` a reverse relation is implied.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I thought we decided on using @reverse for this 🤔 In any case, that's what's in the latest spec says, so it should be implemented at some point (with the carret behaviour still supported if needed).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it in the code to use @reverse but forgot to update the documentation. Woopsie.

Comment on lines 66 to 69
The new field `resource_defaults` tells the server that all containers for
the `http://www.w3.org/ns/ldp#contains` relation
that have this resource as the source,
have `read` as an available scope.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current spec is written with the idea that there is only a single container (per source) for any given relation. I don't think I made that explicit (yet), but I don't see what the value would be of having mulltiple (and I do see potential problems that come with it). WDYT?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it is more: "The new field resource_defaults tells the server that other resources can relate to this resource via the ldp:contains relation; and that read policies defined on the collection those related resources will be inherited by them."

Comment on lines 123 to 129
Resource registration happens asynchronously.
As a consequence,
it is possible, when registering a resource,
that the registration of its parent container was not yet completed.
The UMA ID of this parent is necessary to link to the correct relation though.
The current solution is to register the resource without the relations and retry twice:
once immediately after the registration is successful, and once after 30 seconds.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay for the current solution, but can we not use the RS knowledge of the relation (i.e., resource B is created as child of resource A) to wait more targetted, instead of interval-based?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or work with a FIFO queue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also outdated. I completely forgot to update the documentation after making necessary changes to have everything working. 🙈 It now works with an event-based system as I mention in my initial comment above.

Comment on lines 139 to 144
The current implementation instead generates a ticket targeting the first existing (grand)parent container,
and requests the `create` scope.

Solid does have some edge case situations that are no longer covered this way.
For example, trying to read a non-existent resource would give a different response
depending on if the client was allowed to read it or not (404 vs 401/403).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm think this is quite a good/robust solution.

Don't fully agree with the assessment of uncovered Solid cases... Using WAC/ACP, it is also the case that either [A] the ACR explicitly says create on "non-existent" resource X is (dis)allowed, which is similar as that URI already being present as an asset in UMA1; or [B] the 404/401|3 decision is infered from the ACR's policies on the container of X, which is similar to the current behaviour of the UMA implementation described here.

Footnotes

  1. The terminology "(non-)existent" is misleading; it confounds two different aspects of a resource: whether it is registered in the assets/ACRs (i.e., does the idea/concept of the resource "exist" or not), and whether the resource has an actual content and representation (i.e., has it been initiated or not). Nothing in UMA precludes the possibly that an RS registers resources that are already known, but are not yet initiated with content, similarly to how ACRs can contain rules about such resources.

Comment on lines 153 to 156
One of the sides of the relation would be the source of a collection,
and the other side would then be part of that collection.
Currently, we have no clear way to indicate which is which,
so for now the server will always assume the subject of the triple is the collection source.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, not sure I get the point here 🤔 The direction is indicated by the presense or absense of @reverse (or the carret); is that not sufficient?

  • resource_relations { ldp:contains: [ other ] } indicates that the current resource (being registered) contains the other resource;
  • resource_relations { @reverse { ldp:contains: [ other ] } } indicates the other resource contains the current resource.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So again, mea culpa on not updating this documentation document. 😇 I'm pretty sure I wrote this before the actual spec was there.

@joachimvh
Copy link
Contributor Author

@termontwouter I rewrote the collections.md document so it actually matches how the server works now.

Comment on lines 55 to 57
The `resource_relations` field indicates that the resource `assets:1234c`
has the `http://www.w3.org/ns/ldp#contains` relation with as target the newly registered resource,
and the other entry indicates that the new resource has the relation `my:other:relation` to `assets:5678`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's a mistake here... The spec says:

Keys of the [resource_relations] object represent relations from the resource being registered to the already existing resources.

That means that { "resource_relations": { "http://www.w3.org/ns/ldp#contains": [ "assets:1234" ] } } expresses a containment relation from the newly registered resource to assets:1234, not the other way around (and vice versa for the other relation).

I believe we decided on that to keep things more intuitive/readable 😅

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the spec I indeed think I mixed up the direction of relations for both resource_defaults and resource_relations. Because above I also say that resource_defaults implies that the newly registered resource is the the source (aka, subject) while the spec says

A JSON object representing the possible relations that future resources might have towards the newly registered resource

So now I'll have to go look if I only mixed it up here or also everywhere in the code. 😀

So just to make sure everything is interpreted correctly, for a standard Solid resource /foo/bar/, should the resource registration contain the following fields?

{
  "resource_defaults": {
    {
      "@reverse": { "ldp:contains": [ "..." ] }
    }
  },
  "resource_relations": {
    "@reverse": { "ldp:contains": [ "/foo/" ] }
  }
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, according to the spec it should contain

{
  "resource_defaults": {
    "ldp:contains": [ "..." ]
  },
  "resource_relations": {
    "@reverse": {
      "ldp:contains": [ "/foo/" ]
    }
  }
}

We can look for alternatives if this is too complex, but i.m.o. it is the most intuitive choice when using such a JSON(-LD) message:

  • Resource defaults are relations from the new asset to future parts of the corresponding collection (represented by an anonymous "blueprint" node with the inherited scopes, as it were).

  • Resource relations are relations from the new resource to a collection source.

Expressed like above, the JSON(-LD) property/key/field aligns with those directions, that is, if we start reading from left to right and interpret @reverse as "virtually switching" that direction.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at how I implemented it, this is actually how I did it, so at least when writing code I interpreted it correctly. I see now that I should have read the spec a bit further for my comment above where it says

from potential future resources to the resource being registered

I was confused by the sentence above which states

possible relations that future resources might have towards the newly registered resource

Which I interpreted as the reverse. I would have expected "possible relations that the newly registered resource might have towards future resources" then (although I guess semantically reverse relations also count as relations).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright! So if that's indeed how you already programmed it, I will just update the phrasing in the spec to (hopefully) be less confusing 😉

<urn:1:2:3> a odrl:AssetCollection ;
odrl:source <my:new:resource> ;
odrl_p:relation <http://www.w3.org/ns/ldp#contains> ;
odrl_p:reverse false .
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as a reminder, this is something we need to reiterate on. A boolean flag indicating reverse is semantically not very pretty. The spec currently says to store the relation as it is registered, but offers no guidance on how to deal with reverse ones. Would it be hard to implement it as a blank node with something like owl:inverseOf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only time this is used is during registration (or an update) to check if a container with certain characteristics already exists, so it can be whatever you want. So instead of odrl_p:reverse you would prefer odrl_p:relation [ owl:inverseOf <http://www.w3.org/ns/ldp#contains> ]?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so. Makes more sense to me

<my:new:resource> odrl:partOf <collection:12345> ;
odrl:partOf <collection:5678:reverse> .
```
This assumes that the collection IDs used above, `collection:12345` and `collection:5678:reverse`, already exist.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any semantics behind the [assetname:reverse] format?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, in practice it will all be UUIDs I think but this was more readable for the example.

the `http://www.w3.org/ns/ldp#contains` relation
that have this resource as the source,
have `read` as an available scope.
The `resource_relations` field indicates that the resource `assets:1234c`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `resource_relations` field indicates that the resource `assets:1234c`
The `resource_relations` field indicates that the resource `assets:1234`

Typo?

@joachimvh joachimvh force-pushed the feat/collection-policies branch from c7e15e0 to 0beb92b Compare August 4, 2025 08:31
@joachimvh joachimvh force-pushed the feat/collection-policies branch from 0beb92b to dff3823 Compare August 4, 2025 08:58
@joachimvh joachimvh merged commit 462118f into main Aug 4, 2025
3 checks passed
@joachimvh joachimvh deleted the feat/collection-policies branch August 4, 2025 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants