-
Couldn't load subscription status.
- Fork 399
initial implementation of wildcard provenence for tree borrows #4630
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
initial implementation of wildcard provenence for tree borrows #4630
Conversation
basic tests for wildcard provenance wildcard tracking data structure & expose_tag implementation basic wildcard accesses fix compilation errors & first working testcase comments & protector test update wildcard tracking when neccessary & fix ui tests remove either state move exposed_as to node & use own AccessLevel enum deallocation through wildcards wildcard reborrowing Location struct & comments test for correctly activating through wildcards & fix updating idempotent foreign access compelete verify function use check_nondet helper in a few more places
|
Thank you for contributing to Miri! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a first round of comments. I didn't get to the wildcard.rs file yet. Also you did some really deep surgery in this code, much deeper than I expected, so we may have to discuss this in person because this is code I didn't write and therefore don't remember all that much about how it works.
Some general points that I didn't comment on everywhere, but that should be fixed everywhere:
- Please write complete, correctly capitalized and punctuated sentences.
- Please leave empty lines between functions and between types (except when you define a group of types that should be seen as a single unit).
src/bin/miri.rs
Outdated
| rustc_args.push(arg); | ||
| } | ||
| } | ||
| // Tree Borrows implies strict provenance, and is not compatible with native calls. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is now outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
native functions are only incompatible with strict provenance? So this check can be removed completely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean?
"Tree Borrows implies strict provenance" is not true any more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with the nativelib interface. So I'm unsure if there is any other reason (besides strict provenance) why tree borrows would be incompatible with nativelib.
I assume there isn't, and I will remove the comment along with the check directly after it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, we can now allow TB + native_lib. (I didn't even realize the check can be removed entirely, I just noted that the comment states an incorrect implication.)
src/borrow_tracker/mod.rs
Outdated
| // The body of this loop needs `borrow_tracker` immutably | ||
| // so we can't move this code inside the following `end_call`. | ||
|
|
||
| // TODO support protected wildcard pointers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear what "support" means.
| /// Which tag the access that caused this error was made through, i.e. | ||
| /// which tag was used to read/write/deallocate. | ||
| pub accessed_info: &'node NodeDebugInfo, | ||
| pub accessed_info: Option<&'node NodeDebugInfo>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs an explanation of what None means.
| machine: &MiriMachine<'tcx>, | ||
| ) -> InterpResult<'tcx> { | ||
| // TODO: for now we bail out on wildcard pointers. Eventually we should | ||
| // handle them as much as we can. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment seems outdated now?
| let orig_tag = match parent_prov { | ||
| ProvenanceExtra::Wildcard => return interp_ok(place.ptr().provenance), // TODO: handle wildcard pointers | ||
| ProvenanceExtra::Concrete(tag) => tag, | ||
| let (orig_tag, provenance) = match parent_prov { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So orig_tag is exactly the same as parent_prov except you converted it from ProvenanceExtra to Option...? What's the point of that?
Also, provenance is way to imprecise as a name. There's so many provenances floating around here, this isn't useful. I'd suggest new_prov.
| pub fn dealloc( | ||
| &mut self, | ||
| tag: BorTag, | ||
| tag: Option<BorTag>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's odd that dealloc changes its signature like this but perform_access does not.
| for (perms_range, Location { perms, .. }) in | ||
| self.rperms.iter_mut(access_range.start, access_range.size) | ||
| { | ||
| let tag = if let Some(tag) = tag { | ||
| tag | ||
| } else { | ||
| // the order in which we check if any nodes are invalidated doesnt matter, | ||
| // so we use the root as a default tag | ||
| self.nodes.get(self.root).unwrap().tag | ||
| }; | ||
| TreeVisitor { | ||
| nodes: &mut self.nodes, | ||
| tag_mapping: &self.tag_mapping, | ||
| perms, | ||
| wildcard_accesses: None, | ||
| } | ||
| .traverse_this_parents_children_other( | ||
| tag, | ||
| // visit all children, skipping none | ||
| |_| ContinueTraversal::Recurse, | ||
| |args: NodeAppArgs<'_>| -> Result<(), TransitionError> { | ||
| let NodeAppArgs { node, perm, .. } = args; | ||
| let perm = perm.get().copied().unwrap_or_else(|| node.default_location_state()); | ||
| if global.borrow().protected_tags.get(&node.tag) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like you entirely changed the old logic here... what is happening?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes are mostly whitespace.
Only the let tag = ... on line 787 and wildcard_accesses: None, on line 798 are actual changes.
| if node.is_exposed { | ||
| return None; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this make sense? This is definitely non-trivial, therefore needs a comment.
| /// methods for wildcard borrows | ||
| impl<'tcx> Tree { | ||
| /// analogous to `perform_access`, but we do not know from which exposed reference the access happens. | ||
| pub fn perform_wildcard_access( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be in wildcard.rs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is in tree.rs mostly because it accesses private functions of LocationState (perform_access, skip_if_known_noop, record_new_access)
| /// We do not know the accessed pointer, but we know that it is a child of this pointer | ||
| WildcardChildAccess, | ||
| /// We do not know the accessed pointer, but we know that it is foreign to this pointer | ||
| WildcardForeignAccess, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does it make sense to have these as new variants in this type, rather than having a separate type for the wildcard traversal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This way I can reuse the code of perms.rs and LocationState::perform_access directly, which both take a AccessRelatedness as an argument. Also, no code seems to rely on the concrete value of AccessRelatedness. Only if its foreign or child access.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So could this enum be simplified to just Local vs Foreign?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be part of the traversal adjustment preparation PR.
| // Keep original provenance. | ||
| return interp_ok(place.ptr().provenance); | ||
| } | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a ptr_try_get_alloc_id above (which github doesn't let me add a comment to -- thanks github) which is subtle and I think not entirely correct. For a wildcard pointer, this will resolve the pointer to an AllocId if it can. If ptr_size is 0, however, that might not be the only legal AllocId!
If the size is 0 on a wildcard pointer I think we have to bail early, there's not much we can do.
ptr_size=0 support remove unneccessary nativelib check use provenance extra instead of option unify perform_access
| /// with possible lazy initialization. | ||
| /// | ||
| /// NOTE: same guarantees on entry initialisation as for `perms` | ||
| pub wildcard_accesses: UniValMap<WildcardAccessTracking>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a more extensive comment for why it is a separate map.
| /// childrens max_child_access/max_foreign_access | ||
| #[derive(Debug, Clone, Default, PartialEq, Eq)] | ||
| pub struct WildcardAccessTracking { | ||
| /// how many of this nodes direct children have `max_child_access==Write` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| /// how many of this nodes direct children have `max_child_access==Write` | |
| /// how many of this node's direct children have `max_child_access==Write` |
| /// were relative to the pointer the access happened from | ||
| #[derive(Clone, Copy, Debug, PartialEq, Eq)] | ||
| pub enum WildcardAccessRelatedness { | ||
| /// the access definitively happened through a child pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"pointer" is confusing here, do you mean "node"?
| #[derive(Clone, Copy, Debug, PartialEq, Eq)] | ||
| pub enum WildcardAccessRelatedness { | ||
| /// the access definitively happened through a child pointer | ||
| ChildAccess, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When you use "child" to mean the reflexive transitive closure, please say "local".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please factor the changes to the visitor that give the callbacks access to the entire tree into a separate commit/PR. We also don't need two layers of visitors, just adjust the existing NodeAppArgs to have the extra state.
| AccessKind::Write => self.write_access_relatedness(exposed_as), | ||
| } | ||
| } | ||
| /// from where relative to this pointer a read access could happen |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"this pointer"/"this node" is confusing here since you don't actually identify a node here.
What you mean is "a node with this wildcard info" or so?
| /// this function calculates the siblings `max_child_access`, both of the other fields need to be passed as arguments | ||
| /// | ||
| /// * `other_factors`: we only ever change one of these values. The max value of the other fields we dont change should be passed through the `other_factors` parameter | ||
| /// * `old_access_type`,`access_type`: we change the parameter not covered by `other_factors` from `old_access_type` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"we change" seems to mean "the caller changed/changes" or so? It sounds like this function is doing the changing.
| /// pushes children onto the stack, if their `max_foreign_access` field needs to be updated | ||
| /// | ||
| /// the `max_foreign_access` fields is set based on the max of the parents `max_foreign_access`, | ||
| /// `exposed_as` and its siblings `max_child_access`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 of these aren't fields though...
| perms: &UniValMap<LocationState>, | ||
| wildcard_accesses: &mut UniValMap<WildcardAccessTracking>, | ||
| ) { | ||
| /// pushes children onto the stack, if their `max_foreign_access` field needs to be updated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a much more extensive comment:
- on some node (say which!), one of these "fields" (properties? they aren't all fields) changed, from old_access_type to new_access type I think?
- the other two stay the same, and are stored in other_factors
- but then there's some more complicated stuff for siblings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if there's three modes to call this function (depending on what changed), every caller needs to document which mode it is using and why the special requirements for that mode are upheld.
But somehow the function doesn't even know what the mode is so... is there some more uniform way to do this? Can the function be called with just the one info on how the foreign access mode that can come in via the parent edge changed?
| access_type: WildcardAccessLevel, | ||
| old_access_type: WildcardAccessLevel, | ||
| access: WildcardAccessTracking, | ||
| mut children: impl Iterator<Item = UniIndex>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess all these have a common parent? Is it all children of that parent?
Is that parent the node that the main comment refers to, whose properties changed?
| /* other factors */ src_access.max_foreign_access, | ||
| access_type, | ||
| old_access_type, | ||
| src_access.clone(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just pass a reference instead of cloning.
EDIT: Or maybe not, turns out that is not always the parent node info...
| other_factors: WildcardAccessLevel, | ||
| access_type: WildcardAccessLevel, | ||
| old_access_type: WildcardAccessLevel, | ||
| access: WildcardAccessTracking, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment should say what this is. Also maybe call it something involving parent in its name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, just replace the entire thing by the read and write counts.
| // Read -> Write | ||
| // Write -> Read | ||
| // Write -> None | ||
| access.child_writes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it okay to ignore child_reads here, given that the read permissions can also change?
| other_factors: WildcardAccessLevel, | ||
| access_type: WildcardAccessLevel, | ||
| old_access_type: WildcardAccessLevel, | ||
| access: WildcardAccessTracking, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, just replace the entire thing by the read and write counts.
| old_access_type: WildcardAccessLevel, | ||
| access_type: WildcardAccessLevel, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call these old_exposed_as and new_exposed_as?
| // dont change (for parents child_permissions and for the other children foreign permissions) | ||
| { | ||
| // we need to keep track of how the previous permissions changed | ||
| let mut prev_old_access = old_access_type; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
child_old_access? Also what exactly is the invariant on this field? It's instantiated based on an exposed_as but gets updated with max_child?
| { | ||
| // we need to keep track of how the previous permissions changed | ||
| let mut prev_old_access = old_access_type; | ||
| let mut prev = id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
child (the one we come from in our upwards traversal)?
| /// We do not know the accessed pointer, but we know that it is a child of this pointer | ||
| WildcardChildAccess, | ||
| /// We do not know the accessed pointer, but we know that it is foreign to this pointer | ||
| WildcardForeignAccess, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could also be part of the traversal adjustment preparation PR.
| // is defined by this child. So we only need to update this one child | ||
| stack.push(( | ||
| children | ||
| .find(|id| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should give a unique node, right?
I'd say make this filter, call next once to get the node, then call it again to test that it is unique.
| ) { | ||
| // find root node | ||
| let mut root = id; | ||
| while let Some(parent) = nodes.get(root).and_then(|n| n.parent) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| while let Some(parent) = nodes.get(root).and_then(|n| n.parent) { | |
| while let Some(parent) = nodes.get(root).unwrap().parent { |
| .map(|child| { | ||
| let node = nodes.get(child).unwrap(); | ||
| let perm = perms.get(child).map(LocationState::permission); | ||
| let access = wildcard_accesses.get(child).unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This uses the data we are checking (wildcard_accesses) to check said data... doesn't that risk that we miss some inconsistencies?
For instance, what if we have a tree where no node is exposed, but there's a node somewhere that says "one of my children is exposed" and there's exactly one child that says "my parent is exposed"? Would we catch that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably rename max_other_children to max_access_siblings to make this code clearer.
|
A
| \
B C
So in the example A has child_reads=1 and B has max_foreign_access=Read (everything else 0)?
This will fail, because B and C both have max_child_access()==None, therefore A should be child_reads=0.
It would also fail because for B the parents max_foreign_access, exposed_as and its siblings (C) max_child_access() are all None, meaning B's max_foreign_access should also be None
The invariants verify_consistency checks should be described in the WildcardAccessTracking struct definition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will fail, because B and C both have max_child_access()==None, therefore A should be child_reads=0.
It would also fail because for B the parents max_foreign_access, exposed_as and its siblings (C) max_child_access() are all None, meaning B's max_foreign_access should also be None
Sounds good then, thanks!
|
@rustbot author |
|
Reminder, once the PR becomes ready for a review, use |
| // What kind of access caused this error (read, write, reborrow, deallocation) | ||
| pub access_cause: AccessCause, | ||
| /// Which tag the access that caused this error was made through, i.e. | ||
| /// Which tag if any the access that caused this error was made through, i.e. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be something like (if any) or , if any,.
|
#4654 contains the updates to the TreeVisitor and AccessRelatedness |
This makes strict consistency requirement between wildcard tracking datastructure and the rest of the tree looser, giving us more flexibility in how we update it.
|
@rustbot ready |
| /// Represensts the maximum access level that is possible. | ||
| /// | ||
| /// Note that we derive Ord and PartialOrd, so the order in which variants are listed below matters: | ||
| /// None < Read < Write. Do not change that order. | ||
| #[derive(Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash, Debug, Default)] | ||
| pub enum WildcardAccessLevel { | ||
| #[default] | ||
| None, | ||
| Read, | ||
| Write, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There exists an identical enum in foreign_access_skipping.rs called IdempotentForeignAccess.
I'm unsure if they should be combined into one single AccessLevel type.
Or maybe just rename WildcardAccessLevel to MaxAccessLevel as it isn't really specific to wildcard accesses.
initial support for wildcard writes in tree borrows.