Open
Conversation
c526ca6 to
0cc24d4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Part of #2129
Foundational method that takes a vector that meets vctrs's newly written up native storage requirements, and strips away all extraneous attributes not natively handled by vctrs methods.
Not being used in place of
vec_data()quite yet, but that is the goal. We will then soft-deprecatevec_data()and start to move away from it in favor of this here in vctrs and in dplyr/tidyr.It will also be used in
vec_proxy()on the output of a user's proxy method. This ensures that:It also seems likely that there is room for
vctrs::vec_unstructure()andrlang::unstructure()vctrs::vec_unstructure()rules:namesnamesdimanddimnames[[1]](note, only row names)names,row.names, and aclassof"data.frame"rlang::unstructure()rules:namesnamesnamesdimanddimnames(note, alldimnames)Notable differences between the two:
dimnamesare kept invec_unstructure(), but all ofdimnamesare kept inunstructure(), because base R operations propagate all ofdimnamesvec_unstructure()but are treated like lists inunstructure()NULLis allowed inunstructure()but notvec_unstructure()environmentand all other types are allowed inunstructure()but notvec_unstructure(). Rationale for allowing them inunstructure()is that instructure()you can pass in an environment and add attributes to it, so there should be a way to remove them as well. But no attributes on an environment are ever "critical", so you just clear them.For practical usage of
rlang::unstructure():+, where you'd want to strip off the rray class but retain all ofdimnamesbefore delegating to base R's own+method, wheredimnamesare propagateddplyr:::dplyr_new_list()andtidyr:::tidyr_new_list(), where I often pass in a data frame and expect this to unstructure that into a named list with no extra attributesIt is quite fast, we might be able to get away without
vec_proxy_unsafe(), not sure yet.Notably using R's ALTREP wrapper types here to avoid a copy of large objects (since only attributes are being manipulated).
But proxy methods were already quite fast, so maybe not.
I imagine that in something like
vec_c()we would usevec_proxy()on theoutobject we create (because we want tovec_restore()at the end), but we'd usevec_proxy_unsafe()on all of the elements before copying them over (because we don't care about their extraneous attributes, we just want the C compatible form that we can copy from).