-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
PEP 694: Abstract file upload mechanisms #4431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I spent some time to sketch this out as a Finite State Machine in pypi/warehouse#18174, and will be using what I learned to refine this a bit! |
I'm planning on taking a closer look later today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple comments, but overall the changes look good to me.
I just wanted to note a few things I came across when reading the entire PEP (with these changes incorporated-- and probably a number of these came from my original PEP so that's on me :D ).
- The PEP specifies that the endpoint for PyPI will be
https://upload.pypi.org/2.0
. I would probably remove that from the PEP and let PyPI decide what it's endpoint will be (in particular the/2.0
part is somewhat confusing given the PEP also uses conneg). - The PEP calls out the inability to parallelize or resume an upload as a problem to be solved, and then later states the PEP solves all the identified problems. The TUS based approach didn't really solve parallelization to begin with, and the removal of TUS means the PEP also doesn't solve resuming an uploading. I think that's fine, but we should update the wording.
- The content type handling is kind of wonky I think. Much like PEP 691 the client can use the
Accept
header to request a particular content type from the server, and the server includes the full version number in themeta.api-version
key in the response. However, the requests appear to only be using themeta.api-version
key. I think ideally we want to have requests using a correctContent-Type
for their request (and thelatest
wouldn't be supported here), and have the server use that for handling the request data. We probably want to also explicitly require that themeta.api-version
matches theContent-Type
for major version.
While this PEP would no longer directly implement resumable or parallel uploads, it does solve the problem of how to address them. Individual file-upload sessions may occur in parallel if a server chooses to implement a mechanism that can support it, and similarly resumable uploads can be implemented as a mechanism. I'll clarify it in the "The new upload API...", 05d7fc2 |
Realization while specifying |
See: 924a27d |
See: f8469cf |
Co-authored-by: Donald Stufft <[email protected]>
Again, this is very understandable, but it is likely to mean that this PEP would bake in HTTP Basic Auth with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ewdurbin - apologies for the long delay, but I've finally completed my review of the updated PR text.
TL;DR: LGTM, thank you!
I do have a number of small suggestions for wording and such, and a question or two, but overall, this looks great and I approve. Next, I'll go through and resolve any outstanding conversations, and I'll update the branch via the GH web UI. Then I think it's just a matter of you going through the suggestions and question, and then we can merge. We can talk off-line about posting to DPO after the PR lands.
Interesting use case 😄. Forward compatibility is different than allowing index-specific variations. I'd be more concerned about the latter introducing variations that aren't standardized, making evolution of the standard more difficult. Forward compatible evolution of the standard without a major version bump should be fine, although a minor version bump probably still makes sense, so clients know what they're interacting with. Regarding the admonition against adding new upload mechanisms without a major version bump, with the above caveat in place, it makes sense to only require a major version bump if a standard mechanism is removed, although frankly I don't expect that to happen in practice.
Like a webhook? That could make sense, e.g. add a URL to the (upload & file-upload) session request payload when they are created. I think we'd have to evaluate a) security implications (e.g. we wouldn't want a package uploader to trigger push notifications to some unrelated service); b) complexity for implementations such as PyPI which don't AFAIK currently have infra to do that. Maybe for 2.0 we can just leave it at polling and think about push notifications for a future rev? |
I thought the same thing when I read this section, but didn't comment, so I'm glad you did! If this change is made, I'd recommend ISO 8601 UTC date time for consistency. |
Co-authored-by: Ee Durbin <[email protected]>
Um, I committed the suggestion about not allowing server specific mechanisms to bump the major or minor versions. I hope I didn't mess up the rest of the text (GH always surprises me here). Looks like it did resolve the thread though. |
I've taken some time to consider this a little bit more before finalizing my proposed changes... @konstin would specifying it as requiring the endpoint to implement RFC 7235 |
…he file upload session status
… an open question
OK, I have completed all of the changes I am confident in and they are ready for review, @warsaw, @dstufft . I have also prepared two PRs, that I am less confident in merging here:
I would appreciate input from @konstin on if these two change sets address the concerns previously raised, and input from @warsaw and @dstufft on if these changes would be acceptable to the pep. |
I really like both of these @ewdurbin. I've added some questions and suggestions. |
say _something_ about authentication for upload 2.0
On a little further consideration, I am on the fence about generic capabilities. I worry about the ability for them to really fracture upload 2.0 implementations across the ecosystem, particularly if they are truly generic rather than allowing for certain specific extensions and implementation. The only real reason I would like to see them as part of this pep so that the pep does not have to own specifications on webhook style notifications. |
I'm kind of the fence about the callback stuff? Like I see a use case for it, but I also feel like it's probably a lot of added complexity and it feels like the majority of use cases aren't going to be able to take advantage of it. I suspect the vast bulk of publishing is done with something like Personally I'd leave it out (at least for now) as a lot of extra complexity for little gain, and we can always add it at a later time if it turns out there's really a big desire for it. However, the proposed mechanism looks fine to me, so if folks really want it now I don't have any real problem with it as it stands. |
Other than the callback comment above, everything in here looks good to me. |
I had a quick call with Barry. My plan is to drop anything about callbacks and extensions via capabilities to an open question to be addressed in a future backwards compatible spec. |
Add documentation about webhook notifications and capabilities extensions that were discussed but deferred to avoid complexity and ecosystem fragmentation during initial Upload 2.0 rollout.
Thanks for addressing both of those! I agree that the complexity for callbacks is too high here. |
This attempts to defer the implementation details of getting the bits of a given artifact from the client to the server.
The primary motivation here is to decouple this PEP from resumable/multi-part uploads, provide flexibility to implementers of PEP 694, and allow for new upload mechanisms without a PEP cycle.
📚 Documentation preview 📚: https://pep-previews--4431.org.readthedocs.build/