-
Notifications
You must be signed in to change notification settings - Fork 35
Add ability to control path inside PEX file #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
1e0b0e7 to
3650d94
Compare
3650d94 to
f3a6b64
Compare
|
This is great! I actually cooked up a similar hack that I called "pythonpath" for our own Python monorepo. I'd be happy to post my change for comparison, but this looks very similar! I'd love to help get something merged upstream! |
| dpath = dpath[3:] | ||
|
|
||
| for prefix in strip_pex_path_prefixes: | ||
| if dpath.startswith(prefix): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you have the same bug here that I had: a path can match two prefixes.
Consider strip_prefixes set to: ['aaa', 'bbb']
If the files are the following:
aaa/bbb/foo(expect to strip aaa and producebbb/foo)bbb/aaa/bar(expect to strip bbb and produceaaa/bar)
However, depending on the order the prefixes are applied, you will probably get one of foo or bar as an output, since it strips both.
To fix it, in my version I strip the LONGEST prefix match, which is a bit annoying but makes in unambiguous, and works when there are "overlapping" strip prefixes (e.g. strip root1 and root1/foo/bar/subroot)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am unsure if it's less unambiguous then stripping by the first matched prefix which is what this patch does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah crap I missed the break on the next line. Silly me, sorry.
Which one will be first though? It will depend on the order deps are specified in BUILD files. I guess that could actually be useful in some cases.
At any rate: Not matching more than one prefix is really the important part.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not deps but first in the strip_pex_path_prefixes list, the stripping afterall apply to all files in runfiles which can be either deps, srcs or even data attribute, if I recall well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! I see: This adds the attribute to pex_binary. I see! I added it to pex_library since we wanted to have some pex_library make the python package whatever, no matter where it lives in the Bazel workspace.
Anyway: Great minds think alike! I agree that something like this should end up as part of these rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the benefit of adding this to pex_library versus pex_binary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be useful to have it apply to data from pex_library rules; otherwise you would have to manually propagate that information along to the final pex_binary in some situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... Now that you mention it: I'm pretty sure my version of this does not actually consider the data attribute, and only applies it to the srcs attribute. I think the internal code where I'm using this doesn't use any data files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I've been playing with this and I definitely think that applying it to the pex_library (both srcs and data) makes the most sense as it removes the need for duplicating code. In my case I have two binaries that make use of the same library and I'd have to strip the same prefix in both binaries. In Buck this is called base_module maybe we can adopt the same name (see: https://buckbuild.com/rule/python_library.html#base_module).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A counter-argument will be that you wouldn’t include test data in the
pex_library? B/c if I understood correctly having this attribute in
pex_library only requires that you will need to define anything that
eligible to striping in the pex_library rule, correct?
On the other hand, having it in pex_binary and pex_test have direct effect
on the resulting Pex since this later are the rules that create them, and
yes, I agree that interface wise it's not DRY (and personally I don't like it too :)) however IMHO it gives users more control.
First off, thanks for this project it's have been helpful a lot.
Background
I am working on a project to migrate our monorepo from Make to Bazel where the repo layout is:
However with the default behaviour of this extension, PEX files generated end up packed with a path that requires us to change all import statements in our code which is not ideal, since it's a lot of changes and we want to allow smooth migration to Bazel with as few impact on the existent code as possible, hence this changeset which allows having a slight control on the path inside the PEX files, it's meant to be used as follow:
Example
Please advise if there is a better way of doing this.