-
-
Notifications
You must be signed in to change notification settings - Fork 29
Add support for new illustrations APIs in libzim 9.4.0 #233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: libzim_9.4.0
Are you sure you want to change the base?
Conversation
53b6486 to
4da244a
Compare
4da244a to
070b003
Compare
35ad235 to
bfd945f
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## libzim_9.4.0 #233 +/- ##
================================================
+ Coverage 93.77% 94.23% +0.46%
================================================
Files 1 1
Lines 546 607 +61
================================================
+ Hits 512 572 +60
- Misses 34 35 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
bfd945f to
de46d68
Compare
|
Ready for review, wheels build still failing on ubuntu arm64 for "known reasons" but rest is in place so ready to be analyzed |
rgaudin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @benoit74,
When we designed this on libzim, it was very clear that:
- attributes are an optional, for-the-future feature that we have no real use case for. It's still not in the spec.
- independant width and height are important now
- scale is important now
We thus requested to be able to directly add illustration directly using width, height, scale. That's not the case with this.
If there are libzim primitives to do it, bind them, if it's the same, add the necessary python glue.
On the reader side, it's less important but I feel like a lot is left on the hands of the libzim-user.
How do I know if 48x48 has various scales? I have to loop through all the getIllustrationInfos() results…
libzim/libzim.pyx
Outdated
| """Get the illustration Metadata item of the archive. | ||
|
|
||
| Args: | ||
| size: Optional size of the illustration (for backward compatibility). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If optional, the default should be mentioned
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is optional in the sense that you provide either directly the size or the full info. It is anyway not consistent with add_illustration, and I prefer when there is one single arg to match libzim, so let's move to this.
tests/test_libzim_creator.py
Outdated
| assert zim.has_illustration(48) is True | ||
| assert zim.has_illustration(96) is True | ||
| assert bytes(zim.get_illustration_item(size=48).content) == favicon_data | ||
| assert bytes(zim.get_illustration_item(size=96).content) == favicon_data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be wise to have different fixtures. Would save some weird bug potential
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created fixture (with new Pillow dependency to generate them on-the-fly rather than painfully defining all of them in the code).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: removed Pillow dependency, it is a pain to install on all our supported plaftorms
libzim/zim.pxd
Outdated
| void finishZimCreation() except + nogil | ||
| void setMainPath(string mainPath) | ||
| void addIllustration(unsigned int size, string content) except + nogil | ||
| void addIllustration(const IllustrationInfo& ii, string content) nogil except + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't nogil supposed to be last?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, forgot about it in some rebase probably ^^
|
Thank you! Regarding addIllustrations, there is "only" four variants available in the libzim: https://github.com/openzim/libzim/blob/8ef511ba0de6a8180d79fbc3c7edcfec4202ae5e/include/zim/writer/creator.h#L165-L197 ; I did not exposed the "contentProvider" variants because the one previously present was not exposed either. I could add the two missing, but it will change nothing to your point. I generally second your remarks and frustrations about how "complex" stuff still is with current PR, but I considered we've agreed python-libzim should be a very thin wrapper on top of the libzim, and everything more "complex" should go into the python-scraperlib. I feel like your python-glue is something which should be implemented in the python-scraperlib. For instance even with only this PR, it is already possible to do things like below which are not "that" awful. IllustrationInfo is just a structure to hold all details about the illustration: Details point are all valid. |
I don't think so but this is debatable of course. scraperlib is not intended for many users. It is, as its name implies, scraper-oriented and quite large and have many dependencies. We can do as we please there because we are the main users and requiring explicit extra types for this is OK there. We're already enforcing a lot of stuff, because we're the ones using it. pylibzim on the other end is to be used by hopefully many users. It has zero runtime dependency, is lightweight and lean. It just seems silly to push this uncomfortable change to our users but let's keep this PR like this and open a separate ticket to discuss whether we want to make an exception or not. |
839a33e to
7db4684
Compare
|
Simple changes pushed, but I'm not comfortable with merging this if you are not comfortable with this PR "design" ; changing this afterwards might means breaking the API so needs a major, uncomfortable changes for "users", etc ... I see no urgency in pushing these changes to the python-libzim, since we have no use-case so far, and would prefer to have a decision comfortable for everyone. And I somehow share your concerns, especially on the reader side. I'm less embarrassed by the fact that we need the new Since python-libzim is anyway not used by our readers (so far, unfortunately?), maybe a simple middle ground would be:
WDYT? |
|
I'm fine with this simple glue strategy. It's dumb enough to not be a problem, as you wrote. Regarding the more complex reader usage, there's a ticket for that on libzim, it will definitely be implemented there but first step was to allow addition of multiple scales and width. |
|
Hum, I'm afraid that in fact this simple glue strategy is not that simple, because it will break the API or introduce dirty glue code for Currently second argument to add_illustration is the content. If content can be either second or third or fourth depending on other arguments, this is either brittle (we count arguments to know where content is) or breaking (we force content to be a named argument). Do you have suggestion about how to handle this? Should we introduce different method names? I'm puzzled Same kind of issue with the scale in |
|
Well libzim 9.4.0 did break the API (version doesn't tell because they don't respect semver) so I see no reason we wouldn't allow ourselves to. Also, it's a binding not a pure copy which it can't be, especially since python doesn't support overloading natively. To me it would only make sense to have the content the first param and the rest afterwards. It could even be a single func with What do you think? |
Fix #232
To be merged after #234 (which is simpler / narrower scope)
Changes:
illustrationmodule with theIllustrationInfoclass, used byreaderandwritermodulesreader.Archive.get_illustration_item: get "default" and get specific sizes/scalesreader.Archive.get_illustrations_infoswriter.Creator.add_illustration: pass detailed illustration infos instead of one fixed size