Add new unidecode_translate method #79
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This method behaves similar to
unidecode_expect_nonascii
, but it uses a preloaded translation dict, built from thexNNN.py
files onunidecode
folder. This dictionary is, then, fed tostr.translate
.It throws the same errors as
unidecode
, but only checks surrogates if thecheck_surrogates
param is True.Since it requires loading the dictionary every initialization (I could not generate a cache for this case), it is slower than
unidecode_expect_nonascii
for use on the utility, but when used on applications which convert many strings, it is faster.Here are the results of
benchmark.py
when run with each configuration (I just replaced the internal calls to each of those methods):unidecode
:unidecode_translate
withcheck_surrogates=True
unidecode_translate
withcheck_surrogates=False
It is also faster for big strings, which can be seem on the following benchmark:
Note that the tests located on the
tests
folder also work for theunidecode_translate
method given thatcheck_surrogates=True
. Note that the cases where it compares the exception context toNone
fail (even with the usage ofraise ... from None
), but it can be easily solved by storing the exception object on a variable and raising if outside the try-catch block.