-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Closed
Labels
triageIssue needs triageIssue needs triage
Description
Describe the bug
I was running:
ocrmypdf --force-ocr gehrke98algebraic.pdf gehrke98algebraic_clean.pdf
and got with both Ubuntu's 24.04 version and Github's source :
An exception occurred while executing the pipeline _sync.py:473
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 409, in run_pipeline
optimize_messages = exec_concurrent(context, executor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 315, in exec_concurrent
pdf, messages = post_process(pdf, context, executor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 247, in post_process
return optimize_pdf(pdf_out, context, executor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/_pipeline.py", line 1009, in optimize_pdf
output_pdf, messages = context.plugin_manager.hook.optimize_pdf(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/pluggy/_hooks.py", line 501, in __call__
return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/pluggy/_manager.py", line 119, in _hookexec
return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/pluggy/_callers.py", line 138, in _multicall
raise exception.with_traceback(exception.__traceback__)
File "/usr/lib/python3/dist-packages/pluggy/_callers.py", line 102, in _multicall
res = hook_impl.function(*args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/builtin_plugins/optimize.py", line 145, in optimize_pdf
result_path = optimize(input_pdf, output_pdf, context, save_settings, executor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/optimize.py", line 695, in optimize
convert_to_jbig2(pdf, jbig2_groups, root, options, executor)
File "/usr/lib/python3/dist-packages/ocrmypdf/optimize.py", line 429, in convert_to_jbig2
_produce_jbig2_images(jbig2_groups, root, options, executor)
File "/usr/lib/python3/dist-packages/ocrmypdf/optimize.py", line 394, in _produce_jbig2_images
executor(
File "/usr/lib/python3/dist-packages/ocrmypdf/_concurrent.py", line 86, in __call__
self._execute(
File "/usr/lib/python3/dist-packages/ocrmypdf/builtin_plugins/concurrency.py", line 138, in _execute
result = future.result()
^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib/python3.12/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/_exec/jbig2enc.py", line 61, in convert_single_mp
return convert_single(
^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/_exec/jbig2enc.py", line 56, in convert_single
proc.check_returncode()
File "/usr/lib/python3.12/subprocess.py", line 502, in check_returncode
raise CalledProcessError(self.returncode, self.args, self.stdout,
subprocess.CalledProcessError: Command '['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.ad8hklkw/images/00000083.tif')]' returned non-zero
exit status 1.
It works well when adding -O0, but produces much larger file.
Steps to reproduce
1. Run ocrmypdf -v1 --force-ocr gehrke98algebraic.pdf gehrke98algebraic_clean.pdf
2. Observe crash
Files
How did you download and install the software?
PyPI (pip, poetry, pipx, etc.)
OCRmyPDF version
15.2.0+dfsg1
Relevant log output
xref 79: treating as an optimization candidate optimize.py:274
xref 81: treating as an optimization candidate optimize.py:274
xref 83: treating as an optimization candidate optimize.py:274
xref 85: treating as an optimization candidate optimize.py:274
xref 87: treating as an optimization candidate optimize.py:274
xref 89: treating as an optimization candidate optimize.py:274
xref 91: treating as an optimization candidate optimize.py:274
xref 93: treating as an optimization candidate optimize.py:274
xref 95: treating as an optimization candidate optimize.py:274
xref 97: treating as an optimization candidate optimize.py:274
xref 99: treating as an optimization candidate optimize.py:274
xref 101: treating as an optimization candidate optimize.py:274
xref 103: treating as an optimization candidate optimize.py:274
xref 105: treating as an optimization candidate optimize.py:274
xref 107: treating as an optimization candidate optimize.py:274
xref 109: treating as an optimization candidate optimize.py:274
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Running: ['jbig2', '--version'] __init__.py:134
Optimizable images: JBIG2 groups: 14 optimize.py:355
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000079.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000081.tif')] __init__.py:134
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000083.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000091.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000105.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000099.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000097.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000089.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000087.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000103.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000093.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000107.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000109.tif')] __init__.py:134
Running: ['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000095.tif')] __init__.py:134
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
stderr = Error in findTiffCompression: function not present __init__.py:76
Error in tiffGetCount: function not present
JBIG2 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0/14 -:--:--
An exception occurred while executing the pipeline _sync.py:473
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 409, in run_pipeline
optimize_messages = exec_concurrent(context, executor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 315, in exec_concurrent
pdf, messages = post_process(pdf, context, executor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/_sync.py", line 247, in post_process
return optimize_pdf(pdf_out, context, executor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/_pipeline.py", line 1009, in optimize_pdf
output_pdf, messages = context.plugin_manager.hook.optimize_pdf(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/pluggy/_hooks.py", line 501, in __call__
return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/pluggy/_manager.py", line 119, in _hookexec
return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/pluggy/_callers.py", line 138, in _multicall
raise exception.with_traceback(exception.__traceback__)
File "/usr/lib/python3/dist-packages/pluggy/_callers.py", line 102, in _multicall
res = hook_impl.function(*args)
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/builtin_plugins/optimize.py", line 145, in optimize_pdf
result_path = optimize(input_pdf, output_pdf, context, save_settings, executor)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/optimize.py", line 695, in optimize
convert_to_jbig2(pdf, jbig2_groups, root, options, executor)
File "/usr/lib/python3/dist-packages/ocrmypdf/optimize.py", line 429, in convert_to_jbig2
_produce_jbig2_images(jbig2_groups, root, options, executor)
File "/usr/lib/python3/dist-packages/ocrmypdf/optimize.py", line 394, in _produce_jbig2_images
executor(
File "/usr/lib/python3/dist-packages/ocrmypdf/_concurrent.py", line 86, in __call__
self._execute(
File "/usr/lib/python3/dist-packages/ocrmypdf/builtin_plugins/concurrency.py", line 138, in _execute
result = future.result()
^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib/python3.12/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/_exec/jbig2enc.py", line 61, in convert_single_mp
return convert_single(
^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/ocrmypdf/_exec/jbig2enc.py", line 56, in convert_single
proc.check_returncode()
File "/usr/lib/python3.12/subprocess.py", line 502, in check_returncode
raise CalledProcessError(self.returncode, self.args, self.stdout,
subprocess.CalledProcessError: Command '['jbig2', '--pdf', '-t', '0.85', PosixPath('/tmp/ocrmypdf.io.cxyi2r43/images/00000079.tif')]' returned non-zero
exit status 1.
Metadata
Metadata
Assignees
Labels
triageIssue needs triageIssue needs triage