-
-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Closed
Labels
Description
Describe the bug
When running ocrmypdf on a specific pdf file, it raises an exception at 0% of the step Recompressing JPEGs.
An exception occurred while executing the pipeline
[Traceback, posted below ...]
OSError: image file is truncated (1 bytes not processed)
Steps to reproduce
1. Run `ocrmypdf input.pdf output.pdf`
2. Notice that it fails at Recompressing JPEGs stage
Files
The file is found here: https://annas-archive.org/md5/aee9796ac090fdc8a93fc654f32020f3
How did you download and install the software?
Linux package manager (apt, dnf, etc.)
OCRmyPDF version
16.12.0
Relevant log output
[...]
Optimizable images: JPEGs: 774 PNGs: 0 optimize.py:371
Recompressing JPEGs ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 0% 0/774 -:--:--
An exception occurred while executing the pipeline _common.py:296
Traceback (most recent call last):
File "/usr/lib/python3.13/site-packages/ocrmypdf/_pipelines/_common.py", line 261, in
cli_exception_handler
return fn(options, plugin_manager)
File "/usr/lib/python3.13/site-packages/ocrmypdf/_pipelines/ocr.py", line 181, in
_run_pipeline
optimize_messages = exec_concurrent(context, executor)
File "/usr/lib/python3.13/site-packages/ocrmypdf/_pipelines/ocr.py", line 145, in
exec_concurrent
pdf, messages = postprocess(pdf, context, executor)
~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/site-packages/ocrmypdf/_pipelines/_common.py", line 460, in
postprocess
return optimize_pdf(pdf_out, context, executor)
File "/usr/lib/python3.13/site-packages/ocrmypdf/_pipeline.py", line 992, in optimize_pdf
output_pdf, messages = context.plugin_manager.hook.optimize_pdf(
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
input_pdf=input_file,
^^^^^^^^^^^^^^^^^^^^^
...<3 lines>...
linearize=should_linearize(input_file, context),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/lib/python3.13/site-packages/pluggy/_hooks.py", line 512, in __call__
return self._hookexec(self.name, self._hookimpls.copy(), kwargs, firstresult)
~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/site-packages/pluggy/_manager.py", line 120, in _hookexec
return self._inner_hookexec(hook_name, methods, kwargs, firstresult)
~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/site-packages/pluggy/_callers.py", line 167, in _multicall
raise exception
File "/usr/lib/python3.13/site-packages/pluggy/_callers.py", line 121, in _multicall
res = hook_impl.function(*args)
File "/usr/lib/python3.13/site-packages/ocrmypdf/builtin_plugins/optimize.py", line 145,
in optimize_pdf
result_path = optimize(input_pdf, output_pdf, context, save_settings, executor)
File "/usr/lib/python3.13/site-packages/ocrmypdf/optimize.py", line 727, in optimize
transcode_jpegs(pdf, jpegs, root, options, executor)
~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/site-packages/ocrmypdf/optimize.py", line 512, in
transcode_jpegs
executor(
~~~~~~~~^
use_threads=True, # Processes are significantly slower at this task
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
...<9 lines>...
task_finished=finish_jpeg,
^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/lib/python3.13/site-packages/ocrmypdf/_concurrent.py", line 78, in __call__
self._execute(
~~~~~~~~~~~~~^
use_threads=use_threads,
^^^^^^^^^^^^^^^^^^^^^^^^
...<5 lines>...
task_finished=task_finished,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/usr/lib/python3.13/site-packages/ocrmypdf/builtin_plugins/concurrency.py", line
162, in _execute
result = future.result()
File "/usr/lib/python3.13/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
~~~~~~~~~~~~~~~~~^^
File "/usr/lib/python3.13/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/usr/lib/python3.13/concurrent/futures/thread.py", line 59, in run
result = self.fn(*self.args, **self.kwargs)
File "/usr/lib/python3.13/site-packages/ocrmypdf/optimize.py", line 484, in
_optimize_jpeg
im.save(opt_jpg, optimize=True, quality=jpeg_quality)
~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.13/site-packages/PIL/Image.py", line 2539, in save
self.load()
~~~~~~~~~^^
File "/usr/lib/python3.13/site-packages/PIL/ImageFile.py", line 391, in load
raise OSError(msg)
OSError: image file is truncated (1 bytes not processed)
austinbutler