Skip to content

[BUG] corrupt deflate stream #131

@kargaranamir

Description

@kargaranamir

Describe the Bug
When running the Ungoliant pipeline, everything proceeds smoothly initially as the JSONL files for each language are built. However, after a couple of hours, an error suddenly appears in the logs, and thereafter, only this error persists. I am curious as to why this occurs and whether it could be resolved by skipping the problematic inputs.

[2024-03-27T23:49:00Z ERROR ungoliant::pipelines::oscardoc::pipeline] ReadData(Custom { kind: InvalidInput, error: "corrupt deflate stream" })

To Reproduce
Nothing specific to mention, just the routine: downloading and pipelining.

Expected Behavior
The expected behavior is for the pipeline to function as it did earlier or to skip the corrupt inputs.

Screenshots

at first:
Screenshot 2024-03-28 at 12 56 51 AM

later:
Screenshot 2024-03-28 at 12 55 51 AM

Desktop (Please Complete the Following Information):

uname -a
Linux delta 5.14.21-150500.55.36-default #1 SMP PREEMPT_DYNAMIC Tue Oct 31 08:37:43 UTC 2023 (e7a2e23) x86_64 x86_64 x86_64 GNU/Linux

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions