RemoteLogDownloader deadlock when met exception.

### Search before asking

- [x] I searched in the [issues](https://github.com/apache/fluss/issues) and found nothing similar.


### Fluss version

0.7.0 (latest release)

### Please describe the bug 🐞

In the original code, the download lock (prefetchSemaphore) is released only in two cases:

1. After a RemoteLogSegment has been successfully read (drained), the lock is released via recycleRemoteLog.

<img width="548" height="211" alt="Image" src="https://github.com/user-attachments/assets/9946e354-be3d-43c2-bbd6-650853319e0b" />

2. When the download of a file fails, the lock is released.

<img width="664" height="384" alt="Image" src="https://github.com/user-attachments/assets/816582cc-7f30-4c45-97f5-7b800b911252" />

Let us simplify the model: suppose a bucket contains three segment files — A, B, and C — and client.scanner.remote-log.prefetch-num = 1.
1. File A fails to download, so the lock is released. File A is then added back to the end of the queue.
2. File B downloads successfully, but the lock is not immediately released because it hasn't been drained.
Since file A has an earlier offset, it remains at the front of the queue and must be processed first. However, file B holds the prefetch lock, and file A cannot be reattempted until the lock is acquired again. But because B will never be drained (as A blocks its processing), the lock is never released — resulting in a deadlock.
3.  file C will not be downloaded, and file A will never be retried. The entire job becomes stuck.

### Solution

When RemoteLogDownloader failed to download a file, no longer release the semaphore but retied to download for several times. If still failed to download, but thrown the exception out of client, let the flink job fails and restarts.

### Are you willing to submit a PR?

- [x] I'm willing to submit a PR!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RemoteLogDownloader deadlock when met exception. #1751

Search before asking

Fluss version

Please describe the bug 🐞

Solution

Are you willing to submit a PR?

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RemoteLogDownloader deadlock when met exception. #1751

Description

Search before asking

Fluss version

Please describe the bug 🐞

Solution

Are you willing to submit a PR?

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions