Skip to content

Conversation

@shashankhs11
Copy link
Contributor

@shashankhs11 shashankhs11 commented Nov 5, 2025

Handle TimeoutException in StateUpdater code path in TaskManager.java

Reviewers: Lucas Brutschy [email protected]

@github-actions github-actions bot added triage PRs from the community streams small Small PRs labels Nov 5, 2025
@shashankhs11 shashankhs11 changed the title KAFKA-19684: Handle TimeoutException from initializeIfNeeded() in StateUpdater Code KAFKA-19864: Handle TimeoutException from initializeIfNeeded() in StateUpdater Code Nov 5, 2025
for (final StreamTask restoredTask : restoredTasks) {
verify(restoredTask).completeRestoration(noOpResetter);
verify(restoredTask).clearTaskTimeout();
verify(restoredTask, atLeastOnce()).clearTaskTimeout();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this because, one of the tests shouldAddNewActiveTasks was failing as it expected the call to clearTaskTimeout() more than once.

@github-actions github-actions bot removed the triage PRs from the community label Nov 5, 2025
@lucasbru lucasbru requested a review from Copilot November 5, 2025 11:20
@lucasbru lucasbru self-requested a review November 5, 2025 11:20
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds handling for TimeoutException during task initialization in the state updater, aligning its behavior with existing LockException handling. It also ensures that task timeouts are properly cleared after successful initialization.

Key changes:

  • Added a TimeoutException catch block in addTaskToStateUpdater that retries initialization with backoff
  • Added clearTaskTimeout() call after successful task initialization
  • Updated tests to verify timeout handling behavior and clearTaskTimeout invocation

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
TaskManager.java Added TimeoutException handling with retry logic and clearTaskTimeout call after successful initialization
TaskManagerTest.java Added test for TimeoutException during initialization and updated existing tests to verify clearTaskTimeout behavior
Comments suppressed due to low confidence (1)

streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java:1160

  • The test shouldAddTasksToStateUpdater should verify that clearTaskTimeout() is called on both tasks after successful initialization, consistent with the new behavior added in line 1085 of TaskManager.java and verified in other tests like shouldRetryInitializationWhenLockExceptionInStateUpdater.
        verify(task00).initializeIfNeeded();
        verify(task01).initializeIfNeeded();
        verify(stateUpdater).add(task00);
        verify(stateUpdater).add(task01);

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@lucasbru lucasbru self-assigned this Nov 6, 2025
Copy link
Member

@lucasbru lucasbru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

THanks! Looks good to me with minor comments.

task.maybeInitTaskTimeoutOrThrow(nowMs, timeoutException);
tasks.addPendingTasksToInit(Collections.singleton(task));
updateOrCreateBackoffRecord(task.id(), nowMs);
log.debug("Task {} timed out during initialization; will retry", task.id(), timeoutException);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be info like above.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed in a0e55f0

tasks.addPendingTasksToInit(Collections.singleton(task));
updateOrCreateBackoffRecord(task.id(), nowMs);
} catch (final TimeoutException timeoutException) {
task.maybeInitTaskTimeoutOrThrow(nowMs, timeoutException);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a comment when this can happen: Either during producer initialization, or while fetching committed offset.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added comment in a0e55f0

@lucasbru lucasbru merged commit 9b0c9db into apache:trunk Nov 7, 2025
22 checks passed
@shashankhs11 shashankhs11 deleted the KAFKA-19684 branch November 7, 2025 13:08
eduwercamacaro pushed a commit to littlehorse-enterprises/kafka that referenced this pull request Nov 12, 2025
…teUpdater Code (apache#20829)

Handle TimeoutException in StateUpdater code path in `TaskManager.java`

Reviewers: Lucas Brutschy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants