Skip to content

Conversation

nmerkulov
Copy link

Fix race condition in query processing

Problem

There was a race condition between closing the stream channel and writing to it from a background goroutine in conn_query.go.

Race condition scenario:

  1. conn_query.go starts a goroutine that calls c.process(ctx, onProcess) (line 45)
  2. c.process() spawns another goroutine internally that calls c.processImpl() (line 106 in conn_process.go)
  3. c.processImpl() reads data from the server and calls on.data(block) for each data block (line 170 in conn_process.go)
  4. When the context is cancelled, c.process() returns with ctx.Err()
  5. The outer goroutine in conn_query.go immediately closes the stream channel (line 54)
  6. However, the inner goroutine from step 2 may still be running and attempting to write to stream via on.data(block)
  7. Writing to a closed channel causes a panic: panic: send on closed channel

Stack trace example:

goroutine 1: close(stream)
goroutine 2: stream <- b  // panic!

Solution

Modified conn_process.go to ensure that the process() function waits for the internal goroutine to complete before returning, even when the context is cancelled.

Changes:

  • When ctx.Done() is triggered, after calling c.cancel(), we now wait for the goroutine to finish by selecting on errCh or doneCh
  • This guarantees that on.data will not be called after process() returns
  • The stream channel can now be safely closed in conn_query.go without race conditions

Impact

  • Fixes panic when cancelling queries that are actively receiving data
  • Ensures proper cleanup and resource management
  • No breaking changes to public API
  • Minimal performance impact (only adds synchronization on context cancellation path)

@CLAassistant
Copy link

CLAassistant commented Oct 6, 2025

CLA assistant check
All committers have signed the CLA.

@mshustov mshustov requested review from Copilot and kavirajk October 12, 2025 09:45
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes a race condition in query processing where the stream channel could be closed while a background goroutine was still attempting to write to it, causing a panic.

  • Adds synchronization to wait for background goroutine completion before returning from process() function when context is cancelled
  • Prevents "send on closed channel" panic during query cancellation
  • Ensures proper cleanup and resource management without breaking API changes

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +119 to +123
// Wait for goroutine to finish before returning
select {
case <-errCh:
case <-doneCh:
}
Copy link

Copilot AI Oct 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected spelling of 'untill' to 'until' in the PR title.

Copilot uses AI. Check for mistakes.

@nmerkulov nmerkulov changed the title wait untill processImpl completes before closing stream chan wait until processImpl completes before closing stream chan Oct 13, 2025
@kavirajk
Copy link
Contributor

@nmerkulov thanks for the PR :) do you mind adding a test case to lock the behavior?

robot-piglet pushed a commit to yandex/perforator that referenced this pull request Oct 20, 2025
<ClickHouse/clickhouse-go#1677>

UPD
исправил фикс на надежный. теперь канал не будет закрыт, пока горутина-писатель не завершится
Семантически так правильней
commit_hash:ef4786541bee52c63054552b0c593ce70fc6c75b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants