Skip to content

Conversation

@hatamiarash7
Copy link
Contributor

Create a dedicated Makefile for backend project.


Improve README.md
- Add a name for the Docker container and mount a volume for the DuckDB database.
- Fix MD lint problems (MD029)


During the addition of dedicated make targets for backend, a data race was detected when running tests with the race detector (go test -race). The race occurred in the batch log storage mechanism, specifically between:

  • Write operation in StoreLog()
  • Read operation in ProcessBatchStoreLogs()

The race condition was caused by unsafe concurrent access to the shared batchLogs slice. Here's the problematic sequence:

func StoreLog(entry models.LogEntry) error {
    batchLogsMutex.Lock()
    batchLogs = append(batchLogs, entry)

    if len(batchLogs) >= maxBatchStoreLogsSize {
        batchLogsMutex.Unlock()              // ⚠️ Mutex unlocked here
        return ProcessBatchStoreLogs()       // ⚠️ Calls another function
    }

    batchLogsMutex.Unlock()
    return nil
}

// ProcessBatchStoreLogs() tries to lock again
func ProcessBatchStoreLogs() error {
    batchLogsMutex.Lock()                    // ⚠️ Tries to lock again
    if len(batchLogs) == 0 {
        batchLogsMutex.Unlock()
        return nil
    }
    // ... process logs
}

Between StoreLog() unlocking the mutex and ProcessBatchStoreLogs() re-locking it, other goroutines could:

  • Append new entries to batchLogs (write)
  • Read from batchLogs in tests or other operations

This created a classic data race where multiple goroutines accessed shared memory without proper synchronization.

Solution

1. Modified StoreLog()

func StoreLog(entry models.LogEntry) error {
    batchLogsMutex.Lock()
    batchLogs = append(batchLogs, entry)

    if len(batchLogs) >= maxBatchStoreLogsSize {
        // Copy entries and reset while STILL holding the lock
        entries := batchLogs
        batchLogs = make([]models.LogEntry, 0, maxBatchStoreLogsSize)
        batchLogsMutex.Unlock()

        // Process the copied entries outside the lock
        return processBatchStoreLogsWithEntries(entries)
    }

    batchLogsMutex.Unlock()
    return nil
}

Key changes:

  • Copy batchLogs to a local variable entries while holding the mutex
  • Reset batchLogs immediately (still under mutex protection)
  • Process the copied entries after releasing the lock
  • No gap where batchLogs can be accessed unsafely

2. Refactored ProcessBatchStoreLogs()

func ProcessBatchStoreLogs() error {
    batchLogsMutex.Lock()
    if len(batchLogs) == 0 {
        batchLogsMutex.Unlock()
        return nil
    }

    // Copy and reset under lock protection
    entries := batchLogs
    batchLogs = make([]models.LogEntry, 0, maxBatchStoreLogsSize)
    batchLogsMutex.Unlock()

    // Process outside the lock
    return processBatchStoreLogsWithEntries(entries)
}

3. New Helper Function

// processBatchStoreLogsWithEntries processes a batch of log entries
// This function does not touch the global batchLogs slice
func processBatchStoreLogsWithEntries(entries []models.LogEntry) error {
    if len(entries) == 0 {
        return nil
    }

    // Database operations (no shared memory access)
    dbConn, err := db.Conn(context.Background())
    // ... rest of the processing
}

Purpose:

  • Isolates database operations from shared state
  • Accepts entries as a parameter (no global state access)
  • Can be called without holding any locks

Benefits

  • Thread Safety
    • No concurrent access to batchLogs without mutex protection
    • No race conditions between readers and writers
  • Performance
    • Mutex held for minimal time (just copying slice references)
    • Database operations happen outside the critical section
    • Better concurrency as locks aren't held during I/O
  • Correctness
    • No data loss during batch processing
    • Proper separation of concerns (lock management vs. processing)
    • Clear ownership of data (global vs. local)

@nicolasbeauvais nicolasbeauvais merged commit 8ef15ea into phare:main Nov 25, 2025
@nicolasbeauvais
Copy link
Contributor

Thank you for the PR @hatamiarash7 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants