Skip to content

Conversation

@choseenonee
Copy link
Collaborator

Fix: Safe Teardown Before First WriteRow

Previously, calling Teardown on a dataWriter before the first WriteRow caused a panic.
The issue occurred because the underlying writer (CSV or Parquet) was never initialized, yet Teardown attempted to flush it.

Example of failing flow

dataWriter := csv.NewWriter(
    ctx,
    w.model,
    w.config.CSVParams,
    w.columnsToDiscard,
    outPath,
    w.continueGeneration,
    w.writtenRowsChan,
)

dataWriter.Init()
dataWriter.Teardown() // panic here

Changes in this PR

  • Fixed Teardown implementation for CSV and Parquet writers to handle cases when no rows were written.
  • Added a unit test that reproduces the bug and verifies the fix.

Comment on lines 438 to 452
w.writerMutex.Lock()
if w.csvWriter != nil {
w.writerMutex.Unlock()

if err := w.flush(); err != nil {
return err
}
}

if err := w.fileDescriptor.Close(); err != nil {
return errors.New(err.Error())
w.writerMutex.TryLock()

if w.fileDescriptor != nil {
if err := w.fileDescriptor.Close(); err != nil {
return errors.New(err.Error())
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After we waited for writerWg and flushWg, no one else is using csvWriter. Do we need to take a lock? Please correct me if I've missed something.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! You are absolutely right.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants