feat: Extend error handling to allow conflict resolution #124

MetalBlueberry · 2025-09-01T14:27:46Z

No description provided.

arajkumar

first pass!

arajkumar · 2025-09-30T13:24:47Z

pkg/errorhandlers/conflict_handler_test.go

+	copier, err := csvcopy.NewCopier(connStr, "test_metrics",
+		csvcopy.WithColumns("device_id,label,value"),
+		csvcopy.WithBatchSize(2),
+		csvcopy.WithBatchErrorHandler(BatchConflictHandler(WithConflictHandlerNext(csvcopy.BatchHandlerNoop()))),


IMHO, having next handler complicates the API. Do you think it would be useful? if so, why not just use array of batch error handlers.?

The current design is a decorator pattern. A next handler is something is called after your function finishes and allows to chain options.

Now that you mention it, next it is provably not correct working, as the actual goal here is to fallback in case of not having a conflict error.

Having an array doesn't solve the problem. because if you do that, how the array should behave? execute all the handlers? just the last one? how do you combine outputs? .... it becomes hard to manage.

If we rename it, do you have any ideas for the name? 😅 I can't think of any good one.

arajkumar · 2025-09-30T13:52:46Z

pkg/buffer/buffer.go

+	// Create a copy of the input data to avoid issues with caller reusing the slice
+	data := make([]byte, len(p))
+	copy(data, p)


There is no copy involved with net.Buffers., should we retain that property to avoid excessive garbage?

There was a copy, but done in scan.go function

here exactly, I hope the link works

https://github.com/timescale/timescaledb-parallel-copy/pull/124/files#diff-50cd4d562d776a354f7d300bd42b001d9736e761941da821c9e14344d51a78e8L200

arajkumar · 2025-09-30T13:55:10Z

pkg/buffer/buffer.go

+}
+
+// Seek sets the position for next Read or Write operation
+func (v *Seekable) Seek(offset int64, whence int) (int64, error) {


Is there a use-case for SeekCurrent and SeekEnd? I think we could have just stayed with net.Buffers and buf[0][0] might be good enough?

net buffer discards data on read.

here is the implementation. on Read it consumes the bytes by resizing the slice. That is why this implementation uses the moving indexes instead. It allows to just reset it to come back to the start.

// Read from the buffers. // // Read implements [io.Reader] for [Buffers]. // // Read modifies the slice v as well as v[i] for 0 <= i < len(v), // but does not modify v[i][j] for any i, j. func (v *Buffers) Read(p []byte) (n int, err error) { for len(p) > 0 && len(*v) > 0 { n0 := copy(p, (*v)[0]) v.consume(int64(n0)) p = p[n0:] n += n0 } if len(*v) == 0 { err = io.EOF } return } func (v *Buffers) consume(n int64) { for len(*v) > 0 { ln0 := int64(len((*v)[0])) if ln0 > n { (*v)[0] = (*v)[0][n:] return } n -= ln0 (*v)[0] = nil *v = (*v)[1:] } }

arajkumar · 2025-09-30T14:12:35Z

pkg/csvcopy/csvcopy.go

 			defer workerWg.Done()
-			err := c.processBatches(ctx, batchChan)
+			// Add worker ID to context for all operations in this worker
+			workerCtx := WithWorkerID(ctx, i)


I failed to understand the usefulness of adding worker id into the ctx. Also, why adding worker id into the COPY statements will be useful?

why adding worker id to the context
Being in the context, it is automatically propagated across interfaces. This means you can implement the BatchErrorHandler and you will still receive the worker id over the context. Then you can use it to log which worker executed the handler. The main advantage is that if you want, you can forget it exists.

why adding worker id to copy statements
I hit a situation where there was a dead lock and the error statements display the entire query. having the id in there allows me to correlate the error with an specific worker. So it is easy for me to link app logs with postgres logs.

arajkumar · 2025-09-30T14:21:10Z

pkg/csvcopy/csvcopy.go

+	// Add worker ID comment if available in context
+	if workerID := GetWorkerIDFromContext(ctx); workerID >= 0 {
+		baseCmd = fmt.Sprintf("/* Worker-%d */ %s", workerID, baseCmd)
+	}


I failed to understand the usefulness of adding worker id to the copy command.

why adding worker id to copy statements
I hit a situation where there was a dead lock and the error statements display the entire query. having the id in there allows me to correlate the error with an specific worker. So it is easy for me to link app logs with postgres logs.

…nd-buffer

MetalBlueberry added 5 commits September 1, 2025 16:27

feat: implement rewind buffer

54d6359

lint fix

119b026

inital attempt to handle insert conflicts

01730bc

Fix linter

02bc94e

create errorhandlers package

b7f2705

MetalBlueberry force-pushed the vperez/implement-rewind-buffer branch from a0179e6 to b7f2705 Compare September 2, 2025 14:31

MetalBlueberry added 2 commits September 2, 2025 16:35

more linter things

fbc6808

use correct schema

508ceab

MetalBlueberry changed the title ~~feat: implement rewind buffer~~ feat: Extend error handling to allow conflict resolution Sep 2, 2025

MetalBlueberry added 3 commits September 3, 2025 10:58

add cli and cleanup

91ec78b

Improve logs around the process

1eb526e

fix test

180d281

MetalBlueberry mentioned this pull request Sep 4, 2025

Recommended way to handle uniqueness constraints? #116

Open

MetalBlueberry added 4 commits September 25, 2025 10:56

use temporal tables

fa82b21

cleanup log lines

23fb5f0

make sure verbose flag actually works

c7ace6a

handle lint errors

32c4c30

MetalBlueberry marked this pull request as ready for review September 25, 2025 10:03

MetalBlueberry added 3 commits September 26, 2025 12:59

cleanup a bit.

9081df9

reduce scope

0001cb1

Update readme

6606fbf

arajkumar reviewed Sep 30, 2025

View reviewed changes

Merge remote-tracking branch 'origin/main' into vperez/implement-rewi…

ebdd117

…nd-buffer

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Extend error handling to allow conflict resolution #124

feat: Extend error handling to allow conflict resolution #124

Uh oh!

MetalBlueberry commented Sep 1, 2025

Uh oh!

arajkumar left a comment

Uh oh!

arajkumar Sep 30, 2025

Uh oh!

MetalBlueberry Oct 1, 2025

Uh oh!

arajkumar Sep 30, 2025

Uh oh!

MetalBlueberry Oct 1, 2025

Uh oh!

arajkumar Sep 30, 2025

Uh oh!

MetalBlueberry Oct 1, 2025

Uh oh!

arajkumar Sep 30, 2025

Uh oh!

MetalBlueberry Oct 1, 2025

Uh oh!

arajkumar Sep 30, 2025

Uh oh!

MetalBlueberry Oct 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: Extend error handling to allow conflict resolution #124

Are you sure you want to change the base?

feat: Extend error handling to allow conflict resolution #124

Uh oh!

Conversation

MetalBlueberry commented Sep 1, 2025

Uh oh!

arajkumar left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants