-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Unwrap Shared Doman and Temporal Tx Params #17402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
where exactly is the logic that causes sharing of the sd between unwind and exec? |
In the stage loop unwind and exec are seperate finctions and both operate on the shared domain, however the code here: erigon/execution/stagedsync/stage_execute.go Line 309 in a8dac83
flushes and clears the domain. i.e. writes the data to the underlying transaction and clears the SD. There are two cases here:
In 2 the unwind is not visible to the parallel executers until the bounding tx is commited. I think that the sd's local data and the bounding tx need to have thier updates co-ordinated. However in the code there are various places where local shared domains are created and accessed. I think that many of these are likely to break things. |
ok. So this is a preparatory PR to help make debugging easier for tracing the SD usage, correct? |
Yes. It also simplifies the inner code - becuase it can just use the temporal tx rather than checking both. It seems now its pretty much always a temporal tx, so I have forced this to be true. As you can see from the various checkins above I wanted to do the fiddly changes to do with unwrapping here, rather than in the PR where I'm trying to trace the stage code. The only place where this is not the case is in tests - so I've changed them to be conformant with the working code. Hence the addition of the test temporal db. |
This PR changes the way SharedDomains and TemporalTx's are passed to stage components. Rather than wraping them inside of a TxContainer.
The reason for doing this is that the container removes type and content saftey when we're passing domains an transactions. For the most part now the tx passed is temporal - and all of the underlying code now handles temproal transactions, so I think for the stage loop thie wrapper has become redundant.
Why Now ? Becuase I have discovered an issue with unwinds when running parallel processing with an external tx. This is becuase in the case the shared domain is not shared, or its memory gets clear in unwind, and becuase the unwind flush is to the external RoTx of the loop, the RoTx's of the parallel workers can't see this update as its not been commited yet.
To fix this I need to trace the SD usage and use an SD which is shared between unwind and exec. This will make that easier to do.