-
Notifications
You must be signed in to change notification settings - Fork 86
parametrize erasure coding numbers #506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
zdave-parity
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes look correct to me, just a few minor comments.
| \newcommand{\spl}[1]{\text{split}_{#1}} | ||
|
|
||
| The foundation of the data-availability and distribution system of \Jam is a systematic Reed-Solomon erasure coding function in \textsc{gf}($2^{16}$) of rate 342:1023, the same transform as done by the algorithm of \cite{lin2014novel}. We use a little-endian $\blob[2]$ form of the 16-bit \textsc{gf} points with a functional equivalence given by $\fnencode[2]$. From this we may assume the encoding function $\fnerasurecode: \sequence[342]{\blob[2]} \to \sequence[1023]{\blob[2]}$ and the recovery function $\fnecrecover: \protoset{\tuple{\blob[2], \Nmax{1023}}}_{342} \to \sequence[342]{\blob[2]}$. Encoding is done by extrapolating a data blob of size 684 octets (provided in $\fnerasurecode$ here as 342 octet pairs) into 1,023 octet pairs. Recovery is done by collecting together any distinct 342 octet pairs, together with their indices, and transforming this into the original sequence of 342 octet pairs. | ||
| The foundation of the data-availability and distribution system of \Jam is a systematic Reed-Solomon erasure coding function in \textsc{gf}($2^{16}$) of rate $\nicefrac{\Cecpiecesize}{2}$:$\Cvalcount$, the same transform as done by the algorithm of \cite{lin2014novel}. We use a little-endian $\blob[2]$ form of the 16-bit \textsc{gf} points with a functional equivalence given by $\fnencode[2]$. From this we may assume the encoding function $\fnerasurecode: \sequence[\nicefrac{\Cecpiecesize}{2}]{\blob[2]} \to \sequence[\Cvalcount]{\blob[2]}$ and the recovery function $\fnecrecover: \protoset{\tuple{\blob[2], \Nmax{\Cvalcount}}}_{\nicefrac{\Cecpiecesize}{2}} \to \sequence[\nicefrac{\Cecpiecesize}{2}]{\blob[2]}$. Encoding is done by extrapolating a data blob of size $\Cecpiecesize$ octets (provided in $\fnerasurecode$ here as $\nicefrac{\Cecpiecesize}{2}$ octet pairs) into $\Cvalcount$ octet pairs. Recovery is done by collecting together any distinct $\nicefrac{\Cecpiecesize}{2}$ octet pairs, together with their indices, and transforming this into the original sequence of $\nicefrac{\Cecpiecesize}{2}$ octet pairs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the /2s make this a bit unpleasant to read. Maybe sensible to introduce another constant for the number of original shards, or change the meaning of W_E to this and use 2W_E for piece size? This is a style question that is probably best answered by Gav though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zdave-parity I totally agree with you. @gavofyork what is your suggestion?
| Once done, then imported segments must be reconstructed. This process may in fact be lazy as the Refine function makes no usage of the data until the \emph{fetch} host-call is made. Fetching generally implies that, for each imported segment, erasure-coded chunks are retrieved from enough unique validators (342, including the guarantor) and is described in more depth in appendix \ref{sec:erasurecoding}. (Since we specify systematic erasure-coding, its reconstruction is trivial in the case that the correct 342 validators are responsive.) Chunks must be fetched for both the data itself and for justification metadata which allows us to ensure that the data is correct. | ||
| Once done, then imported segments must be reconstructed. This process may in fact be lazy as the Refine function makes no usage of the data until the \emph{fetch} host-call is made. Fetching generally implies that, for each imported segment, erasure-coded chunks are retrieved from enough unique validators ($\nicefrac{\Cecpiecesize}{2}$, including the guarantor) and is described in more depth in appendix \ref{sec:erasurecoding}. (Since we specify systematic erasure-coding, its reconstruction is trivial in the case that the correct $\nicefrac{\Cecpiecesize}{2}$ validators are responsive.) Chunks must be fetched for both the data itself and for justification metadata which allows us to ensure that the data is correct. | ||
|
|
||
| Validators, in their role as availability assurers, should index such chunks according to the index of the segments-tree whose reconstruction they facilitate. Since the data for segment chunks is so small at 12 octets, fixed communications costs should be kept to a bare minimum. A good network protocol (out of scope at present) will allow guarantors to specify only the segments-tree root and index together with a Boolean to indicate whether the proof chunk need be supplied. Since we assume at least 341 other validators are online and benevolent, we can assume that the guarantor can compute $\importsegmentdata$ and $\justifysegmentdata$ above with confidence, based on the general availability of data committed to with $\mathbf{s}^\clubsuit$, which is specified below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
341 here should be $\nicefrac{\Cecpiecesize}{2} - 1$ I guess, though that is a bit of a mouthful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe $\nicefrac{\Cecpiecesize}{2} - 1 = 341$ ?
No description provided.