diff --git a/foundations/whitepapers/tblkch.mdx b/foundations/whitepapers/tblkch.mdx index 911e96454..deb9d53a0 100644 --- a/foundations/whitepapers/tblkch.mdx +++ b/foundations/whitepapers/tblkch.mdx @@ -14,17 +14,18 @@ The aim of this text is to provide a detailed description of the Telegram Open N --- + ## Introduction -This document provides a detailed description of the TON Blockchain, including its precise block format, validity conditions, TON Virtual Machine (TVM) invocation details, smart-contract creation process, and cryptographic signatures. In this respect it is a continuation of the TON whitepaper (cf. [[3](#ref-3)]), so we freely use the terminology introduced in that document. +This document provides a detailed description of the TON Blockchain, including its precise block format, validity conditions, TON Virtual Machine (TVM) invocation details, smart-contract creation process, and cryptographic signatures. In this respect it is a continuation of the [TON whitepaper](/foundations/whitepapers/ton), so we freely use the terminology introduced in that document. [Chapter 1](#1-overview) provides a general overview of the TON Blockchain and its design principles, with particular attention to the introduction of compatibility and validity conditions and the implementation of message delivery guarantees. More detailed information, such as the TL-B schemes that describe the serialization of all required data structures into trees or collections ("bags") of cells, is provided in subsequent chapters, culminating in a complete description of the TON Blockchain (shardchain and masterchain) block layout in [Chapter 5](#5-block-layout). -A detailed description of the elliptic curve cryptography used for signing blocks and messages, also accessible through TVM primitives, is provided in [Appendix A](#a-elliptic-curve-cryptography). TVM itself is described in a separate document (cf. [[4](#ref-4)]). +A detailed description of the elliptic curve cryptography used for signing blocks and messages, also accessible through TVM primitives, is provided in [Appendix A](#a-elliptic-curve-cryptography). TVM itself is described in a [separate document](/foundations/whitepapers/tvm). -Some subjects have intentionally been left out of this document. One is the Byzantine Fault Tolerant(BFT) protocol used by the validators to determine the next block of the masterchain or a shardchain; that subject is left for a forthcoming document dedicated to the TON Network. And although this document describes the precise format of TON Blockchain blocks, and discusses the blockchain's validity conditions and serialized invalidity proofs,[1](#fn1) it provides no details about the network protocols used to propagate these blocks, block candidates, collated blocks, and invalidity proofs. +Some subjects have intentionally been left out of this document. One is the Byzantine Fault Tolerant(BFT) protocol used by the validators to determine the next block of the masterchain or a shardchain; that subject is left for a forthcoming document dedicated to the TON Network. And although this document describes the precise format of TON Blockchain blocks, and discusses the blockchain's validity conditions and serialized invalidity proofs,[1](#fn1) it provides no details about the network protocols used to propagate these blocks, block candidates, collated blocks, and invalidity proofs. -Similarly, this document does not provide the complete source code of the masterchain smart contracts used to elect the validators, change the configurable parameters or get their current values, or punish the validators for their misbehavior, even though these smart contracts form an important part of the total blockchain state and of the masterchain block zero. Instead, this document describes the location of these smart contracts and their formal interfaces.[2](#fn2) The source code of these smart contracts will be provided separately as downloadable files with comments. +Similarly, this document does not provide the complete source code of the masterchain smart contracts used to elect the validators, change the configurable parameters or get their current values, or punish the validators for their misbehavior, even though these smart contracts form an important part of the total blockchain state and of the masterchain block zero. Instead, this document describes the location of these smart contracts and their formal interfaces.[2](#fn2) The source code of these smart contracts will be provided separately as downloadable files with comments. Please note that the current version of this document describes a preliminary test version of the TON Blockchain; some minor details are likely to change prior to launch during the development, testing, and deployment phases. @@ -34,11 +35,11 @@ This chapter provides an overview of the main features and design principles of ## 1.1 Everything is a bag of cells -All data in the blocks and state of the TON Blockchain is represented as a collection of *cells* (cf. [[3](#ref-3), 2.5]). Therefore, this chapter begins with a general discussion of cells. +All data in the blocks and state of the TON Blockchain is represented as a collection of [*cells*](/foundations/whitepapers/ton#2-5-global-shardchain-state-%E2%80%9Cbag-of-cells%E2%80%9D-philosophy). Therefore, this chapter begins with a general discussion of cells. ### 1.1.1. TVM cells -Recall that the TON Blockchain, as well as the TON Virtual Machine (TVM; cf. [[4](#ref-4)]), represents all permanently stored data as a *collection* or *bag* of so-called *cells*. Each cell consists of up to 1023 data bits and up to four references to other cells. Cyclic cell references are not allowed, so the cells are usually organized into *trees of cells*, or rather *directed acyclic graphs (DAGs) of cells*.[3](#fn3) Any value of an abstract algebraic (dependent) data type may be represented (serialized) as a tree of cells. The precise way of representing values of an abstract data type as a tree of cells is expressed by means of a *TL-B scheme*.[4](#fn4) A more thorough discussion of different kinds of cells may be found in [[4](#ref-4), 3.1]. +Recall that the TON Blockchain, as well as the TON Virtual Machine (TVM), represents all permanently stored data as a [*collection* or *bag* of so-called *cells*](/foundations/whitepapers/tvm#3-1-10-“everything-is-a-bag-of-cells”-paradigm). Each cell consists of up to 1023 data bits and up to four references to other cells. Cyclic cell references are not allowed, so the cells are usually organized into *trees of cells*, or rather *directed acyclic graphs (DAGs) of cells*.[3](#fn3) Any value of an abstract algebraic (dependent) data type may be represented (serialized) as a tree of cells. The precise way of representing values of an abstract data type as a tree of cells is expressed by means of a *TL-B scheme*.[4](#fn4) A more thorough discussion of different kinds of cells may be found in [Generalities on cells](/foundations/whitepapers/tvm#3-1-generalities-on-cells). ### 1.1.2. Application to TON Blockchain blocks and state @@ -59,7 +60,7 @@ where $[A]$ equals one when condition $A$ is true, and zero otherwise. - Next, $\lceil l/8\rceil$ data bytes follow. This means that the $l$ data bits of the cell are split into groups of eight, and each group is interpreted as a big-endian 8-bit integer and stored into a byte. If $l$ is not divisible by eight, a single binary one and a suitable number of binary zeroes (up to six) are appended to the data bits, and the completion tag (the least significant bit of the descriptor byte $d_2$) is set. -- Finally, $r$ references to other cells follow. Each reference is normally represented by 32 bytes containing the SHA-256 hash of the referenced cell, computed as explained below in [1.1.4](#1-1-4-the-sha256-hash-of-a-cell). +- Finally, $r$ references to other cells follow. Each reference is normally represented by 32 bytes containing the [SHA-256 hash of the referenced cell](#1-1-4-the-sha-256-hash-of-a-cell) computed. In this way, the standard representation $\text{CellRepr}(c)$ of a cell $c$ with $l$ data bits and $r$ references is $2+\lceil l/8\rceil+32r$ bytes long. @@ -77,7 +78,7 @@ Furthermore, because SHA-256 is tacitly assumed to be collision-resistant, we as ### 1.1.5. Exotic cells -Apart from the *ordinary* cells (also called *simple* or *data* cells) considered so far, cells of other types, called *exotic cells*, sometimes appear in the actual representations of TON Blockchain blocks and other data structures. Their representation is somewhat different; they are distinguished by having the first descriptor byte $d_1\geq 5$ (cf. [[4](#ref-4), 3.1]). +Apart from the *ordinary* cells (also called *simple* or *data* cells) considered so far, cells of other types, called *exotic cells*, sometimes appear in the actual representations of TON Blockchain blocks and other data structures. Their representation is somewhat different; they are distinguished by having the first [descriptor byte](/foundations/whitepapers/tvm#3-1-generalities-on-cells) $d_1\geq 5$. ### 1.1.6. External reference cells @@ -102,7 +103,7 @@ Signatures are an excellent example of the application of representation hashes. ### 1.1.10. Higher hashes of a cell -In addition to the transparent and representation hashes of a cell $c$, a sequence of *higher hashes* $\text{Hash}_i(c)$, $i=1,2,\dots$ may be defined, which eventually stabilizes at $\text{Hash}_\infty(c)$. (More detail may be found in [[4](#ref-4), 3.1].) +In addition to the transparent and representation hashes of a cell $c$, a sequence of [*higher hashes*](/foundations/whitepapers/tvm#3-1-6-the-higher-hashes-of-a-cell) $\text{Hash}_i(c)$, $i=1,2,\dots$ may be defined, which eventually stabilizes at $\text{Hash}_\infty(c)$. --- @@ -112,7 +113,7 @@ This section briefly describes the principal components of a block and of the bl ### 1.2.1. The Infinite Sharding Paradigm (ISP) applied to blockchain block and state -Recall that according to the Infinite Sharding Paradigm, each account can be considered as lying in its separate "accountchain", and the (virtual) blocks of these accountchains are then grouped into shardchain blocks for efficiency purposes. Specifically, the state of a shardchain consists, roughly speaking, of the states of all its "accountchains" (i.e., of all accounts assigned to it); similarly, a block of a shardchain essentially consists of a collection of virtual "blocks" for some accounts assigned to the shardchain.[5](#fn5) +Recall that according to the Infinite Sharding Paradigm, each account can be considered as lying in its separate "accountchain", and the (virtual) blocks of these accountchains are then grouped into shardchain blocks for efficiency purposes. Specifically, the state of a shardchain consists, roughly speaking, of the states of all its "accountchains" (i.e., of all accounts assigned to it); similarly, a block of a shardchain essentially consists of a collection of virtual "blocks" for some accounts assigned to the shardchain.[5](#fn5) We can summarize this as follows: @@ -126,7 +127,7 @@ $$ where $n$ is the bit length of the $\mathit{account\_id}$, and $\text{Hashmap}(n,X)$ describes a partial map $\mathbf{2}^n\dashrightarrow X$ from bitstrings of length $n$ into values of type $X$. -Recall that each shardchain—or, more precisely, each shardchain block[6](#fn6)—corresponds to all accountchains that belong to the same "workchain" (i.e., have the same $\mathit{workchain\_id}=w$) and have an $\mathit{account\_id}$ beginning with the same binary prefix $s$, so that $(w,s)$ completely determines a shard. Therefore, the above hashmaps must contain only keys beginning with prefix $s$. +Recall that each shardchain—or, more precisely, each shardchain block[6](#fn6)—corresponds to all accountchains that belong to the same "workchain" (i.e., have the same $\mathit{workchain\_id}=w$) and have an $\mathit{account\_id}$ beginning with the same binary prefix $s$, so that $(w,s)$ completely determines a shard. Therefore, the above hashmaps must contain only keys beginning with prefix $s$. We will see in a moment that the above description is only an approximation: the state and block of the shardchain need to contain some extra data that are not split according to the $\mathit{account\_id}$ as suggested by ([3](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state)). @@ -136,9 +137,9 @@ A shardchain block and its state may each be classified into two distinct parts. ### 1.2.3. Interaction with other blocks and the outside world. Global and local consistency conditions -The non-split parts of the shardchain block and its state are mostly related to the interaction of this block with some other "neighboring" blocks. The global consistency conditions of the blockchain as a whole are reduced to internal consistency conditions of separate blocks by themselves as well as external local consistency conditions between certain blocks (cf. [1.3](#1-3-consistency-conditions)). +The non-split parts of the shardchain block and its state are mostly related to the interaction of this block with some other "neighboring" blocks. The global consistency conditions of the blockchain as a whole are reduced to internal consistency conditions of separate blocks by themselves as well as external local [consistency conditions](#1-3-consistency-conditions) between certain blocks. -Most of these local consistency conditions are related to message forwarding between different shardchains, transactions involving more than one shardchain, and message delivery guarantees. However, another group of local consistency conditions relates a block with its immediate antecessors and successors inside a shardchain; for instance, the initial state of a block usually must coincide with the final state of its immediate antecessor.[7](#fn7) +Most of these local consistency conditions are related to message forwarding between different shardchains, transactions involving more than one shardchain, and message delivery guarantees. However, another group of local consistency conditions relates a block with its immediate antecessors and successors inside a shardchain; for instance, the initial state of a block usually must coincide with the final state of its immediate antecessor.[7](#fn7) ### 1.2.4. Inbound and outbound messages of a block @@ -153,7 +154,7 @@ Another non-split component of a shardchain block is the *block header*, which c ### 1.2.6. Validator signatures, signed and unsigned blocks -The block described so far is an *unsigned block*; it is generated in its entirety and considered as a whole by the validators. When the validators ultimately sign it, the *signed block* is created, consisting of the unsigned block along with a list of validator signatures (of a certain representation hash of the unsigned block, cf. [1.1.9](#1-1-9-use-of-representation-hashes-for-signatures)). This list of signatures is also a non-split component of the (signed) block; however, since it lies outside the unsigned block, it is somewhat different from the other data kept in a block. +The block described so far is an *unsigned block*; it is generated in its entirety and considered as a whole by the validators. When the validators ultimately sign it, the *signed block* is created, consisting of the unsigned block along with a list of validator signatures (of a certain [representation hash](#1-1-9-use-of-representation-hashes-for-signatures) of the unsigned block). This list of signatures is also a non-split component of the (signed) block; however, since it lies outside the unsigned block, it is somewhat different from the other data kept in a block. ### 1.2.7. Outbound message queue of a shardchain @@ -163,7 +164,7 @@ Originally, each outbound message is included into *OutMsgQueue*; it is removed ### 1.2.8. Layout of *InMsgDescr*, *OutMsgDescr* and *OutMsgQueue* -All of the most important non-split shardchain data structures related to messages are organized as *hashmaps* or *dictionaries* (implemented by means of Patricia trees serialized into a tree of cells as described in [[4](#ref-4), 3.3]), with the following keys: +All of the most important non-split shardchain data structures related to messages are organized as *hashmaps* or *dictionaries* (implemented by means of [Patricia trees](/foundations/whitepapers/tvm#3-3-2-hashmaps-as-patricia-trees) serialized into a tree of cells), with the following keys: - The inbound message description *InMsgDescr* uses the 256-bit message hash as a key. - The outbound message description *OutMsgDescr* uses the 256-bit message hash as a key. @@ -171,7 +172,7 @@ All of the most important non-split shardchain data structures related to messag ### 1.2.9. The split part of the block: transaction chains -The split part of a shardchain block consists of a hashmap mapping some of the accounts assigned to the shardchain to "virtual accountchain blocks" *AccountBlock*, cf. ([3](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state)). Such a virtual accountchain block consists of a sequential list of *transactions* related to that account. +The split part of a shardchain block consists of a hashmap mapping some of the accounts assigned to the shardchain to "virtual accountchain blocks" *AccountBlock* ([3](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state)). Such a virtual accountchain block consists of a sequential list of *transactions* related to that account. ### 1.2.10. Transaction description @@ -180,7 +181,7 @@ Each transaction is described in the block by an instance of the *Transaction* t - A reference to exactly one *inbound message* (which must be present in *InMsgDescr* as well) that has been *processed* by the transaction. - References to several (maybe zero) *outbound messages* (also present in *OutMsgDescr* and most likely included in *OutMsgQueue*) that have been *generated* by the transaction. -The transaction consists of an invocation of TVM (cf. [[4](#ref-4)]) with the code of the smart contract corresponding to the account in question loaded into the virtual machine, and with the data root cell of the smart contract loaded into the virtual machine's register `c4`. The inbound message itself is passed in the stack as an argument to the smart contract's `main()` function, along with some other important data, such as the amount of TON Gram and other defined currencies attached to the message, the sender account address, the current balance of the smart contract, and so on. +The transaction consists of an invocation of the TVM with the code of the smart contract corresponding to the account in question loaded into the virtual machine, and with the data root cell of the smart contract loaded into the virtual machine's [register `c4`](/foundations/whitepapers/tvm#1-3-2-list-of-control-registers). The inbound message itself is passed in the stack as an argument to the smart contract's `main()` function, along with some other important data, such as the amount of TON Gram and other defined currencies attached to the message, the sender account address, the current balance of the smart contract, and so on. In addition to the information listed above, a *Transaction* instance also contains the original and final states of the account (i.e., of the smart contract), as well as some of the TVM running statistics (gas consumed, gas price, instructions performed, cells created/destroyed, virtual machine termination code, etc.). @@ -198,7 +199,7 @@ The account state itself approximately consists of the following data: - Its *storage usage statistics*, including the number of cells and bytes kept in the persistent storage of the smart contract (i.e., inside the blockchain state) and the last time a storage usage payment was exacted from this account. - An optional *formal interface description* (intended for smart contracts) and/or *user public information* (intended mostly for human users and organizations). -Notice that there is no distinction between "smart contract" and "account" in the TON Blockchain. Instead, "simple" or "wallet" accounts, typically employed by human users and their cryptocurrency wallet applications for simple cryptocurrency transfers, are just simple smart contracts with standard (shared) code and with persistent data consisting of the public key of the wallet (or several public keys in the case of a multi-signature wallet; cf. [1.7.6](#1-7-6-example%3A-creating-a-cryptocurrency-wallet-smart-contract) for more detail). +Notice that there is no distinction between "smart contract" and "account" in the TON Blockchain. Instead, "simple" or "wallet" accounts, typically employed by human users and their [cryptocurrency wallet applications](#1-7-6-example%3A-creating-a-cryptocurrency-wallet-smart-contract) for simple cryptocurrency transfers, are just simple smart contracts with standard (shared) code and with persistent data consisting of the public key of the wallet (or several public keys in the case of a multi-signature wallet). ### 1.2.13. Masterchain blocks @@ -214,7 +215,7 @@ In addition to shardchain blocks and their states, the TON Blockchain contains * ## 1.3 Consistency conditions -In addition to the data structures contained in the block and in the blockchain state, which are serialized into bags of cells according to certain TL-B schemes explained in detail later (cf. Chapters [3](#3-messages%2C-message-descriptors%2C-and-queues)—[5](#5-block-layout)), an important component of the blockchain layout is the *consistency conditions* between data kept inside one or in different blocks (as mentioned in [1.2.3](#1-2-3-interaction-with-other-blocks-and-the-outside-world-global-and-local-consistency-conditions)). This section describes in detail the function of consistency conditions in the blockchain. +In addition to the data structures contained in the block and in the blockchain state, which are serialized into bags of cells according to certain TL-B schemes explained in detail later (Chapters [3](#3-messages%2C-message-descriptors%2C-and-queues)—[5](#5-block-layout)), an important component of the blockchain layout is the [*consistency conditions*](#1-2-3-interaction-with-other-blocks-and-the-outside-world-global-and-local-consistency-conditions) between data kept inside one or in different blocks. This section describes in detail the function of consistency conditions in the blockchain. ### 1.3.1. Expressing consistency conditions @@ -263,7 +264,7 @@ $$ "for any $x$ from $X$, there is a $y$ from $Y$ such that condition $A(x,y)$ holds". Even if we know $C$ to be true, we do not have a way of quickly finding a $y:Y$, such that $A(x,y)$, for a given $x:X$. As a consequence, the verification of $C$ may be quite time-consuming. -In order to simplify the verification of local conditions, they are made *constructible* (i.e., verifiable in bounded time) by adding some *witness* data structures. For instance, condition $C$ of ([5](#1-3-7-constructive-elimination-of-existence-quantifiers)) may be transformed by adding a new data structure $f:X\to Y$ (a map $f$ from $X$ to $Y$) and imposing the following condition $C'$ instead: +In order to simplify the verification of local conditions, they are made *constructible* (i.e., verifiable in bounded time) by adding some *witness* data structures. For instance, condition $C$ of (5) may be transformed by adding a new data structure $f:X\to Y$ (a map $f$ from $X$ to $Y$) and imposing the following condition $C'$ instead: $$ C':\equiv\forall_{(x:X)}A\bigl(x,f(x)\bigr)\quad. \tag{6} @@ -273,11 +274,11 @@ Of course, the "witness" value $f(x):Y$ may be included inside the (modified) da ### 1.3.8. Example: consistency condition for *InMsgDescr* -For instance, the consistency condition between $X:=\textit{InMsgDescr}$, the list of all inbound messages processed in a block, and $Y:=\textit{Transactions}$, the list of all transactions present in a block, is of the above sort: "For any input message $x$ present in *InMsgDescr*, a transaction $y$ must be present in the block such that $y$ processes $x$".[8](#fn8) The procedure of $\exists$-elimination described in [1.3.7](#1-3-7-constructive-elimination-of-existence-quantifiers) leads us to introduce an additional field in the inbound message descriptors of *InMsgDescr*, containing a reference to the transaction in which the message is actually processed. +For instance, the consistency condition between $X:=\textit{InMsgDescr}$, the list of all inbound messages processed in a block, and $Y:=\textit{Transactions}$, the list of all transactions present in a block, is of the above sort: "For any input message $x$ present in *InMsgDescr*, a transaction $y$ must be present in the block such that $y$ processes $x$".[8](#fn8) The procedure of [$\exists$-elimination](#1-3-7-constructive-elimination-of-existence-quantifiers) leads us to introduce an additional field in the inbound message descriptors of *InMsgDescr*, containing a reference to the transaction in which the message is actually processed. ### 1.3.9. Constructive elimination of logical disjunctions -Similarly to the transformation described in [1.3.7](#1-3-7-constructive-elimination-of-existence-quantifiers), condition +Similarly to the [transformation condition](#1-3-7-constructive-elimination-of-existence-quantifiers), $$ D :\equiv \forall_{(x:X)} (A_1(x) \vee A_2(x)) \quad , \tag{7} @@ -291,7 +292,7 @@ $$ This is a special case of the existential quantifier elimination considered before for $Y=\mathbf{2}=\{1,2\}$. It may be useful when $A_1(x)$ and $A_2(x)$ are complicated conditions that cannot be verified quickly, so that it is useful to know in advance which of them is in fact true. -For instance, *InMsgDescr*, as considered in [1.3.8](#1-3-8-example%3A-consistency-condition-for-inmsgdescr), can contain both messages processed in the block and transit messages. We might introduce a field in the inbound message description to indicate whether the message is transit or not, and, in the latter case, include a witness field for the transaction processing the message. +For instance, [*InMsgDescr*](#1-3-8-example%3A-consistency-condition-for-inmsgdescr), can contain both messages processed in the block and transit messages. We might introduce a field in the inbound message description to indicate whether the message is transit or not, and, in the latter case, include a witness field for the transaction processing the message. ### 1.3.10. Constructivization of conditions @@ -303,9 +304,9 @@ Ultimately, all of the internal conditions for a block, along with the local ant ### 1.3.12. Witnesses of the invalidity of a block -If a block does not satisfy all of the validity conditions $C_1, \ldots, C_n$ (i.e., the conjunction $V:\equiv\bigwedge_i C_i$ of the validity conditions), it is *invalid*. This means that it satisfies the "invalidity condition" $\neg V=\bigvee_i\neg C_i$. If all of the $C_i$—and hence, also $V$—have been "constructivized" in the sense described in [1.3.10](#1-3-10-constructivization-of-conditions), so that they contain only logical conjunctions and universal quantifiers (and simple atomic propositions), then $\neg V$ contains only logical disjunctions and existential quantifiers. Then a constructivization of $\neg V$ may be defined, which would involve an *invalidity witness*, starting with an index $i$ of the specific validity condition $C_i$ which fails. +If a block does not satisfy all of the validity conditions $C_1, \ldots, C_n$ (i.e., the conjunction $V:\equiv\bigwedge_i C_i$ of the validity conditions), it is *invalid*. This means that it satisfies the "invalidity condition" $\neg V=\bigvee_i\neg C_i$. If all of the $C_i$—and hence, also $V$—have been [constructivized](#1-3-10-constructivization-of-conditions), so that they contain only logical conjunctions and universal quantifiers (and simple atomic propositions), then $\neg V$ contains only logical disjunctions and existential quantifiers. Then a constructivization of $\neg V$ may be defined, which would involve an *invalidity witness*, starting with an index $i$ of the specific validity condition $C_i$ which fails. -Such invalidity witnesses may also be serialized and presented to other validators or committed into the masterchain to prove that a specific block or block candidate is in fact invalid. Therefore, the construction and serialization of invalidity witnesses is an important part of a Proof-of-Stake (PoS) blockchain design.[9](#fn9) +Such invalidity witnesses may also be serialized and presented to other validators or committed into the masterchain to prove that a specific block or block candidate is in fact invalid. Therefore, the construction and serialization of invalidity witnesses is an important part of a Proof-of-Stake (PoS) blockchain design.[9](#fn9) ### 1.3.13. Minimizing the size of witnesses @@ -374,7 +375,7 @@ $$ \text{Lt}(e)>\text{Lt}(e')\quad\text{whenever $e\succ e'$ (i.e., $e$ logically depends on $e'$),} \quad (9) $$ -without insisting that $\text{Lt}(e)$ be the smallest non-negative integer with this property. In such cases we can speak about *relaxed* logical time, as opposed to the *strict* logical time defined above (cf. [1.4.1](#1-4-1-logical-time)). Notice, however, that the condition ([9](#1-4-2-a-relaxed-variant-of-logical-time)) is a fundamental property of logical time and cannot be relaxed further. +without insisting that $\text{Lt}(e)$ be the smallest non-negative integer with this property. In such cases we can speak about *relaxed* logical time, as opposed to the *strict* [logical time](#1-4-1-logical-time). Notice, however, that the condition ([9](#1-4-2-a-relaxed-variant-of-logical-time)) is a fundamental property of logical time and cannot be relaxed further. ### 1.4.3. Logical time intervals @@ -410,9 +411,9 @@ In particular, if $C$ consists of atomic events $e_1$, ..., $e_n$, then $\text{L ### 1.4.5. Strict, or minimal, logical time intervals -One can assign to any finite collection of atomic events $E=\{e\}$ related by a causality relation (partial order) $\prec$, and all subsets $C\subset E$, *minimal* logical time intervals. That is, among all assignments of logical time intervals satisfying the conditions listed in [1.4.4](#1-4-4-requirements-for-logical-time-intervals), we choose the one having all $\text{Lt}^+(C)-\text{Lt}^-(C)$ as small as possible, and if several assignments with this property exist, we choose the one that has the minimum $\text{Lt}^-(C)$ as well. +One can assign to any finite collection of atomic events $E=\{e\}$ related by a causality relation (partial order) $\prec$, and all subsets $C\subset E$, *minimal* logical time intervals. That is, among all assignments of [logical time intervals](#1-4-4-requirements-for-logical-time-intervals) satisfying the conditions, we choose the one having all $\text{Lt}^+(C)-\text{Lt}^-(C)$ as small as possible, and if several assignments with this property exist, we choose the one that has the minimum $\text{Lt}^-(C)$ as well. -Such an assignment can be achieved by first assigning logical time $\text{Lt}(e)$ to all atomic events $e\in E$ as described in [1.4.1](#1-4-1-logical-time), then setting $\text{Lt}^-(C):=\inf_{e\in C}\text{Lt}(e)$ and $\text{Lt}^+(C):=1+\sup_{e\in C}\text{Lt}(e)$ for any $C\subset E$. +Such an assignment can be achieved by first assigning [logical time](#1-4-1-logical-time) $\text{Lt}(e)$ to all atomic events $e\in E$, then setting $\text{Lt}^-(C):=\inf_{e\in C}\text{Lt}(e)$ and $\text{Lt}^+(C):=1+\sup_{e\in C}\text{Lt}(e)$ for any $C\subset E$. In most cases when we need to assign logical time intervals, we use the minimal logical time intervals just described. @@ -420,7 +421,7 @@ In most cases when we need to assign logical time intervals, we use the minimal The TON Blockchain assigns logical time and logical time intervals to several of its components. -For instance, each outbound message created in a transaction is assigned its *logical creation time*; for this purpose, the creation of an outbound message is considered an atomic event, logically dependent on the previous message created by the same transaction, as well as on the previous transaction of the same account, on the inbound message processed by the same transaction, and on all events contained in the blocks referred to by hashes contained in the block with the same transaction. As a consequence, *outbound messages created by the same smart contract have strictly increasing logical creation times*. The transaction itself is considered a collection of atomic events, and is assigned a logical time interval (cf. [4.2.1](#4-2-1-logical-time-of-a-transaction) for a more precise description). +For instance, each outbound message created in a transaction is assigned its *logical creation time*; for this purpose, the creation of an outbound message is considered an atomic event, logically dependent on the previous message created by the same transaction, as well as on the previous transaction of the same account, on the inbound message processed by the same transaction, and on all events contained in the blocks referred to by hashes contained in the block with the same transaction. As a consequence, *outbound messages created by the same smart contract have strictly increasing logical creation times*. The transaction itself is considered a collection of atomic events, and is assigned a [logical time interval](#1-4-3-logical-time-intervals). Each block is a collection of transaction and message creation events, so it is assigned a logical time interval, explicitly mentioned in the header of the block. @@ -428,13 +429,13 @@ Each block is a collection of transaction and message creation events, so it is ## 1.5 Total blockchain state -This section discusses the total state of the TON Blockchain, as well as the states of separate shardchains and the masterchain. For example, the precise definition of the state of the neighboring shardchains becomes crucial for correctly formalizing the consistency condition asserting that the validators for a shardchain must import the oldest messages from the union of *OutMsgQueues* taken from the states of all neighboring shardchains (cf. [2.2.5](#2-2-5-logical-time-monotonicity%3A-importing-the-oldest-message-from-the-neighbors)). +This section discusses the total state of the TON Blockchain, as well as the states of separate shardchains and the masterchain. For example, the precise definition of the state of the neighboring shardchains becomes crucial for correctly formalizing the consistency condition asserting that the validators for a shardchain must [import the oldest messages](#2-2-5-logical-time-monotonicity%3A-importing-the-oldest-message-from-the-neighbors) from the union of *OutMsgQueues* taken from the states of all neighboring shardchains. ### 1.5.1. Total state defined by a masterchain block Every masterchain block contains a list of all currently active shards and of the latest blocks for each of them. In this respect, *every masterchain block defines the corresponding total state of the TON Blockchain, since it fixes the state of every shardchain, and of the masterchain as well*. -An important requirement imposed on this list of the latest blocks for all shardchain blocks is that, if a masterchain block $B$ lists $S$ as the latest block of some shardchain, and a newer masterchain block $B'$, with $B$ as one of its antecessors, lists $S'$ as the latest block of the same shardchain, then $S$ must be one of the antecessors of $S'$.[10](#fn10) This condition makes the total state of the TON blockchain defined by a subsequent masterchain block $B'$ compatible with the total state defined by a previous block $B$. +An important requirement imposed on this list of the latest blocks for all shardchain blocks is that, if a masterchain block $B$ lists $S$ as the latest block of some shardchain, and a newer masterchain block $B'$, with $B$ as one of its antecessors, lists $S'$ as the latest block of the same shardchain, then $S$ must be one of the antecessors of $S'$.[10](#fn10) This condition makes the total state of the TON blockchain defined by a subsequent masterchain block $B'$ compatible with the total state defined by a previous block $B$. ### 1.5.2. Total state defined by a shardchain block @@ -446,7 +447,7 @@ In particular, when we say that a block *must* import in its *InMsgDescr* the me ## 1.6 Configurable parameters and smart contracts -Recall that the TON Blockchain has several so-called "configurable parameters" (cf. [[3](#ref-3)]), which are either certain values or certain smart contracts residing in the masterchain. This section discusses the storage of and access to these configurable parameters. +Recall that the TON Blockchain has several so-called ["configurable parameters"](/foundations/whitepapers/ton#2-1-21-configurable-parameters), which are either certain values or certain smart contracts residing in the masterchain. This section discusses the storage of and access to these configurable parameters. ### 1.6.1. Examples of configurable parameters @@ -478,7 +479,7 @@ Similarly, the configuration smart contract $\gamma$ may define some "ordinary" ### 1.6.6. Values obtained by get methods may be different from those obtained through the block header -Notice that the state of the configuration smart contract $\gamma$, including the values of configurable parameters, may change several times inside a masterchain block, if there are several transactions processed by $\gamma$ in that block. As a consequence, the values obtained by invoking get methods of $\gamma$, or sending get messages to $\gamma$, may be different from those obtained by inspecting the reference in the block header (cf. [1.6.3](#1-6-3-quick-access-through-the-header-of-masterchain-blocks)), which refers to the *final* state of the configurable parameters in the block. +Notice that the state of the configuration smart contract $\gamma$, including the values of configurable parameters, may change several times inside a masterchain block, if there are several transactions processed by $\gamma$ in that block. As a consequence, the values obtained by invoking get methods of $\gamma$, or sending get messages to $\gamma$, may be different from those obtained by inspecting the reference in the [block header](#1-6-3-quick-access-through-the-header-of-masterchain-blocks), which refers to the *final* state of the configurable parameters in the block. ### 1.6.7. Changing the values of configurable parameters @@ -488,7 +489,7 @@ Some parameters, such as the current set of validators, cannot be changed in thi ### 1.6.8. Changing the validator election procedure -If the validator election procedure ever needs to be changed, this can be accomplished by first committing a new validator election smart contract into the masterchain, and then changing the ordinary configurable parameter containing the address $\nu$ of the validator election smart contract. This will require two-thirds of the validators to accept the proposal in a vote as described above in [1.6.7](#1-6-7-changing-the-values-of-configurable-parameters). +If the validator election procedure ever needs to be changed, this can be accomplished by first committing a new validator election smart contract into the masterchain, and then changing the ordinary configurable parameter containing the address $\nu$ of the validator election smart contract. This will require two-thirds of the validators to [accept the proposal in a vote](#1-6-7-changing-the-values-of-configurable-parameters). ### 1.6.9. Changing the procedure of changing configurable parameters @@ -510,7 +511,7 @@ The mechanisms for creating new smart contracts and assigning their addresses de ### 1.7.2. Transferring cryptocurrency to uninitialized accounts -First of all, *it is possible to send messages, including value-bearing messages, to previously unmentioned accounts*. If an inbound message arrives at a shardchain with a destination address $\eta$ corresponding to an undefined account, it is processed by a transaction as if the code of the smart contract were empty (i.e., consisting of an implicit $\texttt{RET}$). If the message is value-bearing, this leads to the creation of an "uninitialized account", which may have a non-zero balance (if value-bearing messages have been sent to it),[11](#fn11) but has no code and no data. Because even an uninitialized account occupies some persistent storage (needed to hold its balance), some small persistent-storage payments will be exacted from time to time from the account's balance, until it becomes negative. +First of all, *it is possible to send messages, including value-bearing messages, to previously unmentioned accounts*. If an inbound message arrives at a shardchain with a destination address $\eta$ corresponding to an undefined account, it is processed by a transaction as if the code of the smart contract were empty (i.e., consisting of an implicit $\texttt{RET}$). If the message is value-bearing, this leads to the creation of an "uninitialized account", which may have a non-zero balance (if value-bearing messages have been sent to it),[11](#fn11) but has no code and no data. Because even an uninitialized account occupies some persistent storage (needed to hold its balance), some small persistent-storage payments will be exacted from time to time from the account's balance, until it becomes negative. ### 1.7.3. Initializing smart contracts by constructor messages @@ -524,7 +525,7 @@ Notice that the constructor message usually must bear some value, which will be ### 1.7.5. Creating smart contracts by external constructor messages -In some cases, it is necessary to create a smart contract by a constructor message that cannot bear any value—for instance, by a constructor message "from nowhere" (an external inbound message). Then one should first transfer a sufficient amount of funds to the uninitialized smart contract as explained in [1.7.2](#1-7-2-transferring-cryptocurrency-to-uninitialized-accounts), and only then send a constructor message "from nowhere". +In some cases, it is necessary to create a smart contract by a constructor message that cannot bear any value—for instance, by a constructor message "from nowhere" (an external inbound message). Then one should first transfer a sufficient amount of funds to the [uninitialized smart contract](#1-7-2-transferring-cryptocurrency-to-uninitialized-accounts), and only then send a constructor message "from nowhere". ### 1.7.6. Example: creating a cryptocurrency wallet smart contract @@ -536,7 +537,7 @@ An example of the above situation is provided by cryptocurrency wallet applicati - The wallet application can inspect the shardchain containing account $\xi$ (in the case of a basic workchain account) or the masterchain (in the case of a masterchain account), either by itself or using a blockchain explorer, and check the balance of $\xi$. - If the balance is sufficient, the wallet application may create and sign (with the user's private key) the constructor message ("from nowhere"), and submit it for inclusion to the validators or the collators for the corresponding blockchain. - Once the constructor message is included into a block of the blockchain and processed by a transaction, the wallet smart contract is finally created. -- When the user wants to transfer some funds to some other user or smart contract $\eta$, or wants to send a value-bearing message to $\eta$, she uses her wallet application to create the message $m$ that she wants her wallet smart contract $\xi$ to send to $\eta$, envelope $m$ into a special "message from nowhere" $m'$ with destination $\xi$, and sign $m'$ with her private key. Some provisions against replay attacks must be made, as explained in [2.2.1](#2-2-1-message-uniqueness). +- When the user wants to transfer some funds to some other user or smart contract $\eta$, or wants to send a value-bearing message to $\eta$, she uses her wallet application to create the message $m$ that she wants her wallet smart contract $\xi$ to send to $\eta$, envelope $m$ into a special "message from nowhere" $m'$ with destination $\xi$, and sign $m'$ with her private key. Some provisions against [replay attacks](#2-2-1-message-uniqueness) must be made. - The wallet smart contract receives message $m'$ and checks the validity of the signature with the aid of the public key stored in its persistent data. If the signature is correct, it extracts embedded message $m$ from $m'$ and sends it to its intended destination $\eta$, with the indicated amount of funds attached to it. - If the user does not need to immediately start transferring funds, but only wants to passively receive some funds, she may keep her account uninitialized as long as she wants (provided the persistent storage payments do not lead to the exhaustion of its balance), thus minimizing the storage profile and persistent storage payments of the account. - Notice that the wallet application may create for the human user the illusion that the funds are kept in the application itself, and provide an interface to transfer funds or send arbitrary messages "directly" from the user's account $\xi$. In reality, all these operations will be performed by the user's wallet smart contract, which effectively acts as a proxy for such requests. We see that a cryptocurrency wallet is a simple example of a *mixed* application, having an on-chain part (the wallet smart contract, used as a proxy for outbound messages) and an off-chain part (the external wallet application running on a user's device and keeping the private account key). @@ -561,7 +562,7 @@ This section explains how the code and state of a smart contract may be changed, The persistent data of a smart contract is usually modified as a result of executing the code of the smart contract in TVM while processing a transaction, triggered by an inbound message to the smart contract. More specifically, the code of the smart contract has access to the old persistent storage of the smart contract via TVM control register $\texttt{c4}$, and may modify the persistent storage by storing another value into $\texttt{c4}$ before normal termination. -Normally, there are no other ways to modify the data of an existing smart contract. If the code of the smart contract does not provide any ways to modify the persistent data (e.g., if it is a simple wallet smart contract as described in [1.7.6](#1-7-6-example%3A-creating-a-cryptocurrency-wallet-smart-contract), which initializes the persistent data with the user's public key and does not intend to ever change it), then it will be effectively immutable—unless the code of the smart contract is modified first. +Normally, there are no other ways to modify the data of an existing smart contract. If the code of the smart contract does not provide any ways to modify the persistent data (e.g., if it is a simple [wallet smart contract](#1-7-6-example%3A-creating-a-cryptocurrency-wallet-smart-contract), which initializes the persistent data with the user's public key and does not intend to ever change it), then it will be effectively immutable—unless the code of the smart contract is modified first. ### 1.8.2. Modification of the code of a smart contract @@ -571,13 +572,13 @@ Typically, if the developer of a smart contract wants to be able to upgrade its ### 1.8.3. Keeping the code or data of the smart contract outside the blockchain -The code or data of the smart contract may be kept outside the blockchain and be represented only by their hashes. In such cases, only empty inbound messages may be processed, as well as messages carrying a correct copy of the smart-contract code (or its portion relevant for processing the specific message) and its data inside special fields. An example of such a situation is given by the uninitialized smart contracts and constructor messages described in [1.7](#1-7-new-smart-contracts-and-their-addresses). +The code or data of the smart contract may be kept outside the blockchain and be represented only by their hashes. In such cases, only empty inbound messages may be processed, as well as messages carrying a correct copy of the smart-contract code (or its portion relevant for processing the specific message) and its data inside special fields. An example of such a situation is given by the [uninitialized smart contracts and constructor messages](#1-7-new-smart-contracts-and-their-addresses). ### 1.8.4. Using code libraries -Some smart contracts may share the same code, but use different data. One example of this is wallet smart contracts (cf. [1.7.6](#1-7-6-example%3A-creating-a-cryptocurrency-wallet-smart-contract)), which are likely to use the same code (throughout all wallets created by the same software), but with different data (because each wallet must use its own pair of cryptographic keys). In this case, the code for all the wallet smart contracts is best committed by the developer into a shared *library*; this library would reside in the masterchain, and be referred to by its hash using a special "external library cell reference" as the root of the code of each wallet smart contract (or as a subtree inside that code). +Some smart contracts may share the same code, but use different data. One example of this is [wallet smart contracts](#1-7-6-example%3A-creating-a-cryptocurrency-wallet-smart-contract), which are likely to use the same code (throughout all wallets created by the same software), but with different data (because each wallet must use its own pair of cryptographic keys). In this case, the code for all the wallet smart contracts is best committed by the developer into a shared *library*; this library would reside in the masterchain, and be referred to by its hash using a special "external library cell reference" as the root of the code of each wallet smart contract (or as a subtree inside that code). -Notice that even if the library code becomes unavailable—for example, because its developer stops paying for its storage in the masterchain—it is still possible to use the smart contracts referring to this library, either by committing the library again into the masterchain, or by including its relevant parts inside a message sent to the smart contract. This external cell reference resolution mechanism is discussed in more detail later in [4.4.3](#4-4-3-smart-contract-library-environment). +Notice that even if the library code becomes unavailable—for example, because its developer stops paying for its storage in the masterchain—it is still possible to use the smart contracts referring to this library, either by committing the library again into the masterchain, or by including its relevant parts inside a message sent to the smart contract. This [external cell reference](#4-4-3-smart-contract-library-environment) resolution mechanism is discussed in more detail later. ### 1.8.5. Destroying smart contracts @@ -617,9 +618,9 @@ Any message has both a *source address* and a *destination address*. Its source Some messages can have no source or no destination address (though at least one of them must be present), as indicated by special flags in the message header. Such messages are the *external messages* intended for the interaction of the TON Blockchain with the outside world—human users and their cryptowallet applications, off-chain and mixed applications and services, other blockchains, and so on. -External messages are never routed inside the TON Blockchain. Instead, "messages from nowhere" (i.e., with no source address) are directly included into the *InMsgDescr* of a destination shardchain block (provided some conditions are met) and processed by a transaction in that very block. Similarly, "messages to nowhere" (i.e., with no TON Blockchain destination address), also known as *log messages*, are also present only in the block containing the transaction that generated such a message.[12](#fn12) +External messages are never routed inside the TON Blockchain. Instead, "messages from nowhere" (i.e., with no source address) are directly included into the *InMsgDescr* of a destination shardchain block (provided some conditions are met) and processed by a transaction in that very block. Similarly, "messages to nowhere" (i.e., with no TON Blockchain destination address), also known as *log messages*, are also present only in the block containing the transaction that generated such a message.[12](#fn12) -Therefore, external messages are almost irrelevant for the discussion of message routing and message delivery guarantees. In fact, the message delivery guarantees for outbound external messages are trivial (at most, the message must be included into the *LogMsg* part of the block), and for inbound external messages there are none, since the validators of a shardchain block are free to include or ignore suggested inbound external messages at their discretion (e.g., according to the processing fee offered by the message).[13](#fn13) +Therefore, external messages are almost irrelevant for the discussion of message routing and message delivery guarantees. In fact, the message delivery guarantees for outbound external messages are trivial (at most, the message must be included into the *LogMsg* part of the block), and for inbound external messages there are none, since the validators of a shardchain block are free to include or ignore suggested inbound external messages at their discretion (e.g., according to the processing fee offered by the message).[13](#fn13) In what follows, we focus on "usual" or "internal" messages, which have both a source and a destination address. @@ -627,7 +628,7 @@ In what follows, we focus on "usual" or "internal" messages, which have both a s When a message needs to be routed through intermediate shardchains before reaching its intended destination, it is assigned a *transit address* and a *next-hop address* in addition to the (immutable) source and destination addresses. When a copy of the message resides inside a transit shardchain awaiting its relay to its next hop, the *transit address* is its intermediate address lying in the transit shardchain, as if belonging to a special message-relay smart contract whose only job is to relay the unchanged message to the next shardchain on the route. The *next-hop address* is the address in a neighboring shardchain (or, on some rare occasions, in the same shardchain) to which the message needs to be relayed. After the message is relayed, the next-hop address usually becomes the transit address of the copy of the message included in the next shardchain. -Immediately after an outbound message is created in a shardchain (or in the masterchain), its transit address is set to its source address.[14](#fn14) +Immediately after an outbound message is created in a shardchain (or in the masterchain), its transit address is set to its source address.[14](#fn14) ### 2.1.5. Computation of the next-hop address for hypercube routing @@ -663,19 +664,19 @@ That said, we tacitly ignore the existence of anycast addresses and the addition ### 2.1.8 Hamming optimality of the next-hop address algorithm -Notice that the specific hypercube routing next-hop computation algorithm explained in [2.1.5](#2-1-5-computation-of-the-next-hop-address-for-hypercube-routing) may potentially be replaced by another algorithm, provided it satisfies certain properties. One of these properties is the *Hamming optimality*, meaning that the Hamming ($L_1$) distance from $\xi$ to $\eta$ equals the sum of Hamming distances from $\xi$ to $\text{NextHop}(\xi,\eta)$ and from $\text{NextHop}(\xi,\eta)$ to $\eta$: +Notice that the specific [hypercube routing next-hop computation algorithm](#2-1-5-computation-of-the-next-hop-address-for-hypercube-routing) may potentially be replaced by another algorithm, provided it satisfies certain properties. One of these properties is the *Hamming optimality*, meaning that the Hamming ($L_1$) distance from $\xi$ to $\eta$ equals the sum of Hamming distances from $\xi$ to $\text{NextHop}(\xi,\eta)$ and from $\text{NextHop}(\xi,\eta)$ to $\eta$: $$ \|\xi-\eta\|_1=\bigl\|\xi-\text{NextHop}(\xi,\eta)\bigr\|_1+\bigl\|\text{NextHop}(\xi,\eta)-\eta\bigr\|_1 \quad (14) $$ -Here $\|\xi-\eta\|_1$ is the Hamming distance between $\xi$ and $\eta$, equal to the number of bit positions in which $\xi$ and $\eta$ differ:[15](#fn15) +Here $\|\xi-\eta\|_1$ is the Hamming distance between $\xi$ and $\eta$, equal to the number of bit positions in which $\xi$ and $\eta$ differ:[15](#fn15) $$ \|\xi-\eta\|_1=\sum_i|\xi_i-\eta_i| \quad (15) $$ -Notice that in general one should expect only an inequality in ([14](#2-1-8-hamming-optimality-of-the-next-hop-address-algorithm)), following from the triangle inequality for the $L_1$-metric. Hamming optimality essentially means that $\text{NextHop}(\xi,\eta)$ lies on one of the (Hamming) shortest paths from $\xi$ to $\eta$. It can also be expressed by saying that $\nu=\text{NextHop}(\xi,\eta)$ is always obtained from $\xi$ by changing the values of bits at some positions to their values in $\eta$: for any bit position $i$, we have $\nu_i=\xi_i$ or $\nu_i=\eta_i$.[16](#fn16) +Notice that in general one should expect only an inequality in ([14](#2-1-8-hamming-optimality-of-the-next-hop-address-algorithm)), following from the triangle inequality for the $L_1$-metric. Hamming optimality essentially means that $\text{NextHop}(\xi,\eta)$ lies on one of the (Hamming) shortest paths from $\xi$ to $\eta$. It can also be expressed by saying that $\nu=\text{NextHop}(\xi,\eta)$ is always obtained from $\xi$ by changing the values of bits at some positions to their values in $\eta$: for any bit position $i$, we have $\nu_i=\xi_i$ or $\nu_i=\eta_i$.[16](#fn16) ### 2.1.9. Non-stopping of NextHop @@ -691,7 +692,7 @@ This convexity property is important for some proofs related to message forwardi ### 2.1.11. Internal routing -Notice that the next-hop address computed according to the rules defined in [2.1.5](#2-1-5-computation-of-the-next-hop-address-for-hypercube-routing) may belong to the same shardchain as the current one (i.e., the one containing the transit address). In that case, the "internal routing" occurs immediately, the transit address is replaced by the value of the computed next-hop address, and the next-hop address computation step is repeated until a next-hop address lying outside the current shardchain is obtained. The message is then kept in the transit output queue according to its computed next-hop address, with its last computed transit address as the "intermediate owner" of the transit message. If the current shardchain splits into two shardchains before the message is forwarded further, it is the shardchain containing the intermediate owner that inherits this transit message. +Notice that the [next-hop address computed](#2-1-5-computation-of-the-next-hop-address-for-hypercube-routing) according to the rules may belong to the same shardchain as the current one (i.e., the one containing the transit address). In that case, the "internal routing" occurs immediately, the transit address is replaced by the value of the computed next-hop address, and the next-hop address computation step is repeated until a next-hop address lying outside the current shardchain is obtained. The message is then kept in the transit output queue according to its computed next-hop address, with its last computed transit address as the "intermediate owner" of the transit message. If the current shardchain splits into two shardchains before the message is forwarded further, it is the shardchain containing the intermediate owner that inherits this transit message. Alternatively, we might go on computing the next-hop addresses only to find out that the destination address already belongs to the current shardchain. In that case, the message will be processed (by a transaction) inside this shardchain instead of being forwarded further. @@ -703,24 +704,24 @@ The masterchain is also included in this definition, as if it were the only shar ### 2.1.13. Any shard is a neighbor of itself -Notice that a shardchain is always considered a neighbor of itself. This may seem redundant, because we always repeat the next-hop computation described in [2.1.5](#2-1-5-computation-of-the-next-hop-address-for-hypercube-routing) until we obtain a next-hop address outside the current shardchain (cf. [2.1.11](#2-1-11-internal-routing)). However, there are at least two reasons for such an arrangement: +Notice that a shardchain is always considered a neighbor of itself. This may seem redundant, because we always repeat the [next-hop computation](#2-1-5-computation-of-the-next-hop-address-for-hypercube-routing) until we obtain a next-hop address [outside the current shardchain](#2-1-11-internal-routing). However, there are at least two reasons for such an arrangement: -- Some messages have the source and the destination address inside the same shardchain, at least when the message is created. However, if such a message is not processed immediately in the same block where it has been created, it must be added to the outbound message queue of its shardchain, and be imported as an inbound message (with an entry in the *InMsgDescr*) in one of the subsequent blocks of the same shardchain.[17](#fn17) +- Some messages have the source and the destination address inside the same shardchain, at least when the message is created. However, if such a message is not processed immediately in the same block where it has been created, it must be added to the outbound message queue of its shardchain, and be imported as an inbound message (with an entry in the *InMsgDescr*) in one of the subsequent blocks of the same shardchain.[17](#fn17) - Alternatively, the next-hop address may originally be in some other shardchain that later gets merged with the current shardchain, so that the next hop becomes inside the same shardchain. Then the message will have to be imported from the outbound message queue of the merged shardchain, and forwarded or processed accordingly to its next-hop address, even though they reside now inside the same shardchain. ### 2.1.14. Hypercube Routing and the ISP Ultimately, the Infinite Sharding Paradigm (ISP) applies here: a shardchain should be considered a provisional union of accountchains, grouped together solely to minimize the block generation and transmission overhead. -The forwarding of a message runs through several intermediate accountchains, some of which can happen to lie in the same shard. In this case, once a message reaches an accountchain lying in this shard, it is immediately ("internally") routed inside that shard until the last accountchain lying in the same shard is reached (cf. [2.1.11](#2-1-11-internal-routing)). Then the message is enqueued in the output queue of that last accountchain.[18](#fn18) +The forwarding of a message runs through several intermediate accountchains, some of which can happen to lie in the same shard. In this case, once a message reaches an accountchain lying in this shard, it is immediately ["internally" routed](#2-1-11-internal-routing) inside that shard until the last accountchain lying in the same shard is reached. Then the message is enqueued in the output queue of that last accountchain.[18](#fn18) ### 2.1.15. Representation of transit and next-hop addresses Notice that the transit and next-hop addresses differ from the source address only in the $\mathit{workchain\_id}$ and in the first (most significant) 64 bits of the account address. Therefore, they may be represented by 96-bit strings. Furthermore, their $\mathit{workchain\_id}$ usually coincides with the $\mathit{workchain\_id}$ of either the source address or the destination address; a couple of bits may be used to indicate this situation, thus further reducing the space required to represent the transit and next-hop addresses. -In fact, the required storage may be reduced even further by observing that the specific hypercube routing algorithm described in [2.1.5](#2-1-5-computation-of-the-next-hop-address-for-hypercube-routing) always generates intermediate (i.e., transit and next-hop) addresses that coincide with the destination address in their first $k$ bits, and with the source address in their remaining bits. Therefore, one might use just the values $0\leq k_{\text{tr}},k_{\text{nh}}\leq 96$ to fully specify the transit and next-hop addresses. One might also notice that $k':=k_{\text{nh}}$ turns out to be a fixed function of $k:=k_{\text{tr}}$ (for instance, $k'=k+n_2=k+4$ for $k\geq32$), and therefore include only one 7-bit value of $k$ in the serialization. +In fact, the required storage may be reduced even further by observing that the specific [hypercube routing algorithm](#2-1-5-computation-of-the-next-hop-address-for-hypercube-routing) always generates intermediate (i.e., transit and next-hop) addresses that coincide with the destination address in their first $k$ bits, and with the source address in their remaining bits. Therefore, one might use just the values $0\leq k_{\text{tr}},k_{\text{nh}}\leq 96$ to fully specify the transit and next-hop addresses. One might also notice that $k':=k_{\text{nh}}$ turns out to be a fixed function of $k:=k_{\text{tr}}$ (for instance, $k'=k+n_2=k+4$ for $k\geq32$), and therefore include only one 7-bit value of $k$ in the serialization. -Such optimizations have the obvious disadvantage that they rely too much on the specific routing algorithm used, which can be changed in the future, so they are used in [3.1.15](#3-1-15-enveloped-messages) with a provision to specify more general intermediate addresses if necessary. +Such optimizations have the obvious disadvantage that they rely too much on the specific routing algorithm used, which can be changed in the future, so they are used in [Enveloped messages](#3-1-15-enveloped-messages) with a provision to specify more general intermediate addresses if necessary. ### 2.1.16. Message envelopes @@ -728,25 +729,27 @@ The transit and next-hop addresses of a forwarded message are not included in th In the representation of a block as a tree, or rather a DAG, of cells, the two different envelopes will contain references to a shared cell with the original message. If the message is large, this arrangement avoids the need to keep more than one copy of the message in the block. +--- + ## 2.2 Hypercube Routing protocol This section exposes the details of the hypercube routing protocol employed by the TON Blockchain to achieve guaranteed delivery of messages between smart contracts residing in arbitrary shardchains. For the purposes of this document, we will refer to the variant of hypercube routing employed by the TON Blockchain as Hypercube Routing (HR). ### 2.2.1. Message uniqueness -Before continuing, let us observe that any (internal) message is *unique*. Recall that a message contains its full source address along with its logical creation time, and all outbound messages created by the same smart contract have strictly increasing logical creation times (cf. [1.4.6](#1-4-6-logical-time-in-the-ton-blockchain)); therefore, the combination of the full source address and the logical creation time uniquely defines the message. Since we assume the chosen hash function SHA-256 to be collision resistant, *a message is uniquely determined by its hash*, so we can identify two messages if we know that their hashes coincide. +Before continuing, let us observe that any (internal) message is *unique*. Recall that a message contains its full source address along with its logical creation time, and all outbound messages created by the same smart contract have strictly increasing [logical creation times](#1-4-6-logical-time-in-the-ton-blockchain); therefore, the combination of the full source address and the logical creation time uniquely defines the message. Since we assume the chosen hash function SHA-256 to be collision resistant, *a message is uniquely determined by its hash*, so we can identify two messages if we know that their hashes coincide. This does not extend to external messages "from nowhere", which have no source addresses. Special care must be taken to prevent replay attacks related to such messages, especially by designers of user wallet smart contracts. One possible solution is to include a sequence number in the body of such messages, and keep the count of external messages already processed inside the smart-contract persistent data, refusing to process an external message if its sequence number differs from this count. ### 2.2.2. Identifying messages with equal hashes -The TON Blockchain assumes that two messages with the same hashes coincide, and treats either of them as a redundant copy of the other. As explained above in [2.2.1](#2-2-1-message-uniqueness), this does not lead to any unexpected effects for internal messages. However, if one sends two coinciding "messages from nowhere" to a smart contract, it may happen that only one of them will be delivered—or both. If their action is not supposed to be idempotent (i.e., if processing the message twice has a different effect from processing it once), some provisions should be made to distinguish the two messages, for instance by including a sequence number in them. +The TON Blockchain assumes that two messages with the same hashes coincide, and treats either of them as a redundant copy of the other. This does not lead to any unexpected effects for [internal messages](#2-2-1-message-uniqueness). However, if one sends two coinciding "messages from nowhere" to a smart contract, it may happen that only one of them will be delivered—or both. If their action is not supposed to be idempotent (i.e., if processing the message twice has a different effect from processing it once), some provisions should be made to distinguish the two messages, for instance by including a sequence number in them. In particular, the *InMsgDescr* and *OutMsgDescr* use the (unenveloped) message hash as a key, tacitly assuming that distinct messages have distinct hashes. In this way, one can trace the path and the fate of a message across different shardchains by looking up the message hash in the *InMsgDescr* and *OutMsgDescr* of different blocks. ### 2.2.3. The structure of *OutMsgQueue* -Recall that the outbound messages—both those created inside the shardchain, and transit messages previously imported from a neighboring shardchain to be relayed to the next-hop shardchain—are accumulated in the *OutMsgQueue*, which is part of the state of the shardchain (cf. [1.2.7](#1-2-7-outbound-message-queue-of-a-shardchain)). In contrast with *InMsgDescr* and *OutMsgDescr*, the key in *OutMsgQueue* is not the message hash, but its next-hop address—or at least its first 96 bits—concatenated with the message hash. +Recall that the outbound messages—both those created inside the shardchain, and transit messages previously imported from a neighboring shardchain to be relayed to the next-hop shardchain—are accumulated in the [*OutMsgQueue*](#1-2-7-outbound-message-queue-of-a-shardchain), which is part of the state of the shardchain. In contrast with *InMsgDescr* and *OutMsgDescr*, the key in *OutMsgQueue* is not the message hash, but its next-hop address—or at least its first 96 bits—concatenated with the message hash. Furthermore, the *OutMsgQueue* is not just a dictionary (hashmap), mapping its keys into (enveloped) messages. Rather, it is a *min-augmented dictionary with respect to the logical creation time*, meaning that each node of the Patricia tree representing *OutMsgQueue* has an additional value (in this case, an unsigned 64-bit integer), and that this augmentation value in each fork node is set to be equal to the minimum of the augmentation values of its children. The augmentation value of a leaf equals the logical creation time of the message contained in that leaf; it need not be stored explicitly. @@ -762,7 +765,7 @@ The first fundamental local condition of message forwarding, called *(message im > While importing messages into the *InMsgDescr* of a shardchain block from the *OutMsgQueues* of its neighboring shardchains, the validators must import the messages in the increasing order of their logical time; in the case of a tie, the message with the smaller hash is imported first. -More precisely, each shardchain block contains the hash of a masterchain block (assumed to be "the latest" masterchain block at the time of the shardchain block's creation), which in turn contains the hashes of the most recent shardchain blocks. In this way, each shardchain block indirectly "knows" the most recent state of all other shardchains, and especially its neighboring shardchains, including their *OutMsgQueues*.[19](#fn19) +More precisely, each shardchain block contains the hash of a masterchain block (assumed to be "the latest" masterchain block at the time of the shardchain block's creation), which in turn contains the hashes of the most recent shardchain blocks. In this way, each shardchain block indirectly "knows" the most recent state of all other shardchains, and especially its neighboring shardchains, including their *OutMsgQueues*.[19](#fn19) Now an alternative equivalent formulation of the monotonicity condition is as follows: @@ -776,7 +779,7 @@ Notice that if this condition is not fulfilled, a small Merkle proof witnessing - A path in the *OutMsgQueue* of a neighbor from the root to a certain message $m$ with small logical creation time. - A path in the *InMsgDescr* of the block under consideration showing that the key equal to $\text{Hash}(m)$ is absent in *InMsgDescr* (i.e., that $m$ has not been included in the current block). -- A proof that $m$ has not been included in a preceding block of the same shardchain, using the block header information containing the smallest and the largest logical time of all messages imported into the block (cf. [2.3.4](#2-3-4-checking-whether-a-message-has-already-been-delivered-to-its-final-destination)–[2.3.7](#2-3-7-checking-whether-an-hr-message-has-already-been-delivered-via-ihr-to-its-final-destination) for more information). +- A proof that $m$ has not been included in a preceding block of the same shardchain, using the block header information containing the smallest and the largest logical time of all messages imported into the block ([2.3.4](#2-3-4-checking-whether-a-message-has-already-been-delivered-to-its-final-destination)–[2.3.7](#2-3-7-checking-whether-an-hr-message-has-already-been-delivered-via-ihr-to-its-final-destination) for more information). - A path in *InMsgDescr* to another included message $m'$, such that either $\text{Lt}(m')>\text{Lt}(m)$, or $\text{Lt}(m')=\text{Lt}(m)$ and $\text{Hash}(m')>\text{Hash}(m)$. ### 2.2.7. Deleting a message from *OutMsgQueue* @@ -785,7 +788,7 @@ A message must be deleted from *OutMsgQueue* sooner or later; otherwise, the sto ### 2.2.8. Guaranteed message delivery via Hypercube Routing -In this way, a message cannot be deleted from the outbound message queue unless it has been either relayed to its next-hop shardchain or delivered to its final destination (cf. [2.2.7](#2-2-7-deleting-a-message-from-outmsgqueue)). Meanwhile, the message import monotonicity condition (cf. [2.2.5](#2-2-5-logical-time-monotonicity%3A-importing-the-oldest-message-from-the-neighbors)) ensures that any message will sooner or later be relayed into the next shardchain, taking into account other conditions which require the validators to use at least half of the block's space or gas limits for importing inbound internal messages (otherwise the validators might choose to create empty blocks or import only external messages even in the presence of non-empty outbound message queues at their neighbors). +In this way, a message cannot be [deleted from the outbound message queue](#2-2-7-deleting-a-message-from-outmsgqueue) unless it has been either relayed to its next-hop shardchain or delivered to its final destination. Meanwhile, the [message import monotonicity condition](#2-2-5-logical-time-monotonicity%3A-importing-the-oldest-message-from-the-neighbors) ensures that any message will sooner or later be relayed into the next shardchain, taking into account other conditions which require the validators to use at least half of the block's space or gas limits for importing inbound internal messages (otherwise the validators might choose to create empty blocks or import only external messages even in the presence of non-empty outbound message queues at their neighbors). ### 2.2.9 Message processing order @@ -793,13 +796,13 @@ When several imported messages are processed by transactions inside a block, the ### 2.2.10 FIFO guarantees of Hypercube Routing -The message processing order conditions (cf. [2.2.9](#2-2-9-message-processing-order)), along with the message import monotonicity conditions (cf. [2.2.5](#2-2-5-logical-time-monotonicity%3A-importing-the-oldest-message-from-the-neighbors)), imply the *FIFO guarantees for Hypercube Routing*. Namely, if a smart contract $\xi$ creates two messages $m$ and $m'$ with the same destination $\eta$, and $m'$ is generated later than $m$ (meaning that $m\prec m'$, hence $\text{Lt}(m)<\text{Lt}(m')$), then $m$ will be processed by $\eta$ before $m'$. This is so because both messages will follow the same routing steps on the path from $\xi$ to $\eta$ (the Hypercube Routing algorithm described in [2.1.5](#2-1-5-computation-of-the-next-hop-address-for-hypercube-routing) is deterministic), and in all outbound queues and inbound message descriptions $m'$ will appear "after" $m$.[20](#fn20) +The [message processing order](#2-2-9-message-processing-order) conditions, along with the [message import monotonicity conditions](#2-2-5-logical-time-monotonicity%3A-importing-the-oldest-message-from-the-neighbors), imply the *FIFO guarantees for Hypercube Routing*. Namely, if a smart contract $\xi$ creates two messages $m$ and $m'$ with the same destination $\eta$, and $m'$ is generated later than $m$ (meaning that $m\prec m'$, hence $\text{Lt}(m)<\text{Lt}(m')$), then $m$ will be processed by $\eta$ before $m'$. This is so because both messages will follow the same routing steps on the path from $\xi$ to $\eta$ (the [Hypercube Routing algorithm](#2-1-5-computation-of-the-next-hop-address-for-hypercube-routing) is deterministic), and in all outbound queues and inbound message descriptions $m'$ will appear "after" $m$.[20](#fn20) If message $m'$ can be delivered to $B$ via Instant Hypercube Routing, this is not necessarily true anymore. Therefore, a simple way of ensuring FIFO message delivery discipline between a pair of smart contracts consists in setting a special bit in the message header preventing its delivery via IHR. ### 2.2.11. Delivery uniqueness guarantees of Hypercube Routing -Notice that the message import monotonicity conditions also imply the *uniqueness* of the delivery of any message via Hypercube Routing—i.e., that it cannot be imported and processed by the destination smart contract more than once. We will see later in [2.3](#2-3-instant-hypercube-routing-and-combined-delivery-guarantees) that enforcing delivery uniqueness when both Hypercube Routing and Instant Hypercube Routing are active is more complicated. +Notice that the message import monotonicity conditions also imply the *uniqueness* of the delivery of any message via Hypercube Routing—i.e., that it cannot be imported and processed by the destination smart contract more than once. Enforcing delivery uniqueness when both [Hypercube Routing and Instant Hypercube Routing](#2-3-instant-hypercube-routing-and-combined-delivery-guarantees) are active is more complicated. ### 2.2.12 An overview of Hypercube Routing @@ -809,17 +812,17 @@ Let us summarize all routing steps performed to deliver an internal message $m$ - [ImmediateProcessing?] — If the destination $\eta$ resides in the same shardchain $S_0$, the message may be processed in the same block it was generated in. In this case, $m$ is included into *OutMsgDescr* with a flag indicating it has been processed in this very block and need not be forwarded further. Another copy of $m$ is included into *InMsgDescr*, along with the usual data describing the processing of inbound messages. (Notice that $m$ is not included into the *OutMsgQueue* of $S_0$.) -- [InitialInternalRouting] — If $m$ either has a destination outside $S_0$, or is not processed in the same block where it was generated, the internal routing procedure described in [2.1.11](#2-1-11-internal-routing) is applied, until an index $k$ is found such that $\xi_k$ lies in $S_0$, but $\xi_{k+1}=\text{NextHop}(\xi_k,\eta)$ does not (i.e., $S_k=S_0$, but $S_{k+1}\neq S_0$). Alternatively, this process stops if $\xi_k=\eta$ or $\xi_k$ coincides with $\eta$ in its first 96 bits. +- [InitialInternalRouting] — If $m$ either has a destination outside $S_0$, or is not processed in the same block where it was generated, the [internal routing](#2-1-11-internal-routing) procedure is applied, until an index $k$ is found such that $\xi_k$ lies in $S_0$, but $\xi_{k+1}=\text{NextHop}(\xi_k,\eta)$ does not (i.e., $S_k=S_0$, but $S_{k+1}\neq S_0$). Alternatively, this process stops if $\xi_k=\eta$ or $\xi_k$ coincides with $\eta$ in its first 96 bits. -- [OutboundQueuing] — The message $m$ is included into *OutMsgDescr* (with the key equal to its hash), with an envelope containing its transit address $\xi_k$ and next-hop address $\xi_{k+1}$ as explained in [2.1.6](#2-1-16-message-envelopes) and [2.1.15](#2-1-15-representation-of-transit-and-next-hop-addresses). The same enveloped message is also included in the *OutMsgQueue* of the state of $S_k$, with the key equal to the concatenation of the first 96 bits of its next-hop address $\xi_{k+1}$ (which may be equal to $\eta$ if $\eta$ belongs to $S_k$) and the message hash $\text{Hash}(m)$. +- [OutboundQueuing] — The message $m$ is included into *OutMsgDescr* (with the key equal to its hash), with an [envelope](#2-1-16-message-envelopes) containing its [transit address](#2-1-15-representation-of-transit-and-next-hop-addresses) $\xi_k$ and next-hop address $\xi_{k+1}$. The same enveloped message is also included in the *OutMsgQueue* of the state of $S_k$, with the key equal to the concatenation of the first 96 bits of its next-hop address $\xi_{k+1}$ (which may be equal to $\eta$ if $\eta$ belongs to $S_k$) and the message hash $\text{Hash}(m)$. - [QueueWait] — Message $m$ waits in the *OutMsgQueue* of shardchain $S_k$ to be forwarded further. In the meantime, shardchain $S_k$ may split or merge with other shardchains; in that case, the new shard $S'_k$ containing the transit address $\xi_k$ inherits $m$ in its *OutMsgQueue*. -- [ImportInbound] — At some point in the future, the validators for the shardchain $S_{k+1}$ containing the next-hop address $\xi_{k+1}$ scan the *OutMsgQueue* in the state of shardchain $S_k$ and decide to import message $m$ in keeping with the monotonicity condition (cf. [2.2.5](#2-2-5-logical-time-monotonicity%3A-importing-the-oldest-message-from-the-neighbors)) and other conditions. A new block for shardchain $S_{k+1}$ is generated, with an enveloped copy of $m$ included in its *InMsgDescr*. The entry in *InMsgDescr* contains also the reason for importing $m$ into this block, with a hash of the most recent block of shardchain $S'_k$, and the previous next-hop and transit addresses $\xi_k$ and $\xi_{k+1}$, so that the corresponding entry in the *OutMsgQueue* of $S'_k$ can be easily located. +- [ImportInbound] — At some point in the future, the validators for the shardchain $S_{k+1}$ containing the next-hop address $\xi_{k+1}$ scan the *OutMsgQueue* in the state of shardchain $S_k$ and decide to import message $m$ in keeping with the [monotonicity condition](#2-2-5-logical-time-monotonicity%3A-importing-the-oldest-message-from-the-neighbors) and other conditions. A new block for shardchain $S_{k+1}$ is generated, with an enveloped copy of $m$ included in its *InMsgDescr*. The entry in *InMsgDescr* contains also the reason for importing $m$ into this block, with a hash of the most recent block of shardchain $S'_k$, and the previous next-hop and transit addresses $\xi_k$ and $\xi_{k+1}$, so that the corresponding entry in the *OutMsgQueue* of $S'_k$ can be easily located. - [Confirmation] — This entry in the *InMsgDescr* of $S_{k+1}$ also serves as a confirmation for $S'_k$. In a later block of $S'_k$, message $m$ must be removed from the *OutMsgQueue* of $S'_k$; this modification is reflected in a special entry in the *OutMsgDescr* of the block of $S'_k$ that performs this state modification. -- [Forwarding?] — If the final destination $\eta$ of $m$ does not reside in $S_{k+1}$, the message is *forwarded*. Hypercube Routing is applied until some $\xi_l$, $l>k$, and $\xi_{l+1}=\text{NextHop}(\xi_l,\eta)$ are obtained, such that $\xi_l$ lies in $S_{k+1}$, but $\xi_{l+1}$ does not (cf. [2.1.11](#2-1-11-internal-routing)). After that, a newly-enveloped copy of $m$ with transit address set to $\xi_l$ and next-hop address $\xi_{l+1}$ is included into both the *OutMsgDescr* of the current block of $S_{k+1}$ and the *OutMsgQueue* of the new state of $S_{k+1}$. The entry of $m$ in *InMsgDescr* contains a flag indicating that the message has been forwarded; the entry in *OutMsgDescr* contains the newly-enveloped message and a flag indicating that this is a forwarded message. Then all the steps starting from [OutboundQueueing] are repeated, for $l$ instead of $k$. +- [Forwarding?] — If the final destination $\eta$ of $m$ does not reside in $S_{k+1}$, the message is *forwarded*. Hypercube Routing is applied until some $\xi_l$, $l>k$, and $\xi_{l+1}=\text{NextHop}(\xi_l,\eta)$ are obtained, such that $\xi_l$ lies in $S_{k+1}$, but $\xi_{l+1}$ [does not](#2-1-11-internal-routing). After that, a newly-enveloped copy of $m$ with transit address set to $\xi_l$ and next-hop address $\xi_{l+1}$ is included into both the *OutMsgDescr* of the current block of $S_{k+1}$ and the *OutMsgQueue* of the new state of $S_{k+1}$. The entry of $m$ in *InMsgDescr* contains a flag indicating that the message has been forwarded; the entry in *OutMsgDescr* contains the newly-enveloped message and a flag indicating that this is a forwarded message. Then all the steps starting from [OutboundQueueing] are repeated, for $l$ instead of $k$. - [Processing?] — If the final destination $\eta$ of $m$ resides in $S_{k+1}$, then the block of $S_{k+1}$ that imported the message must process it by a transaction $t$ included in the same block. In this case, *InMsgDescr* contains a reference to $t$ by its logical time $\text{Lt}(t)$, and a flag indicating that the message has been processed. @@ -834,7 +837,7 @@ This section describes the Instant Hypercube Routing protocol, normally applied Let us explain the major steps applied when the Instant Hypercube Routing (IHR) mechanism is applied to a message. (Notice that normally both the usual HR and IHR work in parallel for the same message; some provisions must be taken to guarantee the uniqueness of delivery of any message.) -Consider the routing and delivery of the same message $m$ with source $\xi$ and destination $\eta$ as discussed in [2.2.12](#2-2-12-an-overview-of-hypercube-routing): +Consider the [routing and delivery](#2-2-12-an-overview-of-hypercube-routing) of the same message $m$ with source $\xi$ and destination $\eta$: - [NetworkSend] — After the validators of $S_0$ have agreed on and signed the block containing the creating transaction $t$ for $m$, and observed that the destination $\eta$ of $m$ does not reside inside $S_0$, they may send a datagram (encrypted network message), containing the message $m$ along with a Merkle proof of its inclusion into the *OutMsgDescr* of the block just generated, to the validator group of the shardchain $T$ currently owning the destination $\eta$. @@ -870,29 +873,29 @@ The obvious disadvantage of this algorithm is that, if message $m$ is very old ( ### 2.3.5. Checking whether an IHR message has already been delivered to its final destination -To check whether an IHR message $m$ has already been delivered to its destination shardchain, we can apply the general algorithm described above (cf. [2.3.4](#2-3-4-checking-whether-a-message-has-already-been-delivered-to-its-final-destination)), modified to inspect only the last $c$ blocks for some small constant $c$ (say, $c=8$). If no conclusion can be reached after inspecting these blocks, then the validators for the destination shardchain may simply discard the IHR message instead of spending more resources on this check. +To check whether an IHR message $m$ has already been delivered to its destination shardchain, we can apply the [general algorithm](#2-3-4-checking-whether-a-message-has-already-been-delivered-to-its-final-destination), modified to inspect only the last $c$ blocks for some small constant $c$ (say, $c=8$). If no conclusion can be reached after inspecting these blocks, then the validators for the destination shardchain may simply discard the IHR message instead of spending more resources on this check. ### 2.3.6. Checking whether an HR message has already been delivered via HR to its final destination or an intermediate shardchain To check whether an HR-received message $m$ (or rather, a message $m$ being considered for import via HR) has already been imported via HR, we can use the following algorithm: Let $\xi_k$ be the transit address of $m$ (belonging to a neighboring shardchain $S_k$) and $\xi_{k+1}$ be its next-hop address (belonging to the shardchain under consideration). Since we are considering the inclusion of $m$, $m$ must be present in the *OutMsgQueue* of the most recent state of shardchain $S_k$, with $\xi_k$ and $\xi_{k+1}$ indicated in its envelope. In particular, (a) the message has been included into *OutMsgQueue*, and we may even know when, because the entry in *OutMsgQueue* sometimes contains the logical time of the block where it has been added, and (b) it has not yet been removed from *OutMsgQueue*. -Now, the validators of the neighboring shardchain are required to remove a message from *OutMsgQueue* as soon as they observe that message (with transit and next-hop addresses $\xi_k$ and $\xi_{k+1}$ in its envelope) has been imported into the *InMsgDescr* of the message's next-hop shardchain. Therefore, (b) implies that the message could have been imported into the *InMsgDescr* of a preceding block only if this preceding block is very new (i.e., not yet known to the most recent neighboring shardchain block). Therefore, only a very limited number of preceding blocks (typically one or two, at most) need to be scanned by the algorithm described in [2.3.4](#2-3-4-checking-whether-a-message-has-already-been-delivered-to-its-final-destination) to conclude that the message has not yet been imported.[21](#fn21) In fact, if this check is performed by the validators or collators for the current shardchain themselves, it can be optimized by keeping in memory the *InMsgDescrs* of the several latest blocks. +Now, the validators of the neighboring shardchain are required to remove a message from *OutMsgQueue* as soon as they observe that message (with transit and next-hop addresses $\xi_k$ and $\xi_{k+1}$ in its envelope) has been imported into the *InMsgDescr* of the message's next-hop shardchain. Therefore, (b) implies that the message could have been imported into the *InMsgDescr* of a preceding block only if this preceding block is very new (i.e., not yet known to the most recent neighboring shardchain block). Therefore, only a very limited number of preceding blocks (typically one or two, at most) need to be scanned by the [algorithm](#2-3-4-checking-whether-a-message-has-already-been-delivered-to-its-final-destination) to conclude that the message has not yet been imported.[21](#fn21) In fact, if this check is performed by the validators or collators for the current shardchain themselves, it can be optimized by keeping in memory the *InMsgDescrs* of the several latest blocks. ### 2.3.7 Checking whether an HR message has already been delivered via IHR to its final destination -Finally, to check whether an HR message has already been delivered to its final destination via IHR, one can use the general algorithm described in [2.3.4](#2-3-4-checking-whether-a-message-has-already-been-delivered-to-its-final-destination). In contrast with [2.3.5](#2-3-5-checking-whether-an-ihr-message-has-already-been-delivered-to-its-final-destination), we cannot abort the verification process after scanning a fixed number of the latest blocks in the destination shardchain, because HR messages cannot be dropped without a reason. +Finally, to check whether an HR message has already been delivered to its final destination via IHR, one can use the [general algorithm](#2-3-4-checking-whether-a-message-has-already-been-delivered-to-its-final-destination). In contrast, we cannot [abort the verification process](#2-3-5-checking-whether-an-ihr-message-has-already-been-delivered-to-its-final-destination) after scanning a fixed number of the latest blocks in the destination shardchain, because HR messages cannot be dropped without a reason. Instead, we indirectly bound the number of blocks to be inspected by forbidding the inclusion of IHR message $m$ into a block $B$ of its destination shardchain if there are already more than, say, $c=8$ blocks $B'$ in the destination shardchain with $\text{Lt}^+(B')\geq\text{Lt}(m)$. Such a condition effectively restricts the time interval after the creation of message $m$ in which it could have been delivered via IHR, so that only a small number of blocks of the destination shardchain (at most $c$) will need to be inspected. -Notice that this condition nicely aligns with the modified algorithm described in [2.3.5](#2-3-5-checking-whether-an-ihr-message-has-already-been-delivered-to-its-final-destination), effectively forbidding the validators from importing the newly-received IHR message if more than $c=8$ steps are needed to check that it had not been imported already. +Notice that this condition nicely aligns with the [modified algorithm](#2-3-5-checking-whether-an-ihr-message-has-already-been-delivered-to-its-final-destination), effectively forbidding the validators from importing the newly-received IHR message if more than $c=8$ steps are needed to check that it had not been imported already. --- # 3 Messages, message descriptors, and queues -This chapter presents the internal layout of individual messages, message descriptors (such as *InMsgDescr* or *OutMsgDescr*), and message queues (such as *OutMsgQueue*). Enveloped messages (cf. [2.1.16](#2-1-16-message-envelopes)) are also discussed here. +This chapter presents the internal layout of individual messages, message descriptors (such as *InMsgDescr* or *OutMsgDescr*), and message queues (such as *OutMsgQueue*). [Enveloped messages](#2-1-16-message-envelopes) are also discussed here. Notice that most general conventions related to messages must be obeyed by all shardchains, even if they do not belong to the basic shardchain; otherwise, messaging and interaction between different workchains would not be possible. It is the *interpretation* of the message contents and the *processing* of messages, usually by some transactions, that differs between workchains. @@ -902,7 +905,7 @@ This chapter begins with some general definitions, followed by the precise layou ### 3.1.1. Some standard definitions -For the reader's convenience, we reproduce here several general TL-B definitions.[22](#fn22) These definitions are used below in the discussion of address and message layout, but otherwise are not related to the TON Blockchain. +For the reader's convenience, we reproduce here several general TL-B definitions.[22](#fn22) These definitions are used below in the discussion of address and message layout, but otherwise are not related to the TON Blockchain. ``` unit$_ = Unit; @@ -936,7 +939,7 @@ _ MsgAddressInt = MsgAddress; _ MsgAddressExt = MsgAddress; ``` -The two last lines define type `MsgAddress` to be the internal union of types `MsgAddressInt` and `MsgAddressExt` (not to be confused with their external union `Either MsgAddressInt MsgAddressExt` as defined in [3.1.1](#3-1-1-some-standard-definitions)), as if the preceding four lines had been repeated with the right-hand side replaced by `MsgAddress`. In this way, type `MsgAddress` has four constructors, and types `MsgAddressInt` and `MsgAddressExt` are both subtypes of `MsgAddress`. +The two last lines define type `MsgAddress` to be the internal union of types `MsgAddressInt` and `MsgAddressExt` (not to be confused with their [external union](#3-1-1-some-standard-definitions) `Either MsgAddressInt MsgAddressExt` ), as if the preceding four lines had been repeated with the right-hand side replaced by `MsgAddress`. In this way, type `MsgAddress` has four constructors, and types `MsgAddressInt` and `MsgAddressExt` are both subtypes of `MsgAddress`. ### 3.1.3. External addresses @@ -944,7 +947,7 @@ The first two constructors, `addr_none` and `addr_extern`, are used for source a ### 3.1.4 Internal addresses -The two remaining constructors, `addr_std` and `addr_var`, represent internal addresses. The first of them, `addr_std`, represents a signed 8-bit $\mathit{workchain\_id}$ (sufficient for the masterchain and for the basic workchain) and a 256-bit internal address in the selected workchain. The second of them, `addr_var`, represents addresses in workchains with a "large" $\mathit{workchain\_id}$, or internal addresses of length not equal to 256. Both of these constructors have an optional `anycast` value, absent by default, which enables "address rewriting" when present.[23](#fn23) +The two remaining constructors, `addr_std` and `addr_var`, represent internal addresses. The first of them, `addr_std`, represents a signed 8-bit $\mathit{workchain\_id}$ (sufficient for the masterchain and for the basic workchain) and a 256-bit internal address in the selected workchain. The second of them, `addr_var`, represents addresses in workchains with a "large" $\mathit{workchain\_id}$, or internal addresses of length not equal to 256. Both of these constructors have an optional `anycast` value, absent by default, which enables "address rewriting" when present.[23](#fn23) The validators must use `addr_std` instead of `addr_var` whenever possible, but must be ready to accept `addr_var` in inbound messages. The `addr_var` constructor is intended for future extensions. @@ -964,13 +967,13 @@ nanograms$_ amount:(VarUInteger 16) = Grams; If one wants to represent $x$ nanograms, one selects an integer $\ell<16$ such that $x<2^{8\ell}$, and serializes first $\ell$ as an unsigned 4-bit integer, then $x$ itself as an unsigned $8\ell$-bit integer. Notice that four zero bits represent a zero amount of Grams. -Recall (cf. [[3, A](#ref-3)]) that the original total supply of Grams is fixed at five billion (i.e., $5\cdot10^{18}<2^{63}$ nanograms), and is expected to grow very slowly. Therefore, all the amounts of Grams encountered in practice will fit in unsigned or even signed 64-bit integers. The validators may use the 64-bit integer representation of Grams in their internal computations; however, the serialization of these values the blockchain is another matter. +Recall that the original total supply of [Grams](/foundations/whitepapers/ton#a-the-ton-coin,-or-the-gram) is fixed at five billion (i.e., $5\cdot10^{18}<2^{63}$ nanograms), and is expected to grow very slowly. Therefore, all the amounts of Grams encountered in practice will fit in unsigned or even signed 64-bit integers. The validators may use the 64-bit integer representation of Grams in their internal computations; however, the serialization of these values the blockchain is another matter. ### 3.1.6. Representing collections of arbitrary currencies Recall that the TON Blockchain allows its users to define arbitrary cryptocurrencies or tokens apart from the Gram, provided some conditions are met. Such additional cryptocurrencies are identified by 32-bit $\textit{currency\_id}$s. The list of defined additional cryptocurrencies is a part of the blockchain configuration, stored in the masterchain. -When some amounts of one or several such cryptocurrencies need to be represented, a dictionary (cf. [[4](#ref-4), 3.3]) with 32-bit $\textit{currency\_id}$s as keys and `VarUInteger 32` values is used: +When some amounts of one or several such cryptocurrencies need to be represented, a [dictionary](/foundations/whitepapers/tvm#3-3-hashmaps,-or-dictionaries) with 32-bit $\textit{currency\_id}$s as keys and `VarUInteger 32` values is used: ``` extra_currencies$_ dict:(HashmapE 32 (VarUInteger 32)) @@ -1008,7 +1011,7 @@ message$_ {X:Type} info:CommonMsgInfo The meaning of this scheme is as follows. -Type `Message X` describes a message with the body (or payload) of type $X$. Its serialization starts with `info` of type `CommonMsgInfo`, which comes in three flavors: for internal messages, inbound external messages, and outbound external messages, respectively. All of them have a source address `src` and destination address `dest`, which are external or internal according to the chosen constructor. Apart from that, an internal message may bear some `value` in Grams and other defined currencies (cf. [3.1.6](#3-1-6-representing-collections-of-arbitrary-currencies)), and all messages generated inside the TON Blockchain have a logical creation time `created_lt` (cf. [1.4.6](#1-4-6-logical-time-in-the-ton-blockchain)) and creation unixtime `created_at`, both automatically set by the generating transaction. The creation unixtime equals the creation unixtime of the block containing the generating transaction. +Type `Message X` describes a message with the body (or payload) of type $X$. Its serialization starts with `info` of type `CommonMsgInfo`, which comes in three flavors: for internal messages, inbound external messages, and outbound external messages, respectively. All of them have a source address `src` and destination address `dest`, which are external or internal according to the chosen constructor. Apart from that, an internal message may bear some `value` in Grams and [other defined currencies](#3-1-6-representing-collections-of-arbitrary-currencies), and all messages generated inside the TON Blockchain have a [logical creation time](#1-4-6-logical-time-in-the-ton-blockchain) `created_lt` and creation unixtime `created_at`, both automatically set by the generating transaction. The creation unixtime equals the creation unixtime of the block containing the generating transaction. ### 3.1.8. Forwarding and IHR fees. Total value of an internal message @@ -1018,7 +1021,7 @@ Notice that the total value carried by a newly-created internal outbound message ### 3.1.9. Code and data portions contained in a message -Apart from the common message information stored in `info`, a message can contain portions of the destination smart contract's code and data. This feature is used, for instance, in the so-called *constructor messages* (cf. [1.7.3](#1-7-3-initializing-smart-contracts-by-constructor-messages)), which are simply internal or inbound external messages with `code` and possibly `data` fields defined in their `init` portions. If the hash of these fields is correct, and the destination smart contract has no code or data, the values from the message are used instead.[24](#fn24) +Apart from the common message information stored in `info`, a message can contain portions of the destination smart contract's code and data. This feature is used, for instance, in the so-called [*constructor messages*](#1-7-3-initializing-smart-contracts-by-constructor-messages), which are simply internal or inbound external messages with `code` and possibly `data` fields defined in their `init` portions. If the hash of these fields is correct, and the destination smart contract has no code or data, the values from the message are used instead.[24](#fn24) ### 3.1.10 Using `code` and `data` for other purposes @@ -1048,7 +1051,7 @@ Notice that *the source address and the logical creation time of an internal or ### 3.1.15. Enveloped messages -*Message envelopes* are used for attaching routing information, such as the current (transit) address and the next-hop address, to inbound, transit, and outbound messages (cf. [2.1.16](#2-1-16-message-envelopes)). The message itself is kept in a separate cell and referred to from the message envelope by a cell reference. +[*Message envelopes*](#2-1-16-message-envelopes) are used for attaching routing information, such as the current (transit) address and the next-hop address, to inbound, transit, and outbound messages. The message itself is kept in a separate cell and referred to from the message envelope by a cell reference. ``` interm_addr_regular$0 use_src_bits:(#<= 96) @@ -1062,12 +1065,13 @@ msg_envelope cur_addr:IntermediateAddress msg:^(Message Any) = MsgEnvelope; ``` -The `IntermediateAddress` type is used to describe the intermediate addresses of a message—that is, its current (or transit) address `cur_addr`, and its next-hop address `next_addr`. The first constructor `interm_addr_regular` represents the intermediate address using the optimization described in [2.1.15](#2-1-15-representation-of-transit-and-next-hop-addresses), by storing the number of the first bits of the intermediate address that are the same as in the source address; the two other explicitly store the workchain identifier and the first 64 bits of the address inside that workchain (the remaining bits can be taken from the source address). The `fwd_fee_remaining` field is used to explicitly represent the maximum amount of message forwarding fees that can be deducted from the message value during the remaining HR steps; it cannot exceed the value of `fwd_fee` indicated in the message itself. +The `IntermediateAddress` type is used to describe the intermediate addresses of a message—that is, its current (or transit) address `cur_addr`, and its next-hop address `next_addr`. The first constructor `interm_addr_regular` [represents the intermediate address](#2-1-15-representation-of-transit-and-next-hop-addresses) using the optimization, by storing the number of the first bits of the intermediate address that are the same as in the source address; the two other explicitly store the workchain identifier and the first 64 bits of the address inside that workchain (the remaining bits can be taken from the source address). The `fwd_fee_remaining` field is used to explicitly represent the maximum amount of message forwarding fees that can be deducted from the message value during the remaining HR steps; it cannot exceed the value of `fwd_fee` indicated in the message itself. +--- ## 3.2 Inbound message descriptors -This section discusses *InMsgDescr*, the structure containing a description of all inbound messages imported into a block.[25](#fn25) +This section discusses *InMsgDescr*, the structure containing a description of all inbound messages imported into a block.[25](#fn25) ### 3.2.1. Types and sources of inbound messages @@ -1077,7 +1081,7 @@ Inbound messages may be classified as follows: - *Inbound external messages* — Need no additional reason for being imported into the block, but must be immediately processed by a transaction in the same block. - *Internal IHR messages with destination addresses in this block* — The reason for their being imported into the block includes a Merkle proof of their generation (i.e., their inclusion in *OutMsgDescr* of their original block). Such a message must be immediately delivered to its final destination and processed by a transaction. -- *Internal messages with destinations in this block* — The reason for their inclusion is their presence in *OutMsgQueue* of the most recent state of a neighboring shardchain,[26](#fn26) or their presence in *OutMsgDescr* of this very block. This neighboring shardchain is completely determined by the transit address indicated in the forwarded message envelope, which is replicated in *InMsg* as well. The "fate" of this message is again described by a reference to the processing transaction inside the current block. +- *Internal messages with destinations in this block* — The reason for their inclusion is their presence in *OutMsgQueue* of the most recent state of a neighboring shardchain,[26](#fn26) or their presence in *OutMsgDescr* of this very block. This neighboring shardchain is completely determined by the transit address indicated in the forwarded message envelope, which is replicated in *InMsg* as well. The "fate" of this message is again described by a reference to the processing transaction inside the current block. - *Immediately routed internal messages* — Essentially a subclass of the previous class of messages. In this case, the imported message is one of the outbound messages generated in this very block. - *Transit internal messages* — Have the same reason for inclusion as the previous class of messages. However, they are not processed inside the block, but internally forwarded into *OutMsgDescr* and *OutMsgQueue*. This fact, along with a reference to the new envelope of the transit message, must be registered in *InMsg*. - *Discarded internal messages with destinations in this block* — An internal message with a destination in this block may be imported and immediately discarded instead of being processed by a transaction if it has already been received and processed via IHR in a preceding block of this shardchain. In this case, a reference to the previous processing transaction must be provided. @@ -1086,7 +1090,7 @@ Inbound messages may be classified as follows: ### 3.2.2. Descriptor of an inbound message -Each inbound message is described by an instance of the `InMsg` type, which has six constructors corresponding to the cases listed above in [3.2.1](#3-2-1-types-and-sources-of-inbound-messages): +Each [inbound message](#3-2-1-types-and-sources-of-inbound-messages) is described by an instance of the `InMsg` type, which has six constructors corresponding to the cases: ``` msg_import_ext$000 msg:^(Message Any) transaction:^Transaction @@ -1134,13 +1138,13 @@ Notice that the forwarding and transit fees collected from an imported message d Before continuing, let us discuss the serialization of *augmented* hashmaps, or dictionaries. -Augmented hashmaps are key-value storage structures with $n$-bit keys and values of some type $X$, similar to the ordinary hashmaps described in [[4](#ref-4) 3.3]. However, each intermediate node of the Patricia tree representing an *augmented* hashmap is augmented by a value of type $Y$. +Augmented hashmaps are key-value storage structures with $n$-bit keys and values of some type $X$, similar to the [ordinary hashmaps](/foundations/whitepapers/tvm#3-3-hashmaps,-or-dictionaries). However, each intermediate node of the Patricia tree representing an *augmented* hashmap is augmented by a value of type $Y$. These augmentation values must satisfy certain *aggregation conditions*. Typically, $Y$ is an integer type, and the aggregation condition is that the augmentation value of a fork must equal the sum of the augmentation values of its two children. In general, a *fork evaluation function* $S:Y\times Y\to Y$ or $S:Y\to Y\to Y$ is used instead of the sum. The augmentation value of a leaf is usually computed from the value stored in that leaf by means of a *leaf evaluation function* $L:X\to Y$. The augmentation value of a leaf may be stored explicitly in the leaf along with the value; however, in most cases there is no need for this, because the leaf evaluation function $L$ is very simple. ### 3.2.6. Serialization of augmented hashmaps -The serialization of augmented hashmaps with $n$-bit keys, values of type $X$, and augmentation values of type $Y$ is given by the following TL-B scheme, which is an extension of the one provided in [[4](#ref-4), 3.3.3]: +The [serialization of augmented hashmaps](/foundations/whitepapers/tvm#3-3-3-serialization-of-hashmaps) with $n$-bit keys, values of type $X$, and augmentation values of type $Y$ is given by the following TL-B scheme, which is an extension: ``` ahm_edge#_ {n:#} {X:Type} {Y:Type} {l:#} {m:#} @@ -1169,7 +1173,7 @@ import_fees$_ fees_collected:Grams ### 3.2.8. Structure of *InMsgDescr* -Now the *InMsgDescr* itself is defined as an augmented hashmap, with 256-bit keys (equal to the representation hashes of imported messages), values of type `InMsg` (cf. [3.2.2](#3-2-2-descriptor-of-an-inbound-message)), and augmentation values of type `ImportFees` (cf. [3.2.7](#3-2-7-augmentation-of-inmsgdescr)): +Now the *InMsgDescr* itself is defined as an augmented hashmap, with 256-bit keys (equal to the representation hashes of imported messages), values of type [`InMsg`](#3-2-2-descriptor-of-an-inbound-message), and [augmentation](#3-2-7-augmentation-of-inmsgdescr) values of type `ImportFees`: ``` _ (HashmapAugE 256 InMsg ImportFees) = InMsgDescr; @@ -1179,7 +1183,9 @@ This TL-B notation uses an anonymous constructor `_` to define `InMsgDescr` as a ### 3.2.9. Aggregation rules for *InMsgDescr* -The fork evaluation and leaf evaluation functions (cf. [3.2.5](#3-2-5-augmented-hashmaps%2C-or-dictionaries)) are not included explicitly in the above notation, because the dependent types of TL-B are not expressive enough for this purpose. In words, the fork evaluation function is just the componentwise addition of two `ImportFees` instances, and the leaf evaluation function is defined by the rules listed in [3.2.3](#3-2-3-collecting-forwarding-and-transit-fees-from-imported-messages) and [3.2.4](#3-2-4-imported-value-of-an-inbound-message). In this way, the root of the Patricia tree representing an instance of *InMsgDescr* contains an *ImportFees* instance with the total value imported by all inbound messages, and with the total forwarding fees collected from them. +The [fork evaluation and leaf evaluation functions](#3-2-5-augmented-hashmaps%2C-or-dictionaries) are not included explicitly in the above notation, because the dependent types of TL-B are not expressive enough for this purpose. In words, the fork evaluation function is just the componentwise addition of two `ImportFees` instances, and the leaf evaluation function is defined by the rules listed in [3.2.3](#3-2-3-collecting-forwarding-and-transit-fees-from-imported-messages) and [3.2.4](#3-2-4-imported-value-of-an-inbound-message). In this way, the root of the Patricia tree representing an instance of *InMsgDescr* contains an *ImportFees* instance with the total value imported by all inbound messages, and with the total forwarding fees collected from them. + +--- ## 3.3 Outbound message queue and descriptors @@ -1198,7 +1204,7 @@ Outbound messages may be classified as follows: Apart from the above types of outbound messages, *OutMsgDescr* can contain special "message dequeueing records", which indicate that a message has been removed from the *OutMsgQueue* in this block. The reason for this removal is indicated in the message deletion record; it consists of a reference to the enveloped message being deleted, and of the logical time of the neighboring shardchain block that has this enveloped message in its *InMsgDescr*. -Notice that on some occasions a message may be imported from the *OutMsgQueue* of the current shardchain, internally routed, and then included into *OutMsgDescr* and *OutMsgQueue* again with a different envelope.[27](#fn27) In this case, a variant of the transit outbound message description is used, which doubles as a message dequeueing record. +Notice that on some occasions a message may be imported from the *OutMsgQueue* of the current shardchain, internally routed, and then included into *OutMsgDescr* and *OutMsgQueue* again with a different envelope.[27](#fn27) In this case, a variant of the transit outbound message description is used, which doubles as a message dequeueing record. ### 3.3.3. Descriptor of an outbound message @@ -1219,7 +1225,7 @@ msg_export_tr_req$111 out_msg:^MsgEnvelope imported:^InMsg = OutMsg; ``` -The last two descriptions have the effect of removing (dequeueing) the message from *OutMsgQueue* instead of inserting it. The last one re-inserts the message into *OutMsgQueue* with a new envelope after performing the internal routing (cf. [2.1.11](#2-1-11-internal-routing)). +The last two descriptions have the effect of removing (dequeueing) the message from *OutMsgQueue* instead of inserting it. The last one re-inserts the message into *OutMsgQueue* with a new envelope after performing the [internal routing](#2-1-11-internal-routing). ### 3.3.4. Exported value of an outbound message @@ -1229,23 +1235,23 @@ Each outbound message described by an *OutMsg* exports some value—a certain am - An internal message, generated in this block, exports its `value` plus its `ihr_fee` plus its `fwd_fee`. Notice that `fwd_fee` must be equal to the `fwd_fee_remaining` indicated in the `out_msg` envelope. - A transit message exports its `value` plus its `ihr_fee` plus the value of `fwd_fee_remaining` of its `out_msg` envelope. - The same holds for `msg_export_tr_req`, the constructor of *OutMsg* used for re-inserted dequeued messages. -- A message dequeueing record (`msg_export_deq`; cf. [3.3.2](#3-3-2-message-dequeueing-records)) exports no value. +- A [message dequeueing record](#3-3-2-message-dequeueing-records) (`msg_export_deq`) exports no value. ### 3.3.5. Structure of *OutMsgDescr* -The *OutMsgDescr* itself is simply an augmented hashmap (cf. [3.2.5](#3-2-5-augmented-hashmaps%2C-or-dictionaries)), with 256-bit keys (equal to the representation hash of the message), values of type *OutMsg*, and augmentation values of type *CurrencyCollection*: +The *OutMsgDescr* itself is simply an [augmented hashmap](#3-2-5-augmented-hashmaps%2C-or-dictionaries), with 256-bit keys (equal to the representation hash of the message), values of type *OutMsg*, and augmentation values of type *CurrencyCollection*: ``` _ (HashmapAugE 256 OutMsg CurrencyCollection) = OutMsgDescr; ``` -The augmentation is the *exported value* of the corresponding message, aggregated by means of the sum, and computed at the leaves as explained in [3.3.4](#3-3-4-exported-value-of-an-outbound-message). In this way, the total exported value appears near the root of the Patricia tree representing *OutMsgDescr*. +The augmentation is the [*exported value*](#3-3-4-exported-value-of-an-outbound-message) of the corresponding message, aggregated by means of the sum, and computed at the leaves. In this way, the total exported value appears near the root of the Patricia tree representing *OutMsgDescr*. The most important consistency condition for *OutMsgDescr* is that its entry with key $k$ must be an *OutMsg* describing a message $m$ with representation hash $\text{Hash}^\flat(m)=k$. ### 3.3.6. Structure of *OutMsgQueue* -Recall (cf. [1.2.7](#1-2-7-outbound-message-queue-of-a-shardchain)) that *OutMsgQueue* is a part of the blockchain state, not of a block. Therefore, a block contains only hash references to its initial and final state, and its newly-created cells. +Recall that [*OutMsgQueue*](#1-2-7-outbound-message-queue-of-a-shardchain) is a part of the blockchain state, not of a block. Therefore, a block contains only hash references to its initial and final state, and its newly-created cells. The structure of *OutMsgQueue* is simple: it is just an augmented hashmap with 352-bit keys and values of type *OutMsg*: ``` @@ -1268,7 +1274,7 @@ Several internal consistency conditions are imposed on *OutMsg* instances presen --- -## 4 Accounts and transactions +# 4 Accounts and transactions This chapter discusses the layout of *accounts* (or *smart contracts*) and their state in the TON Blockchain. It also considers *transactions*, which are the only way to modify the state of an account, and to process inbound messages and generate new outbound messages. @@ -1280,7 +1286,7 @@ An account is *identified* by its full address, and is *completely described* by ### 4.1.1. Account addresses -In general, an account is completely identified by its *full address*, consisting of a 32-bit $\mathit{workchain\_id}$, and the (usually 256-bit) *internal address* or *account identifier* $\mathit{account\_id}$ inside the chosen workchain. In the basic workchain ($\mathit{workchain\_id}=0$) and in the masterchain ($\mathit{workchain\_id}=-1$) the internal address is always 256-bit. In these workchains,[28](#fn28) $\mathit{account\_id}$ cannot be chosen arbitrarily, but must be equal to the hash of the initial code and data of the smart contract; otherwise, it will be impossible to initialize the account with the intended code and data (cf. [1.7.3](#1-7-3-initializing-smart-contracts-by-constructor-messages)), and to do anything with the accumulated funds in the account balance. +In general, an account is completely identified by its *full address*, consisting of a 32-bit $\mathit{workchain\_id}$, and the (usually 256-bit) *internal address* or *account identifier* $\mathit{account\_id}$ inside the chosen workchain. In the basic workchain ($\mathit{workchain\_id}=0$) and in the masterchain ($\mathit{workchain\_id}=-1$) the internal address is always 256-bit. In these workchains,[28](#fn28) $\mathit{account\_id}$ cannot be chosen arbitrarily, but must be equal to the hash of the initial code and data of the smart contract; otherwise, it will be impossible to [initialize the account](#1-7-3-initializing-smart-contracts-by-constructor-messages) with the intended code and data, and to do anything with the accumulated funds in the account balance. ### 4.1.2. Zero account @@ -1288,7 +1294,7 @@ By convention, the *zero account* or *account with zero address* accumulates the ### 4.1.3. Small and large smart contracts -By default, smart contracts are "small", meaning that they have one account address belonging to exactly one shardchain at any given moment of time. However, one can create a "large smart contract of splitting depth $d$", meaning that up to $2^d$ instances of the smart contract may be created, with the first $d$ bits of the original address of the smart contract replaced by arbitrary bit sequences.[29](#fn29) One can send messages to such smart contracts using internal anycast addresses with `anycast` set to $d$ (cf. [3.1.2](#3-1-2-tl-b-scheme-for-addresses)). Furthermore, the instances of the large smart contract are allowed to use this anycast address as the source address of their generated messages. +By default, smart contracts are "small", meaning that they have one account address belonging to exactly one shardchain at any given moment of time. However, one can create a "large smart contract of splitting depth $d$", meaning that up to $2^d$ instances of the smart contract may be created, with the first $d$ bits of the original address of the smart contract replaced by arbitrary bit sequences.[29](#fn29) One can send messages to such smart contracts using [internal anycast addresses](#3-1-2-tl-b-scheme-for-addresses) with `anycast` set to $d$. Furthermore, the instances of the large smart contract are allowed to use this anycast address as the source address of their generated messages. An instance of a large smart contract is an account with non-zero *maximal splitting depth* $d$. @@ -1320,9 +1326,9 @@ storage_info$_ used:StorageUsed last_paid:uint32 The `last_paid` field contains either the unixtime of the most recent storage payment collected (usually this is the unixtime of the most recent transaction), or the unixtime when the account was created (again, by a transaction). The `due_payment` field, if present, accumulates the storage payments that could not be exacted from the balance of the account, represented by a strictly positive amount of nanograms; it can be present only for uninitialized or frozen accounts that have a balance of zero Grams (but may have non-zero balances in other cryptocurrencies). When `due_payment` becomes larger than the value of a configurable parameter of the blockchain, the account is destroyed altogether, and its balance, if any, is transferred to the zero account. -### 4.1.6. Account description +### 4.1.6. Account balance -The state of an account is represented by an instance of type *Account*, described by the following TL-B scheme:[30](#fn30) +The state of an account is represented by an instance of type *Account*, described by the following TL-B scheme:[30](#fn30) ``` account_none$0 = Account; account$1 addr:MsgAddressInt storage_stat:StorageInfo @@ -1362,7 +1368,7 @@ Notice that the account state is very similar to a message sent from an account - When a transaction is processed, its inputs are an inbound message and the previous account state; its outputs are outbound messages generated and the next account state. If we treat the state as a special kind of message, we see that every transaction has exactly two inputs (the account state and an inbound message) and at least one output. - Both a message and the account state can carry code and data in an instance of *StateInit*, and some value in their `balance`. - An account is initialized by a *constructor message*, which essentially carries the future state and balance of the account. -- On some occasions messages are converted into account states, and vice versa. For instance, when a shardchain merge event occurs, and two accounts that are instances of the same large contract need to be merged, one of them is converted into a message sent to the other one (cf. [4.2.11](#4-2-11-merge-transactions)). Similarly, when a shardchain split event occurs, and an instance of a large smart contract needs to be split into two, this is achieved by a special transaction that creates the new instance by means of a constructor message sent from the previously existing instance to the new one (cf. [4.2.10](#4-2-10-split-transactions)). +- On some occasions messages are converted into account states, and vice versa. For instance, when a shardchain [merge event](#4-2-11-merge-transactions) occurs, and two accounts that are instances of the same large contract need to be merged, one of them is converted into a message sent to the other one. Similarly, when a shardchain [split event](#4-2-10-split-transactions) occurs, and an instance of a large smart contract needs to be split into two, this is achieved by a special transaction that creates the new instance by means of a constructor message sent from the previously existing instance to the new one. - One may say that a message is involved in transferring some information *across space* (between different shardchains, or at least accountchains), while an account state transfers information *across time* (from the past to the future of the same account). ### 4.1.8. Differences between messages and account states @@ -1375,7 +1381,7 @@ Of course, there are important differences, too. For example: ### 4.1.9. The combined state of all accounts in a shard -The split part of the shardchain state (cf. [1.2.1](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state) and [1.2.2](#1-2-2-split-and-non-split-part-of-the-shardchain-block-and-state)) is given by +The split part of the shardchain state ([1.2.1](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state) and [1.2.2](#1-2-2-split-and-non-split-part-of-the-shardchain-block-and-state)) is given by ``` _ (HashmapAugE 256 Account CurrencyCollection) = ShardAccounts; @@ -1411,7 +1417,7 @@ In contrast with messages, which have essentially the same headers throughout al ### 4.2.1. Logical time of a transaction -Each transaction $t$ has a logical time interval $\text{Lt}^\bullet(t)=[\text{Lt}^-(t),\text{Lt}^+(t))$ assigned to it (cf. [1.4.6](#1-4-6-logical-time-in-the-ton-blockchain) and [1.4.3](#1-4-3-logical-time-intervals)). By convention, a transaction $t$ generating $n$ outbound messages $m_1$, $\ldots$, $m_n$ is assigned a logical time interval of length $n+1$, so that +Each transaction $t$ has a logical time interval $\text{Lt}^\bullet(t)=[\text{Lt}^-(t),\text{Lt}^+(t))$ assigned to it ([1.4.6](#1-4-6-logical-time-in-the-ton-blockchain) and [1.4.3](#1-4-3-logical-time-intervals)). By convention, a transaction $t$ generating $n$ outbound messages $m_1$, $\ldots$, $m_n$ is assigned a logical time interval of length $n+1$, so that $$ \mathrm{L}{\small\mathrm{t}}^+(t) = \mathrm{L}{\small\mathrm{t}}^-(t) + n + 1 \quad . \tag{16} @@ -1443,7 +1449,7 @@ Each transaction $t$ contains or indirectly refers to the following data: - The initial state of account $\xi$ (including its balance). - The final state of account $\xi$ (including its balance). - The total fees collected by the validators. -- A detailed description of the transaction containing all or some data needed to validate it, including the kind of the transaction (cf. [4.2.4](#4-2-4-kinds-of-transactions)) and some of the intermediate steps performed. +- A detailed description of the transaction containing all or some data needed to validate it, including the [kind of the transaction](#4-2-4-kinds-of-transactions) and some of the intermediate steps performed. Of these components, all but the very last one are quite general and might appear in other workchains as well. @@ -1474,7 +1480,7 @@ An ordinary transaction is performed in several *phases*, which may be thought o ### 4.2.6. Bouncing inbound messages to non-existent accounts -Notice that if an inbound message with its `bounce` flag set is sent to a previously non-existent account, and the transaction is aborted (for instance, because there is no code and data with the correct hash in the inbound message, so the virtual machine could not be invoked at all), then the account is not created even as an uninitialized account, since it would have zero balance and no code and data anyways.[31](#fn31) +Notice that if an inbound message with its `bounce` flag set is sent to a previously non-existent account, and the transaction is aborted (for instance, because there is no code and data with the correct hash in the inbound message, so the virtual machine could not be invoked at all), then the account is not created even as an uninitialized account, since it would have zero balance and no code and data anyways.[31](#fn31) ### 4.2.7. Processing of an inbound message is split between computing and action phases @@ -1487,7 +1493,7 @@ Some reasons for such an arrangement are: - It is simpler to abort the transaction if the smart contract eventually terminates with an exit code other than 0 or 1. - The rules for processing output actions may be changed without modifying the virtual machine. (For instance, new output actions may be introduced.) - The virtual machine itself may be modified or even replaced by another one (for instance, in a new workchain) without changing the rules for processing output actions. -- The execution of the smart contract inside the virtual machine is completely isolated from the blockchain and is a *pure computation*. As a consequence, this execution may be *virtualized* inside the virtual machine itself by means of TVM's $\texttt{RUNVM}$ primitive, a useful feature for validator smart contracts and for smart contracts controlling payment channels and other sidechains. Additionally, the virtual machine may be *emulated* inside itself or a stripped-down version of itself, a useful feature for validating the execution of smart contracts inside TVM.[32](#fn32) +- The execution of the smart contract inside the virtual machine is completely isolated from the blockchain and is a *pure computation*. As a consequence, this execution may be *virtualized* inside the virtual machine itself by means of TVM's $\texttt{RUNVM}$ primitive, a useful feature for validator smart contracts and for smart contracts controlling payment channels and other sidechains. Additionally, the virtual machine may be *emulated* inside itself or a stripped-down version of itself, a useful feature for validating the execution of smart contracts inside TVM.[32](#fn32) ### 4.2.9. Storage, tick, and tock transactions @@ -1497,19 +1503,19 @@ Storage transactions are very similar to a stand-alone storage phase of an ordin Split transactions in fact consist of two transactions. If an account $\xi$ needs to be split into two accounts $\xi$ and $\xi'$: -- First a *split prepare transaction*, similar to a tock transaction (but in a shardchain instead of the masterchain), is issued for account $\xi$. It must be the last transaction for $\xi$ in a shardchain block. The output of the processing stage of a split prepare transaction consists not only of the new state of account $\xi$, but also of the new state of account $\xi'$, represented by a constructor message to $\xi'$ (cf. [4.1.7](#4-1-7-account-state-as-a-message-from-an-account-to-its-future-self)). +- First a *split prepare transaction*, similar to a tock transaction (but in a shardchain instead of the masterchain), is issued for account $\xi$. It must be the last transaction for $\xi$ in a shardchain block. The output of the processing stage of a split prepare transaction consists not only of the new state of account $\xi$, but also of the new state of account $\xi'$, represented by a [constructor message](#4-1-7-account-state-as-a-message-from-an-account-to-its-future-self) to $\xi'$. - Then a *split install transaction* is added for account $\xi'$, with a reference to the corresponding split prepare transaction. The split install transaction must be the only transaction for a previously non-existent account $\xi'$ in the block. It effectively sets the state of $\xi'$ as defined by the split prepare transaction. ### 4.2.11. Merge transactions Merge transactions also consist of two transactions each. If an account $\xi'$ needs to be merged into account $\xi$: -- First a *merge prepare transaction* is issued for $\xi'$, which converts all of its persistent state and balance into a special constructor message with destination $\xi$ (cf. [4.1.7](#4-1-7-account-state-as-a-message-from-an-account-to-its-future-self)). +- First a *merge prepare transaction* is issued for $\xi'$, which converts all of its persistent state and balance into a [special constructor message](#4-1-7-account-state-as-a-message-from-an-account-to-its-future-self) with destination $\xi$. - Then a *merge install transaction* for $\xi$, referring to the corresponding merge prepare transaction, processes that constructor message. The merge install transaction is similar to a tick transaction in that it must be the first transaction for $\xi$ in a block, but it is located in a shardchain block, not in the masterchain, and it has a special inbound message. ### 4.2.12. Serialization of a general transaction -Any transaction contains the fields listed in [4.2.3](#4-2-3-generic-components-of-a-transaction). As a consequence, there are some common components in all transactions: +Any transaction contains the [fields](#4-2-3-generic-components-of-a-transaction). As a consequence, there are some common components in all transactions: ``` transaction$_ account_addr:uint256 lt:uint64 outmsg_cnt:uint15 @@ -1523,9 +1529,9 @@ transaction$_ account_addr:uint256 lt:uint64 outmsg_cnt:uint15 old:^X new:^X = MERKLE_UPDATE X; ``` -The exclamation mark in the TL-B declaration of a `merkle_update` indicates special processing required for such values. In particular, they must be kept in a separate cell, which must be marked as *exotic* by a bit in its header (cf. [[4](#ref-3) 3.1]). +The exclamation mark in the TL-B declaration of a `merkle_update` indicates special processing required for such values. In particular, they must be kept in a separate cell, which must be marked as [*exotic*](/foundations/whitepapers/tvm#3-1-7-types-of-exotic-cells) by a bit in its header. -A full explanation of the serialization of *TransactionDescr*, which describes one transaction according to its kind listed in [4.2.4](#4-2-4-kinds-of-transactions), can be found in [4.3](#4-3-transaction-descriptions). +A full explanation of the serialization of *TransactionDescr*, which describes one transaction according to its [kind](#4-2-4-kinds-of-transactions), can be found in [Transaction descriptions](#4-3-transaction-descriptions). ### 4.2.13. Representation of outbound messages generated by a transaction @@ -1550,7 +1556,7 @@ acc_trans$_ account_addr:uint256 The `transactions` dictionary is sum-augmented by a *Grams* value, which aggregates the total fees collected from these transactions. -In addition to this dictionary, an *AccountBlock* contains a Merkle update (cf. [[4](#ref-3) 3.1]) of the total state of the account. If an account did not exist before the block, its state is represented by an `account_none`. +In addition to this dictionary, an *AccountBlock* contains a [Merkle update](/foundations/whitepapers/tvm#3-1-7-types-of-exotic-cells) of the total state of the account. If an account did not exist before the block, its state is represented by an `account_none`. ### 4.2.16. Consistency conditions for *AccountBlocks* @@ -1567,7 +1573,7 @@ These conditions simply express the fact that the state of an account may change ### 4.2.17. Collection of all transactions in a block -All transactions in a block are represented by (cf. [1.2.1](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state)): +All transactions in a block are represented by [1.2.1](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state): ``` _ (HashmapAugE 256 AccountBlock Grams) = ShardAccountBlocks; @@ -1578,13 +1584,15 @@ _ (HashmapAugE 256 AccountBlock Grams) = ShardAccountBlocks; Again, consistency conditions are imposed on this structure, requiring that the value at key $\xi$ be an *AccountBlock* with address equal to $\xi$. Further consistency conditions relate this structure with the initial and final states of the shardchain indicated in the block, requiring that: - If *ShardAccountBlock* has no key $\xi$, then the state of account $\xi$ in the initial and in the final state of the block must coincide (or it must be absent from both). -- If $\xi$ is present in *ShardAccountBlock*, its initial and final states as indicated in AccountBlock must match those indicated in the initial and final states of the shardchain block, expressed by instances of *ShardAccounts* (cf. [4.1.9](#4-1-9-the-combined-state-of-all-accounts-in-a-shard)). +- If $\xi$ is present in *ShardAccountBlock*, its initial and final states as indicated in AccountBlock must match those indicated in the initial and final states of the shardchain block, expressed by instances of [*ShardAccounts*](#4-1-9-the-combined-state-of-all-accounts-in-a-shard). These conditions express that the shardchain state is indeed composed out of the states of separate accountchains. +--- + ## 4.3 Transaction descriptions -This section presents the specific TL-B schemes for transaction descriptions according to the classification provided in [4.2.4](#4-2-4-kinds-of-transactions). +This section presents the specific TL-B schemes for transaction descriptions according to the [classification](#4-2-4-kinds-of-transactions). ### 4.3.1. Reasons for omitting data from a transaction description @@ -1596,7 +1604,7 @@ If we compress the sequence of all intermediate steps of the virtual machine int The above considerations notwithstanding, there are still several reasons to introduce some details in the transaction description: -- We want to impose external consistency conditions on the transaction, so that at least the validity of the value flow inside the transaction and the validity of inbound and outbound messages can be quickly checked without invoking the virtual machine (cf. [4.2.14](#4-2-14-consistency-conditions-for-transactions)). This at least guarantees the invariance of the total amount of each cryptocurrency in the blockchain, even if it does not guarantee the correctness of its distribution. +- We want to impose external [consistency conditions on the transaction](#4-2-14-consistency-conditions-for-transactions), so that at least the validity of the value flow inside the transaction and the validity of inbound and outbound messages can be quickly checked without invoking the virtual machine. This at least guarantees the invariance of the total amount of each cryptocurrency in the blockchain, even if it does not guarantee the correctness of its distribution. - We want to be able to trace principal state changes of an account (such as its being created, activated, or frozen) by inspecting the data stored in the transaction description, without figuring out the missing details of the transaction. This simplifies the verification of the consistency conditions between the accountchain and shardchain states in a block. - Finally, certain information—such as the total steps of the virtual machine, the hashes of its initial and final states, the total gas consumed, and the exit code—might considerably simplify the debugging and implementation of the TON Blockchain software. (This information would help a human programmer understand what has happened in a particular blockchain block.) @@ -1668,7 +1676,7 @@ If there is no reason to skip the computing phase, TVM is invoked and the result - The `gas_limit` parameter reflects the gas limit for this instance of TVM. It equals the lesser of either the Grams credited in the credit phase from the value of the inbound message divided by the current gas price, or the global per-transaction gas limit. - The `gas_credit` parameter may be non-zero only for external inbound messages. It is the lesser of either the amount of gas that can be paid from the account balance or the maximum gas credit. - The `exit_code` and `exit_args` parameters represent the status values returned by TVM. -- The `vm_init_state_hash` and `vm_final_state_hash` parameters are the representation hashes of the original and resulting states of TVM, and `vm_steps` is the total number of steps performed by TVM (usually equal to two plus the number of instructions executed, including implicit RETs).[33](#fn33) +- The `vm_init_state_hash` and `vm_final_state_hash` parameters are the representation hashes of the original and resulting states of TVM, and `vm_steps` is the total number of steps performed by TVM (usually equal to two plus the number of instructions executed, including implicit RETs).[33](#fn33) ### 4.3.8. Description of the action phase @@ -1765,17 +1773,19 @@ trans_merge_install$0111 split_info:SplitMergeInfo = TransactionDescr; ``` +--- + ## 4.4 Invoking smart contracts in TVM This section describes the exact parameters with which TVM is invoked during the computing phase of ordinary and other transactions. ### 4.4.1 Smart-contract code -The *code* of a smart contract is normally a part of the account's persistent state, at least if the account is *active* (cf. [4.1.6](#4-1-6-account-description)). However, a frozen or uninitialized (or non-existent) account has no persistent state, with the possible exception of the account's balance and the hash of its intended state (equal to the account address for uninitialized accounts). In this case, the code must be supplied in the `init` field of the inbound message being processed by the transaction (cf. [3.1.7](#3-1-7-message-layout)). +The *code* of a smart contract is normally a part of the account's persistent state, at least if the account is [*active*](#4-1-6-account-balance). However, a frozen or uninitialized (or non-existent) account has no persistent state, with the possible exception of the account's balance and the hash of its intended state (equal to the account address for uninitialized accounts). In this case, the code must be supplied in the [`init`](#3-1-7-message-layout) field of the inbound message being processed by the transaction. ### 4.4.2. Smart-contract persistent data -The *persistent data* of a smart contract is kept alongside its code, and remarks similar to those made above in [4.4.1](#4-4-1-smart-contract-code) apply. In this respect, the code and persistent data of a smart contract are just two parts of its persistent state, which differ only in the way they are treated by TVM during smart-contract execution. +The *persistent data* of a smart contract is kept alongside its code, and remarks similar to those made [above](#4-4-1-smart-contract-code) apply. In this respect, the code and persistent data of a smart contract are just two parts of its persistent state, which differ only in the way they are treated by TVM during smart-contract execution. ### 4.4.3. Smart-contract library environment @@ -1783,7 +1793,7 @@ The *library environment* of a smart contract is a hashmap mapping 256-bit cell The library environment for an invocation of a smart contract is computed as follows: -1. The global library environment for the workchain in question is taken from the current state of the masterchain.[34](#fn34) +1. The global library environment for the workchain in question is taken from the current state of the masterchain.[34](#fn34) 2. Next, it is augmented by the local library environment of the smart contract, stored in the `library` field of the smart contract's state. Only 256-bit keys equal to the hashes of the corresponding value cells are taken into account. If a key is present in both the global and local library environments, the local environment takes precedence while merging the two library environments. 3. Finally, the message library stored in the `library` field of the init field of the inbound message is similarly taken into account. Notice, however, that if the account is frozen or uninitialized, the `library` field of the message is part of the suggested state of the account, and is used instead of the local library environment in the previous step. The message library has lower precedence than both the local and the global library environments. @@ -1791,24 +1801,24 @@ The library environment for an invocation of a smart contract is computed as fol A new instance of TVM is initialized prior to the execution of a smart contract as follows: -- The original $\texttt{cc}$ (current continuation) is initialized using the cell slice created from the cell $\texttt{code}$, containing the code of the smart contract computed as described in [4.4.1](#4-4-1-smart-contract-code). +- The original $\texttt{cc}$ (current continuation) is initialized using the cell slice created from the cell $\texttt{code}$, containing the code of the [smart contract](#4-4-1-smart-contract-code). - The $\texttt{cp}$ (TVM codepage) is set to zero. If the smart contract wants to use another TVM codepage $x$, it must switch to it by using $\texttt{SETCODEPAGE}$ $x$ as the first instruction of its code. - Control register $\texttt{c0}$ (return continuation) is initialized by extraordinary continuation `ec_quit` with parameter 0. When executed, this continuation leads to a termination of TVM with exit code 0. - Control register $\texttt{c1}$ (alternative return continuation) is initialized by extraordinary continuation `ec_quit` with parameter 1. When invoked, it leads to a termination of TVM with exit code 1. (Notice that terminating with exit code 0 or 1 is considered a successful termination.) - Control register $\texttt{c2}$ (exception handler) is initialized by extraordinary continuation `ec_quit_exc`. When invoked, it takes the top integer from the stack (equal to the exception number) and terminates TVM with exit code equal to that integer. In this way, by default all exceptions terminate the smart-contract execution with exit code equal to the exception number. - Control register $\texttt{c3}$ (code dictionary) is initialized by the cell with the smart-contract code, similarly to the initial current continuation ($\texttt{cc}$). -- Control register $\texttt{c4}$ (root of persistent data) is initialized by the persistent data of the smart contract.[35](#fn35) -- Control register $\texttt{c5}$ (root of actions) is initialized by an empty cell. The "output action" primitives of TVM, such as $\texttt{SENDMSG}$, use $\texttt{c5}$ to accumulate the list of actions (e.g., outbound messages) to be performed upon successful termination of the smart contract (cf. [4.2.7](#4-2-7-processing-of-an-inbound-message-is-split-between-computing-and-action-phases) and [4.2.8](#4-2-8-reasons-for-splitting-the-processing-into-computation-and-action-phases)). -- Control register $\texttt{c7}$ (root of temporary data) is initialized by a singleton *Tuple*, the only component of which is a *Tuple* containing an instance of *SmartContractInfo* with smart contract balance and other useful information (cf. [4.4.10](#4-4-10-smart-contract-information)). The smart contract may replace the temporary data, especially all components of the *Tuple* at $\texttt{c7}$ but the first one, with whatever other temporary data it may require. However, the original content of the *SmartContractInfo* at the first component of the *Tuple* held in $\texttt{c7}$ is inspected and sometimes modified by $\texttt{SENDMSG}$ TVM primitives and other "output action" primitives of TVM. +- Control register $\texttt{c4}$ (root of persistent data) is initialized by the persistent data of the smart contract.[35](#fn35) +- Control register $\texttt{c5}$ (root of actions) is initialized by an empty cell. The "output action" primitives of TVM, such as $\texttt{SENDMSG}$, use $\texttt{c5}$ to accumulate the list of actions (e.g., outbound messages) to be performed upon successful termination of the smart contract ([4.2.7](#4-2-7-processing-of-an-inbound-message-is-split-between-computing-and-action-phases) and [4.2.8](#4-2-8-reasons-for-splitting-the-processing-into-computation-and-action-phases)). +- Control register $\texttt{c7}$ (root of temporary data) is initialized by a singleton *Tuple*, the only component of which is a *Tuple* containing an instance of *SmartContractInfo* with smart contract balance and other [useful information](#4-4-10-smart-contract-information). The smart contract may replace the temporary data, especially all components of the *Tuple* at $\texttt{c7}$ but the first one, with whatever other temporary data it may require. However, the original content of the *SmartContractInfo* at the first component of the *Tuple* held in $\texttt{c7}$ is inspected and sometimes modified by $\texttt{SENDMSG}$ TVM primitives and other "output action" primitives of TVM. - The *gas limits* $\texttt{gas}=(g_m,g_l,g_c,g_r)$ are initialized as follows: - - The *maximal gas limit* $g_m$ is set to the lesser of either the total Gram balance of the smart contract (after the the credit phase—i.e., combined with the value of the inbound message) divided by the current gas price, or the per-execution global gas limit.[36](#fn36) + - The *maximal gas limit* $g_m$ is set to the lesser of either the total Gram balance of the smart contract (after the the credit phase—i.e., combined with the value of the inbound message) divided by the current gas price, or the per-execution global gas limit.[36](#fn36) - The c*urrent gas limit* $g_l$ is set to the lesser of either the Gram value of the inbound message divided by the gas price, or the global per-execution gas limit. In this way, always $g_l\leq g_m$. For inbound external messages $g_l=0$, since they cannot carry any value. - The *gas credit* $g_c$ is set to zero for inbound internal messages, and to the lesser of either $g_m$ or a fixed small value (the default external message gas credit, a configurable parameter) for inbound external messages. - Finally, the *remaining gas* limit $g_r$ is automatically initialized by $g_l+g_c$. ### 4.4.5. The initial stack of TVM for processing an internal message -After TVM is initialized as described in [4.4.4](#4-4-4-the-initial-state-of-tvm), its stack is initialized by pushing the arguments to the main() function of the smart contract as follows: +After [TVM is initialized](#4-4-4-the-initial-state-of-tvm), its stack is initialized by pushing the arguments to the main() function of the smart contract as follows: - The Gram balance $b$ of the smart contract (after crediting the value of the inbound message) is passed as an *Integer* amount of nanograms. - The Gram balance $b_m$ of inbound message $m$ is passed as an *Integer* amount of nanograms. @@ -1830,7 +1840,7 @@ The smart contract must terminate with $g_c=0$ or $g_r\geq g_c$; otherwise, the ### 4.4.7. Processing tick and tock transactions -The TVM stack for processing tick and tock transactions (cf. [4.2.4](#4-2-4-kinds-of-transactions)) is initialized by pushing the following values: +The TVM stack for processing [tick and tock transactions](#4-2-4-kinds-of-transactions) is initialized by pushing the following values: - The Gram balance $b$ of the current account in nanograms (an *Integer*). - The 256-bit address $\xi$ of the current account inside the masterchain, represented by an unsigned *Integer*. @@ -1839,10 +1849,10 @@ The TVM stack for processing tick and tock transactions (cf. [4.2.4](#4-2-4-kind ### 4.4.8. Processing split prepare transactions -For processing split prepare transactions (cf. [4.3.13](#4-3-13-split-prepare-and-install-transactions)), the TVM stack is initialized by pushing the following values: +For processing [split prepare transactions](#4-3-13-split-prepare-and-install-transactions), the TVM stack is initialized by pushing the following values: - The Gram balance $b$ of the current account. -- A Slice containing *SplitMergeInfo* (cf. [4.3.13](#4-3-13-split-prepare-and-install-transactions)). +- A Slice containing [*SplitMergeInfo*](#4-3-13-split-prepare-and-install-transactions). - The 256-bit address $\xi$ of the current account. - The 256-bit address $\tilde\xi$ of the sibling account. - An integer $0\leq d\leq 63$, equal to the position of the only bit in which $\xi$ and $\tilde\xi$ differ. @@ -1850,13 +1860,13 @@ For processing split prepare transactions (cf. [4.3.13](#4-3-13-split-prepare-an ### 4.4.9. Processing merge install transactions -For processing merge install transactions (cf. [4.3.14](#4-3-14-merge-prepare-and-install-transactions)), the TVM stack is initialized by pushing the following values: +For processing [merge install transactions](#4-3-14-merge-prepare-and-install-transactions), the TVM stack is initialized by pushing the following values: - The Gram balance $b$ of the current account (already combined with the Gram balance of the sibling account). - The Gram balance $b'$ of the sibling account, taken from the inbound message $m$. - The message $m$ from the sibling account, automatically generated by a merge prepare transaction. Its `init` field contains the final state $\tilde S$ of the sibling account. -- The state $\tilde S$ of the sibling account, represented by a *StateInit* (cf. [3.1.7](#3-1-7-message-layout)). -- A *Slice* containing *SplitMergeInfo* (cf. [4.3.13](#4-3-13-split-prepare-and-install-transactions)). +- The state $\tilde S$ of the sibling account, represented by a [*StateInit*](#3-1-7-message-layout). +- A *Slice* containing [*SplitMergeInfo*](#4-3-13-split-prepare-and-install-transactions). - The 256-bit address $\xi$ of the current account. - The 256-bit address $\tilde\xi$ of the sibling account. - An integer $0\leq d\leq 63$, equal to the position of the only bit in which $\xi$ and $\tilde\xi$ differ. @@ -1874,7 +1884,7 @@ The smart-contract information structure *SmartContractInfo*, passed in the firs ] = SmartContractInfo; ``` -In other words, the first component of this tuple is an *Integer* `magic` always equal to `0x076ef1ea`, the second component is an *Integer* `actions`, originally initialized by zero, but incremented by one whenever an output action is installed by a non-`RAW` output action primitive of the TVM, and so on. The remaining balance is represented by a pair, i.e., a two-component *Tuple*: the first component is the nanogram balance, and the second component is a dictionary with 32-bit keys representing all other currencies, if any (cf. [3.1.6](#3-1-6-representing-collections-of-arbitrary-currencies)). +In other words, the first component of this tuple is an *Integer* `magic` always equal to `0x076ef1ea`, the second component is an *Integer* `actions`, originally initialized by zero, but incremented by one whenever an output action is installed by a non-`RAW` output action primitive of the TVM, and so on. The remaining balance is represented by a pair, i.e., a two-component *Tuple*: the first component is the nanogram balance, and the second component is a dictionary with 32-bit keys representing [all other currencies](#3-1-6-representing-collections-of-arbitrary-currencies), if any. The `rand_seed` field (an unsigned 256-bit integer) here is initialized deterministically starting from the `rand_seed` of the block, the account address, the hash of the inbound message being processed (if any), and the transaction logical time `trans_lt`. @@ -1889,14 +1899,13 @@ out_list$_ {n:#} prev:^(OutList n) action:OutAction action_send_msg#0ec3c86d out_msg:^(Message Any) = OutAction; action_set_code#ad4de08e new_code:^Cell = OutAction; ``` - --- # 5 Block layout This chapter presents the block layout used by the TON Blockchain, combining the data structures described separately in previous chapters to produce a complete description of a shardchain block. In addition to the TL-B schemes that define the representation of a shardchain block by a tree of cells, this chapter describes exact serialization formats for the resulting bags (collections) of cells, which are necessary to represent a shardchain block as a file. -Masterchain blocks are similar to shardchain blocks, but have some additional fields. The necessary modifications are discussed separately in [5.2](#5-2-masterchain-block-layout). +Masterchain blocks are similar to shardchain blocks, but have some additional fields. The necessary modifications are discussed separately in [Masterchain block layout](#5-2-masterchain-block-layout). ## 5.1 Shardchain block layout @@ -1906,24 +1915,24 @@ This section lists the data structures that must be contained in a shardchain bl The shardchain state consists of: -- *ShardAccounts*, the split part of the shardchain state (cf. [1.2.2](#1-2-2-split-and-non-split-part-of-the-shardchain-block-and-state)) containing the state of all accounts assigned to this shard (cf. [4.1.9](#4-1-9-the-combined-state-of-all-accounts-in-a-shard)). -- *OutMsgQueue*, the output message queue of the shardchain (cf. [3.3.6](#3-3-6-structure-of-outmsgqueue)). +- *ShardAccounts*, the [split part of the shardchain state](#1-2-2-split-and-non-split-part-of-the-shardchain-block-and-state) containing the state of [all accounts assigned to this shard](#4-1-9-the-combined-state-of-all-accounts-in-a-shard). +- [*OutMsgQueue*](#3-3-6-structure-of-outmsgqueue), the output message queue of the shardchain. - *SharedLibraries*, the description of all shared libraries of the shardchain (for now, non-empty only in the masterchain). - The logical time and the unixtime of the last modification of the state. - The total balance of the shard. -- A hash reference to the most recent masterchain block, indirectly describing the state of the masterchain and, through it, the state of all other shardchains of the TON Blockchain (cf. [1.5.2](#1-5-2-total-state-defined-by-a-shardchain-block )). +- A hash reference to the most recent masterchain block, indirectly describing the state of the masterchain and, through it, the [state of all other shardchains](#1-5-2-total-state-defined-by-a-shardchain-block) of the TON Blockchain. ### 5.1.2. Components of a shardchain block A shardchain block must contain: -- A list of *validator signatures* (cf. [1.2.6](#1-2-6-validator-signatures%2C-signed-and-unsigned-blocks)), which is external with respect to all other contents of the block. -- *BlockHeader*, containing general information about the block (cf. [1.2.5](#1-2-5-block-header)) +- A list of [*validator signatures*](#1-2-6-validator-signatures%2C-signed-and-unsigned-blocks), which is external with respect to all other contents of the block. +- [*BlockHeader*](#1-2-5-block-header), containing general information about the block - Hash references to the immediately preceding block or blocks of the same shardchain, and to the most recent masterchain block. -- *InMsgDescr* and *OutMsgDescr*, the inbound and outbound message descriptors (cf. [3.2.8](#3-2-8-structure-of-inmsgdescr) and [3.3.5](#3-3-5-structure-of-outmsgdescr)). -- *ShardAccountBlocks*, the collection of all transactions processed in the block (cf. [4.2.17](#4-2-17-collection-of-all-transactions-in-a-block)) along with all updates of the states of the accounts assigned to the shard. This is the *split* part of the shardchain block (cf. [1.2.2](#1-2-2-split-and-non-split-part-of-the-shardchain-block-and-state)). +- [*InMsgDescr*](#3-2-8-structure-of-inmsgdescr) and [*OutMsgDescr*](#3-3-5-structure-of-outmsgdescr), the inbound and outbound message descriptors. +- *ShardAccountBlocks*, the [collection of all transactions processed in the block](#4-2-17-collection-of-all-transactions-in-a-block) along with all updates of the states of the accounts assigned to the shard. This is the *split* part of the [shardchain block](#1-2-2-split-and-non-split-part-of-the-shardchain-block-and-state). - The *value flow*, describing the total value imported from the preceding blocks of the same shardchain and from inbound messages, the total value exported by outbound message, the total fees collected by validators, and the total value remaining in the shard. -- A *Merkle update* (cf. [[4](#ref-4), 3.1]) of the shardchain state. Such a Merkle update contains the hashes of the initial and final shardchain states with respect to the block, along with all new cells of the final state that have been created while processing the block.[37](#fn37) +- A [*Merkle update*](/foundations/whitepapers/tvm#3-1-7-types-of-exotic-cells) of the shardchain state. Such a Merkle update contains the hashes of the initial and final shardchain states with respect to the block, along with all new cells of the final state that have been created while processing the block.[37](#fn37) ### 5.1.3. Common parts of the block layout for all workchains @@ -1936,7 +1945,7 @@ Recall that different workchains may define their own rules for processing messa ### 5.1.4. TL-B scheme for the shardchain state -The shardchain state (cf. [1.2.1](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state) and [5.1.1](#5-1-1-components-of-the-shardchain-state)) is serialized according to the following TL-B scheme: +The shardchain state [1.2.1](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state) and [5.1.1](#5-1-1-components-of-the-shardchain-state) is serialized according to the following TL-B scheme: ``` ext_blk_ref$_ start_lt:uint64 end_lt:uint64 @@ -1973,7 +1982,7 @@ Here `publishers` is a hashmap with keys equal to the addresses of all accounts ### 5.1.6. TL-B scheme for an unsigned shardchain block -The precise format of an *unsigned* (cf. [1.2.6](#1-2-6-validator-signatures%2C-signed-and-unsigned-blocks)) shardchain block is given by the following TL-B scheme: +The precise format of an [*unsigned*](#1-2-6-validator-signatures%2C-signed-and-unsigned-blocks) shardchain block is given by the following TL-B scheme: ``` block_info version:uint32 @@ -2034,13 +2043,13 @@ signed_block block:^Block blk_serialize_hash:uint256 = SignedBlock; ``` -The *serialization hash* `blk_serialize_hash` of the unsigned block `block` is essentially a hash of a specific serialization of the block into an octet string (cf. [5.3.12](#5-3-12-the-serialization-hash-of-a-block) for a more detailed explanation). The signatures collected in `signatures` are Ed25519-signatures (cf. [A.3](#a-3-ed25519-cryptography)) made with a validator's private keys of the SHA-256 of the concatenation of the 256-bit representation hash of the block `block` and of its 256-bit serialization hash `blk_serialize_hash`. The 64-bit keys in dictionary `signatures` represent the first 64 bits of the public keys of the corresponding validators. +The *serialization hash* `blk_serialize_hash` of the unsigned block `block` is essentially a hash of a specific [serialization](#5-3-12-the-serialization-hash-of-a-block) of the block into an octet string. The signatures collected in `signatures` are [Ed25519-signatures](#a-3-ed25519-cryptography) made with a validator's private keys of the SHA-256 of the concatenation of the 256-bit representation hash of the block `block` and of its 256-bit serialization hash `blk_serialize_hash`. The 64-bit keys in dictionary `signatures` represent the first 64 bits of the public keys of the corresponding validators. ### 5.1.9. Serialization of a signed block The overall procedure of serializing and signing a block may be described as follows: -1. An unsigned block $B$ is generated, transformed into a complete bag of cells (cf. [5.3.2](#5-3-2-complete-bags-of-cells)), and serialized into an octet string $S_B$. +1. An unsigned block $B$ is generated, transformed into a [complete bag of cells](#5-3-2-complete-bags-of-cells), and serialized into an octet string $S_B$. 2. Validators sign the 256-bit combined hash @@ -2050,7 +2059,7 @@ The overall procedure of serializing and signing a block may be described as fol of the representation hash of $B$ and of the Merkle hash of its serialization $S_B$. -3. A signed shardchain block $\tilde B$ is generated from $B$ and these validator signatures as described above (cf. [5.1.8](#5-1-8-signed-shardchain-block)). +3. A [signed shardchain block](#5-1-8-signed-shardchain-block) $\tilde B$ is generated from $B$ and these validator signatures. 4. This signed block $\tilde B$ is transformed into an incomplete bag of cells, which contains only the validator signatures, but the unsigned block itself is absent from this bag of cells, being its only absent cell. @@ -2058,13 +2067,15 @@ The overall procedure of serializing and signing a block may be described as fol The result is the serialization of the signed block into an octet string. It may be propagated by network or stored into a disk file. +--- + ## 5.2 Masterchain block layout -Masterchain blocks are very similar to shardchain blocks of the basic workchain. This section lists some of the modifications needed to obtain the description of a masterchain block from the description of a shardchain block given in [5.1](#5-1-shardchain-block-layout). +Masterchain blocks are very similar to shardchain blocks of the basic workchain. This section lists some of the modifications needed to obtain the description of a masterchain block from the description of a [shardchain block](#5-1-shardchain-block-layout). ### 5.2.1 Additional components present in the masterchain state -In addition to the components listed in [5.1.1](#5-1-1-components-of-the-shardchain-state), the masterchain state must contain: +In addition to the [components of a shardchain state](#5-1-1-components-of-the-shardchain-state), the masterchain state must contain: - *ShardHashes* — Describes the current shard configuration, and contains the hashes of the latest blocks of the corresponding shardchains. - *ShardFees* — Describes the total fees collected by the validators of each shardchain. @@ -2073,7 +2084,7 @@ In addition to the components listed in [5.1.1](#5-1-1-components-of-the-shardch ### 5.2.2. Additional components present in masterchain blocks -In addition to the components listed in [5.1.2](#5-1-2-components-of-a-shardchain-block), each masterchain block must contain: +In addition to the [components of a shardchain block](#5-1-2-components-of-a-shardchain-block), each masterchain block must contain: - *ShardHashes* — Describes the current shard configuration, and contains the hashes of the latest blocks of the corresponding shardchains. (Notice that this component is also present in the masterchain state.) @@ -2112,11 +2123,11 @@ _ (HashmapAugE 32 ^(BinTreeAug True CurrencyCollection) CurrencyCollection) = ShardFees; ``` -The structure of *ShardFees* is similar to that of *ShardHashes* (cf. [5.2.3](#5-2-3-description-of-shardhashes)), but the dictionary and binary trees involved are augmented by currency values, equal to the `total_validator_fees` values of the final states of the corresponding shardchain blocks. The value aggregated at the root of *ShardFees* is added together with the `total_validator_fees` of the masterchain state, yielding the total TON Blockchain validator fees. The increase of the value aggregated at the root of *ShardFees* from the initial to the final state of a masterchain block is reflected in the `fees_imported` in the value flow of that masterchain block. +The structure of *ShardFees* is similar to that of [*ShardHashes*](#5-2-3-description-of-shardhashes), but the dictionary and binary trees involved are augmented by currency values, equal to the `total_validator_fees` values of the final states of the corresponding shardchain blocks. The value aggregated at the root of *ShardFees* is added together with the `total_validator_fees` of the masterchain state, yielding the total TON Blockchain validator fees. The increase of the value aggregated at the root of *ShardFees* from the initial to the final state of a masterchain block is reflected in the `fees_imported` in the value flow of that masterchain block. ### 5.2.5. Description of *ConfigParams* -Recall that the *configurable parameters* or the *configuration dictionary* is a dictionary `config` with 32-bit keys kept inside the first cell reference of the persistent data of the configuration smart contract $\gamma$ (cf. [1.6](#1-6-configurable-parameters-and-smart-contracts)). The address $\gamma$ of the configuration smart contract and a copy of the configuration dictionary are duplicated in fields `config_addr` and `config` of a *ConfigParams* structure, explicitly included into masterchain state to facilitate access to the current values of the configurable parameters (cf. [1.6.3](#1-6-3-quick-access-through-the-header-of-masterchain-blocks)): +Recall that the [*configurable parameters*](#1-6-configurable-parameters-and-smart-contracts) or the *configuration dictionary* is a dictionary `config` with 32-bit keys kept inside the first cell reference of the persistent data of the configuration smart contract $\gamma$. The address $\gamma$ of the configuration smart contract and a copy of the configuration dictionary are duplicated in fields `config_addr` and `config` of a *ConfigParams* structure, explicitly included into masterchain state to [facilitate access](#1-6-3-quick-access-through-the-header-of-masterchain-blocks) to the current values of the configurable parameters: ``` _ config_addr:uint256 config:^(Hashmap 32 ^Cell) @@ -2125,7 +2136,7 @@ _ config_addr:uint256 config:^(Hashmap 32 ^Cell) ### 5.2.6. Masterchain state data -The data specific to the masterchain state is collected into *McStateExtra*, already mentioned in [5.1.4](#5-1-4-tl-b-scheme-for-the-shardchain-state): +The [data specific to the masterchain state](#5-1-4-tl-b-scheme-for-the-shardchain-state) is collected into *McStateExtra*: ``` masterchain_state_extra#cc1f @@ -2145,6 +2156,8 @@ masterchain_block_extra#cc9f = McBlockExtra; ``` +--- + ## 5.3 Serialization of a bag of cells The description provided in the previous section defines the way a shardchain block is represented as a tree of cells. However, this tree of cells needs to be serialized into a file, suitable for disk storage or network transfer. This section discusses the standard ways of serializing a tree, a DAG, or a bag of cells into an octet string. @@ -2179,7 +2192,7 @@ If cells are listed in a topological order, then the verification that there are The serialization process of a bag of cells $B$ consisting of $n$ cells can be outlined as follows: 1. List the cells from $B$ in a topological order: $c_0$, $c_1$, ..., $c_{n-1}$. Then $c_0$ is the root cell of $B$. -2. Choose an integer $s$, such that $n\leq 2^s$. Represent each cell $c_i$ by an integral number of octets in the standard way (cf. [1.1.3](#1-1-3-the-layout-of-a-single-cell) or [[4, 3.1.4](#ref-4)]), but using unsigned big-endian $s$-bit integer $j$ instead of hash $\text{Hash}(c_j)$ to represent internal references to cell $c_j$ (cf. [5.3.6](#5-3-6-serialization-of-one-cell-from-a-bag-of-cells) below). +2. Choose an integer $s$, such that $n\leq 2^s$. Represent each cell $c_i$ by an integral number of octets in the standard way ([1.1.3](#1-1-3-the-layout-of-a-single-cell) or [4, 3.1.4](/foundations/whitepapers/tvm#3-1-4-standard-cell-representation)), but using unsigned big-endian $s$-bit integer $j$ instead of hash $\text{Hash}(c_j)$ to represent [internal references](#5-3-6-serialization-of-one-cell-from-a-bag-of-cells) to cell $c_j$. 3. Concatenate the representations of cells $c_i$ thus obtained in the increasing order of $i$. 4. Optionally, an index can be constructed that consists of $n+1$ $t$-bit integer entries $L_0$, ..., $L_n$, where $L_i$ is the total length (in octets) of the representations of cells $c_j$ with $j\leq i$, and integer $t\geq0$ is chosen so that $L_n\leq 2^t$. 5. The serialization of the bag of cells now consists of a magic number indicating the precise format of the serialization, followed by integers $s\geq 0$, $t\geq0$, $n\leq 2^s$, an optional index consisting of $\lceil(n+1)t/8\rceil$ octets, and $L_n$ octets with the cell representations. @@ -2191,19 +2204,19 @@ If an index is included, any cell $c_i$ in the serialized bag of cells may be ea More precisely, each individual cell $c=c_i$ is serialized as follows, provided $s$ is a multiple of eight (usually $s=8$, $16$, $24$, or $32$): -1. Two descriptor bytes $d_1$ and $d_2$ are computed similarly to [[4, 3.1.4](#ref-4)] by setting $d_1=r+8s+16h+32l$ and $d_2=\lfloor b/8\rfloor+\lceil b/8\rceil$, where: - - $0\leq r\leq 4$ is the number of cell references present in cell $c$; if $c$ is absent from the bag of cells being serialized and is represented by its hashes only, then $r=7$.[38](#fn38) +1. Two descriptor bytes $d_1$ and $d_2$ are computed similarly to [Standard cell representation](/foundations/whitepapers/tvm#3-1-4-standard-cell-representation) by setting $d_1=r+8s+16h+32l$ and $d_2=\lfloor b/8\rfloor+\lceil b/8\rceil$, where: + - $0\leq r\leq 4$ is the number of cell references present in cell $c$; if $c$ is absent from the bag of cells being serialized and is represented by its hashes only, then $r=7$.[38](#fn38) - $0\leq b\leq 1023$ is the number of data bits in cell $c$. - - $0\leq l\leq 3$ is the level of cell $c$ (cf. [[4, 3.1.3](#ref-4)]). + - $0\leq l\leq 3$ is the [level of cell](/foundations/whitepapers/tvm#3-1-3-the-level-of-a-cell) $c$. - $s=1$ for exotic cells and $s=0$ for ordinary cells. - $h=1$ if the cell's hashes are explicitly included into the serialization; otherwise, $h=0$. (When $r=7$, we must always have $h=1$.) For absent cells (i.e., external references), only $d_1$ is present, always equal to $23+32l$. 2. Two bytes $d_1$ and $d_2$ (if $r<7$) or one byte $d_1$ (if $r=7$) begin the serialization of cell $c$. -3. If $h=1$, the serialization is continued by $l+1$ 32-byte higher hashes of $c$ (cf. [[4, 3.1.6](#ref-4)]): $\text{Hash}_1(c)$, ..., $\text{Hash}_{l+1}(c)=\text{Hash}_\infty(c)$. -4. After that, $\lceil b/8\rceil$ data bytes are serialized, by splitting $b$ data bits into 8-bit groups and interpreting each group as a big-endian integer in the range $0\ldots255$. If $b$ is not divisible by $8$, then the data bits are first augmented by one binary $1$ and up to six binary $0$, so as to make the number of data bits divisible by eight.[39](#fn39) -5. Finally, $r$ cell references to cells $c_{j_1}$, ..., $c_{j_r}$ are encoded by means of $r$ $s$-bit big-endian integers $j_1$, ..., $j_r$.[40](#fn40) +3. If $h=1$, the serialization is continued by $l+1$ 32-byte [higher hashes](/foundations/whitepapers/tvm#3-1-6-the-higher-hashes-of-a-cell) of $c$: $\text{Hash}_1(c)$, ..., $\text{Hash}_{l+1}(c)=\text{Hash}_\infty(c)$. +4. After that, $\lceil b/8\rceil$ data bytes are serialized, by splitting $b$ data bits into 8-bit groups and interpreting each group as a big-endian integer in the range $0\ldots255$. If $b$ is not divisible by $8$, then the data bits are first augmented by one binary $1$ and up to six binary $0$, so as to make the number of data bits divisible by eight.[39](#fn39) +5. Finally, $r$ cell references to cells $c_{j_1}$, ..., $c_{j_r}$ are encoded by means of $r$ $s$-bit big-endian integers $j_1$, ..., $j_r$.[40](#fn40) ### 5.3.7. A classification of serialization schemes for bags of cells @@ -2211,13 +2224,13 @@ A serialization scheme for a bag of cells must specify the following parameters: - The 4-byte magic number prepended to the serialization. - The number of bits $s$ used to represent cell indices. Usually $s$ is a multiple of eight (e.g., $8$, $16$, $24$, or $32$). -- The number of bits $t$ used to represent offsets of cell serializations (cf. [5.3.5](#5-3-5-outline-of-serialization-process)). Usually $t$ is also a multiple of eight. +- The number of bits $t$ used to represent offsets of [cell serializations](#5-3-5-outline-of-serialization-process). Usually $t$ is also a multiple of eight. - A flag indicating whether an index with offsets $L_0$, ..., $L_n$ of cell serializations is present. This flag may be combined with $t$ by setting $t=0$ when the index is absent. - A flag indicating whether the CRC32-C of the whole serialization is appended to it for integrity verification purposes. ### 5.3.8. Fields present in the serialization of a bag of cells -In addition to the values listed in [5.3.7](#5-3-7-a-classification-of-serialization-schemes-for-bags-of-cells), fixed by the choice of a serialization scheme for bags of cells, the serialization of a specific bag of cells must specify the following parameters: +In addition to the [values](#5-3-7-a-classification-of-serialization-schemes-for-bags-of-cells) fixed by the choice of a serialization scheme for bags of cells, the serialization of a specific bag of cells must specify the following parameters: - The total number of cells $n$ present in the serialization. - The number of "root cells" $k\leq n$ present in the serialization. The root cells themselves are $c_0$, ..., $c_{k-1}$. All other cells present in the bag of cells are expected to be reachable by chains of references starting from the root cells. @@ -2285,7 +2298,7 @@ compiled_smart_contract tiny_string#_ len:(#<= 126) str:(len * [ uint8 ]) = TinyString; ``` -Then a compiled smart contract may be represented by a value of type *CompiledSmartContract*, transformed into a tree of cells and then into a bag of cells, and then serialized using one of the constructors listed in [5.3.9](#5-3-9-tl-b-scheme-for-serializing-bags-of-cells). The resulting octet string may be then written into a file with suffix `.tvc` ("TVM smart contract"), and this file may be used to distribute the compiled smart contract, download it into a wallet application for deploying into the TON Blockchain, and so on. +Then a compiled smart contract may be represented by a value of type *CompiledSmartContract*, transformed into a tree of cells and then into a bag of cells, and then serialized using one of the [constructors](#5-3-9-tl-b-scheme-for-serializing-bags-of-cells). The resulting octet string may be then written into a file with suffix `.tvc` ("TVM smart contract"), and this file may be used to distribute the compiled smart contract, download it into a wallet application for deploying into the TON Blockchain, and so on. ### 5.3.11. Merkle hashes for an octet string @@ -2309,7 +2322,7 @@ One can check that $\text{Hash}_M(s)=\text{Hash}_M(t)$ for octet strings $s$ and ### 5.3.12. The serialization hash of a block -The construction of [5.3.11](#5-3-11-merkle-hashes-for-an-octet-string) is applied in particular to the serialization of the bag of cells representing an unsigned shardchain or masterchain block. The validators sign not only the representation hash of the unsigned block, but also the "serialization hash" of the unsigned block, defined as $\text{Hash}_M$ of the serialization of the unsigned block. In this way, the validators certify that this octet string is indeed a serialization of the corresponding block. +The construction of [Merkle hashes for an octet string](#5-3-11-merkle-hashes-for-an-octet-string) is applied in particular to the serialization of the bag of cells representing an unsigned shardchain or masterchain block. The validators sign not only the representation hash of the unsigned block, but also the "serialization hash" of the unsigned block, defined as $\text{Hash}_M$ of the serialization of the unsigned block. In this way, the validators certify that this octet string is indeed a serialization of the corresponding block. --- @@ -2317,7 +2330,7 @@ The construction of [5.3.11](#5-3-11-merkle-hashes-for-an-octet-string) is appli This appendix contains a formal description of the elliptic curve cryptography currently used in TON, particularly in the TON Blockchain and the TON Network. -TON uses two forms of elliptic curve cryptography: Ed25519 is used for cryptographic Schnorr signatures, while Curve25519 is used for asymmetric cryptography. These curves are used in the standard way (as defined in the original articles [[1](#ref-1)] and [[2](#ref-2)] by D. Bernstein and RFCs 7748 and 8032); however, some serialization details specific to TON must be explained. One unique adaptation of these curves for TON is that TON supports automatic conversion of Ed25519 keys into Curve25519 keys, so that the same keys can be used for signatures and for asymmetric cryptography. +TON uses two forms of elliptic curve cryptography: Ed25519 is used for cryptographic Schnorr signatures, while Curve25519 is used for asymmetric cryptography. These curves are used in the standard way (as defined in the original articles [[1](#references)] and [[2](#references)] by D. Bernstein and RFCs 7748 and 8032); however, some serialization details specific to TON must be explained. One unique adaptation of these curves for TON is that TON supports automatic conversion of Ed25519 keys into Curve25519 keys, so that the same keys can be used for signatures and for asymmetric cryptography. ## A.1 Elliptic curves @@ -2415,7 +2428,7 @@ Elliptic curve cryptography usually deals with a fixed cyclic subgroup $C$ of a ### A.1.11. Private and public keys for elliptic curve cryptography -Usually a private key for elliptic curve cryptography described by the data listed in [A.1.9](#a-1-9-data-for-elliptic-curve-cryptography) is a "random" integer $0[[1]](#references)] and its usage in TON. ### A.2.1. Curve25519 @@ -2486,15 +2501,17 @@ TON uses another form for public and private keys of Curve25519 cryptography, bo A private key for TON Curve25519 cryptography is just a random 256-bit string $k$. It is used by computing $\text{Sha512}(k)$, taking the first 256 bits of the result, interpreting them as a little-endian 256-bit integer $a$, clearing bits $0$, $1$, $2$, and $255$ of $a$, and setting bit $254$ so as to obtain a value $2^{254}\leq a<2^{255}$, divisible by eight. The value $a$ thus obtained is the *secret exponent* corresponding to $k$; meanwhile, the remaining 256 bits of $\text{Sha512}(k)$ constitute the *secret salt* $k''$. -The public key corresponding to $k$—or to the secret exponent $a$—is just the $x$-coordinate $x_A$ of the point $A:=[a]G$. Once $a$ and $x_A$ are computed, they are used in exactly the same way as in [A.2.3](#a-2-3-private-and-public-keys-for-standard-curve25519-cryptography). In particular, if $x_A$ needs to be serialized, it is serialized into 32 octets as an unsigned little-endian 256-bit integer. +The public key corresponding to $k$—or to the secret exponent $a$—is just the $x$-coordinate $x_A$ of the point $A:=[a]G$. Once $a$ and $x_A$ are computed, they are used in exactly the [same way](#a-2-3-private-and-public-keys-for-standard-curve25519-cryptography). In particular, if $x_A$ needs to be serialized, it is serialized into 32 octets as an unsigned little-endian 256-bit integer. ### A.2.5. Curve25519 is used in the TON Network -Notice that the asymmetric Curve25519 cryptography described in [A.2.4](#a-2-4-public-and-private-keys-for-ton-curve25519-cryptography) is extensively used by the TON Network, especially the ADNL (Abstract Datagram Network Layer) protocol. However, TON Blockchain needs elliptic curve cryptography mostly for signatures. For this purpose, Ed25519 signatures described in the next section are used. +Notice that the asymmetric [Curve25519 cryptography](#a-2-4-public-and-private-keys-for-ton-curve25519-cryptography) is extensively used by the TON Network, especially the ADNL (Abstract Datagram Network Layer) protocol. However, TON Blockchain needs elliptic curve cryptography mostly for signatures. For this purpose, Ed25519 signatures described in the next section are used. + +--- ## A.3 Ed25519 cryptography -Ed25519 cryptography is extensively used for fast cryptographic signatures by both the TON Blockchain and the TON Network. This section describes the variant of Ed25519 cryptography used by TON. An important difference from the standard approaches (as defined by D. Bernstein et al. in [[2](#ref-2)]) is that TON provides automatic conversion of private and public Ed25519 keys into Curve25519 keys, so that the same keys could be used both for encrypting/decrypting and for signing messages. +Ed25519 cryptography is extensively used for fast cryptographic signatures by both the TON Blockchain and the TON Network. This section describes the variant of Ed25519 cryptography used by TON. An important difference from the standard approaches (as defined by D. Bernstein et al. in [[2]](#references)) is that TON provides automatic conversion of private and public Ed25519 keys into Curve25519 keys, so that the same keys could be used both for encrypting/decrypting and for signing messages. ### A.3.1. Twisted Edwards curves @@ -2570,7 +2587,7 @@ In this way, Ed25519-curve $E_{-1,d}$ is birationally equivalent to Curve25519 ( ### A.3.6. Generator of Ed25519 -The generator of Ed25519 is the point $G'$ with $y(G')=4/5$ and $0\leq x(G')**[1]** Daniel J. Bernstein, *Curve25519: New Diffie--Hellman Speed Records* (2006), in: M. Yung, Ye. Dodis, A. Kiayas et al, *Public Key Cryptography*, Lecture Notes in Computer Science 3958, pp. 207--228. Available at https://cr.yp.to/ecdh/curve25519-20060209.pdf. +[1] Daniel J. Bernstein, *Curve25519: New Diffie--Hellman Speed Records*... -**[2]** Daniel J. Bernstein, Niels Duif, Tanja Lange et al., *High-speed high-security signatures* (2012), *Journal of Cryptographic Engineering* 2 (2), pp. 77--89. Available at https://ed25519.cr.yp.to/ed25519-20110926.pdf. +[2] Daniel J. Bernstein, Niels Duif, Tanja Lange et al., *High-speed high-security signatures*... -**[3]** N. Durov, *Telegram Open Network*, 2017. +[3] N. Durov, *Telegram Open Network*, 2017. -**[4]** N. Durov, *Telegram Open Network Virtual Machine*, 2018. +[4] N. Durov, *Telegram Open Network Virtual Machine*, 2018. ## Footnotes -**1** As of August 2018, this document does not include a detailed description of serialized invalidity proofs, because they are likely to change significantly during the development of the validator software. Only the general design principles for consistency conditions and serialized invalidity proofs are discussed. [Back ↑](#introduction) +**1** As of August 2018, this document does not include a detailed description of serialized invalidity proofs, because they are likely to change significantly during the development of the validator software. Only the general design principles for consistency conditions and serialized invalidity proofs are discussed. [Back ↑](#ref-fn1) -**2** This is not included in the present version of this document, but will be provided in a separate appendix to a future revision. [Back ↑](#introduction) +**2** This is not included in the present version of this document, but will be provided in a separate appendix to a future revision. [Back ↑](#ref-fn2) -**3** Completely identical cells are often identified in memory and in disk storage; this is the reason why trees of cells are transparently transformed into DAGs of cells. From this perspective, a DAG is just a storage optimization of the underlying tree of cells, irrelevant for most considerations. [Back ↑](#1-1-1-tvm-cells) +**3** Completely identical cells are often identified in memory and in disk storage; this is the reason why trees of cells are transparently transformed into DAGs of cells. From this perspective, a DAG is just a storage optimization of the underlying tree of cells, irrelevant for most considerations. [Back ↑](#ref-fn3) -**4** Cf. [[4](#ref-4)], 3.3.3-4], where an example is given and explained, pending a more complete reference. [Back ↑](#1-1-1-tvm-cells) +**4** [[4](#ref-4), 3.3.3–4], where an example is given and explained, pending a more complete reference. [Back ↑](#ref-fn4) -**5** If there are no transactions related to an account, the corresponding virtual block is empty and is omitted in the shardchain block. [Back ↑](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state) +**5** If there are no transactions related to an account, the corresponding virtual block is empty and is omitted in the shardchain block. [Back ↑](#ref-fn5) -**6** Recall that TON Blockchain supports *dynamic* sharding, so the shard configuration may change from block to block because of shard merge and split events. Therefore, we cannot simply say that each shardchain corresponds to a fixed set of accountchains. [Back ↑](#1-2-1-the-infinite-sharding-paradigm-isp-applied-to-blockchain-block-and-state) +**6** Recall that TON Blockchain supports *dynamic* sharding, so the shard configuration may change from block to block because of shard merge and split events. Therefore, we cannot simply say that each shardchain corresponds to a fixed set of accountchains. [Back ↑](#ref-fn6) -**7** This condition applies if there is exactly one immediate antecessor (i.e., if a shardchain merge event did not occur immediately before the block in question); otherwise, this condition becomes more convoluted. [Back ↑](#1-2-3-interaction-with-other-blocks-and-the-outside-world-global-and-local-consistency-conditions) +**7** This condition applies if there is exactly one immediate antecessor (i.e., if a shardchain merge event did not occur immediately before the block in question); otherwise, this condition becomes more convoluted. [Back ↑](#ref-fn7) -**8** This example is a bit simplified since it does not take into account the presence of transit messages in InMsgDescr, which are not processed by any explicit transaction. [Back ↑](#1-3-8-example%3A-consistency-condition-for-inmsgdescr) +**8** This example is a bit simplified since it does not take into account the presence of transit messages in InMsgDescr, which are not processed by any explicit transaction. [Back ↑](#ref-fn8) -**9** It is interesting to note that this part of the work can be done almost automatically. [Back ↑](#1-3-12-witnesses-of-the-invalidity-of-a-block) +**9** It is interesting to note that this part of the work can be done almost automatically. [Back ↑](#ref-fn9) -**10** In order to express this condition correctly in the presence of dynamic sharding, one should fix some account $\xi$, and consider the latest blocks $S$ and $S'$ of the shardchains containing $\xi$ in the shard configurations of both $B$ and $B'$, since the shards containing $\xi$ might be different in $B$ and $B'$. [Back ↑](#1-5-1-total-state-defined-by-a-masterchain-block) +**10** In order to express this condition correctly in the presence of dynamic sharding, one should fix some account $\xi$, and consider the latest blocks $S$ and $S'$ of the shardchains containing $\xi$ in the shard configurations of both $B$ and $B'$, since the shards containing $\xi$ might be different in $B$ and $B'$. [Back ↑](#ref-fn10) -**11** Value-bearing messages with the `bounce` flag set will not be accepted by an uninitialized account, but will be "bounced" back. [Back ↑](#1-7-2-transferring-cryptocurrency-to-uninitialized-accounts) +**11** Value-bearing messages with the `bounce` flag set will not be accepted by an uninitialized account, but will be "bounced" back. [Back ↑](#ref-fn11) -**12** "Messages to nowhere" may have some special fields in their body indicating their destination outside the TON Blockchain—for instance, an account in some other blockchain, or an IP address and port—which may be interpreted by the third-party software appropriately. Such fields are ignored by the TON Blockchain. [Back ↑](#2-1-3-external-messages-with-no-source-or-destination-address) +**12** "Messages to nowhere" may have some special fields in their body indicating their destination outside the TON Blockchain—for instance, an account in some other blockchain, or an IP address and port—which may be interpreted by the third-party software appropriately. Such fields are ignored by the TON Blockchain. [Back ↑](#ref-fn12) -**13** The problem of bypassing possible validator censorship—which could happen, for instance, if all validators conspire not to include external messages sent to accounts belonging to some set of blacklisted accounts—is dealt with separately elsewhere. The main idea is that the validators may be forced to promise to include a message with a known hash in a future block, without knowing anything about the identity of the sender or the receiver; they will have to keep this promise afterwards when the message itself with pre-agreed hash is presented. [Back ↑](#2-1-3-external-messages-with-no-source-or-destination-address) +**13** The problem of bypassing possible validator censorship—which could happen, for instance, if all validators conspire not to include external messages sent to accounts belonging to some set of blacklisted accounts—is dealt with separately elsewhere. The main idea is that the validators may be forced to promise to include a message with a known hash in a future block, without knowing anything about the identity of the sender or the receiver; they will have to keep this promise afterwards when the message itself with pre-agreed hash is presented. [Back ↑](#ref-fn13) -**14** However, the internal routing process described in [2.1.11](#sp-hr-int-route) is applied immediately after that, which may further modify the transit address. [Back ↑](#2-1-4-transit-and-next-hop-addresses) +**14** However, the internal routing process described in [Internal routing](#2-1-11-internal-routing) is applied immediately after that, which may further modify the transit address. [Back ↑](#ref-fn14) -**15** When the addresses involved are of different lengths (e.g., because they belong to different workchains), one should consider only the first 96 bits of the addresses in the above formula. [Back ↑](#2-1-8-hamming-optimality-of-the-next-hop-address-algorithm) +**15** When the addresses involved are of different lengths (e.g., because they belong to different workchains), one should consider only the first 96 bits of the addresses in the above formula. [Back ↑](#ref-fn15) -**16** Instead of Hamming optimality, we might have considered the equivalent property of Kademlia optimality, written for the Kademlia (or weighted $L_1$) distance as given by $\|\xi-\eta\|_K:=\sum_i2^{-i}|\xi_i-\eta_i|$ instead of the Hamming distance. [Back ↑](#2-1-8-hamming-optimality-of-the-next-hop-address-algorithm) +**16** Instead of Hamming optimality, we might have considered the equivalent property of Kademlia optimality, written for the Kademlia (or weighted $L_1$) distance as given by $\|\xi-\eta\|_K:=\sum_i2^{-i}|\xi_i-\eta_i|$ instead of the Hamming distance. [Back ↑](#ref-fn16) -**17** Notice that the next-hop and internal-routing computations are still applied to such messages, since the current shardchain may be split before the message is processed. In this case, the new sub-shardchain containing the destination address will inherit the message. [Back ↑](#2-1-13-any-shard-is-a-neighbor-of-itself) +**17** Notice that the next-hop and internal-routing computations are still applied to such messages, since the current shardchain may be split before the message is processed. In this case, the new sub-shardchain containing the destination address will inherit the message. [Back ↑](#ref-fn17) -**18** We may define the (virtual) output queue of an account(chain) as the subset of the OutMsgQueue of the shard currently containing that account that consists of messages with transit addresses equal to the address of the account. [Back ↑](#2-1-14-hypercube-routing-and-the-isp) +**18** We may define the (virtual) output queue of an account(chain) as the subset of the OutMsgQueue of the shard currently containing that account that consists of messages with transit addresses equal to the address of the account. [Back ↑](#ref-fn18) -**19** In particular, if the hash of a recent block of a neighboring shardchain is not yet reflected in the latest masterchain block, its modifications to OutMsgQueue must not be taken into account. [Back ↑](#2-2-5-logical-time-monotonicity-importing-the-oldest-message-from-the-neighbors) +**19** In particular, if the hash of a recent block of a neighboring shardchain is not yet reflected in the latest masterchain block, its modifications to OutMsgQueue must not be taken into account. [Back ↑](#ref-fn19) -**20** This statement is not as trivial as it seems at first, because some of the shardchains involved may split or merge during the routing. A correct proof may be obtained by adopting the ISP perspective to HR as explained in [2.1.14](#2-1-14-hypercube-routing-and-the-isp) and observing that $m'$ will always be behind $m$, either in terms of the intermediate accountchain reached or, if they happen to be in the same accountchain, in terms of logical creation time. +**20** This statement is not as trivial as it seems at first, because some of the shardchains involved may split or merge during the routing. A correct proof may be obtained by adopting the ISP perspective to HR as explained in [Hypercube Routing and the ISP](#2-1-14-hypercube-routing-and-the-isp) and observing that $m'$ will always be behind $m$, either in terms of the intermediate accountchain reached or, if they happen to be in the same accountchain, in terms of logical creation time. [Back ↑](#ref-fn20) -**21** One must not only look up the key $\text{Hash}(m)$ in the InMsgDescr of these blocks, but also check the intermediate addresses in the envelope of the corresponding entry, if found. [Back ↑](#2-3-6-checking-whether-an-hr-message-has-already-been-delivered-via-hr-to-its-final-destination-or-an-intermediate-shardchain) +**21** One must not only look up the key $\text{Hash}(m)$ in the InMsgDescr of these blocks, but also check the intermediate addresses in the envelope of the corresponding entry, if found. [Back ↑](#ref-fn21) -**22** A description of an older version of TL may be found at https://core.telegram.org/mtproto/TL. Alternatively, an informal introduction to TL-B schemes may be found in [[4](#ref-4)]. [Back ↑](#3-1-1-some-standard-definitions) +**22** A description of an older version of TL may be found at https://core.telegram.org/mtproto/TL. Alternatively, an informal introduction to TL-B schemes may be found in [[4]](#references). [Back ↑](#ref-fn22) -**23** Address rewriting is a feature used to implement "anycast addresses" employed by the so-called large or global smart contracts (cf. [[3](#ref-3)]), which can have instances in several shardchains. When address rewriting is enabled, a message may be routed to and processed by a smart contract with an address coinciding with the destination address up to the first $d$ bits, where $d\leq 32$ is the "splitting depth" of the smart contract indicated in the `anycast.depth` field (cf. [2.1.6](#2-1-6-support-for-anycast-addresses)). Otherwise, the addresses must match exactly. [Back ↑](#3-1-4-internal-addresses) +**23** Address rewriting is a feature used to implement "anycast addresses" employed by the so-called large or global smart contracts (see [[3]](#references)), which can have instances in several shardchains. When address rewriting is enabled, a message may be routed to and processed by a smart contract with an address coinciding with the destination address up to the first $d$ bits, where $d\leq 32$ is the "splitting depth" of the smart contract indicated in the `anycast.depth` field ([Support for anycast addresses](#2-1-6-support-for-anycast-addresses)). Otherwise, the addresses must match exactly. [Back ↑](#ref-fn23) -**24** More precisely, the information from the `init` field of an inbound message is used either when the receiving account is uninitialized or frozen with the hash of StateInit equal to the one expected by the account, or when the receiving account is active, and its code or data is an external hash reference matching the hash of the code or data received in the StateInit of the message. [Back ↑](#3-1-9-code-and-data-portions-contained-in-a-message) +**24** More precisely, the information from the `init` field of an inbound message is used either when the receiving account is uninitialized or frozen with the hash of StateInit equal to the one expected by the account, or when the receiving account is active, and its code or data is an external hash reference matching the hash of the code or data received in the StateInit of the message. [Back ↑](#ref-fn24) -**25** Strictly speaking, InMsgDescr is the type of this structure; we deliberately use the same notation to describe the only instance of this type in a block. [Back ↑](#3-2-inbound-message-descriptors) +**25** Strictly speaking, InMsgDescr is the type of this structure; we deliberately use the same notation to describe the only instance of this type in a block. [Back ↑](#ref-fn25) -**26** Recall that a shardchain is considered a neighbor of itself. [Back ↑](#3-2-1-types-and-sources-of-inbound-messages) +**26** Recall that a shardchain is considered a neighbor of itself. [Back ↑](#ref-fn26) -**27** This situation is rare and occurs only after shardchain merge events. Normally the messages imported from the OutMsgQueue of the same shardchain have destinations inside this shardchain, and are processed accordingly instead of being re-queued. [Back ↑](#3-3-2-message-dequeueing-records) +**27** This situation is rare and occurs only after shardchain merge events. Normally the messages imported from the OutMsgQueue of the same shardchain have destinations inside this shardchain, and are processed accordingly instead of being re-queued. [Back ↑](#ref-fn27) -**28** For simplicity, we sometimes treat the masterchain as just another workchain with $\mathit{workchain\_id}=-1$. [Back ↑](#4-1-1-account-addresses) +**28** For simplicity, we sometimes treat the masterchain as just another workchain with $\mathit{workchain\_id}=-1$. [Back ↑](#ref-fn28) -**29** In fact, up to the first $d$ bits are replaced in such a way that each shard contains at most one instance of the large smart contract, and that shards $(w,s)$ with prefix $s$ of length $|s|\leq d$ contain exactly one instance. [Back ↑](#4-1-3-small-and-large-smart-contracts) +**29** In fact, up to the first $d$ bits are replaced in such a way that each shard contains at most one instance of the large smart contract, and that shards $(w,s)$ with prefix $s$ of length $|s|\leq d$ contain exactly one instance. [Back ↑](#ref-fn29) -**30** This scheme uses anonymous constructors and anonymous fields, both represented by an underscore `_`. [Back ↑](#4-1-6-account-description) +**30** This scheme uses anonymous constructors and anonymous fields, both represented by an underscore `_`. [Back ↑](#ref-fn30) -**31** In particular, if a user mistakenly sends some funds to a non-existent address in a bounceable message, the funds will not be wasted, but rather will be returned (bounced) back. Therefore, a user wallet application should set the bounce flag in all generated messages by default unless explicitly instructed otherwise. However, non-bounceable messages are indispensable in some situations (cf. 1.7.6). [Back ↑](#4-2-6-bouncing-inbound-messages-to-non-existent-accounts) +**31** In particular, if a user mistakenly sends some funds to a non-existent address in a bounceable message, the funds will not be wasted, but rather will be returned (bounced) back. Therefore, a user wallet application should set the bounce flag in all generated messages by default unless explicitly instructed otherwise. However, non-bounceable messages are indispensable in some situations ([Using non-bounceable messages](#1-7-6-using-non-bounceable-messages)). [Back ↑](#ref-fn31) -**32** A reference implementation of a TVM emulator running in a stripped-down version of TVM may be committed into the masterchain to be used when a disagreement between the validators on a specific run of TVM arises. In this way, flawed implementations of TVM may be detected. The reference implementation then serves as an authoritative source on the operational semantics of TVM. (Cf. [[2](#ref-2)] B.2) [Back ↑](#4-2-8-reasons-for-splitting-the-processing-into-computation-and-action-phases) +**32** A reference implementation of a TVM emulator running in a stripped-down version of TVM may be committed into the masterchain to be used when a disagreement between the validators on a specific run of TVM arises. In this way, flawed implementations of TVM may be detected. The reference implementation then serves as an authoritative source on the operational semantics of TVM. ([[2]](#references) B.2) [Back ↑](#ref-fn32) -**33** Notice that this record does not represent a change in the state of the account, because the transaction may still be aborted during the action phase. In that case, the new persistent data indirectly referenced by vm_final_state_hash will be discarded. [Back ↑](#4-3-7-valid-computing-phase) +**33** Notice that this record does not represent a change in the state of the account, because the transaction may still be aborted during the action phase. In that case, the new persistent data indirectly referenced by vm_final_state_hash will be discarded. [Back ↑](#ref-fn33) -**34** The most common way of creating shared libraries for TVM is to publish a reference to the root cell of the library in the masterchain. [Back ↑](#4-4-3-smart-contract-library-environment) +**34** The most common way of creating shared libraries for TVM is to publish a reference to the root cell of the library in the masterchain. [Back ↑](#ref-fn34) -**35** The persistent data of the smart contract need not be loaded in its entirety for this to occur. Instead the root is loaded, and TVM may load other cells by their references from the root only when they are accessed, thus providing a form of virtual memory. [Back ↑](#4-4-4-the-initial-state-of-tvm) +**35** The persistent data of the smart contract need not be loaded in its entirety for this to occur. Instead the root is loaded, and TVM may load other cells by their references from the root only when they are accessed, thus providing a form of virtual memory. [Back ↑](#ref-fn35) -**36** Both the global gas limit and the gas price are configurable parameters determined by the current state of the masterchain. [Back ↑](#4-4-4-the-initial-state-of-tvm) +**36** Both the global gas limit and the gas price are configurable parameters determined by the current state of the masterchain. [Back ↑](#ref-fn36) -**37** In principle, an experimental version of TON Blockchain might choose to keep only the hashes of the initial and final states of the shardchain. The Merkle update increases the block size, but it is handy for full nodes that want to keep and update their copy of the shardchain state. Otherwise, the full nodes would have to repeat all the computations contained in a block to compute the updated state of the shardchain by themselves. [Back ↑](#5-1-2-components-of-a-shardchain-block) +**37** In principle, an experimental version of TON Blockchain might choose to keep only the hashes of the initial and final states of the shardchain. The Merkle update increases the block size, but it is handy for full nodes that want to keep and update their copy of the shardchain state. Otherwise, the full nodes would have to repeat all the computations contained in a block to compute the updated state of the shardchain by themselves. [Back ↑](#ref-fn37) -**38** Notice that these "absent cells" are different from the library reference and external reference cells, which are kinds of exotic cells (cf. [[4, 3.1.7](#ref-4)]). Absent cells, by contrast, are introduced only for the purpose of serializing incomplete bags of cells, and can never be processed by TVM. [Back ↑](#5-3-6-serialization-of-one-cell-from-a-bag-of-cells) +**38** Notice that these "absent cells" are different from the library reference and external reference cells, which are kinds of exotic cells ([[4]](#references), 3.1.7). Absent cells, by contrast, are introduced only for the purpose of serializing incomplete bags of cells, and can never be processed by TVM. [Back ↑](#ref-fn38) -**39** Notice that exotic cells (with $s=1$) always have $b\geq8$, with the cell type encoded in the first eight data bits (cf. [[4, 3.1.7](#ref-4)]). [Back ↑](#5-3-6-serialization-of-one-cell-from-a-bag-of-cells) +**39** Notice that exotic cells (with $s=1$) always have $b\geq8$, with the cell type encoded in the first eight data bits ([[4]](#references), 3.1.7). [Back ↑](#ref-fn39) -**40** If the bag of cells is not complete, some of these cell references may refer to cells $c'$ absent from the bag of cells. In that case, special "absent cells" with $r=7$ are included into the bag of cells and are assigned some indices $j$. These indices are then used to represent references to absent cells. [Back ↑](#5-3-6-serialization-of-one-cell-from-a-bag-of-cells) +**40** If the bag of cells is not complete, some of these cell references may refer to cells $c'$ absent from the bag of cells. In that case, special "absent cells" with $r=7$ are included into the bag of cells and are assigned some indices $j$. These indices are then used to represent references to absent cells. [Back ↑](#ref-fn40) -**41** Arithmetic modulo $p$ for a modulus $p$ near a power of two can be implemented very efficiently. On the other hand, residues modulo $2^{255}-19$ can be represented by 255-bit integers. This is the reason this particular value of $p$ has been chosen by D. Bernstein. [Back ↑](#a-1-1-finite-fields) +**41** Arithmetic modulo $p$ for a modulus $p$ near a power of two can be implemented very efficiently. On the other hand, residues modulo $2^{255}-19$ can be represented by 255-bit integers. This is the reason this particular value of $p$ has been chosen by D. Bernstein. [Back ↑](#ref-fn41) -**42** Actually, D. Bernstein chose $A=486662$ because it is the smallest positive integer $A\equiv 2\pmod 4$ such that both the corresponding Montgomery curve ([31](#a-2-1-curve25519)) over $\mathbb{F}_p$ for $p=2^{255}-19$ and the quadratic twist of this curve have small cofactors. Such an arrangement avoids the necessity to check whether an $x$-coordinate $x_P\in\mathbb{F}_p$ of a point $P$ defines a point $(x_P,y_P)\in\mathbb{F}_p^2$ lying on the Montgomery curve itself or on its quadratic twist. [Back ↑](#a-2-1-curve25519) \ No newline at end of file +**42** Actually, D. Bernstein chose $A=486662$ because it is the smallest positive integer $A\equiv 2\pmod 4$ such that both the corresponding Montgomery curve ([31](#a-2-1-curve25519)) over $\mathbb{F}_p$ for $p=2^{255}-19$ and the quadratic twist of this curve have small cofactors. Such an arrangement avoids the necessity to check whether an $x$-coordinate $x_P\in\mathbb{F}_p$ of a point $P$ defines a point $(x_P,y_P)\in\mathbb{F}_p^2$ lying on the Montgomery curve itself or on its quadratic twist. [Back ↑](#ref-fn42) \ No newline at end of file