diff --git a/foundations/whitepapers/tvm.mdx b/foundations/whitepapers/tvm.mdx
index 43fa1de74..2faf1379c 100644
--- a/foundations/whitepapers/tvm.mdx
+++ b/foundations/whitepapers/tvm.mdx
@@ -8,6 +8,7 @@ description: "Whitepaper by Dr. Nikolai Durov"
 **Date**: March 23, 2020 <br />
 <Icon icon="file-pdf" size={16} />: [Original whitepaper, PDF](/resources/pdfs/tvm.pdf)
 
+
 ## Abstract
 
 The aim of this text is to provide a description of the Telegram Open Network Virtual Machine (TON VM or TVM), used to execute smart contracts in the TON Blockchain.
@@ -20,9 +21,9 @@ Additionally, TVM must meet the following requirements:
 
 - It must provide for possible future extensions and improvements while retaining backward compatibility and interoperability, because the code of a smart contract, once committed into the blockchain, must continue working in a predictable manner regardless of any future modifications to the VM.
 - It must strive to attain high "(virtual) machine code" density, so that the code of a typical smart contract occupies as little persistent blockchain storage as possible.
-- It must be completely deterministic. In other words, each run of the same code with the same input data must produce the same result, regardless of specific software and hardware used.<sup>[1](#fn1)</sup>
+- It must be completely deterministic. In other words, each run of the same code with the same input data must produce the same result, regardless of specific software and hardware used.<a id="ref-fn1"></a><sup>[1](#fn1)</sup>
 
-The design of TVM is guided by these requirements. While this document describes a preliminary and experimental version of TVM,<sup>[2](#fn2)</sup> the backward compatibility mechanisms built into the system allow us to be relatively unconcerned with the efficiency of the operation encoding used for TVM code in this preliminary version.
+The design of TVM is guided by these requirements. While this document describes a preliminary and experimental version of TVM,<a id="ref-fn2"></a><sup>[2](#fn2)</sup> the backward compatibility mechanisms built into the system allow us to be relatively unconcerned with the efficiency of the operation encoding used for TVM code in this preliminary version.
 
 TVM is not intended to be implemented in hardware (e.g., in a specialized microprocessor chip); rather, it should be implemented in software running on conventional hardware. This consideration lets us incorporate some high-level concepts and operations in TVM that would require convoluted microcode in a hardware implementation but pose no significant problems for a software implementation. Such operations are useful for achieving high code density and minimizing the byte (or storage cell) profile of smart-contract code when deployed in the TON Blockchain.
 
@@ -60,7 +61,7 @@ In some cases, it is more convenient to assume the completion is enabled by defa
 
 ## 1.1   TVM is a stack machine
 
-First of all, *TVM is a stack machine*. This means that, instead of keeping values in some "variables" or "general-purpose registers", they are kept in a (LIFO) *stack*, at least from the "low-level" (TVM) perspective.<sup>[3](#fn3)</sup>
+First of all, *TVM is a stack machine*. This means that, instead of keeping values in some "variables" or "general-purpose registers", they are kept in a (LIFO) *stack*, at least from the "low-level" (TVM) perspective.<a id="ref-fn3"></a><sup>[3](#fn3)</sup>
 
 Most operations and user-defined functions take their arguments from the top of the stack, and replace them with their result. For example, the integer addition primitive (built-in operation) $\texttt{ADD}$ does not take any arguments describing which registers or immediate values should be added together and where the result should be stored. Instead, the two top values are taken from the stack, they are added together, and their sum is pushed into the stack in their place.
 
@@ -83,18 +84,18 @@ One should bear in mind that one always can implement compilers from statically
 A preliminary list of value types supported by TVM is as follows:
 
 - *Integer* — Signed 257-bit integers, representing integer numbers in the range $-2^{256}\ldots2^{256}-1$, as well as a special "not-a-number" value $\texttt{NaN}$.
-- *Cell* — A *TVM cell* consists of at most 1023 bits of data, and of at most four references to other cells. All persistent data (including TVM code) in the TON Blockchain is represented as a collection of TVM cells (cf. [[1](#ref-1)], 2.5.14).
+- *Cell* — A *TVM cell* consists of at most 1023 bits of data, and of at most four references to other cells. All persistent data (including TVM code) in the TON Blockchain is represented as a [collection of TVM cells](/foundations/whitepapers/ton#2-5-14-“everything-is-a-bag-of-cells”-philosophy).
 - *Tuple* — An ordered collection of up to 255 components, having arbitrary value types, possibly distinct. May be used to represent non-persistent values of arbitrary algebraic data types.
 - *Null* — A type with exactly one value $\bot$, used for representing empty lists, empty branches of binary trees, absence of return value in some situations, and so on.
 - *Slice* — A *TVM cell slice*, or *slice* for short, is a contiguous "sub-cell" of an existing cell, containing some of its bits of data and some of its references. Essentially, a slice is a read-only view for a subcell of a cell. Slices are used for unpacking data previously stored (or serialized) in a cell or a tree of cells.
 - *Builder* — A *TVM cell builder*, or *builder* for short, is an "incomplete" cell that supports fast operations of appending bitstrings and cell references at its end. Builders are used for packing (or serializing) data from the top of the stack into new cells (e.g., before transferring them to persistent storage).
 - *Continuation* — Represents an "execution token" for TVM, which may be invoked (executed) later. As such, it generalizes function addresses (i.e., function pointers and references), subroutine return addresses, instruction pointer addresses, exception handler addresses, closures, partial applications, anonymous functions, and so on.
 
-This list of value types is incomplete and may be extended in future revisions of TVM without breaking the old TVM code, due mostly to the fact that all originally defined primitives accept only values of types known to them and will fail (generate a type-checking exception) if invoked on values of new types. Furthermore, existing value types themselves can also be extended in the future: for example, 257-bit *Integer* might become 513-bit *LongInteger*, with originally defined arithmetic primitives failing if either of the arguments or the result does not fit into the original subtype *Integer*. Backward compatibility with respect to the introduction of new value types and extension of existing value types will be discussed in more detail later (cf. [5.1.4](#5-1-4-changing-the-behavior-of-old-operations)).
+This list of value types is incomplete and may be extended in future revisions of TVM without breaking the old TVM code, due mostly to the fact that all originally defined primitives accept only values of types known to them and will fail (generate a type-checking exception) if invoked on values of new types. Furthermore, existing value types themselves can also be extended in the future: for example, 257-bit *Integer* might become 513-bit *LongInteger*, with originally defined arithmetic primitives failing if either of the arguments or the result does not fit into the original subtype *Integer*. Backward compatibility with respect to the introduction of new value types and extension of existing value types will be discussed [later](#5-1-4-changing-the-behavior-of-old-operations).
 
 ## 1.2   Categories of TVM instructions
 
-TVM *instructions*, also called *primitives* and sometimes *(built-in) operations*, are the smallest operations atomically performed by TVM that can be present in the TVM code. They fall into several categories, depending on the types of values (cf. [1.1.3](#1-1-3-preliminary-list-of-value-types)) they work on. The most important of these categories are:
+TVM *instructions*, also called *primitives* and sometimes *(built-in) operations*, are the smallest operations atomically performed by TVM that can be present in the TVM code. They fall into several categories, depending on the [types of values](#1-1-3-preliminary-list-of-value-types) they work on. The most important of these categories are:
 
 - *Stack (manipulation) primitives* — Rearrange data in the TVM stack, so that the other primitives and user-defined functions can later be called with correct arguments. Unlike most other primitives, they are polymorphic, i.e., work with values of arbitrary types.
 - *Tuple (manipulation) primitives* — Construct, modify, and decompose *Tuple*s. Similarly to the stack primitives, they are polymorphic.
@@ -121,10 +122,10 @@ The original version of TVM defines and uses the following control registers:
 - $\texttt{c0}$ — Contains the *next continuation* or *return continuation* (similar to the subroutine return address in conventional designs). This value must be a *Continuation*.
 - $\texttt{c1}$ — Contains the *alternative (return) continuation*; this value must be a *Continuation*. It is used in some (experimental) control flow primitives, allowing TVM to define and call "subroutines with two exit points".
 - $\texttt{c2}$ — Contains the *exception handler*. This value is a *Continuation*, invoked whenever an exception is triggered.
-- $\texttt{c3}$ — Contains the *current dictionary*, essentially a hashmap containing the code of all functions used in the program. For reasons explained later in [4.6](#4-6-functions%2C-recursion%2C-and-dictionaries), this value is also a *Continuation*, not a *Cell* as one might expect.
+- $\texttt{c3}$ — Contains the *current dictionary*, essentially a hashmap containing the code of all functions used in the program. This value is also a [Continuation](#4-6-functions%2C-recursion%2C-and-dictionaries), not a *Cell* as one might expect.
 - $\texttt{c4}$ — Contains the *root of persistent data*, or simply the *data*. This value is a *Cell*. When the code of a smart contract is invoked, $\texttt{c4}$ points to the root cell of its persistent data kept in the blockchain state. If the smart contract needs to modify this data, it changes $\texttt{c4}$ before returning.
 - $\texttt{c5}$ — Contains the *output actions*. It is also a *Cell* initialized by a reference to an empty cell, but its final value is considered one of the smart contract outputs. For instance, the $\texttt{SENDMSG}$ primitive, specific for the TON Blockchain, simply inserts the message into a list stored in the output actions.
-- $\texttt{c7}$ — Contains the *root of temporary data*. It is a *Tuple*, initialized by a reference to an empty *Tuple* before invoking the smart contract and discarded after its termination.<sup>[4](#fn4)</sup>
+- $\texttt{c7}$ — Contains the *root of temporary data*. It is a *Tuple*, initialized by a reference to an empty *Tuple* before invoking the smart contract and discarded after its termination.<a id="ref-fn4"></a><sup>[4](#fn4)</sup>
 
 More control registers may be defined in the future for specific TON Blockchain or high-level programming language purposes, if necessary.
 
@@ -132,15 +133,15 @@ More control registers may be defined in the future for specific TON Blockchain
 
 The total state of TVM consists of the following components:
 
-- *Stack* (cf. [1.1](#1-1-tvm-is-a-stack-machine)) — Contains zero or more *values* (cf. [1.1.1](#1-1-1-tvm-values)), each belonging to one of *value types* listed in [1.1.3](#1-1-3-preliminary-list-of-value-types).
-- *Control registers* $\mathit{c0}$–$\mathit{c15}$ — Contain some specific values as described in [1.3.2](#1-3-2-list-of-control-registers). (Only seven control registers are used in the current version.)
+- *[Stack](#1-1-tvm-is-a-stack-machine)* — Contains zero or more [values](#1-1-1-tvm-values), each belonging to one of [value types](#1-1-3-preliminary-list-of-value-types).
+- *Control registers* $\mathit{c0}$–$\mathit{c15}$ — Contain some [specific values](#1-3-2-list-of-control-registers). (Only seven control registers are used in the current version.)
 - *Current continuation* $\mathit{cc}$ — Contains the current continuation (i.e., the code that would be normally executed after the current primitive is completed). This component is similar to the instruction pointer register ($\texttt{ip}$) in other architectures.
 - *Current codepage* $\mathit{cp}$ — A special signed 16-bit integer value that selects the way the next TVM opcode will be decoded. For example, future versions of TVM might use different codepages to add new opcodes while preserving backward compatibility.
 - *Gas limits* $\mathit{gas}$ — Contains four signed 64-bit integers: the current gas limit $g_l$, the maximal gas limit $g_m$, the remaining gas $g_r$, and the gas credit $g_c$. Always $0\leq g_l\leq g_m$, $g_c\geq0$, and $g_r\leq g_l+g_c$; $g_c$ is usually initialized by zero, $g_r$ is initialized by $g_l+g_c$ and gradually decreases as the TVM runs. When $g_r$ becomes negative or if the final value of $g_r$ is less than $g_c$, an *out of gas* exception is triggered.
 
-Notice that there is no "return stack" containing the return addresses of all previously called but unfinished functions. Instead, only control register $\texttt{c0}$ is used. The reason for this will be explained later in [4.1.9](#4-1-9-subroutine-calls%3A-callx-or-execute-primitives).
+Notice that there is no "return stack" containing the return addresses of all previously called but unfinished functions. Instead, only [control register $\texttt{c0}$](#4-1-9-subroutine-calls%3A-or-primitives) is used.
 
-Also notice that there are no general-purpose registers, because TVM is a stack machine (cf. [1.1](#1-1-tvm-is-a-stack-machine)). So the above list, which can be summarized as "stack, control, continuation, codepage, and gas" (SCCCG), similarly to the classical SECD machine state ("stack, environment, control, dump"), is indeed the *total* state of TVM.<sup>[5](#fn5)</sup>
+Also notice that there are no general-purpose registers, because [TVM is a stack machine](#1-1-tvm-is-a-stack-machine). So the above list, which can be summarized as "stack, control, continuation, codepage, and gas" (SCCCG), similarly to the classical SECD machine state ("stack, environment, control, dump"), is indeed the *total* state of TVM.<a id="ref-fn5"></a><sup>[5](#fn5)</sup>
 
 ## 1.5   Integer arithmetic
 
@@ -166,7 +167,7 @@ TVM also has a primitive $\texttt{MODPOW2}$ $n$, which reduces the integer at th
 
 ### 1.5.5. *Integer* is 257-bit, not 256-bit
 
-One can understand now why TVM's *Integer* is (signed) 257-bit, not 256-bit. The reason is that it is the smallest integer type containing both signed 256-bit integers and unsigned 256-bit integers, which does not require automatic reinterpreting of the same 256-bit string depending on the operation used (cf. [1.5.1](#1-5-1-absence-of-automatic-conversion-of-integers)).
+One can understand now why TVM's *Integer* is (signed) 257-bit, not 256-bit. The reason is that it is the smallest integer type containing both signed 256-bit integers and unsigned 256-bit integers, which does not require [automatic reinterpreting](#1-5-1-absence-of-automatic-conversion-of-integers) of the same 256-bit string depending on the operation used.
 
 ### 1.5.6. Division and rounding
 
@@ -178,9 +179,11 @@ The implementation of division in TVM somewhat differs from most other implement
 
 To simplify implementation of fixed-point arithmetic, TVM supports combined multiply-divide, multiply-shift, and shift-divide operations with double-length (i.e., 514-bit) intermediate product. For example, $\texttt{MULDIVMODR}$ takes three integer arguments from the stack, $a$, $b$, and $c$, first computes $ab$ using a 514-bit intermediate result, and then divides $ab$ by $c$ using rounding to the nearest integer. If $c$ is zero or if the quotient does not fit into *Integer*, either two $\texttt{NaN}$s are returned, or an integer overflow exception is generated, depending on whether a quiet version of the operation has been used. Otherwise, both the quotient and the remainder are pushed into the stack.
 
+---
+
 # 2   The stack
 
-This chapter contains a general discussion and comparison of register and stack machines, expanded further in Appendix [C](#appendix-c-code-density-of-stack-and-register-machines), and describes the two main classes of stack manipulation primitives employed by TVM: the *basic* and the *compound stack manipulation primitives*. An informal explanation of their sufficiency for all stack reordering required for correctly invoking other primitives and user-defined functions is also provided. Finally, the problem of efficiently implementing TVM stack manipulation primitives is discussed in [2.3](#2-3-efficiency-of-stack-manipulation-primitives).
+This chapter contains a general discussion and comparison of register and stack machines, expanded further in [Appendix C](#c-code-density-of-stack-and-register-machines), and describes the two main classes of stack manipulation primitives employed by TVM: the *basic* and the *compound stack manipulation primitives*. An informal explanation of their sufficiency for all stack reordering required for correctly invoking other primitives and user-defined functions is also provided. Finally, the problem of efficiently implementing TVM [stack manipulation primitives](#2-3-efficiency-of-stack-manipulation-primitives) will be discussed.
 
 ## 2.1    Stack calling conventions
 
@@ -214,7 +217,7 @@ Some register machine architectures require one of the arguments for most arithm
 
 When compiled for a register machine, high-level language functions usually receive their arguments in certain registers in a predefined order. If there are too many arguments, these functions take the remainder from the stack (yes, a register machine usually has a stack, too!). Some register calling conventions pass no arguments in registers at all, however, and only use the stack (for example, the original calling conventions used in implementations of Pascal and C, although modern implementations of C use some registers as well).
 
-For simplicity, we will assume that up to $m\leq n$ function arguments are passed in registers, and that these registers are $\texttt{r0}$, $\texttt{r1}$, $\ldots$, $\texttt{r}(m-1)$, in that order (if some other registers are used, we can simply renumber them).<sup>[6](#fn6)</sup>
+For simplicity, we will assume that up to $m\leq n$ function arguments are passed in registers, and that these registers are $\texttt{r0}$, $\texttt{r1}$, $\ldots$, $\texttt{r}(m-1)$, in that order (if some other registers are used, we can simply renumber them).<a id="ref-fn6"></a><sup>[6](#fn6)</sup>
 
 ### 2.1.6. Order of function arguments
 
@@ -224,7 +227,6 @@ In this respect the TVM stack calling conventions—obeyed, at least, by TVM pri
 
 Of course, an implementation of a high-level language for TVM might choose some other calling conventions for its functions, different from the default ones. This might be useful for certain functions—for instance, if the total number of arguments depends on the value of the first argument, as happens for "variadic functions" such as $\texttt{scanf}$ and $\texttt{printf}$. In such cases, the first one or several arguments are better passed near the top of the stack, not somewhere at some unknown location deep in the stack.
 
-
 ### 2.1.7. Arguments to arithmetic primitives on register machines
 
 On a stack machine, built-in arithmetic primitives (such as $\texttt{ADD}$ or $\texttt{DIVMOD}$) follow the same calling conventions as user-defined functions. In this respect, user-defined functions (for example, a function computing the square root of a number) might be considered as "extensions" or "custom upgrades" of the stack machine. This is one of the clearest advantages of stack machines (and of stack programming languages such as Forth) compared to register machines.
@@ -237,7 +239,7 @@ In contrast, arithmetic instructions (built-in operations) on register machines
 
 - One-address form — Always takes one of the arguments from the accumulator $\texttt{r0}$, and stores the result in $\texttt{r0}$ as well; then $i=k=0$, and only $j$ needs to be specified by the instruction. This form is used by some simpler microprocessors (such as Intel 8080).
 
-Note that this flexibility is available only for built-in operations, but not for user-defined functions. In this respect, register machines are not as easily "upgradable" as stack machines.<sup>[7](#fn7)</sup>
+Note that this flexibility is available only for built-in operations, but not for user-defined functions. In this respect, register machines are not as easily "upgradable" as stack machines.<a id="ref-fn7"></a><sup>[7](#fn7)</sup>
 
 ### 2.1.8. Return values of functions
 
@@ -251,7 +253,7 @@ Some functions might want to return several values $y_1$, $\ldots$, $y_k$, with
 
 For example, the "divide with remainder" primitive $\texttt{DIVMOD}$ needs to return two values, the quotient $q$ and the remainder $r$. Therefore, $\texttt{DIVMOD}$ pushes $q$ and $r$ into the stack, in that order, so that the quotient is available thereafter at $\texttt{s1}$ and the remainder at $\texttt{s0}$. The net effect of $\texttt{DIVMOD}$ is to divide the original value of $\texttt{s1}$ by the original value of $\texttt{s0}$, and return the quotient in $\texttt{s1}$ and the remainder in $\texttt{s0}$. In this particular case the depth of the stack and the values of all other "stack registers" remain unchanged, because $\texttt{DIVMOD}$ takes two arguments and returns two results. In general, the values of other "stack registers" that lie in the stack below the arguments passed and the values returned are shifted according to the change of the depth of the stack.
 
-In principle, some primitives and user-defined functions might return a variable number of result values. In this respect, the remarks above about variadic functions (cf. [2.1.6](#2-1-6-order-of-function-arguments)) apply: the total number of result values and their types should be determined by the values near the top of the stack. (For example, one might push the return values $y_1$, $\ldots$, $y_k$, and then push their total number $k$ as an integer. The caller would then determine the total number of returned values by inspecting $\texttt{s0}$.)
+In principle, some primitives and user-defined functions might return a variable number of result values. In this respect, the remarks above about [variadic functions](#2-1-6-order-of-function-arguments) apply: the total number of result values and their types should be determined by the values near the top of the stack. (For example, one might push the return values $y_1$, $\ldots$, $y_k$, and then push their total number $k$ as an integer. The caller would then determine the total number of returned values by inspecting $\texttt{s0}$.)
 
 In this respect TVM, again, faithfully observes Forth calling conventions.
 
@@ -261,19 +263,19 @@ When a stack of depth $n$ contains values $z_1$, $\ldots$, $z_n$, in that order,
 
 Alternatively, one can describe $\texttt{DIV}$ as a primitive that runs on a stack $S'$ of depth $n\geq2$, divides $\texttt{s1}$ by $\texttt{s0}$, and returns the floor-rounded quotient as $\texttt{s0}$ of the new stack $S''$ of depth $n-1$. The new value of $\texttt{s}(i)$ equals the old value of $\texttt{s}(i+1)$ for $1\leq i<n-1$. These descriptions are equivalent, but saying that $\texttt{DIV}$ transforms $x$ $y$ into $\lfloor x/y\rfloor$, or $\ldots$ $x$ $y$ into $\ldots$ $\lfloor x/y\rfloor$, is more concise.
 
-The stack notation is extensively used throughout Appendix [A](#a-instructions-and-opcodes), where all currently defined TVM primitives are listed.
+The stack notation is extensively used throughout [Appendix A](#a-instructions-and-opcodes), where all currently defined TVM primitives are listed.
 
 ### 2.1.11. Explicitly defining the number of arguments to a function
 
 Stack machines usually pass the current stack in its entirety to the invoked primitive or function. That primitive or function accesses only the several values near the top of the stack that represent its arguments, and pushes the return values in their place, by convention leaving all deeper values intact. Then the resulting stack, again in its entirety, is returned to the caller.
 
-Most TVM primitives behave in this way, and we expect most user-defined functions to be implemented under such conventions. However, TVM provides mechanisms to specify how many arguments must be passed to a called function (cf. [4.1.10](#4-1-10-determining-the-number-of-arguments-passed-to-and%2For-return-values-accepted-from-a-subroutine)). When these mechanisms are employed, the specified number of values are moved from the caller's stack into the (usually initially empty) stack of the called function, while deeper values remain in the caller's stack and are inaccessible to the callee. The caller can also specify how many return values it expects from the called function.
+Most TVM primitives behave in this way, and we expect most user-defined functions to be implemented under such conventions. However, TVM provides mechanisms to specify how many [arguments](#4-1-10-determining-the-number-of-arguments-passed-to-and%2For-return-values-accepted-from-a-subroutine) must be passed to a called function. When these mechanisms are employed, the specified number of values are moved from the caller's stack into the (usually initially empty) stack of the called function, while deeper values remain in the caller's stack and are inaccessible to the callee. The caller can also specify how many return values it expects from the called function.
 
 Such argument-checking mechanisms might be useful, for example, for a library function that calls user-provided functions passed as arguments to it.
 
 ## 2.2   Stack manipulation primitives
 
-A stack machine, such as TVM, employs a lot of stack manipulation primitives to rearrange arguments to other primitives and user-defined functions, so that they become located near the top of the stack in correct order. This section discusses which stack manipulation primitives are necessary and sufficient for achieving this goal, and which of them are used by TVM. Some examples of code using these primitives can be found in Appendix [C](#c-code-density-of-stack-and-register-machines).
+A stack machine, such as TVM, employs a lot of stack manipulation primitives to rearrange arguments to other primitives and user-defined functions, so that they become located near the top of the stack in correct order. This section discusses which stack manipulation primitives are necessary and sufficient for achieving this goal, and which of them are used by TVM. Some examples of code using these primitives can be found in [Appendix C](#c-code-density-of-stack-and-register-machines).
 
 ### 2.2.1. Basic stack manipulation primitives
 
@@ -293,11 +295,11 @@ Some other "unsystematic" stack manipulation operations might be also defined (e
 
 A compiler or a human TVM-code programmer might use the basic stack primitives as follows.
 
-Suppose that the function or primitive to be invoked is to be passed, say, three arguments $x$, $y$, and $z$, currently located in stack registers $\texttt{s}(i)$, $\texttt{s}(j)$, and $\texttt{s}(k)$. In this circumstance, the compiler (or programmer) might issue operation $\texttt{PUSH s}(i)$ (if a copy of $x$ is needed after the call to this primitive) or $\texttt{XCHG s}(i)$ (if it will not be needed afterwards) to put the first argument $x$ into the top of the stack. Then, the compiler (or programmer) could use either $\texttt{PUSH s}(j')$ or $\texttt{XCHG s}(j')$, where $j'=j$ or $j+1$, to put $y$ into the new top of the stack.<sup>[8](#fn8)</sup>
+Suppose that the function or primitive to be invoked is to be passed, say, three arguments $x$, $y$, and $z$, currently located in stack registers $\texttt{s}(i)$, $\texttt{s}(j)$, and $\texttt{s}(k)$. In this circumstance, the compiler (or programmer) might issue operation $\texttt{PUSH s}(i)$ (if a copy of $x$ is needed after the call to this primitive) or $\texttt{XCHG s}(i)$ (if it will not be needed afterwards) to put the first argument $x$ into the top of the stack. Then, the compiler (or programmer) could use either $\texttt{PUSH s}(j')$ or $\texttt{XCHG s}(j')$, where $j'=j$ or $j+1$, to put $y$ into the new top of the stack.<a id="ref-fn8"></a><sup>[8](#fn8)</sup>
 
-Proceeding in this manner, we see that we can put the original values of $x$, $y$, and $z$—or their copies, if needed—into locations $\texttt{s2}$, $\texttt{s1}$, and $\texttt{s0}$, using a sequence of push and exchange operations (cf. [2.2.4](#2-2-4-mnemonics-of-compound-stack-operations) and [2.2.5](#2-2-5-semantics-of-compound-stack-operations) for a more detailed explanation). In order to generate this sequence, the compiler will need to know only the three values $i$, $j$ and $k$, describing the old locations of variables or temporary values in question, and some flags describing whether each value will be needed thereafter or is needed only for this primitive or function call. The locations of other variables and temporary values will be affected in the process, but a compiler (or a human programmer) can easily track their new locations.
+Proceeding in this manner, we see that we can put the original values of $x$, $y$, and $z$—or their copies, if needed—into locations $\texttt{s2}$, $\texttt{s1}$, and $\texttt{s0}$, using a sequence of push and exchange operations ([2.2.4](#2-2-4-mnemonics-of-compound-stack-operations) and [2.2.5](#2-2-5-semantics-of-compound-stack-operations) for a more detailed explanation). In order to generate this sequence, the compiler will need to know only the three values $i$, $j$ and $k$, describing the old locations of variables or temporary values in question, and some flags describing whether each value will be needed thereafter or is needed only for this primitive or function call. The locations of other variables and temporary values will be affected in the process, but a compiler (or a human programmer) can easily track their new locations.
 
-Similarly, if the results returned from a function need to be discarded or moved to other stack registers, a suitable sequence of exchange and pop operations will do the job. In the typical case of one return value in $\texttt{s0}$, this is achieved either by an $\texttt{XCHG s}(i)$ or a $\texttt{POP s}(i)$ (in most cases, a $\texttt{DROP}$) operation.<sup>[9](#fn9)</sup>
+Similarly, if the results returned from a function need to be discarded or moved to other stack registers, a suitable sequence of exchange and pop operations will do the job. In the typical case of one return value in $\texttt{s0}$, this is achieved either by an $\texttt{XCHG s}(i)$ or a $\texttt{POP s}(i)$ (in most cases, a $\texttt{DROP}$) operation.<a id="ref-fn9"></a><sup>[9](#fn9)</sup>
 
 Rearranging the result value or values before returning from a function is essentially the same problem as arranging arguments for a function call, and is achieved similarly.
 
@@ -314,11 +316,11 @@ In order to improve the density of the TVM code and simplify development of comp
 
 Of course, such operations make sense only if they admit a more compact encoding than the equivalent sequence of basic operations. For example, if all top-of-stack exchanges, $\texttt{XCHG s1,s}(i)$ exchanges, and push and pop operations admit one-byte encodings, the only compound stack operations suggested above that might merit inclusion in the set of stack manipulation primitives are $\texttt{PUXC}$, $\texttt{XCHG3}$, and $\texttt{PUSH3}$.
 
-These compound stack operations essentially augment other primitives (instructions) in the code with the "true" locations of their operands, somewhat similarly to what happens with two-address or three-address register machine code. However, instead of encoding these locations inside the opcode of the arithmetic or another instruction, as is customary for register machines, we indicate these locations in a preceding compound stack manipulation operation. As already described in [2.1.7](#2-1-7-arguments-to-arithmetic-primitives-on-register-machines), the advantage of such an approach is that user-defined functions (or rarely used specific primitives added in a future version of TVM) can benefit from it as well (cf. [C.3](#c-3-sample-non-leaf-function) for a more detailed discussion with examples).
+These compound stack operations essentially augment other primitives (instructions) in the code with the "true" locations of their operands, somewhat similarly to what happens with two-address or three-address register machine code. However, instead of encoding these locations inside the opcode of the arithmetic or another instruction, as is customary for register machines, we indicate these locations in a preceding compound stack manipulation operation. The advantage of such an approach is that [user-defined functions](#2-1-7-arguments-to-arithmetic-primitives-on-register-machines) (or rarely used specific primitives added in a [future version](#c-3-sample-non-leaf-function) of TVM) can benefit from it as well.
 
 ### 2.2.4. Mnemonics of compound stack operations
 
-The mnemonics of compound stack operations, some examples of which have been provided in [2.2.3](#2-2-3-compound-stack-manipulation-primitives), are created as follows.
+The mnemonics of [compound stack](#2-2-3-compound-stack-manipulation-primitives) operations, some examples of which have been provided are created as follows.
 
 The $\gamma\geq2$ formal arguments $\texttt{s}(i_1)$, $\ldots$, $\texttt{s}(i_\gamma)$ to such an operation $O$ represent the values in the original stack that will end up in $\texttt{s}(\gamma-1)$, $\ldots$, $\texttt{s0}$ after the execution of this compound operation, at least if all $i_\nu$, $1\leq\nu\leq\gamma$, are distinct and at least $\gamma$. The mnemonic itself of the operation $O$ is a sequence of $\gamma$ two-letter strings $\texttt{PU}$ and $\texttt{XC}$, with $\texttt{PU}$ meaning that the corresponding argument is to be PUshed (i.e., a copy is to be created), and $\texttt{XC}$ meaning that the value is to be eXChanged (i.e., no other copy of the original value is created). Sequences of several $\texttt{PU}$ or $\texttt{XC}$ strings may be abbreviated to one $\texttt{PU}$ or $\texttt{XC}$ followed by the number of copies. (For instance, we write $\texttt{PUXC2PU}$ instead of $\texttt{PUXCXCPU}$.)
 
@@ -332,11 +334,11 @@ Each compound $\gamma$-ary operation $O$ $\texttt{s}(i_1)$,$\ldots$,$\texttt{s}(
 
 - Equivalently, we might begin the induction from $\gamma=1$. Then $\texttt{PU s}(i)$ corresponds to the sequence consisting of one basic operation $\texttt{PUSH s}(i)$, and $\texttt{XC s}(i)$ corresponds to the one-element sequence consisting of $\texttt{XCHG s}(i)$.
 
-- For $\gamma\geq1$ (or for $\gamma\geq2$, if we use $\gamma=1$ as induction base), there are two subcases:
+- Otherwise, if $\gamma\geq2$, compound operation $O$ has one of two forms:
 
   1. $O$ $\texttt{s}(i_1)$,$\ldots$,$\texttt{s}(i_\gamma)$, with $O=\texttt{XC}O'$, where $O'$ is a compound operation of arity $\gamma-1$ (i.e., the mnemonic of $O'$ consists of $\gamma-1$ strings $\texttt{XC}$ and $\texttt{PU}$). Let $\alpha$ be the total quantity of $\texttt{PU}$shes in $O$, and $\beta$ be that of e$\texttt{XC}$hanges, so that $\alpha+\beta=\gamma$. Then the original operation is translated into $\texttt{XCHG s}(\beta-1)\texttt{,s}(i_1)$, followed by the translation of $O'$ $\texttt{s}(i_2)$,$\ldots$,$\texttt{s}(i_\gamma)$, defined by the induction hypothesis.
 
-  2. $O$ $\texttt{s}(i_1)$,$\ldots$,$\texttt{s}(i_\gamma)$, with $O=\texttt{PU}O'$, where $O'$ is a compound operation of arity $\gamma-1$. Then the original operation is translated into $\texttt{PUSH s}(i_1)$; $\texttt{XCHG s}(\beta)$, followed by the translation of $O'$ $\texttt{s}(i_2+1)$,$\ldots$,$\texttt{s}(i_\gamma+1)$, defined by the induction hypothesis.<sup>[10](#fn10)</sup>
+  2. $O$ $\texttt{s}(i_1)$,$\ldots$,$\texttt{s}(i_\gamma)$, with $O=\texttt{PU}O'$, where $O'$ is a compound operation of arity $\gamma-1$. Then the original operation is translated into $\texttt{PUSH s}(i_1)$; $\texttt{XCHG s}(\beta)$, followed by the translation of $O'$ $\texttt{s}(i_2+1)$,$\ldots$,$\texttt{s}(i_\gamma+1)$, defined by the induction hypothesis.<a id="ref-fn10"></a><sup>[10](#fn10)</sup>
 
 ### 2.2.6. Stack manipulation instructions are polymorphic
 
@@ -382,7 +384,7 @@ This section presents a classification and general descriptions of cell types.
 
 ### 3.1.1. TVM memory and persistent storage consist of cells
 
-Recall that the TVM memory and persistent storage consist of *(TVM) cells*. Each cell contains up to 1023 bits of data and up to four references to other cells.<sup>[11](#fn11)</sup> Circular references are forbidden and cannot be created by means of TVM (cf. [2.3.5](#2-3-5-absence-of-circular-references)). In this way, all cells kept in TVM memory and persistent storage constitute a directed acyclic graph (DAG).
+Recall that the TVM memory and persistent storage consist of *(TVM) cells*. Each cell contains up to 1023 bits of data and up to four references to other cells.<a id="ref-fn11"></a><sup>[11](#fn11)</sup> [Circular references](#2-3-5-absence-of-circular-references) are forbidden and cannot be created by means of TVM. In this way, all cells kept in TVM memory and persistent storage constitute a directed acyclic graph (DAG).
 
 ### 3.1.2. Ordinary and exotic cells
 
@@ -412,7 +414,7 @@ When a cell needs to be transferred by a network protocol or stored in a disk fi
 
 2. Then the data bits are serialized as $\lceil b/8\rceil$ 8-bit octets (bytes). If $b$ is not a multiple of eight, a binary $\texttt{1}$ and up to six binary $\texttt{0}$s are appended to the data bits. After that, the data is split into $\lceil b/8\rceil$ eight-bit groups, and each group is interpreted as an unsigned big-endian integer $0\ldots 255$ and stored into an octet.
 
-3. Finally, each of the $r$ cell references is represented by 32 bytes containing the 256-bit *representation hash* $\text{Hash}(c_i)$, explained below in [3.1.5](#3-1-5-the-representation-hash-of-a-cell), of the cell $c_i$ referred to.
+3. Finally, each of the $r$ cell references is represented by 32 bytes containing the 256-bit [representation hash](#3-1-5-the-representation-hash-of-a-cell) $\text{Hash}(c_i)$ of the cell $c_i$ referred to.
 
 In this way, $2+\lceil b/8\rceil+32r$ bytes of $\text{CellRepr}(c)$ are obtained.
 
@@ -424,11 +426,11 @@ $$
 \text{Hash}(c):=\text{Sha256}\bigl(\text{CellRepr}(c)\bigr) \tag{2}
 $$
 
-Notice that cyclic cell references are not allowed and cannot be created by means of the TVM (cf. [2.3.5](#2-3-5-absence-of-circular-references)), so this recursion always ends, and the representation hash of any cell is well-defined.
+Notice that [cyclic cell references](#2-3-5-absence-of-circular-references) are not allowed and cannot be created by means of the TVM, so this recursion always ends, and the representation hash of any cell is well-defined.
 
 ### 3.1.6. The higher hashes of a cell
 
-Recall that a cell $c$ of level $l$ has $l$ higher hashes $\text{Hash}_i(c)$, $1\leq i\leq l$, as well. Exotic cells have their own rules for computing their higher hashes. Higher hashes $\text{Hash}_i(c)$ of an ordinary cell $c$ are computed similarly to its representation hash, but using the higher hashes $\text{Hash}_i(c_j)$ of its children $c_j$ instead of their representation hashes $\text{Hash}(c_j)$. By convention, we set $\text{Hash}_\infty(c):=\text{Hash}(c)$, and $\text{Hash}_i(c):=\text{Hash}_\infty(c)=\text{Hash}(c)$ for all $i>l$.<sup>[12](#fn12)</sup>
+Recall that a cell $c$ of level $l$ has $l$ higher hashes $\text{Hash}_i(c)$, $1\leq i\leq l$, as well. Exotic cells have their own rules for computing their higher hashes. Higher hashes $\text{Hash}_i(c)$ of an ordinary cell $c$ are computed similarly to its representation hash, but using the higher hashes $\text{Hash}_i(c_j)$ of its children $c_j$ instead of their representation hashes $\text{Hash}(c_j)$. By convention, we set $\text{Hash}_\infty(c):=\text{Hash}(c)$, and $\text{Hash}_i(c):=\text{Hash}_\infty(c)=\text{Hash}(c)$ for all $i>l$.<a id="ref-fn12"></a><sup>[12](#fn12)</sup>
 
 ### 3.1.7. Types of exotic cells
 
@@ -454,21 +456,21 @@ TVM currently supports the following cell types:
   \text{Lvl}(c)=\max(\text{Lvl}(c_1)-1,\text{Lvl}(c_2)-1,0) \tag{4}
   $$
 
-  A Merkle update behaves like a Merkle proof for both $c_1$ and $c_2$, and contains $8+256+256$ data bits with $\text{Hash}_1(c_1)$ and $\text{Hash}_1(c_2)$. However, an extra requirement is that *all pruned branch cells $c'$ that are descendants of $c_2$ and are bound by $c$ must also be descendants of $c_1$.*<sup>[13](#fn13)</sup> When a Merkle update cell is loaded, it is replaced by $c_2$.
+  A Merkle update behaves like a Merkle proof for both $c_1$ and $c_2$, and contains $8+256+256$ data bits with $\text{Hash}_1(c_1)$ and $\text{Hash}_1(c_2)$. However, an extra requirement is that *all pruned branch cells $c'$ that are descendants of $c_2$ and are bound by $c$ must also be descendants of $c_1$.*<a id="ref-fn13"></a><sup>[13](#fn13)</sup> When a Merkle update cell is loaded, it is replaced by $c_2$.
 
 ### 3.1.8. All values of algebraic data types are trees of cells
 
-Arbitrary values of arbitrary algebraic data types (e.g., all types used in functional programming languages) can be serialized into trees of cells (of level 0), and such representations are used for representing such values within TVM. The copy-on-write mechanism (cf. [2.3.2](#2-3-2-efficient-implementation-of-dup-and-push-instructions-using-copy-on-write)) allows TVM to identify cells containing the same data and references, and to keep only one copy of such cells. This actually transforms a tree of cells into a directed acyclic graph (with the additional property that all its vertices be accessible from a marked vertex called the "root"). However, this is a storage optimization rather than an essential property of TVM. From the perspective of a TVM code programmer, one should think of TVM data structures as trees of cells.
+Arbitrary values of arbitrary algebraic data types (e.g., all types used in functional programming languages) can be serialized into trees of cells (of level 0), and such representations are used for representing such values within TVM. The [copy-on-write mechanism](#2-3-2-efficient-implementation-of-and-instructions-using-copy-on-write) allows TVM to identify cells containing the same data and references, and to keep only one copy of such cells. This actually transforms a tree of cells into a directed acyclic graph (with the additional property that all its vertices be accessible from a marked vertex called the "root"). However, this is a storage optimization rather than an essential property of TVM. From the perspective of a TVM code programmer, one should think of TVM data structures as trees of cells.
 
 ### 3.1.9. TVM code is a tree of cells
 
 The TVM code itself is also represented by a tree of cells. Indeed, TVM code is simply a value of some complex algebraic data type, and as such, it can be serialized into a tree of cells.
 
-The exact way in which the TVM code (e.g., TVM assembly code) is transformed into a tree of cells is explained later (cf. [4.1.4](#4-1-4-normal-work-of-tvm%2C-or-the-main-loop) and [5.2](#5-2-instruction-encoding)), in sections discussing control flow instructions, continuations, and TVM instruction encoding.
+The exact way in which the TVM code (e.g., TVM assembly code) is transformed into a tree of cells is explained later ([4.1.4](#4-1-4-normal-work-of-tvm%2C-or-the-main-loop) and [5.2](#5-2-instruction-encoding)), in sections discussing control flow instructions, continuations, and TVM instruction encoding.
 
 ### 3.1.10. "Everything is a bag of cells" paradigm
 
-As described in [[1](#ref-1), 2.5.14], all the data used by the TON Blockchain, including the blocks themselves and the blockchain state, can be represented—and are represented—as collections, or "bags", of cells. We see that TVM's structure of data (cf. [3.1.8](#3-1-8-all-values-of-algebraic-data-types-are-trees-of-cells)) and code (cf. [3.1.9](#3-1-9-tvm-code-is-a-tree-of-cells)) nicely fits into this "everything is a bag of cells" paradigm. In this way, TVM can naturally be used to execute smart contracts in the TON Blockchain, and the TON Blockchain can be used to store the code and persistent data of these smart contracts between invocations of TVM. (Of course, both TVM and the TON Blockchain have been designed so that this would become possible.)
+All the data used by the TON Blockchain, including the blocks themselves and the blockchain state, can be represented—and are represented—as collections, or [bags of cells](/foundations/whitepapers/ton#2-5-14-“everything-is-a-bag-of-cells”-philosophy). We see that TVM's [structure of data](#3-1-8-all-values-of-algebraic-data-types-are-trees-of-cells) and [code](#3-1-9-tvm-code-is-a-tree-of-cells) nicely fits into this "everything is a bag of cells" paradigm. In this way, TVM can naturally be used to execute smart contracts in the TON Blockchain, and the TON Blockchain can be used to store the code and persistent data of these smart contracts between invocations of TVM. (Of course, both TVM and the TON Blockchain have been designed so that this would become possible.)
 
 ## 3.2   Data manipulation instructions and cells
 
@@ -481,11 +483,11 @@ The TVM cell instructions are naturally subdivided into two principal classes:
 - *Cell creation instructions* or *serialization instructions*, used to construct new cells from values previously kept in the stack and previously constructed cells.
 - *Cell parsing instructions* or *deserialization instructions*, used to extract data previously stored into cells by cell creation instructions.
 
-Additionally, there are *exotic cell instructions* used to create and inspect exotic cells (cf. [3.1.2](#3-1-2-ordinary-and-exotic-cells)), which in particular are used to represent pruned branches of Merkle proofs and Merkle proofs themselves.
+Additionally, there are *exotic cell instructions* used to create and inspect [exotic cells](#3-1-2-ordinary-and-exotic-cells), which in particular are used to represent pruned branches of Merkle proofs and Merkle proofs themselves.
 
 ### 3.2.2. *Builder* and *Slice* values
 
-Cell creation instructions usually work with *Builder* values, which can be kept only in the stack (cf. [1.1.3](#1-1-3-preliminary-list-of-value-types)). Such values represent partially constructed cells, for which fast operations for appending bitstrings, integers, other cells, and references to other cells can be defined. Similarly, cell parsing instructions make heavy use of *Slice* values, which represent either the remainder of a partially parsed cell, or a value (subcell) residing inside such a cell and extracted from it by a parsing instruction.
+Cell creation instructions usually work with [Builder values](#1-1-3-preliminary-list-of-value-types), which can be kept only in the stack. Such values represent partially constructed cells, for which fast operations for appending bitstrings, integers, other cells, and references to other cells can be defined. Similarly, cell parsing instructions make heavy use of *Slice* values, which represent either the remainder of a partially parsed cell, or a value (subcell) residing inside such a cell and extracted from it by a parsing instruction.
 
 ### 3.2.3. *Builder* and *Slice* values exist only as stack values
 
@@ -516,7 +518,7 @@ The mnemonics of cell serialization primitives usually begin with $\texttt{ST}$.
 - The source of the field width in bits to be used (e.g., $\texttt{X}$ for integer serialization instructions means that the bit width $n$ is supplied in the stack; otherwise it has to be embedded into the instruction as an immediate value).
 - The action to be performed if the operation cannot be completed (by default, an exception is generated; "quiet" versions of serialization instructions are marked by a $\texttt{Q}$ letter in their mnemonics).
 
-This classification scheme is used to create a more complete taxonomy of cell serialization primitives, which can be found in [A.7.1](#a-7-1-cell-serialization-primitives).
+This classification scheme is used to create a more complete taxonomy of [cell serialization primitives](#a-7-1-cell-serialization-primitives).
 
 ### 3.2.7. Integer serialization primitives
 
@@ -527,11 +529,11 @@ Integer serialization primitives can be classified according to the above taxono
 - If the integer $x$ to be serialized is not in the range $-2^{n-1}\leq x<2^{n-1}$ (for signed integer serialization) or $0\leq x<2^n$ (for unsigned integer serialization), a range check exception is usually generated, and if $n$ bits cannot be stored into the provided *Builder*, a cell overflow exception is usually generated.
 - Quiet versions of serialization instructions do not throw exceptions; instead, they push $-1$ on top of the resulting *Builder* upon success, or return the original *Builder* with $0$ on top of it to indicate failure.
 
-Integer serialization instructions have mnemonics like $\texttt{STU 20}$ ("store an unsigned 20-bit integer value") or $\texttt{STIXQ}$ ("quietly store an integer value of variable length provided in the stack"). The full list of these instructions—including their mnemonics, descriptions, and opcodes—is provided in [A.7.1](#a-7-1-cell-serialization-primitives).
+Integer serialization instructions have mnemonics like $\texttt{STU 20}$ ("store an unsigned 20-bit integer value") or $\texttt{STIXQ}$ ("quietly store an integer value of variable length provided in the stack"). The full list of these [instructions](#a-7-1-cell-serialization-primitives)—includes their mnemonics, descriptions, and opcodes.
 
 ### 3.2.8. Integers in cells are big-endian by default
 
-Notice that the default order of bits in *Integers* serialized into *Cells* is *big-endian*, not little-endian.<sup>[14](#fn14)</sup> In this respect *TVM is a big-endian machine*. However, this affects only the serialization of integers inside cells. The internal representation of the *Integer* value type is implementation-dependent and irrelevant for the operation of TVM. Besides, there are some special primitives such as $\texttt{STULE}$ for (de)serializing little-endian integers, which must be stored into an integral number of bytes (otherwise "little-endianness" does not make sense, unless one is also willing to revert the order of bits inside octets). Such primitives are useful for interfacing with the little-endian world—for instance, for parsing custom-format messages arriving to a TON Blockchain smart contract from the outside world.
+Notice that the default order of bits in *Integers* serialized into *Cells* is *big-endian*, not little-endian.<a id="ref-fn14"></a><sup>[14](#fn14)</sup> In this respect *TVM is a big-endian machine*. However, this affects only the serialization of integers inside cells. The internal representation of the *Integer* value type is implementation-dependent and irrelevant for the operation of TVM. Besides, there are some special primitives such as $\texttt{STULE}$ for (de)serializing little-endian integers, which must be stored into an integral number of bytes (otherwise "little-endianness" does not make sense, unless one is also willing to revert the order of bits inside octets). Such primitives are useful for interfacing with the little-endian world—for instance, for parsing custom-format messages arriving to a TON Blockchain smart contract from the outside world.
 
 ### 3.2.9. Other serialization primitives
 
@@ -543,7 +545,7 @@ In addition to the cell serialization primitives for certain built-in value type
 
 ### 3.2.11. Taxonomy of cell deserialisation primitives
 
-Cell parsing, or deserialization, primitives can be classified as described in [3.2.6](#3-2-6-taxonomy-of-cell-creation-serialization-primitives), with the following modifications:
+Cell parsing, or deserialization, [primitives](#3-2-6-taxonomy-of-cell-creation-serialization-primitives) can be classified with the following modifications:
 
 - They work with *Slices* (representing the remainder of the cell being parsed) instead of *Builders*.
 - They return deserialized values instead of accepting them as arguments.
@@ -562,7 +564,7 @@ In addition to the cell deserialisation primitives outlined above, TVM provides
 
 The reader might wonder how the values serialized inside a cell may be modified. Suppose a cell contains three serialized 29-bit integers, $(x,y,z)$, representing the coordinates of a point in space, and we want to replace $y$ with $y'=y+1$, leaving the other coordinates intact. How would we achieve this?
 
-TVM does not offer any ways to modify existing values (cf. [2.3.4](#2-3-4-transparency-of-the-implementation-stack-values-are-values-not-references) and [2.3.5](#2-3-5-absence-of-circular-references)), so our example can only be accomplished with a series of operations as follows:
+TVM does not offer any ways to modify existing values ([2.3.4](#2-3-4-transparency-of-the-implementation:-stack-values-are-“values”,-not-“references”) and [2.3.5](#2-3-5-absence-of-circular-references)), so our example can only be accomplished with a series of operations as follows:
 
 1. Deserialize the original cell into three *Integer*s $x$, $y$, $z$ in the stack (e.g., by $\texttt{CTOS}$; $\texttt{LDI 29}$; $\texttt{LDI 29}$; $\texttt{LDI 29}$; $\texttt{ENDS}$).
 2. Increase $y$ by one (e.g., by $\texttt{SWAP}$; $\texttt{INC}$; $\texttt{SWAP}$).
@@ -570,7 +572,9 @@ TVM does not offer any ways to modify existing values (cf. [2.3.4](#2-3-4-transp
 
 ### 3.2.14. Modifying the persistent storage of a smart contract
 
-If the TVM code wants to modify its persistent storage, represented by the tree of cells rooted at $\texttt{c4}$, it simply needs to rewrite control register $\texttt{c4}$ by the root of the tree of cells containing the new value of its persistent storage. (If only part of the persistent storage needs to be modified, cf. [3.2.13](#3-2-13-modifying-a-serialized-value-in-a-cell).)
+If the TVM code wants to modify its persistent storage, represented by the tree of cells rooted at $\texttt{c4}$, it simply needs to rewrite control register $\texttt{c4}$ by the root of the tree of cells containing the new value of its persistent storage. (If only part of the persistent storage needs to be [modified](#3-2-13-modifying-a-serialized-value-in-a-cell).)
+
+---
 
 ## 3.3   Hashmaps, or dictionaries
 
@@ -590,7 +594,7 @@ It is easy to see that any collection of key-value pairs (with distinct keys) is
 
 ### 3.3.3. Serialization of hashmaps
 
-The serialization of a hashmap into a tree of cells (or, more generally, into a *Slice*) is defined by the following TL-B scheme:<sup>[15](#fn15)</sup>
+The serialization of a hashmap into a tree of cells (or, more generally, into a *Slice*) is defined by the following TL-B scheme:<a id="ref-fn15"></a><sup>[15](#fn15)</sup>
 
 ```
 bit#_ _:(## 1) = Bit;
@@ -623,13 +627,13 @@ A TL-B scheme, like the one above, includes the following components.
 
 The right-hand side of each "equation" is a *type*, either simple (such as $\texttt{Bit}$ or $\texttt{True}$) or parametrized (such as $\texttt{Hashmap}$ $n$ $X$). The parameters of a type must be either natural numbers (i.e., non-negative integers, which are required to fit into 32 bits in practice), such as $n$ in $\texttt{Hashmap}$ $n$ $X$, or other types, such as $X$ in $\texttt{Hashmap}$ $n$ $X$.
 
-The left-hand side of each equation describes a way to define, or even to serialize, a value of the type indicated in the right-hand side. Such a description begins with the name of a *constructor*, such as $\texttt{hm\_edge}$ or $\texttt{hml\_long}$, immediately followed by an optional *constructor tag*, such as #\_ or \$10, which describes the bitstring used to encode (serialize) the constructor in question. Such tags may be given in either binary (after a dollar sign) or hexadecimal notation (after a hash sign), using the conventions described in [1.0](#1-0-notation-for-bitstrings). If a tag is not explicitly provided, TL-B computes a default 32-bit constructor tag by hashing the text of the "equation" defining this constructor in a certain fashion. Therefore, empty tags must be explicitly provided by #\_ or \$\_. All constructor names must be distinct, and constructor tags for the same type must constitute a prefix code (otherwise the deserialization would not be unique).
+The left-hand side of each equation describes a way to define, or even to serialize, a value of the type indicated in the right-hand side. Such a description begins with the name of a *constructor*, such as $\texttt{hm\_edge}$ or $\texttt{hml\_long}$, immediately followed by an optional *constructor tag*, such as #\_ or \$10, which describes the bitstring used to encode (serialize) the constructor in question. Such tags may be given in either binary (after a dollar sign) or [hexadecimal notation](#1-0-notation-for-bitstrings) (after a hash sign), using the conventions. If a tag is not explicitly provided, TL-B computes a default 32-bit constructor tag by hashing the text of the "equation" defining this constructor in a certain fashion. Therefore, empty tags must be explicitly provided by #\_ or \$\_. All constructor names must be distinct, and constructor tags for the same type must constitute a prefix code (otherwise the deserialization would not be unique).
 
-The constructor and its optional tag are followed by *field definitions*. Each field definition is of the form $\textit{ident}:\textit{type-expr}$, where $\textit{ident}$ is an identifier with the name of the field<sup>[16](#fn16)</sup> (replaced by an underscore for anonymous fields), and $\textit{type-expr}$ is the field's type. The type provided here is a *type expression*, which may include simple types or parametrized types with suitable parameters. *Variables*—i.e., the (identifiers of the) previously defined fields of types $\#$ (natural numbers) or $\textit{Type}$ (type of types)—may be used as parameters for the parametrized types. The serialization process recursively serializes each field according to its type, and the serialization of a value ultimately consists of the concatenation of bitstrings representing the constructor (i.e., the constructor tag) and the field values.
+The constructor and its optional tag are followed by *field definitions*. Each field definition is of the form $\textit{ident}:\textit{type-expr}$, where $\textit{ident}$ is an identifier with the name of the field<a id="ref-fn16"></a><sup>[16](#fn16)</sup> (replaced by an underscore for anonymous fields), and $\textit{type-expr}$ is the field's type. The type provided here is a *type expression*, which may include simple types or parametrized types with suitable parameters. *Variables*—i.e., the (identifiers of the) previously defined fields of types $\#$ (natural numbers) or $\textit{Type}$ (type of types)—may be used as parameters for the parametrized types. The serialization process recursively serializes each field according to its type, and the serialization of a value ultimately consists of the concatenation of bitstrings representing the constructor (i.e., the constructor tag) and the field values.
 
 Some fields may be *implicit*. Their definitions are surrounded by curly braces, which indicate that the field is not actually present in the serialization, but that its value must be deduced from other data (usually the parameters of the type being serialized).
 
-Some occurrences of "variables" (i.e., already-defined fields) are prefixed by a tilde. This indicates that the variable's occurrence is used in the opposite way of the default behavior: in the left-hand side of the equation, it means that the variable will be deduced (computed) based on this occurrence, instead of substituting its previously computed value; in the right-hand side, conversely, it means that the variable will not be deduced from the type being serialized, but rather that it will be computed during the deserialization process. In other words, a tilde transforms an "input argument" into an "output argument", and vice versa.<sup>[17](#fn17)</sup>
+Some occurrences of "variables" (i.e., already-defined fields) are prefixed by a tilde. This indicates that the variable's occurrence is used in the opposite way of the default behavior: in the left-hand side of the equation, it means that the variable will be deduced (computed) based on this occurrence, instead of substituting its previously computed value; in the right-hand side, conversely, it means that the variable will not be deduced from the type being serialized, but rather that it will be computed during the deserialization process. In other words, a tilde transforms an "input argument" into an "output argument", and vice versa.<a id="ref-fn17"></a><sup>[17](#fn17)</sup>
 
 Finally, some equalities may be included in curly brackets as well. These are certain "equations", which must be satisfied by the "variables" included in them. If one of the variables is prefixed by a tilde, its value will be uniquely determined by the values of all other variables participating in the equation (which must be known at this point) when the definition is processed from the left to the right.
 
@@ -639,13 +643,13 @@ Parametrized type $\texttt{\#<= }p$ with $p:\texttt{\#}$ (this notation means "$
 
 ### 3.3.5. Application to the serialization of hashmaps
 
-Let us explain the net result of applying the general rules described in [3.3.4](#3-3-4-brief-explanation-of-tl-b-schemes) to the TL-B scheme presented in [3.3.3](#3-3-3-serialization-of-hashmaps).
+Let us explain the net result of applying the [general rules](#3-3-4-brief-explanation-of-tl-b-schemes) to the [TL-B scheme](#3-3-3-serialization-of-hashmaps).
 
-Suppose we wish to serialize a value of type *HashmapE* $n$ $X$ for some integer $0\leq n\leq 1023$ and some type $X$ (i.e., a dictionary with $n$-bit keys and values of type $X$, admitting an abstract representation as a Patricia tree (cf. [3.3.2](#3-3-2-hashmaps-as-patricia-trees))).
+Suppose we wish to serialize a value of type *HashmapE* $n$ $X$ for some integer $0\leq n\leq 1023$ and some type $X$ (i.e., a dictionary with $n$-bit keys and values of type $X$, admitting an abstract representation as a [Patricia tree](#3-3-2-hashmaps-as-patricia-trees)).
 
 First of all, if our dictionary is empty, it is serialized into a single binary $\texttt{0}$, which is the tag of nullary constructor $\texttt{hme\_empty}$. Otherwise, its serialization consists of a binary $\texttt{1}$ (the tag of $\texttt{hme\_root}$), along with a reference to a cell containing the serialization of a value of type *Hashmap* $n$ $X$ (i.e., a necessarily non-empty dictionary).
 
-The only way to serialize a value of type *Hashmap* $n$ $X$ is given by the $\texttt{hm\_edge}$ constructor, which instructs us to serialize first the label $\texttt{label}$ of the edge leading to the root of the subtree under consideration (i.e., the common prefix of all keys in our (sub)dictionary). This label is of type $\texttt{HmLabel}$ $l^\perp$ $n$, which means that it is a bitstring of length at most $n$, serialized in such a way that the true length $l$ of the label, $0\leq l\leq n$, becomes known from the serialization of the label. (This special serialization method is described separately in [3.3.6](#3-3-6-serialization-of-labels).)
+The only way to serialize a value of type *Hashmap* $n$ $X$ is given by the $\texttt{hm\_edge}$ constructor, which instructs us to serialize first the label $\texttt{label}$ of the edge leading to the root of the subtree under consideration (i.e., the common prefix of all keys in our (sub)dictionary). This label is of type $\texttt{HmLabel}$ $l^\perp$ $n$, which means that it is a bitstring of length at most $n$, serialized in such a way that the true length $l$ of the label, $0\leq l\leq n$, becomes known from the [serialization](#3-3-6-serialization-of-labels) of the label.
 
 The label must be followed by the serialization of a $\texttt{node}$ of type $\texttt{HashmapNode}$ $m$ $X$, where $m=n-l$. It corresponds to a vertex of the Patricia tree, representing a non-empty subdictionary of the original dictionary with $m$-bit keys, obtained by removing from all the keys of the original subdictionary their common prefix of length $l$.
 
@@ -707,7 +711,7 @@ A.0.0.1 := 10 100 0001 0000000100100001
 A.0.1 := 10 111 1101111 1101111100100001
 ```
 
-Here $A$ is the root cell, $A.0$ is the cell at the first reference of $A$, $A.1$ is the cell at the second reference of $A$, and so on. This tree of cells can be represented more compactly using the hexadecimal notation described in [1.0](#1-0-notation-for-bitstrings), using indentation to reflect the tree-of-cells structure:
+Here $A$ is the root cell, $A.0$ is the cell at the first reference of $A$, $A.1$ is the cell at the second reference of $A$, and so on. This tree of cells can be represented more compactly using the [hexadecimal notation](#1-0-notation-for-bitstrings), using indentation to reflect the tree-of-cells structure:
 
 ```
 C_
@@ -720,6 +724,7 @@ C_
 
 A total of 93 data bits and 5 references in 6 cells have been used to serialize this dictionary. Notice that a straightforward representation of three 16-bit keys and their corresponding 16-bit values would already require 96 bits (albeit without any references), so this particular serialization turns out to be quite efficient.
 
+
 ### 3.3.8. Ways to describe the serialization of type $X$
 
 Notice that the built-in TVM primitives for dictionary manipulation need to know something about the serialization of type $X$; otherwise, they would not be able to work correctly with $\mathit{Hashmap}$ $n$ $X$, because values of type $X$ are immediately contained in the Patricia tree leaf cells. There are several options available to describe the serialization of type $X$:
@@ -739,7 +744,7 @@ Let us present a classification of basic operations with dictionaries (i.e., val
 
 - $\text{Get}(D,k)$ — Given $D$:*HashmapE*$(n,X)$ and a key $k:n\cdot\texttt{bit}$, returns the corresponding value $D[k]:X^?$ kept in $D$.
 
-- $\text{Set}(D,k,x)$ — Given $D$:*HashmapE*$(n,X)$, a key $k:n\cdot\texttt{bit}$, and a value $x:X$, sets $D'[k]$ to $x$ in a copy $D'$ of $D$, and returns the resulting dictionary $D'$ (cf. [2.3.4](#2-3-4-transparency-of-the-implementation-stack-values-are-values-not-references)).
+- $\text{Set}(D,k,x)$ — Given $D$:*HashmapE*$(n,X)$, a key $k:n\cdot\texttt{bit}$, and a value $x:X$, sets $D'[k]$ to $x$ in a [copy](#2-3-4-transparency-of-the-implementation%3A-stack-values-are-“values”%2C-not-“references”) $D'$ of $D$, and returns the resulting dictionary $D'$.
 
 - $\text{Add}(D,k,x)$ — Similar to $\text{Set}$, but adds the key-value pair $(k,x)$ to $D$ only if key $k$ is absent in $D$.
 
@@ -773,17 +778,17 @@ Let us present a classification of basic operations with dictionaries (i.e., val
 
 - $\text{Merge}(D_0,D_1)$ — Given $D_0$ and $D_1$:*HashmapE*$(n-1,X)$, computes $D$:*HashmapE*$(n,X)$, such that $D/0=D_0$ and $D/1=D_1$.
 
-- $\text{ForEach}(D,f)$ — Executes a function $f$ with two arguments $k$ and $x$, with $(k,x)$ running over all key-value pairs of a dictionary $D$ in lexicographical order.<sup>[18](#fn18)</sup>
+- $\text{ForEach}(D,f)$ — Executes a function $f$ with two arguments $k$ and $x$, with $(k,x)$ running over all key-value pairs of a dictionary $D$ in lexicographical order.<a id="ref-fn18"></a><sup>[18](#fn18)</sup>
 
 - $\text{ForEachRev}(D,f)$ — Similar to $\text{ForEach}$, but processes all key-value pairs in reverse order.
 
-- $\text{TreeReduce}(D,o,f,g)$ — Given $D$:*HashmapE*$(n,X)$, a value $o:X$, and two functions $f:X\to Y$ and $g:Y\times Y\to Y$, performs a "tree reduction" of $D$ by first applying $f$ to all the leaves, and then using $g$ to compute the value corresponding to a fork starting from the values assigned to its children.<sup>[19](#fn19)</sup>
+- $\text{TreeReduce}(D,o,f,g)$ — Given $D$:*HashmapE*$(n,X)$, a value $o:X$, and two functions $f:X\to Y$ and $g:Y\times Y\to Y$, performs a "tree reduction" of $D$ by first applying $f$ to all the leaves, and then using $g$ to compute the value corresponding to a fork starting from the values assigned to its children.<a id="ref-fn19"></a><sup>[19](#fn19)</sup>
 
 ### 3.3.11. Taxonomy of dictionary primitives
 
-The dictionary primitives, described in detail in Appendix [A.10](#a-10-dictionary-manipulation-primitives), can be classified according to the following categories:
+The [dictionary primitives](#a-10-dictionary-manipulation-primitives), can be classified according to the following categories:
 
-- Which dictionary operation (cf. [3.3.10](#3-3-10-basic-dictionary-operations)) do they perform?
+- Which [dictionary operation](#3-3-10-basic-dictionary-operations) do they perform?
 - Are they specialized for the case $X = \text{\textasciicircum}Y$ If so, do they represent values of type $Y$ by *Cell*s or by *Slice*s? (Generic versions always represent values of type $X$ as *Slice*s.)
 - Are the dictionaries themselves passed and returned as *Cell*s or as *Slice*s? (Most primitives represent dictionaries as *Slice*s.)
 - Is the key length $n$ fixed inside the primitive, or is it passed in the stack?
@@ -795,11 +800,11 @@ In addition, TVM includes special serialization/deserialization primitives, such
 
 ## 3.4   Hashmaps with variable-length keys
 
-TVM provides some support for dictionaries, or hashmaps, with variable-length keys, in addition to its support for dictionaries with fixed-length keys (as described in [3.3](#3-3-hashmaps%2C-or-dictionaries) above).
+TVM provides some support for [dictionaries, or hashmaps](#3-3-hashmaps%2C-or-dictionaries), with variable-length keys, in addition to its support for dictionaries with fixed-length keys.
 
 ### 3.4.1. Serialization of dictionaries with variable-length keys
 
-The serialization of a *VarHashmap* into a tree of cells (or, more generally, into a *Slice*) is defined by a TL-B scheme, similar to that described in [3.3.3](#3-3-3-serialization-of-hashmaps):
+The [serialization](#3-3-3-serialization-of-hashmaps) of a *VarHashmap* into a tree of cells (or, more generally, into a *Slice*) is defined by a TL-B scheme:
 
 ```
 vhm_edge#_ {n:#} {X:Type} {l:#} {m:#} label:(HmLabel ~l n) 
@@ -842,7 +847,6 @@ phme_root$1 {n:#} {X:Type} root:^(PfxHashmap n X)
 
 ---
 
-
 # 4    Control flow, continuations, and exceptions
 
 This chapter describes *continuations*, which may represent execution tokens and exception handlers in TVM. Continuations are deeply involved with the control flow of a TVM program; in particular, subroutine calls and conditional and iterated execution are implemented in TVM using special primitives that accept one or more continuations as their arguments.
@@ -851,13 +855,13 @@ We conclude this chapter with a discussion of the problem of recursion and of fa
 
 ## 4.1   Continuations and subroutines
 
-Recall (cf. [1.1.3](#1-1-3-preliminary-list-of-value-types)) that *Continuation* values represent "execution tokens" that can be executed later—for example, by $\texttt{EXECUTE}$=$\texttt{CALLX}$ ("execute" or "call indirect") or $\texttt{JMPX}$ ("jump indirect") primitives. As such, the continuations are heavily used by control flow primitives, enabling subroutine calls, conditional expressions, loops, and so on.
+Recall that [Continuation values](#1-1-3-preliminary-list-of-value-types) represent "execution tokens" that can be executed later—for example, by $\texttt{EXECUTE}$=$\texttt{CALLX}$ ("execute" or "call indirect") or $\texttt{JMPX}$ ("jump indirect") primitives. As such, the continuations are heavily used by control flow primitives, enabling subroutine calls, conditional expressions, loops, and so on.
 
 ### 4.1.1. Ordinary continuations
 
 The most common kind of continuations are the *ordinary continuations*, containing the following data:
 
-- A *Slice* $\texttt{code}$ (cf. [1.1.3](#1-1-3-preliminary-list-of-value-types) and [3.2.2](#3-2-2-builder-and-slice-values)), containing (the remainder of) the TVM code to be executed.
+- A *Slice* $\texttt{code}$ ([1.1.3](#1-1-3-preliminary-list-of-value-types) and [3.2.2](#3-2-2-builder-and-slice-values)), containing (the remainder of) the TVM code to be executed.
 - A (possibly empty) *Stack* $\texttt{stack}$, containing the original contents of the stack for the code to be executed.
 - A (possibly empty) list $\texttt{save}$ of pairs $(\texttt{c}(i),v_i)$ (also called "savelist"), containing the values of control registers to be restored before the execution of the code.
 - A 16-bit integer value $\texttt{cp}$, selecting the TVM codepage used to interpret the TVM code from $\texttt{code}$.
@@ -869,24 +873,24 @@ In most cases, the ordinary continuations are the simplest ones, having empty $\
 
 ### 4.1.3. Current continuation cc
 
-The "current continuation" $\texttt{cc}$ is an important part of the total state of TVM, representing the code being executed right now (cf. [1.1](#1-1-tvm-is-a-stack-machine)). In particular, what we call "the current stack" (or simply "the stack") when discussing all other primitives is in fact the stack of the current continuation. All other components of the total state of TVM may be also thought of as parts of the current continuation $\texttt{cc}$; however, they may be extracted from the current continuation and kept separately as part of the total state for performance reasons. This is why we describe the stack, the control registers, and the codepage as separate parts of the TVM state in [1.4](#1-4-total-state-of-tvm-scccg).
+The "current continuation" $\texttt{cc}$ is an important part of the total state of TVM, representing the code being executed right now. In particular, what we call [the current stack](#1-1-tvm-is-a-stack-machine) (or simply "the stack") when discussing all other primitives is in fact the stack of the current continuation. All other components of the total state of TVM may be also thought of as parts of the current continuation $\texttt{cc}$; however, they may be extracted from the current continuation and kept separately as part of the total state for performance reasons. This is why we describe the stack, the control registers, and the codepage as separate parts of the [TVM state](#1-4-total-state-of-tvm-scccg).
 
 ### 4.1.4. Normal work of TVM, or the main loop
 
 TVM usually performs the following operations:
 
-If the current continuation $\texttt{cc}$ is an ordinary one, it decodes the first instruction from the *Slice* $\texttt{code}$, similarly to the way other cells are deserialized by TVM $\texttt{LD*}$ primitives (cf. [3.2](#3-2-data-manipulation-instructions-and-cells) and [3.2.11](#3-2-11-taxonomy-of-cell-deserialization-primitives)): it decodes the opcode first, and then the parameters of the instruction (e.g., 4-bit fields indicating "stack registers" involved for stack manipulation primitives, or constant values for "push constant" or "literal" primitives). The remainder of the *Slice* is then put into the $\texttt{code}$ of the new $\texttt{cc}$, and the decoded operation is executed on the current stack. This entire process is repeated until there are no operations left in $\texttt{cc.code}$.
+If the current continuation $\texttt{cc}$ is an ordinary one, it decodes the first instruction from the *Slice* $\texttt{code}$, similarly to the way other cells are deserialized by TVM $\texttt{LD*}$ primitives ([3.2](#3-2-data-manipulation-instructions-and-cells) and [3.2.11](#3-2-11-taxonomy-of-cell-deserialisation-primitives)): it decodes the opcode first, and then the parameters of the instruction (e.g., 4-bit fields indicating "stack registers" involved for stack manipulation primitives, or constant values for "push constant" or "literal" primitives). The remainder of the *Slice* is then put into the $\texttt{code}$ of the new $\texttt{cc}$, and the decoded operation is executed on the current stack. This entire process is repeated until there are no operations left in $\texttt{cc.code}$.
 
-If the $\texttt{code}$ is empty (i.e., contains no bits of data and no references), or if a (rarely needed) explicit subroutine return ($\texttt{RET}$) instruction is encountered, the current continuation is discarded, and the "return continuation" from control register $\texttt{c0}$ is loaded into $\texttt{cc}$ instead (this process is discussed in more detail starting in [4.1.6](#4-1-6-switching-to-another-continuation-jmp-and-ret)).<sup>[20](#fn20)</sup> Then the execution continues by parsing operations from the new current continuation.
+If the $\texttt{code}$ is empty (i.e., contains no bits of data and no references), or if a (rarely needed) explicit subroutine return ($\texttt{RET}$) instruction is encountered, the current continuation is discarded, and the "return continuation" from control register $\texttt{c0}$ is [loaded](#4-1-6-switching-to-another-continuation:-and) into $\texttt{cc}$ instead.<a id="ref-fn20"></a><sup>[20](#fn20)</sup> Then the execution continues by parsing operations from the new current continuation.
 
 ### 4.1.5. Extraordinary continuations
 
-In addition to the ordinary continuations considered so far (cf. [4.1.1](#4-1-1-ordinary-continuations)), TVM includes some *extraordinary continuations*, representing certain less common states. Examples of extraordinary continuations include:
+In addition to the [ordinary continuations](#4-1-1-ordinary-continuations), TVM includes some *extraordinary continuations*, representing certain less common states. Examples of extraordinary continuations include:
 
 - The continuation $\texttt{ec\_quit}$ with its parameter set to zero, which represents the end of the work of TVM. This continuation is the original value of $\texttt{c0}$ when TVM begins executing the code of a smart contract.
 - The continuation $\texttt{ec\_until}$, which contains references to two other continuations (ordinary or not) representing the body of the loop being executed and the code to be executed after the loop.
 
-Execution of an extraordinary continuation by TVM depends on its specific class, and differs from the operations for ordinary continuations described in [4.1.4](#4-1-4-normal-work-of-tvm%2C-or-the-main-loop).<sup>[21](#fn21)</sup>
+Execution of an extraordinary continuation by TVM depends on its specific class, and [differs](#4-1-4-normal-work-of-tvm,-or-the-main-loop) from the operations for ordinary continuations.<a id="ref-fn21"></a><sup>[21](#fn21)</sup>
 
 ### 4.1.6. Switching to another continuation: $\texttt{JMP}$ and $\texttt{RET}$
 
@@ -904,7 +908,7 @@ One could also imagine that the default value of $n''$ equals the depth of the o
 
 ### 4.1.8. Restoring control registers from the new continuation $c$
 
-After the new stack is computed, the values of control registers present in $c$.$\texttt{save}$ are restored accordingly, and the current codepage $\texttt{cp}$ is also set to $c$.$\texttt{cp}$. Only then does TVM set $\texttt{cc}$ equal to the new $c$ and begin its execution.<sup>[22](#fn22)</sup>
+After the new stack is computed, the values of control registers present in $c$.$\texttt{save}$ are restored accordingly, and the current codepage $\texttt{cp}$ is also set to $c$.$\texttt{cp}$. Only then does TVM set $\texttt{cc}$ equal to the new $c$ and begin its execution.<a id="ref-fn22"></a><sup>[22](#fn22)</sup>
 
 ### 4.1.9. Subroutine calls: $\texttt{CALLX}$ or $\texttt{EXECUTE}$ primitives
 
@@ -912,9 +916,9 @@ The execution of continuations as subroutines is slightly more complicated than
 
 Consider the $\texttt{CALLX}$ or $\texttt{EXECUTE}$ primitive, which takes a continuation $c$ from the (current) stack and executes it as a subroutine.
 
-Apart from doing the stack manipulations described in [4.1.6](#4-1-6-switching-to-another-continuation-jmp-and-ret) and [4.1.7](#4-1-7-determining-the-number-n-of-arguments-passed-to-the-next-continuation-c) and setting the new control registers and codepage as described in [4.1.8](#4-1-8-restoring-control-registers-from-the-new-continuation-c), these primitives perform several additional steps:
+Apart from doing the stack manipulations described in [4.1.6](#4-1-6-switching-to-another-continuation:-and) and [4.1.7](#4-1-7-determining-the-number-of-arguments-passed-to-the-next-continuation) and setting the new [control registers](#4-1-8-restoring-control-registers-from-the-new-continuation), these primitives perform several additional steps:
 
-1. After the top $n''$ values are removed from the current stack (cf. [4.1.7](#4-1-7-determining-the-number-n-of-arguments-passed-to-the-next-continuation-c)), the (usually empty) remainder is not discarded, but instead is stored in the (old) current continuation $\texttt{cc}$.
+1. After the top $n''$ values are [removed](#4-1-7-determining-the-number-of-arguments-passed-to-the-next-continuation) from the current stack, the (usually empty) remainder is not discarded, but instead is stored in the (old) current continuation $\texttt{cc}$.  
 
 2. The old value of the special register $\texttt{c0}$ is saved into the (previously empty) savelist $\texttt{cc.save}$.
 
@@ -928,7 +932,7 @@ In this way, the called subroutine can return control to the caller by switching
 
 Similarly to $\texttt{JMPX}$ and $\texttt{RET}$, $\texttt{CALLX}$ also has special (rarely used) forms, which allow us to explicitly specify the number $n''$ of arguments passed from the current stack to the called subroutine (by default, $n''$ equals the depth of the current stack, i.e., it is passed in its entirety). Furthermore, a second number $n'''$ can be specified, used to set $\texttt{nargs}$ of the modified $\texttt{cc}$ continuation before storing it into the new $\texttt{c0}$; the new $\texttt{nargs}$ equals the depth of the old stack minus $n''$ plus $n'''$. This means that the caller is willing to pass exactly $n''$ arguments to the called subroutine, and is willing to accept exactly $n'''$ results in their stead.
 
-Such forms of $\texttt{CALLX}$ and $\texttt{RET}$ are mostly intended for library functions that accept functional arguments and want to invoke them safely. Another application is related to the "virtualization support" of TVM, which enables TVM code to run other TVM code inside a "virtual TVM machine". Such virtualization techniques might be useful for implementing sophisticated payment channels in the TON Blockchain (cf. [[1](#ref-1), 5]).
+Such forms of $\texttt{CALLX}$ and $\texttt{RET}$ are mostly intended for library functions that accept functional arguments and want to invoke them safely. Another application is related to the "virtualization support" of TVM, which enables TVM code to run other TVM code inside a "virtual TVM machine". Such virtualization techniques might be useful for implementing sophisticated [payment channels](/foundations/whitepapers/ton#5-ton-payments) in the TON Blockchain.
 
 ### 4.1.11. $\texttt{CALLCC}$: call with current continuation
 
@@ -944,7 +948,7 @@ An important modification of $\texttt{EXECUTE}$ (or $\texttt{CALLX}$) consists i
 
 More sophisticated modifications of $\texttt{EXECUTE}$ include:
 
-- $\texttt{REPEAT}$ — Takes an integer $n$ and a continuation $c$, and executes $c$ $n$ times.<sup>[23](#fn23)</sup>
+- $\texttt{REPEAT}$ — Takes an integer $n$ and a continuation $c$, and executes $c$ $n$ times.<a id="ref-fn23"></a><sup>[23](#fn23)</sup>
 - $\texttt{WHILE}$ — Takes $c'$ and $c''$, executes $c'$, and then takes the top value $x$ from the stack. If $x$ is non-zero, it executes $c''$ and then begins a new loop by executing $c'$ again; if $x$ is zero, it stops.
 - $\texttt{UNTIL}$ — Takes $c$, executes it, and then takes the top integer $x$ from the stack. If $x$ is zero, a new iteration begins; if $x$ is non-zero, the previously executed code is resumed.
 
@@ -973,9 +977,9 @@ However, some operations with opaque continuations are still possible, mostly be
 - Push one or several values into the stack of a continuation $c$ (thus creating a partial application of a function, or a closure).
 - Set the saved value of a control register $\texttt{c}(i)$ inside the savelist $c$.$\texttt{save}$ of a continuation $c$. If there is already a value for the control register in question, this operation silently does nothing.
 
-#### 4.3.3. Example: operations with control registers
+### 4.3.3. Example: operations with control registers
 
-TVM has some primitives to set and inspect the values of control registers. The most important of them are $\texttt{PUSH c}(i)$ (pushes the current value of $\texttt{c}(i)$ into the stack) and $\texttt{POP c}(i)$ (sets the value of $\texttt{c}(i)$ from the stack, if the supplied value is of the correct type). However, there is also a modified version of the latter instruction, called $\texttt{POPSAVE c}(i)$, which saves the old value of $\texttt{c}(i)$ (for $i>0$) into the continuation at $\texttt{c0}$ as described in [4.3.2](#4-3-2-allowed-operations-with-continuations) before setting the new value.
+TVM has some primitives to set and inspect the values of control registers. The most important of them are $\texttt{PUSH c}(i)$ (pushes the current value of $\texttt{c}(i)$ into the stack) and $\texttt{POP c}(i)$ (sets the value of $\texttt{c}(i)$ from the stack, if the supplied value is of the correct type). However, there is also a modified version of the latter instruction, called $\texttt{POPSAVE c}(i)$, which saves the old value of $\texttt{c}(i)$ (for $i>0$) into the [continuation](#4-3-2-allowed-operations-with-continuations) at $\texttt{c0}$ before setting the new value.
 
 ### 4.3.4. Example: setting the number of arguments to a function in its code
 
@@ -987,7 +991,7 @@ A continuation $c$ may be thought of as a piece of code with two optional exit p
 
 ### 4.3.6. Composition of continuations
 
-One can *compose* two continuations $c$ and $c'$ simply by setting $c$.$\texttt{c0}$ or $c$.$\texttt{c1}$ to $c'$. This creates a new continuation denoted by $c\circ_0c'$ or $c\circ_1c'$, which differs from $c$ in its savelist. (Recall that if the savelist of $c$ already has an entry corresponding to the control register in question, such an operation silently does nothing as explained in [4.3.2](#4-3-2-allowed-operations-with-continuations)).
+One can *compose* two continuations $c$ and $c'$ simply by setting $c$.$\texttt{c0}$ or $c$.$\texttt{c1}$ to $c'$. This creates a new continuation denoted by $c\circ_0c'$ or $c\circ_1c'$, which differs from $c$ in its savelist. (Recall that if the savelist of $c$ already has an entry corresponding to the control register in question, such an operation silently [does nothing](#4-3-2-allowed-operations-with-continuations).
 
 By composing continuations, one can build chains or other graphs, possibly with loops, representing the control flow. In fact, the resulting graph resembles a flow chart, with the boolean circuits corresponding to the "condition nodes" (containing code that will transfer control either to $\texttt{c0}$ or to $\texttt{c1}$ depending on some condition), and the one-exit continuations corresponding to the "action nodes".
 
@@ -1017,11 +1021,11 @@ Finally, some "experimental" primitives also involve $\texttt{c1}$ and $\circ_1$
 
 Object-oriented programming in Smalltalk (or Objective C) style may be implemented with the aid of continuations. For this, an object is represented by a special continuation $o$. If it has any data fields, they can be kept in the stack of $o$, making $o$ a partial application (i.e., a continuation with a non-empty stack).
 
-When somebody wants to invoke a method $m$ of $o$ with arguments $x_1$, $x_2$, $\ldots$, $x_n$, she pushes the arguments into the stack, then pushes a magic number corresponding to the method $m$, and then executes $o$ passing $n+1$ arguments (cf. [4.1.10](#4-1-10-determining-the-number-of-arguments-passed-to-and-or-return-values-accepted-from-a-subroutine)). Then $o$ uses the top-of-stack integer $m$ to select the branch with the required method, and executes it. If $o$ needs to modify its state, it simply computes a new continuation $o'$ of the same sort (perhaps with the same code as $o$, but with a different initial stack). The new continuation $o'$ is returned to the caller along with whatever other return values need to be returned.
+When somebody wants to invoke a method $m$ of $o$ with arguments $x_1$, $x_2$, $\ldots$, $x_n$, she pushes the arguments into the stack, then pushes a magic number corresponding to the method $m$, and then executes $o$ passing $n+1$ [arguments](#4-1-10-determining-the-number-of-arguments-passed-to-and%2For-return-values-accepted-from-a-subroutine). Then $o$ uses the top-of-stack integer $m$ to select the branch with the required method, and executes it. If $o$ needs to modify its state, it simply computes a new continuation $o'$ of the same sort (perhaps with the same code as $o$, but with a different initial stack). The new continuation $o'$ is returned to the caller along with whatever other return values need to be returned.
 
 ### 4.4.2. Serializable objects
 
-Another way of representing Smalltalk-style objects as continuations, or even as trees of cells, consists in using the $\texttt{JMPREFDATA}$ primitive (a variant of $\texttt{JMPXDATA}$, cf. [4.1.11](#4-1-11-callcc%3A-call-with-current-continuation)), which takes the first cell reference from the code of the current continuation, transforms the cell referred to into a simple ordinary continuation, and transfers control to it, first pushing the remainder of the current continuation as a *Slice* into the stack. In this way, an object might be represented by a cell $\tilde o$ that contains $\texttt{JMPREFDATA}$ at the beginning of its data, and the actual code of the object in the first reference (one might say that the first reference of cell $\tilde o$ is the *class* of object $\tilde o$). Remaining data and references of this cell will be used for storing the fields of the object.
+Another way of representing Smalltalk-style objects as continuations, or even as trees of cells, consists in using the $\texttt{JMPREFDATA}$ primitive (a [variant](#4-1-11-%3A-call-with-current-continuation) of $\texttt{JMPXDATA}$), which takes the first cell reference from the code of the current continuation, transforms the cell referred to into a simple ordinary continuation, and transfers control to it, first pushing the remainder of the current continuation as a *Slice* into the stack. In this way, an object might be represented by a cell $\tilde o$ that contains $\texttt{JMPREFDATA}$ at the beginning of its data, and the actual code of the object in the first reference (one might say that the first reference of cell $\tilde o$ is the *class* of object $\tilde o$). Remaining data and references of this cell will be used for storing the fields of the object.
 
 Such objects have the advantage of being trees of cells, and not just continuations, meaning that they can be stored into the persistent storage of a TON smart contract.
 
@@ -1029,7 +1033,7 @@ Such objects have the advantage of being trees of cells, and not just continuati
 
 It might make sense (in a future revision of TVM) to mark some continuations as *unique*, meaning that they cannot be copied, even in a delayed manner, by increasing their reference counter to a value greater than one. If an opaque continuation is unique, it essentially becomes a *capability*, which can either be used by its owner exactly once or be transferred to somebody else.
 
-For example, imagine a continuation that represents the output stream to a printer (this is an example of a continuation used as an object, cf. [4.4.1](#4-4-1-representing-objects-using-continuations)). When invoked with one integer argument $n$, this continuation outputs the character with code $n$ to the printer, and returns a new continuation of the same kind reflecting the new state of the stream. Obviously, copying such a continuation and using the two copies in parallel would lead to some unintended side effects; marking it as unique would prohibit such adverse usage.
+For example, imagine a continuation that [represents](#4-4-1-representing-objects-using-continuations) the output stream to a printer (this is an example of a continuation used as an object). When invoked with one integer argument $n$, this continuation outputs the character with code $n$ to the printer, and returns a new continuation of the same kind reflecting the new state of the stream. Obviously, copying such a continuation and using the two copies in parallel would lead to some unintended side effects; marking it as unique would prohibit such adverse usage.
 
 ## 4.5  Exception handling
 
@@ -1049,13 +1053,13 @@ Of course, some exceptions are generated by normal primitives. For example, an a
 
 ### 4.5.4. Exception handling
 
-The exception handling itself consists in a control transfer to the exception handler—i.e., the continuation specified in control register $\texttt{c2}$, with $v$ and $n$ supplied as the two arguments to this continuation, as if a $\texttt{JMP}$ to $\texttt{c2}$ had been requested with $n''=2$ arguments (cf. [4.1.7](#4-1-7-determining-the-number-n-of-arguments-passed-to-the-next-continuation-c) and [4.1.6](#4-1-6-switching-to-another-continuation-jmp-and-ret)). As a consequence, $v$ and $n$ end up in the top of the stack of the exception handler. The remainder of the old stack is discarded.
+The exception handling itself consists in a control transfer to the exception handler—i.e., the continuation specified in control register $\texttt{c2}$, with $v$ and $n$ supplied as the two arguments to this continuation, as if a $\texttt{JMP}$ to $\texttt{c2}$ had been requested with $n''=2$ arguments ([4.1.7](#4-1-7-determining-the-number-of-arguments-passed-to-the-next-continuation) and [4.1.6](#4-1-6-switching-to-another-continuation:-and)). As a consequence, $v$ and $n$ end up in the top of the stack of the exception handler. The remainder of the old stack is discarded.
 
 Notice that if the continuation in $\texttt{c2}$ has a value for $\texttt{c2}$ in its savelist, it will be used to set up the new value of $\texttt{c2}$ before executing the exception handler. In particular, if the exception handler invokes $\texttt{THROWANY}$, it will re-throw the original exception with the restored value of $\texttt{c2}$. This trick enables the exception handler to handle only some exceptions, and pass the rest to an outer exception handler.
 
 ### 4.5.5. Default exception handler
 
-When an instance of TVM is created, $\texttt{c2}$ contains a reference to the "default exception handler continuation", which is an $\texttt{ec\_fatal}$ extraordinary continuation (cf. [4.1.5](#4-1-5-extraordinary-continuations)). Its execution leads to the termination of the execution of TVM, with the arguments $v$ and $n$ of the exception returned to the outside caller. In the context of the TON Blockchain, $n$ will be stored as a part of the transaction's result.
+When an instance of TVM is created, $\texttt{c2}$ contains a reference to the "default exception handler continuation", which is an $\texttt{ec\_fatal}$ [extraordinary continuation](#4-1-5-extraordinary-continuations). Its execution leads to the termination of the execution of TVM, with the arguments $v$ and $n$ of the exception returned to the outside caller. In the context of the TON Blockchain, $n$ will be stored as a part of the transaction's result.
 
 ### 4.5.6. $\texttt{TRY}$ primitive
 
@@ -1082,7 +1086,7 @@ Predefined exceptions of TVM correspond to exception numbers $n$ in the range 0
 - *Fatal error* ($n=12$) — Thrown by TVM in situations deemed impossible.
 - *Out of gas* ($n=13$) — Thrown by TVM when the remaining gas ($g_r$) becomes negative. This exception usually cannot be caught and leads to an immediate termination of TVM.
 
-Most of these exceptions have no parameter (i.e., use a zero integer instead). The order in which these exceptions are checked is outlined below in [4.5.8](#4-5-8-order-of-stack-underflow-type-check-and-range-check-exceptions).
+Most of these exceptions have no parameter (i.e., use a zero integer instead). The order in which these exceptions are checked is outlined [below](#4-5-8-order-of-stack-underflow,-type-check,-and-range-check-exceptions).
 
 ### 4.5.8. Order of stack underflow, type check, and range check exceptions
 
@@ -1094,7 +1098,7 @@ Some primitives accept a variable number of arguments, depending on the values o
 
 ### 4.6.1. The problem of recursion
 
-The conditional and iterated execution primitives described in [4.2](#4-2-control-flow-primitives-conditional-and-iterated-execution)—along with the unconditional branch, call, and return primitives described in [4.1](#4-1-continuations-and-subroutines)—enable one to implement more or less arbitrary code with nested loops and conditional expressions, with one notable exception: one can only create new constant continuations from parts of the current continuation. (In particular, one cannot invoke a subroutine from itself in this way.) Therefore, the code being executed—i.e., the current continuation—gradually becomes smaller and smaller.<sup>[24](#fn24)</sup>
+The [conditional and iterated](#4-2-control-flow-primitives:-conditional-and-iterated-execution) execution primitives—along with the unconditional branch, call, and return [primitives](#4-1-continuations-and-subroutines)—enable one to implement more or less arbitrary code with nested loops and conditional expressions, with one notable exception: one can only create new constant continuations from parts of the current continuation. (In particular, one cannot invoke a subroutine from itself in this way.) Therefore, the code being executed—i.e., the current continuation—gradually becomes smaller and smaller.<a id="ref-fn24"></a><sup>[24](#fn24)</sup>
 
 ### 4.6.2. Y-combinator solution: pass a continuation as an argument to itself
 
@@ -1191,7 +1195,7 @@ However, even if we use one of the two previous approaches to combine all functi
 
 ### 4.6.9. Special register $c3$ for the selector function
 
-In fact, TVM uses a dedicated register $\texttt{c3}$ to keep the continuation representing the current or global "selector function", which can be used to invoke any of a family of mutually recursive functions. Special primitives $\texttt{CALL}$ $nn$ or $\texttt{CALLDICT}$ $nn$ (cf. [A.8.7](#a-8-7-dictionary-subroutine-calls-and-jumps)) are equivalent to $\texttt{PUSHINT}$ $nn$; $\texttt{PUSH c3}$; $\texttt{EXECUTE}$, and similarly $\texttt{JMP}$ $nn$ or $\texttt{JMPDICT}$ $nn$ are equivalent to $\texttt{PUSHINT}$ $nn$; $\texttt{PUSH c3}$; $\texttt{JMPX}$. In this way a TVM program, which ultimately is a large collection of mutually recursive functions, may initialize $\texttt{c3}$ with the correct selector function representing the family of all the functions in the program, and then use $\texttt{CALL}$ $nn$ to invoke any of these functions by its index (sometimes also called the *selector* of a function).
+In fact, TVM uses a dedicated register $\texttt{c3}$ to keep the continuation representing the current or global "selector function", which can be used to invoke any of a family of mutually recursive functions. [Special primitives](#a-8-7-dictionary-subroutine-calls-and-jumps) $\texttt{CALL}$ $nn$ or $\texttt{CALLDICT}$ $nn$ are equivalent to $\texttt{PUSHINT}$ $nn$; $\texttt{PUSH c3}$; $\texttt{EXECUTE}$, and similarly $\texttt{JMP}$ $nn$ or $\texttt{JMPDICT}$ $nn$ are equivalent to $\texttt{PUSHINT}$ $nn$; $\texttt{PUSH c3}$; $\texttt{JMPX}$. In this way a TVM program, which ultimately is a large collection of mutually recursive functions, may initialize $\texttt{c3}$ with the correct selector function representing the family of all the functions in the program, and then use $\texttt{CALL}$ $nn$ to invoke any of these functions by its index (sometimes also called the *selector* of a function).
 
 ### 4.6.10. Initialization of $c3$
 
@@ -1199,13 +1203,13 @@ A TVM program might initialize $\texttt{c3}$ by means of a $\texttt{POP c3}$ ins
 
 ### 4.6.11. Creating selector functions and $\texttt{switch}$ statements
 
-TVM makes special provisions for simple and concise implementation of selector functions (which usually constitute the top level of a TVM program) or, more generally, arbitrary `switch` or `case` statements (which are also useful in TVM programs). The most important primitives included for this purpose are $\texttt{IFBITJMP}$, $\texttt{IFNBITJMP}$, $\texttt{IFBITJMPREF}$, and $\texttt{IFNBITJMPREF}$ (cf. [A.8.2](#a-8-2-conditional-control-flow-primitives)). They effectively enable one to combine subroutines, kept either in separate cells or as subslices of certain cells, into a binary decision tree with decisions made according to the indicated bits of the integer passed in the top of the stack.
+TVM makes special provisions for simple and concise implementation of selector functions (which usually constitute the top level of a TVM program) or, more generally, arbitrary `switch` or `case` statements (which are also useful in TVM programs). The most important [primitives](#a-8-2-conditional-control-flow-primitives) included for this purpose are $\texttt{IFBITJMP}$, $\texttt{IFNBITJMP}$, $\texttt{IFBITJMPREF}$, and $\texttt{IFNBITJMPREF}$. They effectively enable one to combine subroutines, kept either in separate cells or as subslices of certain cells, into a binary decision tree with decisions made according to the indicated bits of the integer passed in the top of the stack.
 
-Another instruction, useful for the implementation of sum-product types, is $\texttt{PLDUZ}$ (cf. [A.7.2](#a-7-2-cell-deserialization-primitives)). This instruction preloads the first several bits of a *Slice* into an *Integer*, which can later be inspected by $\texttt{IFBITJMP}$ and other similar instructions.
+Another instruction, useful for the implementation of [sum-product](#a-7-2-cell-deserialization-primitives) types, is $\texttt{PLDUZ}$. This instruction preloads the first several bits of a *Slice* into an *Integer*, which can later be inspected by $\texttt{IFBITJMP}$ and other similar instructions.
 
 ### 4.6.12. Alternative: using a hashmap to select the correct function
 
-Yet another alternative is to use a *Hashmap* (cf. [3.3](#3-3-hashmaps-or-dictionaries)) to hold the "collection" or "dictionary" of the code of all functions in a program, and use the hashmap lookup primitives (cf. [A.10](#a-10-dictionary-manipulation-primitives)) to select the code of the required function, which can then be $\texttt{BLESS}$ed into a continuation (cf. [A.8.5](#a-8-5-creating-simple-continuations-and-closures)) and executed. Special combined "lookup, bless, and execute" primitives, such as $\texttt{DICTIGETJMP}$ and $\texttt{DICTIGETEXEC}$, are also available (cf. [A.10.11](#a-10-11-special-get-dictionary-and-prefix-code-dictionary-operations-and-constant-dictionaries)). This approach may be more efficient for larger programs and `switch` statements.
+Yet another alternative is to use a [Hashmap](#3-3-hashmaps%2C-or-dictionaries) to hold the "collection" or "dictionary" of the code of all functions in a program, and use the [hashmap lookup primitives](#a-10-dictionary-manipulation-primitives) to select the code of the required function, which can then be $\texttt{BLESS}$ed into a [continuation](#a-8-5-creating-simple-continuations-and-closures) and executed. Special combined "lookup, bless, and execute" primitives, such as $\texttt{DICTIGETJMP}$ and $\texttt{DICTIGETEXEC}$, are also [available](#a-10-11-special-dictionary-and-prefix-code-dictionary-operations%2C-and-constant-dictionaries). This approach may be more efficient for larger programs and `switch` statements.
 
 ---
 
@@ -1213,7 +1217,7 @@ Yet another alternative is to use a *Hashmap* (cf. [3.3](#3-3-hashmaps-or-dictio
 
 This chapter describes the codepage mechanism, which allows TVM to be flexible and extendable while preserving backward compatibility with respect to previously generated code.
 
-We also discuss some general considerations about instruction encodings (applicable to arbitrary machine code, not just TVM), as well as the implications of these considerations for TVM and the choices made while designing TVM's (experimental) codepage zero. The instruction encodings themselves are presented later in Appendix [A](#a-instructions-and-opcodes).
+We also discuss some general considerations about instruction encodings (applicable to arbitrary machine code, not just TVM), as well as the implications of these considerations for TVM and the choices made while designing TVM's (experimental) codepage zero. The instruction encodings themselves are presented later in [Appendix A](#a-instructions-and-opcodes).
 
 ## 5.1   Codepages and interoperability of different TVM versions
 
@@ -1221,11 +1225,11 @@ The *codepages* are an essential mechanism of backward compatibility and of futu
 
 ### 5.1.1. Codepages in continuations
 
-Every ordinary continuation contains a 16-bit *codepage* field $\texttt{cp}$ (cf. [4.1.1](#4-1-1-ordinary-continuations)), which determines the codepage that will be used to execute its code. If a continuation is created by a $\texttt{PUSHCONT}$ (cf. [4.2.3](#4-2-3-constant%2C-or-literal%2C-continuations)) or similar primitive, it usually inherits the current codepage (i.e., the codepage of $\texttt{cc}$).<sup>[25](#fn25)</sup>
+Every [ordinary continuation](#4-1-1-ordinary-continuations) contains a 16-bit *codepage* field $\texttt{cp}$, which determines the codepage that will be used to execute its code. If a [continuation](#4-2-3-constant%2C-or-literal%2C-continuations) is created by a $\texttt{PUSHCONT}$ or similar primitive, it usually inherits the current codepage (i.e., the codepage of $\texttt{cc}$).<a id="ref-fn25"></a><sup>[25](#fn25)</sup>
 
 ### 5.1.2. Current codepage
 
-The current codepage $\texttt{cp}$ (cf. [1.4](#1-4-total-state-of-tvm-scccg)) is the codepage of the current continuation $\texttt{cc}$. It determines the way the next instruction will be decoded from $\texttt{cc.code}$, the remainder of the current continuation's code. Once the instruction has been decoded and executed, it determines the next value of the current codepage. In most cases, the current codepage is left unchanged.
+The [current codepage](#1-4-total-state-of-tvm-scccg) $\texttt{cp}$ is the codepage of the current continuation $\texttt{cc}$. It determines the way the next instruction will be decoded from $\texttt{cc.code}$, the remainder of the current continuation's code. Once the instruction has been decoded and executed, it determines the next value of the current codepage. In most cases, the current codepage is left unchanged.
 
 On the other hand, all primitives that switch the current continuation load the new value of $\texttt{cp}$ from the new current continuation. In this way, all code in continuations is always interpreted exactly as it was intended to be.
 
@@ -1239,7 +1243,7 @@ However, a newer version of TVM will execute old code for codepage zero exactly
 
 New codepages can also change the effects of some operations present in the old codepages while preserving their opcodes and mnemonics.
 
-For example, imagine a future 513-bit upgrade of TVM (replacing the current 257-bit design). It might use a 513-bit *Integer* type within the same arithmetic primitives as before. However, while the opcodes and instructions in the new codepage would look exactly like the old ones, they would work differently, accepting 513-bit integer arguments and results. On the other hand, during the execution of the same code in codepage zero, the new machine would generate exceptions whenever the integers used in arithmetic and other primitives do not fit into 257 bits.<sup>[26](#fn26)</sup> In this way, the upgrade would not change the behavior of the old code.
+For example, imagine a future 513-bit upgrade of TVM (replacing the current 257-bit design). It might use a 513-bit *Integer* type within the same arithmetic primitives as before. However, while the opcodes and instructions in the new codepage would look exactly like the old ones, they would work differently, accepting 513-bit integer arguments and results. On the other hand, during the execution of the same code in codepage zero, the new machine would generate exceptions whenever the integers used in arithmetic and other primitives do not fit into 257 bits.<a id="ref-fn26"></a><sup>[26](#fn26)</sup> In this way, the upgrade would not change the behavior of the old code.
 
 ### 5.1.5. Improving instruction encoding
 
@@ -1261,7 +1265,7 @@ Alternatively, one might create a couple of codepages—say, 4 and 5—which dif
 
 ### 5.1.8. Setting the codepage in the code itself
 
-For convenience, we reserve some opcode in all codepages—say, $\texttt{FF}$ $n$—for the instruction $\texttt{SETCP}$ $n$, with $n$ from 0 to 255 (cf. [A.13](#a-13-codepage-primitives)). Then by inserting such an instruction into the very beginning of (the main function of) a program (e.g., a TON Blockchain smart contract) or a library function, we can ensure that the code will always be executed in the intended codepage.
+For convenience, we reserve some opcode in all [codepages](#a-13-codepage-primitives)—say, $\texttt{FF}$ $n$—for the instruction $\texttt{SETCP}$ $n$, with $n$ from 0 to 255. Then by inserting such an instruction into the very beginning of (the main function of) a program (e.g., a TON Blockchain smart contract) or a library function, we can ensure that the code will always be executed in the intended codepage.
 
 ## 5.2  Instruction encoding
 
@@ -1277,21 +1281,21 @@ As a consequence of this encoding method, any binary string admits at most one p
 
 ### 5.2.3. Invalid opcode
 
-If no prefix of $\texttt{cc.code}$ encodes a valid instruction in the current codepage, an *invalid opcode exception* is generated (cf. [4.5.7](#4-5-7-list-of-predefined-exceptions)). However, the case of an empty $\texttt{cc.code}$ is treated separately as explained in [4.1.4](#4-1-4-normal-work-of-tvm%2C-or-the-main-loop) (the exact behavior may depend on the current codepage).
+If no prefix of $\texttt{cc.code}$ encodes a valid instruction in the current codepage, an invalid opcode [exception](#4-5-7-list-of-predefined-exceptions) is generated. However, the case of an empty $\texttt{cc.code}$ is treated [separately](#4-1-4-normal-work-of-tvm%2C-or-the-main-loop) (the exact behavior may depend on the current codepage).
 
 ### 5.2.4. Special case: end-of-code padding
 
 As an exception to the above rule, some codepages may accept some values of $\texttt{cc.code}$ that are too short to be valid instruction encodings as additional variants of $\texttt{NOP}$, thus effectively using the same procedure for them as for an empty $\texttt{cc.code}$. Such bitstrings may be used for padding the code near its end.
 
-For example, if binary string $\texttt{00000000}$ (i.e., $\texttt{x00}$, cf. [1.0.3](#1-0-3-emphasizing-that-a-string-is-a-hexadecimal-representation-of-a-bitstring)) is used in a codepage to encode $\texttt{NOP}$, its proper prefixes cannot encode any instructions. So this codepage may accept $\texttt{0}$, $\texttt{00}$, $\texttt{000}$, $\ldots$, $\texttt{0000000}$ as variants of $\texttt{NOP}$ if this is all that is left in $\texttt{cc.code}$, instead of generating an invalid opcode exception.
+For example, if [binary string](#1-0-3-emphasizing-that-a-string-is-a-hexadecimal-representation-of-a-bitstring) $\texttt{00000000}$ (i.e., $\texttt{x00}$, is used in a codepage to encode $\texttt{NOP}$, its proper prefixes cannot encode any instructions. So this codepage may accept $\texttt{0}$, $\texttt{00}$, $\texttt{000}$, $\ldots$, $\texttt{0000000}$ as variants of $\texttt{NOP}$ if this is all that is left in $\texttt{cc.code}$, instead of generating an invalid opcode exception.
 
-Such a padding may be useful, for example, if the $\texttt{PUSHCONT}$ primitive (cf. [4.2.3](#4-2-3-constant-or-literal-continuations)) creates only continuations with code consisting of an integral number of bytes, but not all instructions are encoded by an integral number of bytes.
+Such a padding may be useful, for example, if the $\texttt{PUSHCONT}$ primitive creates only [continuations](#4-2-3-constant%2C-or-literal%2C-continuations) with code consisting of an integral number of bytes, but not all instructions are encoded by an integral number of bytes.
 
 ### 5.2.5. TVM code is a bitcode, not a bytecode
 
-Recall that TVM is a bit-oriented machine in the sense that its *Cell*s (and *Slice*s) are naturally considered as sequences of bits, not just of octets (bytes), cf. [3.2.5](#3-2-5-cells-and-cell-primitives-are-bit-oriented%2C-not-byte-oriented). Because the TVM code is also kept in cells (cf. [3.1.9](#3-1-9-tvm-code-is-a-tree-of-cells) and [4.1.4](#4-1-4-normal-work-of-tvm%2C-or-the-main-loop)), there is no reason to use only bitstrings of length divisible by eight as encodings of complete instructions. In other words, generally speaking, *the TVM code is a bitcode, not a bytecode*.
+Recall that TVM is a bit-oriented machine in the sense that its *Cell*s (and *Slice*s) are naturally considered as [sequences of bits](#3-2-5-cells-and-cell-primitives-are-bit-oriented,-not-byte-oriented), not just of octets (bytes). Because the TVM code is also kept in cells ([3.1.9](#3-1-9-tvm-code-is-a-tree-of-cells) and [4.1.4](#4-1-4-normal-work-of-tvm%2C-or-the-main-loop)), there is no reason to use only bitstrings of length divisible by eight as encodings of complete instructions. In other words, generally speaking, *the TVM code is a bitcode, not a bytecode*.
 
-That said, some codepages (such as our experimental codepage zero) may opt to use a bytecode (i.e., to use only encodings consisting of an integral number of bytes)—either for simplicity, or for the ease of debugging and of studying memory (i.e., cell) dumps.<sup>[27](#fn27)</sup>
+That said, some codepages (such as our experimental codepage zero) may opt to use a bytecode (i.e., to use only encodings consisting of an integral number of bytes)—either for simplicity, or for the ease of debugging and of studying memory (i.e., cell) dumps.<a id="ref-fn27"></a><sup>[27](#fn27)</sup>
 
 ### 5.2.6. Opcode space used by a complete instruction
 
@@ -1309,7 +1313,7 @@ This approximation shows why all instructions cannot occupy together more than t
 
 ### 5.2.9. Almost optimal encodings
 
-Coding theory tells us that in an optimally dense encoding, the portion of the opcode space used by a complete instruction ($2^{-l}$, if the complete instruction is encoded in $l$ bits) should be approximately equal to the probability or frequency of its occurrence in real programs.<sup>[28](#fn28)</sup> The same should hold for (incomplete) instructions, or primitives (i.e., generic instructions without specified values of parameters), and for classes of instructions.
+Coding theory tells us that in an optimally dense encoding, the portion of the opcode space used by a complete instruction ($2^{-l}$, if the complete instruction is encoded in $l$ bits) should be approximately equal to the probability or frequency of its occurrence in real programs.<a id="ref-fn28"></a><sup>[28](#fn28)</sup> The same should hold for (incomplete) instructions, or primitives (i.e., generic instructions without specified values of parameters), and for classes of instructions.
 
 ### 5.2.10. Example: stack manipulation primitives
 
@@ -1319,7 +1323,7 @@ For instance, if stack manipulation instructions constitute approximately half o
 
 In most cases, *simple* encodings of complete instructions are used. Simple encodings begin with a fixed bitstring called the *opcode* of the instruction, followed by, say, 4-bit fields containing the indices $i$ of stack registers $\texttt{s}(i)$ specified in the instruction, followed by all other constant (literal, immediate) parameters included in the complete instruction. While simple encodings may not be exactly optimal, they admit short descriptions, and their decoding and encoding can be easily implemented.
 
-If a (generic) instruction uses a simple encoding with an $l$-bit opcode, then the instruction will utilize $2^{-l}$ portion of the opcode space. This observation might be useful for considerations described in [5.2.9](#5-2-9-almost-optimal-encodings) and [5.2.10](#5-2-10-example-stack-manipulation-primitives).
+If a (generic) instruction uses a simple encoding with an $l$-bit opcode, then the instruction will utilize $2^{-l}$ portion of the opcode space. This observation might be useful for considerations described in [5.2.9](#5-2-9-almost-optimal-encodings) and [5.2.10](#5-2-10-example:-stack-manipulation-primitives).
 
 ### 5.2.12. Optimizing code density further: Huffman codes
 
@@ -1327,53 +1331,56 @@ One might construct optimally dense binary code for the set of all complete inst
 
 ### 5.2.13. Practical instruction encodings
 
-In practice, instruction encodings used in TVM and other virtual machines offer a compromise between code density and ease of encoding and decoding. Such a compromise may be achieved by selecting simple encodings (cf. [5.2.11](#5-2-11-simple-encodings-of-instructions)) for all instructions (maybe with separate simple encodings for some often used variants, such as $\texttt{XCHG s0,s}(i)$ among all $\texttt{XCHG s}(i)\texttt{,s}(j)$), and allocating opcode space for such simple encodings using the heuristics outlined in [5.2.9](#5-2-9-almost-optimal-encodings) and [5.2.10](#5-2-10-example-stack-manipulation-primitives); this is the approach currently used in TVM.
+In practice, instruction encodings used in TVM and other virtual machines offer a compromise between code density and ease of encoding and decoding. Such a compromise may be achieved by selecting [simple encodings](#5-2-11-simple-encodings-of-instructions) for all instructions (maybe with separate simple encodings for some often used variants, such as $\texttt{XCHG s0,s}(i)$ among all $\texttt{XCHG s}(i)\texttt{,s}(j)$), and allocating opcode space for such simple encodings using the heuristics outlined in [5.2.9](#5-2-9-almost-optimal-encodings) and [5.2.10](#5-2-10-example:-stack-manipulation-primitives); this is the approach currently used in TVM.
+
+---
 
 ## 5.3   Instruction encoding in codepage zero
 
-This section provides details about the experimental instruction encoding for codepage zero, as described elsewhere in this document (cf. Appendix [A](#a-instructions-and-opcodes)) and used in the preliminary test version of TVM.
+This section provides details about the experimental instruction encoding for codepage zero, as described elsewhere in this [document](#a-instructions-and-opcodes) and used in the preliminary test version of TVM.
 
 ### 5.3.1. Upgradability
 
-First of all, even if this preliminary version somehow gets into the production version of the TON Blockchain, the codepage mechanism (cf. [5.1](#5-1-codepages-and-interoperability-of-different-tvm-versions)) enables us to introduce better versions later without compromising backward compatibility.<sup>[29](#fn29)</sup> So in the meantime, we are free to experiment.
+First of all, even if this preliminary version somehow gets into the production version of the TON Blockchain, the [codepage mechanism](#5-1-codepages-and-interoperability-of-different-tvm-versions) enables us to introduce better versions later without compromising backward compatibility.<a id="ref-fn29"></a><sup>[29](#fn29)</sup> So in the meantime, we are free to experiment.
 
-#### 5.3.2. Choice of instructions
+### 5.3.2. Choice of instructions
 
-We opted to include many "experimental" and not strictly necessary instructions in codepage zero just to see how they might be used in real code. For example, we have both the basic (cf. [2.2.1](#2-2-1-basic-stack-manipulation-primitives)) and the compound (cf. [2.2.3](#2-2-3-compound-stack-manipulation-primitives)) stack manipulation primitives, as well as some "unsystematic" ones such as $\texttt{ROT}$ (mostly borrowed from Forth). If such primitives are rarely used, their inclusion just wastes some part of the opcode space and makes the encodings of other instructions slightly less effective, something we can afford at this stage of TVM's development.
+We opted to include many "experimental" and not strictly necessary instructions in codepage zero just to see how they might be used in real code. For example, we have both the [basic](#2-2-1-basic-stack-manipulation-primitives) and the [compound](#2-2-3-compound-stack-manipulation-primitives) stack manipulation primitives, as well as some "unsystematic" ones such as $\texttt{ROT}$ (mostly borrowed from Forth). If such primitives are rarely used, their inclusion just wastes some part of the opcode space and makes the encodings of other instructions slightly less effective, something we can afford at this stage of TVM's development.
 
-#### 5.3.3. Using experimental instructions
+### 5.3.3. Using experimental instructions
 
 Some of these experimental instructions have been assigned quite long opcodes, just to fit more of them into the opcode space. One should not be afraid to use them just because they are long; if these instructions turn out to be useful, they will receive shorter opcodes in future revisions. Codepage zero is not meant to be fine-tuned in this respect.
 
-#### 5.3.4. Choice of bytecode
+### 5.3.4. Choice of bytecode
 
-We opted to use a bytecode (i.e., to use encodings of complete instructions of lengths divisible by eight). While this may not produce optimal code density, because such a length restriction makes it more difficult to match portions of opcode space used for the encoding of instructions with estimated frequencies of these instructions in TVM code (cf. [5.2.11](#5-2-11-simple-encodings-of-instructions) and [5.2.9](#5-2-9-almost-optimal-encodings)), such an approach has its advantages: it admits a simpler instruction decoder and simplifies debugging (cf. [5.2.5](#5-2-5-tvm-code-is-a-bitcode-not-a-bytecode)).
+We opted to use a bytecode (i.e., to use encodings of complete instructions of lengths divisible by eight). While this may not produce optimal code density, because such a length restriction makes it more difficult to match portions of opcode space used for the encoding of instructions with estimated frequencies of these instructions in TVM code [5.2.11](#5-2-11-simple-encodings-of-instructions) and [5.2.9](#5-2-9-almost-optimal-encodings)), such an approach has its advantages: it admits a simpler instruction decoder and simplifies [debugging](#5-2-5-tvm-code-is-a-bitcode%2C-not-a-bytecode).
 
 After all, we do not have enough data on the relative frequencies of different instructions right now, so our code density optimizations are likely to be very approximate at this stage. The ease of debugging and experimenting and the simplicity of implementation are more important at this point.
 
-#### 5.3.5. Simple encodings for all instructions
+### 5.3.5. Simple encodings for all instructions
 
-For similar reasons, we opted to use simple encodings for all instructions (cf. [5.2.11](#5-2-11-simple-encodings-of-instructions) and [5.2.13](#5-2-13-practical-instruction-encodings)), with separate simple encodings for some very frequently used subcases as outlined in [5.2.13](#5-2-13-practical-instruction-encodings). That said, we tried to distribute opcode space using the heuristics described in [5.2.9](#5-2-9-almost-optimal-encodings) and [5.2.10](#5-2-10-example-stack-manipulation-primitives).
+For similar reasons, we opted to use simple encodings for all instructions ([5.2.11](#5-2-11-simple-encodings-of-instructions) and [5.2.13](#5-2-13-practical-instruction-encodings)), with separate [simple encodings](#5-2-13-practical-instruction-encodings) for some very frequently used subcases. That said, we tried to distribute opcode space using the heuristics described in [5.2.9](#5-2-9-almost-optimal-encodings) and [5.2.10](#5-2-10-example:-stack-manipulation-primitives).
 
-#### 5.3.6. Lack of context-dependent encodings
+### 5.3.6. Lack of context-dependent encodings
 
-This version of TVM also does not use context-dependent encodings (cf. [5.1.6](#5-1-6-making-instruction-encoding-context-dependent)). They may be added at a later stage, if deemed useful.
+This version of TVM also does not use [context-dependent encodings](#5-1-6-making-instruction-encoding-context-dependent). They may be added at a later stage, if deemed useful.
 
-#### 5.3.7. The list of all instructions
+### 5.3.7. The list of all instructions
 
-The list of all instructions available in codepage zero, along with their encodings and (in some cases) short descriptions, may be found in Appendix [A](#a-instructions-and-opcodes).
+The list of all instructions available in codepage zero, along with their encodings and (in some cases) short descriptions, may be found in [Appendix A](#a-instructions-and-opcodes).
 
 ---
 
+
 # A   Instructions and opcodes
 
-This appendix lists all instructions available in the (experimental) codepage zero of TVM, as explained in [5.3](#5-3-instruction-encoding-in-codepage-zero).
+This appendix lists all [instructions](#5-3-instruction-encoding-in-codepage-zero) available in the (experimental) codepage zero of TVM.
 
 We list the instructions in lexicographical opcode order. However, the opcode space is distributed in such way as to make all instructions in each category (e.g., arithmetic primitives) have neighboring opcodes. So we first list a number of stack manipulation primitives, then constant primitives, arithmetic primitives, comparison primitives, cell primitives, continuation primitives, dictionary primitives, and finally application-specific primitives.
 
-We use hexadecimal notation (cf. [1.0](#1-0-notation-for-bitstrings)) for bitstrings. Stack registers $\texttt{s}(i)$ usually have $0\leq i\leq 15$, and $i$ is encoded in a 4-bit field (or, on a few rare occasions, in an 8-bit field). Other immediate parameters are usually 4-bit, 8-bit, or variable length.
+We use [hexadecimal notation](#1-0-notation-for-bitstrings) for bitstrings. Stack registers $\texttt{s}(i)$ usually have $0\leq i\leq 15$, and $i$ is encoded in a 4-bit field (or, on a few rare occasions, in an 8-bit field). Other immediate parameters are usually 4-bit, 8-bit, or variable length.
 
-The stack notation described in [2.1.10](#2-1-10-stack-notation) is extensively used throughout this appendix.
+The [stack notation](#2-1-10-stack-notation) is extensively used throughout this appendix.
 
 ## A.1  Gas prices
 
@@ -1388,11 +1395,10 @@ By default, the gas price of an instruction equals $P:=P_b+C_r+L+B_w+C_w$.
 
 ## A.2   Stack manipulation primitives
 
-This section includes both the basic (cf. [2.2.1](#2-2-1-basic-stack-manipulation-primitives)) and the compound (cf. [2.2.3](#2-2-3-compound-stack-manipulation-primitives)) stack manipulation primitives, as well as some "unsystematic" ones. Some compound stack manipulation primitives, such as $\texttt{XCPU}$ or $\texttt{XCHG2}$, turn out to have the same length as an equivalent sequence of simpler operations. We have included these primitives regardless, so that they can easily be allocated shorter opcodes in a future revision of TVM—or removed for good.
+This section includes both the [basic](#2-2-1-basic-stack-manipulation-primitives) and the [compound](#2-2-3-compound-stack-manipulation-primitives) stack manipulation primitives, as well as some "unsystematic" ones. Some compound stack manipulation primitives, such as $\texttt{XCPU}$ or $\texttt{XCHG2}$, turn out to have the same length as an equivalent sequence of simpler operations. We have included these primitives regardless, so that they can easily be allocated shorter opcodes in a future revision of TVM—or removed for good.
 
 Some stack manipulation instructions have two mnemonics: one Forth-style (e.g., $\texttt{-ROT}$), the other conforming to the usual rules for identifiers (e.g., $\texttt{ROTREV}$). Whenever a stack manipulation primitive (e.g., $\texttt{PICK}$) accepts an integer parameter $n$ from the stack, it must be within the range $0\ldots255$; otherwise a range check exception happens before any further checks.
 
-## A.2   Stack manipulation primitives
 
 ### A.2.1. Basic stack manipulation primitives
 
@@ -1593,7 +1599,7 @@ The general encoding of a $\texttt{DIV}$, $\texttt{DIVMOD}$, or $\texttt{MOD}$ o
 - $0\leq s\leq2$ — Indicates whether either the multiplication or the division have been replaced by shifts: $s=0$—no replacement, $s=1$—division replaced by a right shift, $s=2$—multiplication replaced by a left shift (possible only for $m=1$).
 - $0\leq c\leq1$ — Indicates whether there is a constant one-byte argument $tt$ for the shift operator (if $s\neq0$). For $s=0$, $c=0$. If $c=1$, then $0\leq tt\leq 255$, and the shift is performed by $tt+1$ bits. If $s\neq0$ and $c=0$, then the shift amount is provided to the instruction as a top-of-stack *Integer* in range $0\ldots256$.
 - $1\leq d\leq3$ — Indicates which results of division are required: $1$—only the quotient, $2$—only the remainder, $3$—both.
-- $0\leq f\leq2$ — Rounding mode: $0$—floor, $1$—nearest integer, $2$—ceiling (cf. [1.5.6](#1-5-6-division-and-rounding)).
+- $0\leq f\leq2$ — Rounding mode: $0$—floor, $1$—nearest integer, $2$—ceiling ([1.5.6](#1-5-6-division-and-rounding)).
 
 Examples:
 
@@ -1761,7 +1767,7 @@ All these primitives first check whether there is enough space in the Builder, a
 - $\texttt{CF1F}$ — $\texttt{STBRQ}$ ($b$ $b'$ -- $b$ $b'$ $-1$ or $b''$ $0$).
 - $\texttt{CF20}$ — $\texttt{STREFCONST}$, equivalent to $\texttt{PUSHREF}$; $\texttt{STREFR}$.
 - $\texttt{CF21}$ — $\texttt{STREF2CONST}$, equivalent to $\texttt{STREFCONST}$; $\texttt{STREFCONST}$.
-- $\texttt{CF23}$ — $\texttt{ENDXC}$ ($b$ $x$ -- $c$), if $x\neq0$, creates a *special* or *exotic* cell (cf. [3.1.2](#3-1-2-ordinary-and-exotic-cells)) from *Builder* $b$. The type of the exotic cell must be stored in the first 8 bits of $b$. If $x=0$, it is equivalent to $\texttt{ENDC}$. Otherwise some validity checks on the data and references of $b$ are performed before creating the exotic cell.
+- $\texttt{CF23}$ — $\texttt{ENDXC}$ ($b$ $x$ -- $c$), if $x\neq0$, creates a *special* or [exotic cell](#3-1-2-ordinary-and-exotic-cells) from *Builder* $b$. The type of the exotic cell must be stored in the first 8 bits of $b$. If $x=0$, it is equivalent to $\texttt{ENDC}$. Otherwise some validity checks on the data and references of $b$ are performed before creating the exotic cell.
 - $\texttt{CF28}$ — $\texttt{STILE4}$ ($x$ $b$ -- $b'$), stores a little-endian signed 32-bit integer.
 - $\texttt{CF29}$ — $\texttt{STULE4}$ ($x$ $b$ -- $b'$), stores a little-endian unsigned 32-bit integer.
 - $\texttt{CF2A}$ — $\texttt{STILE8}$ ($x$ $b$ -- $b'$), stores a little-endian signed 64-bit integer.
@@ -1794,7 +1800,7 @@ All these primitives first check whether there is enough space in the Builder, a
 
 ### A.7.2. Cell deserialization primitives
 
-- $\texttt{D0}$ — $\texttt{CTOS}$ ($c$ -- $s$), converts a *Cell* into a *Slice*. Notice that $c$ must be either an ordinary cell, or an exotic cell (cf. [3.1.2](#3-1-2-ordinary-and-exotic-cells)) which is automatically *loaded* to yield an ordinary cell $c'$, converted into a *Slice* afterwards.
+- $\texttt{D0}$ — $\texttt{CTOS}$ ($c$ -- $s$), converts a *Cell* into a *Slice*. Notice that $c$ must be either an ordinary cell, or an [exotic cell](#3-1-2-ordinary-and-exotic-cells) which is automatically *loaded* to yield an ordinary cell $c'$, converted into a *Slice* afterwards.
 - $\texttt{D1}$ — $\texttt{ENDS}$ ($s$ -- ), removes a *Slice* $s$ from the stack, and throws an exception if it is not empty.
 - $\texttt{D2}$ $cc$ — $\texttt{LDI}$ $cc+1$ ($s$ -- $x$ $s'$), loads (i.e., parses) a signed $cc+1$-bit integer $x$ from *Slice* $s$, and returns the remainder of $s$ as $s'$.
 - $\texttt{D3}$ $cc$ — $\texttt{LDU}$ $cc+1$ ($s$ -- $x$ $s'$), loads an unsigned $cc+1$-bit integer $x$ from *Slice* $s$.
@@ -1926,7 +1932,7 @@ All these primitives first check whether there is enough space in the Builder, a
 - $\texttt{E30D}$ — $\texttt{IFREFELSE}$ ($f$ $c$ -), equivalent to $\texttt{PUSHREFCONT}$; $\texttt{SWAP}$; $\texttt{IFELSE}$, with the optimization that the cell reference is not actually loaded into a *Slice* and then converted into an ordinary *Continuation* if $f=0$. Similar remarks apply to the next two primitives: *Cell*s are converted into *Continuation*s only when necessary.
 - $\texttt{E30E}$ — $\texttt{IFELSEREF}$ ($f$ $c$ -), equivalent to $\texttt{PUSHREFCONT}$; $\texttt{IFELSE}$.
 - $\texttt{E30F}$ — $\texttt{IFREFELSEREF}$ ($f$ -), equivalent to $\texttt{PUSHREFCONT}$; $\texttt{PUSHREFCONT}$; $\texttt{IFELSE}$.
-- $\texttt{E310}$--$\texttt{E31F}$ — reserved for loops with break operators, cf. [A.8.4](#a-8-4-control-flow-primitives-loops) below.
+- $\texttt{E310}$--$\texttt{E31F}$ — reserved for loops with break operators, [A.8.4](#a-8-4-manipulating-the-stack-of-continuations) below.
 - $\texttt{E39\_}n$ — $\texttt{IFBITJMP}$ $n$ ($x$ $c$ - $x$), checks whether bit $0\leq n\leq 31$ is set in integer $x$, and if so, performs $\texttt{JMPX}$ to continuation $c$. Value $x$ is left in the stack.
 - $\texttt{E3B\_}n$ — $\texttt{IFNBITJMP}$ $n$ ($x$ $c$ - $x$), jumps to $c$ if bit $0\leq n\leq 31$ is not set in integer $x$.
 - $\texttt{E3D\_}n$ — $\texttt{IFBITJMPREF}$ $n$ ($x$ - $x$), performs a $\texttt{JMPREF}$ if bit $0\leq n\leq 31$ is set in integer $x$.
@@ -1934,11 +1940,11 @@ All these primitives first check whether there is enough space in the Builder, a
 
 ### A.8.3. Control flow primitives: loops
 
-Most of the loop primitives listed below are implemented with the aid of extraordinary continuations, such as $\texttt{ec\_until}$ (cf. [4.1.5](#4-1-5-extraordinary-continuations)), with the loop body and the original current continuation $\texttt{cc}$ stored as the arguments to this extraordinary continuation. Typically a suitable extraordinary continuation is constructed, and then saved into the loop body continuation savelist as $\texttt{c0}$; after that, the modified loop body continuation is loaded into $\texttt{cc}$ and executed in the usual fashion. All of these loop primitives have $\texttt{*BRK}$ versions, adapted for breaking out of a loop; they additionally set $\texttt{c1}$ to the original current continuation (or original $\texttt{c0}$ for $\texttt{*ENDBRK}$ versions), and save the old $\texttt{c1}$ into the savelist of the original current continuation (or of the original $\texttt{c0}$ for $\texttt{*ENDBRK}$ versions).
+Most of the loop primitives listed below are implemented with the aid of [extraordinary continuations](#4-1-5-extraordinary-continuations), such as $\texttt{ec\_until}$, with the loop body and the original current continuation $\texttt{cc}$ stored as the arguments to this extraordinary continuation. Typically a suitable extraordinary continuation is constructed, and then saved into the loop body continuation savelist as $\texttt{c0}$; after that, the modified loop body continuation is loaded into $\texttt{cc}$ and executed in the usual fashion. All of these loop primitives have $\texttt{*BRK}$ versions, adapted for breaking out of a loop; they additionally set $\texttt{c1}$ to the original current continuation (or original $\texttt{c0}$ for $\texttt{*ENDBRK}$ versions), and save the old $\texttt{c1}$ into the savelist of the original current continuation (or of the original $\texttt{c0}$ for $\texttt{*ENDBRK}$ versions).
 
 - $\texttt{E4}$ — $\texttt{REPEAT}$ ($n$ $c$ - ), executes continuation $c$ $n$ times, if integer $n$ is non-negative. If $n\geq2^{31}$ or $n<-2^{31}$, generates a range check exception. Notice that a $\texttt{RET}$ inside the code of $c$ works as a $\texttt{continue}$, not as a $\texttt{break}$. One should use either alternative (experimental) loops or alternative $\texttt{RETALT}$ (along with a $\texttt{SETEXITALT}$ before the loop) to $\texttt{break}$ out of a loop.
 - $\texttt{E5}$ — $\texttt{REPEATEND}$ ($n$ - ), similar to $\texttt{REPEAT}$, but it is applied to the current continuation $\texttt{cc}$.
-- $\texttt{E6}$ — $\texttt{UNTIL}$ ($c$ - ), executes continuation $c$, then pops an integer $x$ from the resulting stack. If $x$ is zero, performs another iteration of this loop. The actual implementation of this primitive involves an extraordinary continuation $\texttt{ec\_until}$ (cf. [4.1.5](#4-1-5-extraordinary-continuations)) with its arguments set to the body of the loop (continuation $c$) and the original current continuation $\texttt{cc}$. This extraordinary continuation is then saved into the savelist of $c$ as $c$.$\texttt{c0}$ and the modified $c$ is then executed. The other loop primitives are implemented similarly with the aid of suitable extraordinary continuations.
+- $\texttt{E6}$ — $\texttt{UNTIL}$ ($c$ - ), executes continuation $c$, then pops an integer $x$ from the resulting stack. If $x$ is zero, performs another iteration of this loop. The actual implementation of this primitive involves an [extraordinary continuation](#4-1-5-extraordinary-continuations) $\texttt{ec\_until}$ with its arguments set to the body of the loop (continuation $c$) and the original current continuation $\texttt{cc}$. This extraordinary continuation is then saved into the savelist of $c$ as $c$.$\texttt{c0}$ and the modified $c$ is then executed. The other loop primitives are implemented similarly with the aid of suitable extraordinary continuations.
 - $\texttt{E7}$ — $\texttt{UNTILEND}$ ( - ), similar to $\texttt{UNTIL}$, but executes the current continuation $\texttt{cc}$ in a loop. When the loop exit condition is satisfied, performs a $\texttt{RET}$.
 - $\texttt{E8}$ — $\texttt{WHILE}$ ($c'$ $c$ - ), executes $c'$ and pops an integer $x$ from the resulting stack. If $x$ is zero, exists the loop and transfers control to the original $\texttt{cc}$. If $x$ is non-zero, executes $c$, and then begins a new iteration.
 - $\texttt{E9}$ — $\texttt{WHILEEND}$ ($c'$ - ), similar to $\texttt{WHILE}$, but uses the current continuation $\texttt{cc}$ as the loop body.
@@ -2000,6 +2006,7 @@ Most of the loop primitives listed below are implemented with the aid of extraor
 - $\texttt{EDFB}$ — $\texttt{SAMEALTSAVE}$ ( - ), sets $c_1:=c_0$, but first saves the old value of $c_1$ into the savelist of $c_0$. Equivalent to $\texttt{SAVE c1}$; $\texttt{SAMEALT}$.
 - $\texttt{EE}rn$ — $\texttt{BLESSARGS}$ $r,n$ ($x_1\ldots x_r$ $s$ -- $c$), described in [A.8.4](#a-8-4-manipulating-the-stack-of-continuations).
 
+
 ### A.8.7. Dictionary subroutine calls and jumps
 
 - $\texttt{F0}n$ — $\texttt{CALL}$ $n$ or $\texttt{CALLDICT}$ $n$ ( - $n$), calls the continuation in $\texttt{c3}$, pushing integer $0\leq n\leq 255$ into its stack as an argument. Approximately equivalent to $\texttt{PUSHINT}$ $n$; $\texttt{PUSH c3}$; $\texttt{EXECUTE}$.
@@ -2032,11 +2039,9 @@ Most of the loop primitives listed below are implemented with the aid of extraor
 - $\texttt{F2FF}$ — $\texttt{TRY}$ ($c$ $c'$ - ), sets $\texttt{c2}$ to $c'$, first saving the old value of $\texttt{c2}$ both into the savelist of $c'$ and into the savelist of the current continuation, which is stored into $c$.$\texttt{c0}$ and $c'$.$\texttt{c0}$. Then runs $c$ similarly to $\texttt{EXECUTE}$. If $c$ does not throw any exceptions, the original value of $\texttt{c2}$ is automatically restored on return from $c$. If an exception occurs, the execution is transferred to $c'$, but the original value of $\texttt{c2}$ is restored in the process, so that $c'$ can re-throw the exception by $\texttt{THROWANY}$ if it cannot handle it by itself.
 - $\texttt{F3}pr$ — $\texttt{TRYARGS}$ $p$,$r$ ($c$ $c'$ - ), similar to $\texttt{TRY}$, but with $\texttt{CALLARGS}$ $p$,$r$ internally used instead of $\texttt{EXECUTE}$. In this way, all but the top $0\leq p\leq 15$ stack elements will be saved into current continuation's stack, and then restored upon return from either $c$ or $c'$, with the top $0\leq r\leq 15$ values of the resulting stack of $c$ or $c'$ copied as return values.
 
----
-
 ## A.10  Dictionary manipulation primitives
 
-TVM's dictionary support is discussed at length in [3.3](#3-3-hashmaps-or-dictionaries). The basic operations with dictionaries are listed in [3.3.10](#3-3-10-basic-dictionary-operations), while the taxonomy of dictionary manipulation primitives is provided in [3.3.11](#3-3-11-taxonomy-of-dictionary-primitives). Here we use the concepts and notation introduced in those sections.
+TVM's dictionary support is discussed at length in [3.3](#3-3-hashmaps%2C-or-dictionaries). The basic operations with [dictionaries](#3-3-10-basic-dictionary-operations), while the taxonomy of dictionary manipulation [primitives](#3-3-11-taxonomy-of-dictionary-primitives). Here we use the concepts and notation introduced in those sections.
 
 Dictionaries admit two different representations as TVM stack values:
 
@@ -2050,8 +2055,8 @@ Opcodes starting with $\texttt{F4}$ and $\texttt{F5}$ are reserved for dictionar
 
 ### A.10.1. Dictionary creation
 
-- $\texttt{6D}$ — $\texttt{NEWDICT}$ ( - $D$), returns a new empty dictionary. It is an alternative mnemonics for $\texttt{PUSHNULL}$, cf. [A.3.1](#a-3-1-null-primitives).
-- $\texttt{6E}$ — $\texttt{DICTEMPTY}$ ($D$ - $?$), checks whether dictionary $D$ is empty, and returns $-1$ or $0$ accordingly. It is an alternative mnemonics for $\texttt{ISNULL}$, cf. [A.3.1](#a-3-1-null-primitives).
+- $\texttt{6D}$ — $\texttt{NEWDICT}$ ( - $D$), returns a new empty dictionary. It is an alternative mnemonics for $\texttt{PUSHNULL}$, [A.3.1](#a-3-1-null-primitives).
+- $\texttt{6E}$ — $\texttt{DICTEMPTY}$ ($D$ - $?$), checks whether dictionary $D$ is empty, and returns $-1$ or $0$ accordingly. It is an alternative mnemonics for $\texttt{ISNULL}$, [A.3.1](#a-3-1-null-primitives).
 
 ### A.10.2. Dictionary serialization and deserialization
 
@@ -2117,7 +2122,7 @@ The mnemonics of the following dictionary primitives are constructed in a system
 
 ### A.10.5. Builder-accepting variants of $\texttt{Set}$ dictionary operations
 
-The following primitives accept the new value as a *Builder* $b$ instead of a *Slice* $x$, which often is more convenient if the value needs to be serialized from several components computed in the stack. (This is reflected by appending a $\texttt{B}$ to the mnemonics of the corresponding $\texttt{Set}$ primitives that work with *Slice*s.) The net effect is roughly equivalent to converting $b$ into a *Slice* by $\texttt{ENDC}$; $\texttt{CTOS}$ and executing the corresponding primitive listed in [A.10.4](#a-10-4-set-replace-add-dictionary-operations).
+The following primitives accept the new value as a *Builder* $b$ instead of a *Slice* $x$, which often is more convenient if the value needs to be serialized from several components computed in the stack. (This is reflected by appending a $\texttt{B}$ to the mnemonics of the corresponding $\texttt{Set}$ primitives that work with *Slice*s.) The net effect is roughly equivalent to converting $b$ into a *Slice* by $\texttt{ENDC}$; $\texttt{CTOS}$ and executing the [corresponding primitive](#a-10-4-%2F%2F-dictionary-operations).
 
 - $\texttt{F441}$ — $\texttt{DICTSETB}$ ($b$ $k$ $D$ $n$ - $D'$).
 - $\texttt{F442}$ — $\texttt{DICTISETB}$ ($b$ $i$ $D$ $n$ - $D'$).
@@ -2163,16 +2168,16 @@ The following operations assume that a dictionary is used to store values $c^?$
 
 ### A.10.8. Prefix code dictionary operations
 
-These are some basic operations for constructing prefix code dictionaries (cf [3.4.2](#3-4-2-serialization-of-prefix-codes)). The primary application for prefix code dictionaries is deserializing TL-B serialized data structures, or, more generally, parsing prefix codes. Therefore, most prefix code dictionaries will be constant and created at compile time, not by the following primitives.
+These are some basic operations for constructing [prefix code](#3-4-2-serialization-of-prefix-codes) dictionaries. The primary application for prefix code dictionaries is deserializing TL-B serialized data structures, or, more generally, parsing prefix codes. Therefore, most prefix code dictionaries will be constant and created at compile time, not by the following primitives.
 
-Some $\texttt{Get}$ operations for prefix code dictionaries may be found in [A.10.11](#a-10-11-special-get-dictionary-and-prefix-code-dictionary-operations-and-constant-dictionaries). Other prefix code dictionary operations include:
+Some $\texttt{Get}$ operations for prefix code dictionaries may be found in [A.10.11](#a-10-11-special-dictionary-and-prefix-code-dictionary-operations%2C-and-constant-dictionaries). Other prefix code dictionary operations include:
 
 - $\texttt{F470}$ — $\texttt{PFXDICTSET}$ ($x$ $k$ $D$ $n$ - $D'$ $-1$ or $D$ $0$).
 - $\texttt{F471}$ — $\texttt{PFXDICTREPLACE}$ ($x$ $k$ $D$ $n$ - $D'$ $-1$ or $D$ $0$).
 - $\texttt{F472}$ — $\texttt{PFXDICTADD}$ ($x$ $k$ $D$ $n$ - $D'$ $-1$ or $D$ $0$).
 - $\texttt{F473}$ — $\texttt{PFXDICTDEL}$ ($k$ $D$ $n$ - $D'$ $-1$ or $D$ $0$).
 
-These primitives are completely similar to their non-prefix code counterparts $\texttt{DICTSET}$ etc (cf. [A.10.4](#a-10-4-set-replace-add-dictionary-operations)), with the obvious difference that even a $\texttt{Set}$ may fail in a prefix code dictionary, so a success flag must be returned by $\texttt{PFXDICTSET}$ as well.
+These primitives are completely similar to their [non-prefix code](#a-10-4-%2F%2F-dictionary-operations) counterparts $\texttt{DICTSET}$ etc, with the obvious difference that even a $\texttt{Set}$ may fail in a prefix code dictionary, so a success flag must be returned by $\texttt{PFXDICTSET}$ as well.
 
 ### A.10.9. Variants of $\texttt{GetNext}$ and $\texttt{GetPrev}$ operations
 
@@ -2218,11 +2223,11 @@ These primitives are completely similar to their non-prefix code counterparts $\
 
 ### A.10.11. Special $\texttt{Get}$ dictionary and prefix code dictionary operations, and constant dictionaries
 
-- $\texttt{F4A0}$ — $\texttt{DICTIGETJMP}$ ($i$ $D$ $n$ - ), similar to $\texttt{DICTIGET}$ (cf. [A.10.12](#a-10-12-subdict-dictionary-operations)), but with $x$ $\texttt{BLESS}$ed into a continuation with a subsequent $\texttt{JMPX}$ to it on success. On failure, does nothing. This is useful for implementing $\texttt{switch}$/$\texttt{case}$ constructions.
+- $\texttt{F4A0}$ — $\texttt{DICTIGETJMP}$ ($i$ $D$ $n$ - ), similar to $\texttt{DICTIGET}$ ([A.10.12](#a-10-12-dictionary-operations)), but with $x$ $\texttt{BLESS}$ed into a continuation with a subsequent $\texttt{JMPX}$ to it on success. On failure, does nothing. This is useful for implementing $\texttt{switch}$/$\texttt{case}$ constructions.
 - $\texttt{F4A1}$ — $\texttt{DICTUGETJMP}$ ($i$ $D$ $n$ - ), similar to $\texttt{DICTIGETJMP}$, but performs $\texttt{DICTUGET}$ instead of $\texttt{DICTIGET}$.
 - $\texttt{F4A2}$ — $\texttt{DICTIGETEXEC}$ ($i$ $D$ $n$ - ), similar to $\texttt{DICTIGETJMP}$, but with $\texttt{EXECUTE}$ instead of $\texttt{JMPX}$.
 - $\texttt{F4A3}$ — $\texttt{DICTUGETEXEC}$ ($i$ $D$ $n$ - ), similar to $\texttt{DICTUGETJMP}$, but with $\texttt{EXECUTE}$ instead of $\texttt{JMPX}$.
-- $\texttt{F4A6\_}n$ — $\texttt{DICTPUSHCONST}$ $n$ ( - $D$ $n$), pushes a non-empty constant dictionary $D$ (as a $\mathit{Cell}^?$) along with its key length $0\leq n\leq 1023$, stored as a part of the instruction. The dictionary itself is created from the first of remaining references of the current continuation. In this way, the complete $\texttt{DICTPUSHCONST}$ instruction can be obtained by first serializing $\texttt{xF4A8\_}$, then the non-empty dictionary itself (one $\texttt{1}$ bit and a cell reference), and then the unsigned 10-bit integer $n$ (as if by a $\texttt{STU 10}$ instruction). An empty dictionary can be pushed by a $\texttt{NEWDICT}$ primitive (cf. [A.10.1](#a-10-1-dictionary-creation)) instead.
+- $\texttt{F4A6\_}n$ — $\texttt{DICTPUSHCONST}$ $n$ ( - $D$ $n$), pushes a non-empty constant dictionary $D$ (as a $\mathit{Cell}^?$) along with its key length $0\leq n\leq 1023$, stored as a part of the instruction. The dictionary itself is created from the first of remaining references of the current continuation. In this way, the complete $\texttt{DICTPUSHCONST}$ instruction can be obtained by first serializing $\texttt{xF4A8\_}$, then the non-empty dictionary itself (one $\texttt{1}$ bit and a cell reference), and then the unsigned 10-bit integer $n$ (as if by a $\texttt{STU 10}$ instruction). An empty dictionary can be pushed by a $\texttt{NEWDICT}$ primitive ([A.10.1](#a-10-1-dictionary-creation)) instead.
 - $\texttt{F4A8}$ — $\texttt{PFXDICTGETQ}$ ($s$ $D$ $n$ - $s'$ $x$ $s''$ $-1$ or $s$ $0$), looks up the unique prefix of *Slice* $s$ present in the prefix code dictionary represented by $\mathit{Cell}^?$ $D$ and $0\leq n\leq 1023$. If found, the prefix of $s$ is returned as $s'$, and the corresponding value (also a *Slice*) as $x$. The remainder of $s$ is returned as a *Slice* $s''$. If no prefix of $s$ is a key in prefix code dictionary $D$, returns the unchanged $s$ and a zero flag to indicate failure.
 - $\texttt{F4A9}$ — $\texttt{PFXDICTGET}$ ($s$ $D$ $n$ - $s'$ $x$ $s''$), similar to $\texttt{PFXDICTGETQ}$, but throws a cell deserialization failure exception on failure.
 - $\texttt{F4AA}$ — $\texttt{PFXDICTGETJMP}$ ($s$ $D$ $n$ - $s'$ $s''$ or $s$), similar to $\texttt{PFXDICTGETQ}$, but on success $\texttt{BLESS}$es the value $x$ into a *Continuation* and transfers control to it as if by a $\texttt{JMPX}$. On failure, returns $s$ unchanged and continues execution.
@@ -2241,7 +2246,7 @@ These primitives are completely similar to their non-prefix code counterparts $\
 - $\texttt{F4B5}$ — $\texttt{SUBDICTRPGET}$ ($k$ $l$ $D$ $n$ - $D'$), similar to $\texttt{SUBDICTGET}$, but removes the common prefix $k$ from all keys of the new dictionary $D'$, which becomes of type $\mathit{HashmapE}(n-l,X)$.
 - $\texttt{F4B6}$ — $\texttt{SUBDICTIRPGET}$ ($x$ $l$ $D$ $n$ - $D'$), variant of $\texttt{SUBDICTRPGET}$ with the prefix represented by a signed big-endian $l$-bit *Integer* $x$, where necessarily $l\leq257$.
 - $\texttt{F4B7}$ — $\texttt{SUBDICTURPGET}$ ($x$ $l$ $D$ $n$ - $D'$), variant of $\texttt{SUBDICTRPGET}$ with the prefix represented by an unsigned big-endian $l$-bit *Integer* $x$, where necessarily $l\leq256$.
-- $\texttt{F4BC}$–$\texttt{F4BF}$ — used by $\texttt{DICT\ldots Z}$ primitives in [A.10.11](#a-10-11-special-get-dictionary-and-prefix-code-dictionary-operations-and-constant-dictionaries).
+- $\texttt{F4BC}$–$\texttt{F4BF}$ — used by $\texttt{DICT\ldots Z}$ primitives in [A.10.11](#a-10-11-special-dictionary-and-prefix-code-dictionary-operations%2C-and-constant-dictionaries).
 
 ## A.11  Application-specific primitives
 
@@ -2249,7 +2254,7 @@ Opcode range $\texttt{F8}$...$\texttt{FB}$ is reserved for the *application-spec
 
 ### A.11.1. External actions and access to blockchain configuration data
 
-Some of the primitives listed below pretend to produce some externally visible actions, such as sending a message to another smart contract. In fact, the execution of a smart contract in TVM never has any effect apart from a modification of the TVM state. All external actions are collected into a linked list stored in special register $\texttt{c5}$ ("output actions"). Additionally, some primitives use the data kept in the first component of the *Tuple* stored in $\texttt{c7}$ ("root of temporary data" cf [1.3.2](#1-3-2-list-of-control-registers)). Smart contracts are free to modify any other data kept in the cell $\texttt{c7}$, provided the first reference remains intact (otherwise some application-specific primitives would be likely to throw exceptions when invoked).
+Some of the primitives listed below pretend to produce some externally visible actions, such as sending a message to another smart contract. In fact, the execution of a smart contract in TVM never has any effect apart from a modification of the TVM state. All external actions are collected into a linked list stored in special register $\texttt{c5}$ ("output actions"). Additionally, some primitives use the data kept in the first component of the [Tuple](#1-3-2-list-of-control-registers) stored in $\texttt{c7}$. Smart contracts are free to modify any other data kept in the cell $\texttt{c7}$, provided the first reference remains intact (otherwise some application-specific primitives would be likely to throw exceptions when invoked).
 
 Most of the primitives listed below use 16-bit opcodes.
 
@@ -2257,7 +2262,7 @@ Most of the primitives listed below use 16-bit opcodes.
 
 Of the following primitives, only the first two are "pure" in the sense that they do not use $\texttt{c5}$ or $\texttt{c7}$.
 
-- $\texttt{F800}$ — $\texttt{ACCEPT}$, sets current gas limit $g_l$ to its maximal allowed value $g_m$, and resets the gas credit $g_c$ to zero (cf. [1.4](#1-4-total-state-of-tvm-scccg)), decreasing the value of $g_r$ by $g_c$ in the process. In other words, the current smart contract agrees to buy some gas to finish the current transaction. This action is required to process external messages, which bring no value (hence no gas) with themselves.
+- $\texttt{F800}$ — $\texttt{ACCEPT}$, sets current gas limit $g_l$ to its maximal allowed value $g_m$, and resets the gas credit $g_c$ to zero ([1.4](#1-4-total-state-of-tvm-scccg)), decreasing the value of $g_r$ by $g_c$ in the process. In other words, the current smart contract agrees to buy some gas to finish the current transaction. This action is required to process external messages, which bring no value (hence no gas) with themselves.
 - $\texttt{F801}$ — $\texttt{SETGASLIMIT}$ ($g$ - ), sets current gas limit $g_l$ to the minimum of $g$ and $g_m$, and resets the gas credit $g_c$ to zero. If the gas consumed so far (including the present instruction) exceeds the resulting value of $g_l$, an (unhandled) out of gas exception is thrown before setting new gas limits. Notice that $\texttt{SETGASLIMIT}$ with an argument $g\geq 2^{63}-1$ is equivalent to $\texttt{ACCEPT}$.
 - $\texttt{F802}$ — $\texttt{BUYGAS}$ ($x$ - ), computes the amount of gas that can be bought for $x$ nanograms, and sets $g_l$ accordingly in the same way as $\texttt{SETGASLIMIT}$.
 - $\texttt{F804}$ — $\texttt{GRAMTOGAS}$ ($x$ - $g$), computes the amount of gas that can be bought for $x$ nanograms. If $x$ is negative, returns 0. If $g$ exceeds $2^{63}-1$, it is replaced with this value.
@@ -2267,7 +2272,7 @@ Of the following primitives, only the first two are "pure" in the sense that the
 
 ### A.11.3. Pseudo-random number generator primitives
 
-The pseudo-random number generator uses the random seed (parameter #6, cf. [A.11.4](#a-11-4-configuration-primitives)), an unsigned 256-bit *Integer*, and (sometimes) other data kept in $\texttt{c7}$. The initial value of the random seed before a smart contract is executed in TON Blockchain is a hash of the smart contract address and the global block random seed. If there are several runs of the same smart contract inside a block, then all of these runs will have the same random seed. This can be fixed, for example, by running $\texttt{LTIME}$; $\texttt{ADDRAND}$ before using the pseudo-random number generator for the first time.
+The pseudo-random number generator uses the [random seed](#a-11-4-configuration-primitives) (parameter #6), an unsigned 256-bit *Integer*, and (sometimes) other data kept in $\texttt{c7}$. The initial value of the random seed before a smart contract is executed in TON Blockchain is a hash of the smart contract address and the global block random seed. If there are several runs of the same smart contract inside a block, then all of these runs will have the same random seed. This can be fixed, for example, by running $\texttt{LTIME}$; $\texttt{ADDRAND}$ before using the pseudo-random number generator for the first time.
 
 - $\texttt{F810}$ — $\texttt{RANDU256}$ ( - $x$), generates a new pseudo-random unsigned 256-bit Integer $x$. The algorithm is as follows: if $r$ is the old value of the random seed, considered as a 32-byte array (by constructing the big-endian representation of an unsigned 256-bit integer), then its $\text{Sha512}(r)$ is computed; the first 32 bytes of this hash are stored as the new value $r'$ of the random seed, and the remaining 32 bytes are returned as the next random value $x$.
 - $\texttt{F811}$ — $\texttt{RAND}$ ($y$ - $z$), generates a new pseudo-random integer $z$ in the range $0\ldots y-1$ (or $y\ldots-1$, if $y<0$). More precisely, an unsigned random value $x$ is generated as in $\texttt{RAND256U}$; then $z:=\lfloor xy/2^{256}\rfloor$ is computed. Equivalent to $\texttt{RANDU256}$; $\texttt{MULRSHIFT 256}$.
@@ -2294,16 +2299,16 @@ The following primitives read configuration data provided in the *Tuple* stored
 
 ### A.11.5. Global variable primitives
 
-The "global variables" may be helpful in implementing some high-level smart-contract languages. They are in fact stored as components of the *Tuple* at $\texttt{c7}$: the $k$-th global variable simply is the $k$-th component of this *Tuple*, for $1\leq k\leq 254$. By convention, the $0$-th component is used for the "configuration parameters" of [A.11.4](#a-11-4-configuration-primitives), so it is not available as a global variable.
+The "global variables" may be helpful in implementing some high-level smart-contract languages. They are in fact stored as components of the *Tuple* at $\texttt{c7}$: the $k$-th global variable simply is the $k$-th component of this *Tuple*, for $1\leq k\leq 254$. By convention, the $0$-th component is used for the [configuration parameters](#a-11-4-configuration-primitives), so it is not available as a global variable.
 
-- $\texttt{F840}$ — $\texttt{GETGLOBVAR}$ ($k$ - $x$), returns the $k$-th global variable for $0\leq k<255$. Equivalent to $\texttt{PUSH c7}$; $\texttt{SWAP}$; $\texttt{INDEXVARQ}$ (cf. [A.3.2](#a-3-2-tuple-primitives)).
+- $\texttt{F840}$ — $\texttt{GETGLOBVAR}$ ($k$ - $x$), returns the $k$-th global variable for $0\leq k<255$. Equivalent to $\texttt{PUSH c7}$; $\texttt{SWAP}$; $\texttt{INDEXVARQ}$ ([A.3.2](#a-3-2-tuple-primitives)).
 - $\texttt{F85\_}k$ — $\texttt{GETGLOB}$ $k$ ( - $x$), returns the $k$-th global variable for $1\leq k\leq 31$. Equivalent to $\texttt{PUSH c7}$; $\texttt{INDEXQ}$ $k$.
 - $\texttt{F860}$ — $\texttt{SETGLOBVAR}$ ($x$ $k$ - ), assigns $x$ to the $k$-th global variable for $0\leq k<255$. Equivalent to $\texttt{PUSH c7}$; $\texttt{ROTREV}$; $\texttt{SETINDEXVARQ}$; $\texttt{POP c7}$.
 - $\texttt{F87\_}k$ — $\texttt{SETGLOB}$ $k$ ($x$ - ), assigns $x$ to the $k$-th global variable for $1\leq k\leq 31$. Equivalent to $\texttt{PUSH c7}$; $\texttt{SWAP}$; $\texttt{SETINDEXQ}$ $k$; $\texttt{POP c7}$.
 
 ### A.11.6. Hashing and cryptography primitives
 
-- $\texttt{F900}$ — $\texttt{HASHCU}$ ($c$ - $x$), computes the representation hash (cf. [3.1.5](#3-1-5-the-representation-hash-of-a-cell)) of a *Cell* $c$ and returns it as a 256-bit unsigned integer $x$. Useful for signing and checking signatures of arbitrary entities represented by a tree of cells.
+- $\texttt{F900}$ — $\texttt{HASHCU}$ ($c$ - $x$), computes the [representation hash](#3-1-5-the-representation-hash-of-a-cell) of a *Cell* $c$ and returns it as a 256-bit unsigned integer $x$. Useful for signing and checking signatures of arbitrary entities represented by a tree of cells.
 - $\texttt{F901}$ — $\texttt{HASHSU}$ ($s$ - $x$), computes the hash of a *Slice* $s$ and returns it as a 256-bit unsigned integer $x$. The result is the same as if an ordinary cell containing only data and references from $s$ had been created and its hash computed by $\texttt{HASHCU}$.
 - $\texttt{F902}$ — $\texttt{SHA256U}$ ($s$ - $x$), computes SHA-256 of the data bits of *Slice* $s$. If the bit length of $s$ is not divisible by eight, throws a cell underflow exception. The hash value is returned as a 256-bit unsigned integer $x$.
 - $\texttt{F910}$ — $\texttt{CHKSIGNU}$ ($h$ $s$ $k$ - $?$), checks the Ed25519-signature $s$ of a hash $h$ (a 256-bit unsigned integer, usually computed as the hash of some data) using public key $k$ (also represented by a 256-bit unsigned integer). The signature $s$ must be a *Slice* containing at least 512 data bits; only the first 512 bits are used. The result is $-1$ if the signature is valid, $0$ otherwise. Notice that $\texttt{CHKSIGNU}$ is equivalent to $\texttt{ROT}$; $\texttt{NEWB}$; $\texttt{STU 256}$; $\texttt{ENDB}$; $\texttt{NEWC}$; $\texttt{ROTREV}$; $\texttt{CHKSIGNS}$, i.e., to $\texttt{CHKSIGNS}$ with the first argument $d$ set to 256-bit *Slice* containing $h$. Therefore, if $h$ is computed as the hash of some data, these data are hashed *twice*, the second hashing occurring inside $\texttt{CHKSIGNS}$.
@@ -2332,7 +2337,7 @@ The "global variables" may be helpful in implementing some high-level smart-cont
 
 ### A.11.9. Message and address manipulation primitives
 
-The message and address manipulation primitives listed below serialize and deserialize values according to the following TL-B scheme (cf. [3.3.4](#3-3-4-brief-explanation-of-tl-b-schemes)):
+The message and address manipulation primitives listed below serialize and deserialize values according to the following [TL-B scheme](#3-3-4-brief-explanation-of-tl-b-schemes):
 ```
 addr_none$00 = MsgAddressExt;
 addr_extern$01 len:(## 9) external_address:(bits len) 
@@ -2416,11 +2421,11 @@ Next we describe the debug primitives that might (and actually are) implemented
 
 ## A.13. Codepage primitives
 
-The following primitives, which begin with byte $\texttt{FF}$, typically are used at the very beginning of a smart contract's code or a library subroutine to select another TVM codepage. Notice that we expect all codepages to contain these primitives with the same codes, otherwise switching back to another codepage might be impossible (cf. [5.1.8](#5-1-8-setting-the-codepage-in-the-code-itself)).
+The following primitives, which begin with byte $\texttt{FF}$, typically are used at the very beginning of a smart contract's code or a library subroutine to select another TVM codepage. Notice that we expect all codepages to contain these primitives with the same codes, otherwise switching back to another [codepage](#5-1-8-setting-the-codepage-in-the-code-itself) might be impossible.
 
 - $\texttt{FF}nn$ — $\texttt{SETCP}$ $nn$, selects TVM codepage $0\leq nn<240$. If the codepage is not supported, throws an invalid opcode exception.
 - $\texttt{FF00}$ — $\texttt{SETCP0}$, selects TVM (test) codepage zero as described in this document.
-- $\texttt{FFF}z$ — $\texttt{SETCP}$ $z-16$, selects TVM codepage $z-16$ for $1\leq z\leq 15$. Negative codepages $-13\ldots-1$ are reserved for restricted versions of TVM needed to validate runs of TVM in other codepages as explained in [B.2.6](#b-2-6-codepage-1). Negative codepage $-14$ is reserved for experimental codepages, not necessarily compatible between different TVM implementations, and should be disabled in the production versions of TVM.
+- $\texttt{FFF}z$ — $\texttt{SETCP}$ $z-16$, selects TVM codepage $z-16$ for $1\leq z\leq 15$. Negative codepages $-13\ldots-1$ are reserved for restricted versions of TVM needed to validate runs of TVM in other [codepages](#b-2-6-codepage). Negative codepage $-14$ is reserved for experimental codepages, not necessarily compatible between different TVM implementations, and should be disabled in the production versions of TVM.
 - $\texttt{FFF0}$ — $\texttt{SETCPX}$ ($c$ - ), selects codepage $c$ with $-2^{15}\leq c<2^{15}$ passed in the top of the stack.
 
 # B   Formal properties and specifications of TVM
@@ -2563,20 +2568,20 @@ The TON Blockchain adopts this approach to validate the runs of TVM (e.g., those
 
 ### B.2.7. Codepage $-2$
 
-This bootstrapping process could be iterated even further, by providing an emulator of the stripped-down version of TVM written for an even simpler version of TVM that supports only boolean values (or integers 0 and 1)—a "codepage $-2$". All 64-bit arithmetic used in codepage $-1$ would then need to be defined by means of boolean operations, thus providing a reference implementation for the stripped-down version of TVM used in codepage $-1$. In this way, if some of the TON Blockchain validators did not agree on the results of their 64-bit arithmetic, they could regress to this reference implementation to find the correct answer.<sup>[30](#fn30)</sup>
-
+This bootstrapping process could be iterated even further, by providing an emulator of the stripped-down version of TVM written for an even simpler version of TVM that supports only boolean values (or integers 0 and 1)—a "codepage $-2$". All 64-bit arithmetic used in codepage $-1$ would then need to be defined by means of boolean operations, thus providing a reference implementation for the stripped-down version of TVM used in codepage $-1$. In this way, if some of the TON Blockchain validators did not agree on the results of their 64-bit arithmetic, they could regress to this reference implementation to find the correct answer.<a id="ref-fn30"></a><sup>[30](#fn30)</sup>
+ 
 ---
 
 
 # C   Code density of stack and register machines
 
-This appendix extends the general consideration of stack manipulation primitives provided in [2.2](#2-2-stack-manipulation-primitives), explaining the choice of such primitives for TVM, with a comparison of stack machines and register machines in terms of the quantity of primitives used and the code density. We do this by comparing the machine code that might be generated by an optimizing compiler for the same source files, for different (abstract) stack and register machines.
+This appendix extends the general consideration of [stack manipulation](#2-2-stack-manipulation-primitives) primitives, explaining the choice of such primitives for TVM, with a comparison of stack machines and register machines in terms of the quantity of primitives used and the code density. We do this by comparing the machine code that might be generated by an optimizing compiler for the same source files, for different (abstract) stack and register machines.
 
-It turns out that the stack machines (at least those equipped with the basic stack manipulation primitives described in [2.2.1](#2-2-1-basic-stack-manipulation-primitives)) have far superior code density. Furthermore, the stack machines have excellent extendability with respect to additional arithmetic and arbitrary data processing operations, especially if one considers machine code automatically generated by optimizing compilers.
+It turns out that the stack machines (at least those equipped with the [basic stack manipulation](#2-2-1-basic-stack-manipulation-primitives) primitives have far superior code density. Furthermore, the stack machines have excellent extendability with respect to additional arithmetic and arbitrary data processing operations, especially if one considers machine code automatically generated by optimizing compilers.
 
 ## C.1   Sample leaf function
 
-We start with a comparison of machine code generated by an (imaginary) optimizing compiler for several abstract register and stack machines, corresponding to the same high-level language source code that contains the definition of a leaf function (i.e., a function that does not call any other functions). For both the register machines and stack machines, we observe the notation and conventions introduced in [2.1](#2-1-stack-calling-conventions).
+We start with a comparison of machine code generated by an (imaginary) optimizing compiler for several abstract register and stack machines, corresponding to the same high-level language source code that contains the definition of a leaf function (i.e., a function that does not call any other functions). For both the register machines and stack machines, we observe the notation and [conventions](#2-1-stack-calling-conventions).
 
 ### C.1.1. Sample source file for a leaf function
 
@@ -2601,11 +2606,11 @@ The source code of the function, in a programming language similar to C, might l
 }
 ```
 
-We assume (cf. [2.1](#2-1-stack-calling-conventions)) that the register machines we consider accept the six parameters $a$ $\ldots$ $f$ in registers $\texttt{r0}$ $\ldots$ $\texttt{r5}$, and return the two values $x$ and $y$ in $\texttt{r0}$ and $\texttt{r1}$. We also assume that the register machines have 16 registers, and that the stack machine can directly access $\texttt{s0}$ to $\texttt{s15}$ by its stack manipulation primitives; the stack machine will accept the parameters in $\texttt{s5}$ to $\texttt{s0}$, and return the two values in $\texttt{s0}$ and $\texttt{s1}$, somewhat similarly to the register machine. Finally, we assume at first that the register machine is allowed to destroy values in all registers (which is slightly unfair towards the stack machine); this assumption will be revisited later.
+We assume that the [register machines](#2-1-stack-calling-conventions) we consider accept the six parameters $a$ $\ldots$ $f$ in registers $\texttt{r0}$ $\ldots$ $\texttt{r5}$, and return the two values $x$ and $y$ in $\texttt{r0}$ and $\texttt{r1}$. We also assume that the register machines have 16 registers, and that the stack machine can directly access $\texttt{s0}$ to $\texttt{s15}$ by its stack manipulation primitives; the stack machine will accept the parameters in $\texttt{s5}$ to $\texttt{s0}$, and return the two values in $\texttt{s0}$ and $\texttt{s1}$, somewhat similarly to the register machine. Finally, we assume at first that the register machine is allowed to destroy values in all registers (which is slightly unfair towards the stack machine); this assumption will be revisited later.
 
 ### C.1.2. Three-address register machine
 
-The machine code (or rather the corresponding assembly code) for a three-address register machine (cf. [2.1.7](#2-1-7-arguments-to-arithmetic-primitives-on-register-machines)) might look as follows:
+The machine code (or rather the corresponding assembly code) for a three-address [register machine](#2-1-7-arguments-to-arithmetic-primitives-on-register-machines)) might look as follows:
 
 ```
 IMUL r6,r0,r3  // r6 := r0 * r3 = ad
@@ -2647,7 +2652,7 @@ IDIV r1,r6  // r1 := Dy/D
 RET
 ```
 
-We have used 16 operations; optimistically assuming each of them (with the exception of $\texttt{RET}$) can be encoded by two bytes, this code would require 31 bytes.<sup>[31](#fn31)</sup>
+We have used 16 operations; optimistically assuming each of them (with the exception of $\texttt{RET}$) can be encoded by two bytes, this code would require 31 bytes.<a id="ref-fn31"></a><sup>[31](#fn31)</sup>
 
 ### C.1.4. One-address register machine
 
@@ -2683,7 +2688,8 @@ We have used 23 operations; if we assume one-byte encoding for all arithmetic op
 
 ### C.1.5. Stack machine with basic stack primitives
 
-The machine code for a stack machine equipped with basic stack manipulation primitives described in [2.2.1](#2-2-1-basic-stack-manipulation-primitives) might look as follows:
+The machine code for a stack machine equipped with [basic stack manipulation](#2-2-1-basic-stack-manipulation-primitives) primitives might look as follows:
+
 
 ```
 PUSH s5    // a b c d e f a
@@ -2721,11 +2727,11 @@ We have used 29 operations; assuming one-byte encodings for all stack operations
 
 Notice as well that we have implicitly used the commutativity of multiplication in this code, computing $de-bf$ instead of $ed-bf$ as specified in high-level language source code. If we were not allowed to do so, an extra $\texttt{XCHG s1}$ would need to be inserted before the third $\texttt{IMUL}$, increasing the total size of the code by one operation and one byte.
 
-The code presented above might have been produced by a rather unsophisticated compiler that simply computed all expressions and subexpressions in the order they appear, then rearranged the arguments near the top of the stack before each operation as outlined in [2.2.2](#2-2-2-basic-stack-manipulation-primitives-suffice). The only "manual" optimization done here involves computing $ec$ before $af$; one can check that the other order would lead to slightly shorter code of 28 operations and bytes (or 29, if we are not allowed to use the commutativity of multiplication), but the $\texttt{ROT}$ optimization would not be applicable.
-
+The code presented above might have been produced by a rather unsophisticated compiler that simply computed all expressions and subexpressions in the order they appear, then rearranged the arguments near the [top of the stack](#2-2-2-basic-stack-manipulation-primitives-suffice) before each operation. The only "manual" optimization done here involves computing $ec$ before $af$; one can check that the other order would lead to slightly shorter code of 28 operations and bytes (or 29, if we are not allowed to use the commutativity of multiplication), but the $\texttt{ROT}$ optimization would not be applicable.
+ 
 ### C.1.6. Stack machine with compound stack primitives
 
-A stack machine with compound stack primitives (cf. [2.2.3](#2-2-3-compound-stack-manipulation-primitives)) would not significantly improve code density of the code presented above, at least in terms of bytes used. The only difference is that, if we were not allowed to use commutativity of multiplication, the extra $\texttt{XCHG s1}$ inserted before the third $\texttt{IMUL}$ might be combined with two previous operations $\texttt{XCHG s3}$, $\texttt{PUSH s2}$ into one compound operation $\texttt{PUXC s2,s3}$; we provide the resulting code below. To make this less redundant, we show a version of the code that computes subexpression $af$ before $ec$ as specified in the original source file. We see that this replaces six operations (starting from line 15) with five other operations, and disables the $\texttt{ROT}$ optimization:
+A stack machine with [compound stack](#2-2-3-compound-stack-manipulation-primitives) primitives would not significantly improve code density of the code presented above, at least in terms of bytes used. The only difference is that, if we were not allowed to use commutativity of multiplication, the extra $\texttt{XCHG s1}$ inserted before the third $\texttt{IMUL}$ might be combined with two previous operations $\texttt{XCHG s3}$, $\texttt{PUSH s2}$ into one compound operation $\texttt{PUXC s2,s3}$; we provide the resulting code below. To make this less redundant, we show a version of the code that computes subexpression $af$ before $ec$ as specified in the original source file. We see that this replaces six operations (starting from line 15) with five other operations, and disables the $\texttt{ROT}$ optimization:
 
 ```
 PUSH s5    // a b c d e f a
@@ -2796,14 +2802,16 @@ It is interesting to note that this version of stack machine code contains only
 
 ## C.2   Comparison of machine code for sample leaf function
 
-[Table 1](#c-2-comparison-of-machine-code-for-sample-leaf-function) summarizes the properties of machine code corresponding to the same source file described in [C.1.1](#c-1-1-sample-source-file-for-a-leaf-function), generated for a hypothetical three-address register machine (cf. [C.1.2](#c-1-2-three-address-register-machine)), with both "optimistic" and "realistic" instruction encodings; a two-address machine (cf. [C.1.3](#c-1-3-two-address-register-machine)); a one-address machine (cf. [C.1.4](#c-1-4-one-address-register-machine)); and a stack machine, similar to TVM, using either only the basic stack manipulation primitives (cf. [C.1.5](#c-1-5-stack-machine-with-basic-stack-primitives)) or both the basic and the composite stack primitives (cf. [C.1.7](#c-1-7-stack-machine-with-compound-stack-primitives-and-manually-optimized-code)).
+[Table 1](#table-1) summarizes the properties of machine code corresponding to the [same source file](#c-1-1-sample-source-file-for-a-leaf-function), generated for a hypothetical [three-address register machine](#c-1-2-three-address-register-machine), with both "optimistic" and "realistic" instruction encodings; a [two-address machine](#c-1-3-two-address-register-machine); a [one-address machine](#c-1-4-one-address-register-machine); and a stack machine, similar to TVM, using either only the [basic stack manipulation](#c-1-5-stack-machine-with-basic-stack-primitives) primitives or both the basic and the [composite stack](#c-1-7-stack-machine-with-compound-stack-primitives-and-manually-optimized-code) primitives.
 
-The meaning of the columns in [Table 1](#c-2-comparison-of-machine-code-for-sample-leaf-function) is as follows:
+The meaning of the columns in [Table 1](#table-1) is as follows:
 
 - "Operations" — The quantity of instructions used, split into "data" (i.e., register move and exchange instructions for register machines, and stack manipulation instructions for stack machines) and "arithmetic" (instructions for adding, subtracting, multiplying and dividing integer numbers). The "total" is one more than the sum of these two, because there is also a one-byte $\texttt{RET}$ instruction at the end of machine code.
 - "Code bytes" — The total amount of code bytes used.
 - "Opcode space" — The portion of "opcode space" (i.e., of possible choices for the first byte of the encoding of an instruction) used by data and arithmetic instructions in the assumed instruction encoding. For example, the "optimistic" encoding for the three-address machine assumes two-byte encodings for all arithmetic instructions *op* $\texttt{r}(i)\texttt{, r}(j)\texttt{, r}(k)$. Each arithmetic instruction would then consume portion $16/256=1/16$ of the opcode space. Notice that for the stack machine we have assumed one-byte encodings for $\texttt{XCHG s}(i)$, $\texttt{PUSH s}(i)$ and $\texttt{POP s}(i)$ in all cases, augmented by $\texttt{XCHG s1,s}(i)$ for the basic stack instructions case only. As for the compound stack operations, we have assumed two-byte encodings for $\texttt{PUSH3}$, $\texttt{XCHG3}$, $\texttt{XCHG2}$, $\texttt{XCPU}$, $\texttt{PUXC}$, $\texttt{PUSH2}$, but not for $\texttt{XCHG s1,s}(i)$.
 
+<a id="table-1"></a>
+
 | Machine | Operations |  |  | Code bytes |  |  | Opcode space |  |  |
 |---------|------------|-------|-------|------------|-------|-------|--------------|-------|-------|
 |         | data | arith | total | data | arith | **total** | data | **arith** | total |
@@ -2814,7 +2822,8 @@ The meaning of the columns in [Table 1](#c-2-comparison-of-machine-code-for-samp
 | stack (basic) | 16 | 11 | 28 | 16 | 11 | **28** | 64/256 | **4/256** | 69/256 |
 | stack (comp.) | 9 | 11 | 21 | 15 | 11 | **27** | 84/256 | **4/256** | 89/256 |
 
-**Table 1.** A summary of machine code properties for hypothetical 3-address, 2-address, 1-address, and stack machines, generated for a sample leaf function (cf. [C.1.1](#c-1-1-sample-source-file-for-a-leaf-function)). The two most important columns, reflecting code density and extendability to other operations, are marked by bold font. Smaller values are better in both of these columns.
+**Table 1.** A summary of machine code properties for hypothetical 3-address, 2-address, 1-address, and stack machines, generated for a [sample leaf function](#c-1-1-sample-source-file-for-a-leaf-function). The two most important columns, reflecting code density and extendability to other operations, are marked by bold font. Smaller values are better in both of these columns.
+
 
 The "code bytes" column reflects the density of the code for the specific sample source. However, "opcode space" is also important, because it reflects the extendability of the achieved density to other classes of operations (e.g., if one were to complement arithmetic operations with string manipulation operations and so on). Here the "arithmetic" subcolumn is more important than the "data" subcolumn, because no further data manipulation operations would be required for such extensions.
 
@@ -2828,7 +2837,7 @@ Finally, the stack machine wins the competition in terms of code density (27 or
 
 To summarize: the two-address machine and stack machine achieve the best extendability with respect to additional arithmetic or data processing instructions (using only 1/256 of code space for each such instruction), while the stack machine additionally achieves the best code density by a small margin. The stack machine utilizes a significant part of its code space (more than a quarter) for data (i.e., stack) manipulation instructions; however, this does not seriously hamper extendability, because the stack manipulation instructions occupy a constant part of the opcode space, regardless of all other instructions and extensions.
 
-While one might still be tempted to use a two-address register machine, we will explain shortly (cf. [C.3](#c-3-sample-non-leaf-function)) why the two-address register machine offers worse code density and extendability in practice than it appears based on this table.
+While one might still be tempted to use a two-address register machine, we will explain [shortly](#c-3-sample-non-leaf-function) why the two-address register machine offers worse code density and extendability in practice than it appears based on this table.
 
 As for the choice between a stack machine with only basic stack manipulation primitives or one supporting compound stack primitives as well, the case for the more sophisticated stack machine appears to be weaker: it offers only one or two fewer bytes of code at the expense of using considerably more opcode space for stack manipulation, and the optimized code using these additional instructions is hard for programmers to write and for compilers to automatically generate.
 
@@ -2844,13 +2853,15 @@ The following sections consider cases $m=0$, $m=8$, and $m=16$ for our register
 
 ### C.2.2. Case $m=0$: no registers to preserve
 
-This case has been considered and summarized in [C.2](#c-2-comparison-of-machine-code-for-sample-leaf-function) and [Table 1](#c-2-comparison-of-machine-code-for-sample-leaf-function) above.
+This case has been considered and summarized in [C.2](#c-2-comparison-of-machine-code-for-sample-leaf-function) and [Table 1](#table-1) above.
 
 ### C.2.3. Case $m=n=16$: all registers must be preserved
 
 This case is the most painful one for the called function. It is especially difficult for leaf functions like the one we have been considering, which do not benefit at all from the fact that other functions preserve some registers when called—they do not call any functions, but instead must preserve all registers themselves.
 
-In order to estimate the consequences of assuming $m=n=16$, we will assume that all our register machines are equipped with a stack, and with one-byte instructions $\texttt{PUSH r}(i)$ and $\texttt{POP r}(i)$, which push or pop a register into/from the stack. For example, the three-address machine code provided in [C.1.2](#c-1-2-three-address-register-machine) destroys the values in registers $\texttt{r2}$, $\texttt{r3}$, $\texttt{r6}$, and $\texttt{r7}$; this means that the code of this function must be augmented by four instructions $\texttt{PUSH r2}$; $\texttt{PUSH r3}$; $\texttt{PUSH r6}$; $\texttt{PUSH r7}$ at the beginning, and by four instructions $\texttt{POP r7}$; $\texttt{POP r6}$; $\texttt{POP r3}$; $\texttt{POP r2}$ right before the $\texttt{RET}$ instruction, in order to restore the original values of these registers from the stack. These four additional $\texttt{PUSH}$/$\texttt{POP}$ pairs would increase the operation count and code size in bytes by $4\times 2=8$. A similar analysis can be done for other register machines as well, leading to [Table 2](#c-2-3-case-m-n-16).
+In order to estimate the consequences of assuming $m=n=16$, we will assume that all our register machines are equipped with a stack, and with one-byte instructions $\texttt{PUSH r}(i)$ and $\texttt{POP r}(i)$, which push or pop a register into/from the stack. For example, the [three-address machine](#c-1-2-three-address-register-machine) code destroys the values in registers $\texttt{r2}$, $\texttt{r3}$, $\texttt{r6}$, and $\texttt{r7}$; this means that the code of this function must be augmented by four instructions $\texttt{PUSH r2}$; $\texttt{PUSH r3}$; $\texttt{PUSH r6}$; $\texttt{PUSH r7}$ at the beginning, and by four instructions $\texttt{POP r7}$; $\texttt{POP r6}$; $\texttt{POP r3}$; $\texttt{POP r2}$ right before the $\texttt{RET}$ instruction, in order to restore the original values of these registers from the stack. These four additional $\texttt{PUSH}$/$\texttt{POP}$ pairs would increase the operation count and code size in bytes by $4\times 2=8$. A similar analysis can be done for other register machines as well, leading to [Table 2](#table-2).
+
+<a id="table-2"></a>
 
 | Machine | $\mathit{r}$ | Operations |  |  | Code bytes |  |  | Opcode space |  |  |
 |---------|--------------|------------|-------|-------|------------|-------|-------|--------------|-------|-------|
@@ -2862,13 +2873,15 @@ In order to estimate the consequences of assuming $m=n=16$, we will assume that
 | stack (basic) | *0* | 16 | 11 | 28 | 16 | 11 | **28** | 64/256 | **4/256** | 69/256 |
 | stack (comp.) | *0* | 9 | 11 | 21 | 15 | 11 | **27** | 84/256 | **4/256** | 89/256 |
 
-**Table 2.** A summary of machine code properties for hypothetical 3-address, 2-address, 1-address, and stack machines, generated for a sample leaf function (cf. [C.1.1](#c-1-1-sample-source-file-for-a-leaf-function)), assuming all of the 16 registers must be preserved by called functions ($m=n=16$). The new column labeled $r$ denotes the number of registers to be saved and restored, leading to $2r$ more operations and code bytes compared to [Table 1](#c-2-comparison-of-machine-code-for-sample-leaf-function). Newly-added $\texttt{PUSH}$ and $\texttt{POP}$ instructions for register machines also utilize 32/256 of the opcode space. The two rows corresponding to stack machines remain unchanged.
+**Table 2.** A summary of machine code properties for hypothetical 3-address, 2-address, 1-address, and stack machines, generated for a [sample leaf function](#c-1-1-sample-source-file-for-a-leaf-function), assuming all of the 16 registers must be preserved by called functions ($m=n=16$). The new column labeled $r$ denotes the number of registers to be saved and restored, leading to $2r$ more operations and code bytes compared to [Table 1](#table-1). Newly-added $\texttt{PUSH}$ and $\texttt{POP}$ instructions for register machines also utilize 32/256 of the opcode space. The two rows corresponding to stack machines remain unchanged.
 
 We see that under these assumptions the stack machines are the obvious winners in terms of code density, and are in the winning group with respect to extendability.
 
 ### C.2.4. Case $m = 8$, $n = 16$: registers $\texttt{r8}$ $\ldots$ $\texttt{r15}$ must be preserved
 
-The analysis of this case is similar to the previous one. The results are summarized in [Table 3](#c-2-4-case-m%3D8%2C-n%3D16%3A-registers-r8…r15-must-be-preserved).
+The analysis of this case is similar to the previous one. The results are summarized in [Table 3](#table-3).
+
+<a id="table-3"></a>
 
 | Machine | $\mathit{r}$ | Operations |  |  | Code bytes |  |  | Opcode space |  |  |
 |---------|--------------|------------|-------|-------|------------|-------|-------|--------------|-------|-------|
@@ -2880,9 +2893,9 @@ The analysis of this case is similar to the previous one. The results are summar
 | stack (basic) | *0* | 16 | 11 | 28 | 16 | 11 | **28** | 64/256 | **4/256** | 69/256 |
 | stack (comp.) | *0* | 9 | 11 | 21 | 15 | 11 | **27** | 84/256 | **4/256** | 89/256 |
 
-**Table 3.** A summary of machine code properties for hypothetical 3-address, 2-address, 1-address and stack machines, generated for a sample leaf function (cf. [C.1.1](#c-1-1-sample-source-file-for-a-leaf-function)), assuming that only the last 8 of the 16 registers must be preserved by called functions ($m=8$, $n=16$). This table is similar to [Table 2](#c-2-3-case-m%3Dn%3D16%3A-all-registers-must-be-preserved), but has smaller values of $r$.
+**Table 3.** A summary of machine code properties for hypothetical 3-address, 2-address, 1-address and stack machines, generated for a [sample leaf function](#c-1-1-sample-source-file-for-a-leaf-function), assuming that only the last 8 of the 16 registers must be preserved by called functions ($m=8$, $n=16$). This table is similar to [Table 2](#table-2), but has smaller values of $r$.
 
-Notice that the resulting table is very similar to [Table 1](#c-2-comparison-of-machine-code-for-sample-leaf-function), apart from the "Opcode space" columns and the row for the one-address machine. Therefore, the conclusions of [C.2](#c-2-comparison-of-machine-code-for-sample-leaf-function) still apply in this case, with some minor modifications. We must emphasize, however, that *these conclusions are valid only for leaf functions, i.e., functions that do not call other functions*. Any program aside from the very simplest will have many non-leaf functions, especially if we are minimizing resulting machine code size (which prevents inlining of functions in most cases).
+Notice that the resulting table is very similar to [Table 1](#table-1), apart from the "Opcode space" columns and the row for the one-address machine. Therefore, the conclusions of [C.2](#c-2-comparison-of-machine-code-for-sample-leaf-function) still apply in this case, with some minor modifications. We must emphasize, however, that *these conclusions are valid only for leaf functions, i.e., functions that do not call other functions*. Any program aside from the very simplest will have many non-leaf functions, especially if we are minimizing resulting machine code size (which prevents inlining of functions in most cases).
 
 ### C.2.5. A fairer comparison using a binary code instead of a byte code
 
@@ -2890,10 +2903,12 @@ The reader may have noticed that our preceding discussion of $k$-address registe
 
 Therefore, let us get rid of this restriction.
 
-Now that we can use any number of bits to encode an instruction, we can choose all opcodes of the same length for all the machines considered. For instance, all arithmetic instructions can have 8-bit opcodes, as the stack machine does, using $1/256$ of the opcode space each; then the three-address register machine will use 20 bits to encode each complete arithmetic instruction. All $\texttt{MOV}$s, $\texttt{XCHG}$s, $\texttt{PUSH}$es, and $\texttt{POP}$s on register machines can be assumed to have 4-bit opcodes, because this is what we do for the most common stack manipulation primitives on a stack machine. The results of these changes are shown in [Table 4](#c-2-5-a-fairer-comparison-using-a-binary-code-instead-of-a-byte-code).
+Now that we can use any number of bits to encode an instruction, we can choose all opcodes of the same length for all the machines considered. For instance, all arithmetic instructions can have 8-bit opcodes, as the stack machine does, using $1/256$ of the opcode space each; then the three-address register machine will use 20 bits to encode each complete arithmetic instruction. All $\texttt{MOV}$s, $\texttt{XCHG}$s, $\texttt{PUSH}$es, and $\texttt{POP}$s on register machines can be assumed to have 4-bit opcodes, because this is what we do for the most common stack manipulation primitives on a stack machine. The results of these changes are shown in [Table 4](#table-4).
 
 We can see that the performance of the various machines is much more balanced, with the stack machine still the winner in terms of the code density, but with the three-address machine enjoying the second place it really merits. If we were to consider the decoding speed and the possibility of parallel execution of instructions, we would have to choose the three-address machine, because it uses only 12 instructions instead of 21.
 
+<a id="table-4"></a>
+
 | Machine | $\mathit{r}$ | Operations |  |  | Code bytes |  |  | Opcode space |  |  |
 |---------|--------------|------------|-------|-------|------------|-------|-------|--------------|-------|-------|
 |         |              | data | arith | total | data | arith | **total** | data | **arith** | total |
@@ -2903,7 +2918,7 @@ We can see that the performance of the various machines is much more balanced, w
 | stack (basic) | *0* | 16 | 11 | 28 | 16 | 11 | **28** | 64/256 | **4/256** | 69/256 |
 | stack (comp.) | *0* | 9 | 11 | 21 | 15 | 11 | **27** | 84/256 | **4/256** | 89/256 |
 
-**Table 4.** A summary of machine code properties for hypothetical 3-address, 2-address, 1-address and stack machines, generated for a sample leaf function (cf. [C.1.1](#c-1-1-sample-source-file-for-a-leaf-function)), assuming that only 8 of the 16 registers must be preserved by functions ($m=8$, $n=16$). This time we can use fractions of bytes to encode instructions, so as to match opcode space used by different machines. All arithmetic instructions have 8-bit opcodes, all data/stack manipulation instructions have 4-bit opcodes. In other respects this table is similar to [Table 3](#c-2-4-case-m%3D8%2C-n%3D16%3A-registers-r8…r15-must-be-preserved).
+**Table 4.** A summary of machine code properties for hypothetical 3-address, 2-address, 1-address and stack machines, generated for a [sample leaf function](#c-1-1-sample-source-file-for-a-leaf-function), assuming that only 8 of the 16 registers must be preserved by functions ($m=8$, $n=16$). This time we can use fractions of bytes to encode instructions, so as to match opcode space used by different machines. All arithmetic instructions have 8-bit opcodes, all data/stack manipulation instructions have 4-bit opcodes. In other respects this table is similar to [Table 3](#table-3).
 
 ---
 
@@ -2913,7 +2928,7 @@ This section compares the machine code for different register machines for a sam
 
 ### C.3.1. Sample source code for a non-leaf function
 
-A sample source file may be obtained by replacing the built-in integer type with a custom *Rational* type, represented by a pointer to an object in memory, in our function for solving systems of two linear equations (cf. [C.1.1](#c-1-1-sample-source-file-for-a-leaf-function)):
+A sample source file may be obtained by replacing the built-in integer type with a custom *Rational* type, represented by a pointer to an object in memory, in our function for solving systems of [two linear equations](#c-1-1-sample-source-file-for-a-leaf-function):
 
 ```
 struct Rational;
@@ -2987,11 +3002,11 @@ POP r0      // x ; ..
 RET
 ```
 
-We have used 41 instructions: 17 one-byte (eight $\texttt{PUSH}$/$\texttt{POP}$ pairs and one $\texttt{RET}$), 13 two-byte ($\texttt{MOV}$ and $\texttt{XCHG}$; out of them 11 "new" ones, involving the stack), and 11 three-byte ($\texttt{CALL}$), for a total of $17\cdot1+13\cdot2+11\cdot3=76$ bytes.<sup>[32](#fn32)</sup>
+We have used 41 instructions: 17 one-byte (eight $\texttt{PUSH}$/$\texttt{POP}$ pairs and one $\texttt{RET}$), 13 two-byte ($\texttt{MOV}$ and $\texttt{XCHG}$; out of them 11 "new" ones, involving the stack), and 11 three-byte ($\texttt{CALL}$), for a total of $17\cdot1+13\cdot2+11\cdot3=76$ bytes.<a id="ref-fn32"></a><sup>[32](#fn32)</sup>
 
 ### C.3.3. Three-address and two-address register machines, $m=8$ preserved registers
 
-Now we have eight registers, $\texttt{r8}$ to $\texttt{r15}$, that are preserved by subroutine calls. We might keep some intermediate values there instead of pushing them into the stack. However, the penalty for doing so consists in a $\texttt{PUSH}$/$\texttt{POP}$ pair for every such register that we choose to use, because our function is also required to preserve its original value. It seems that using these registers under such a penalty does not improve the density of the code, so the optimal code for three- and two-address machines for $m=8$ preserved registers is the same as that provided in [C.3.2](#c-3-2-three-address-and-two-address-register-machines%2C-preserved-registers-2), with a total of 42 instructions and 74 code bytes.
+Now we have eight registers, $\texttt{r8}$ to $\texttt{r15}$, that are preserved by subroutine calls. We might keep some intermediate values there instead of pushing them into the stack. However, the penalty for doing so consists in a $\texttt{PUSH}$/$\texttt{POP}$ pair for every such register that we choose to use, because our function is also required to preserve its original value. It seems that using these registers under such a penalty does not improve the density of the code, so the optimal code for three- and two-address machines for $m=8$ [preserved registers](#c-3-2-three-address-and-two-address-register-machines,-preserved-registers) is the same, with a total of 42 instructions and 74 code bytes.
 
 ### C.3.4. Three-address and two-address register machines, $m=16$ preserved registers
 
@@ -3039,13 +3054,13 @@ POP r0      // x
 RET
 ```
 
-We have used 39 instructions: 11 one-byte, 17 two-byte (among them 5 "new" instructions), and 11 three-byte, for a total of $11\cdot1+17\cdot2+11\cdot3=78$ bytes. Somewhat paradoxically, the code size in bytes is slightly longer than in the previous case (cf. [C.3.2](#c-3-2-three-address-and-two-address-register-machines%2C-preserved-registers-2)), contrary to what one might have expected. This is partially due to the fact that we have assumed two-byte encodings for "new" $\texttt{MOV}$ and $\texttt{XCHG}$ instructions involving the stack, similarly to the "old" instructions. Most existing architectures (such as x86-64) use longer encodings (maybe even twice as long) for their counterparts of our "new" move and exchange instructions compared to the "usual" register-register ones. Taking this into account, we see that we would have obtained here 83 bytes (versus 87 for the code in [C.3.2](#c-3-2-three-address-and-two-address-register-machines%2C-preserved-registers-2)) assuming three-byte encodings of new operations, and 88 bytes (versus 98) assuming four-byte encodings. This shows that, for two-address architectures without optimized encodings for register-stack move and exchange operations, $m=16$ preserved registers might result in slightly shorter code for some non-leaf functions, at the expense of leaf functions (cf. [C.2.3](#c-2-3-case-m%3Dn%3D16%3A-all-registers-must-be-preserved) and [C.2.4](#c-2-4-case-m%3D8%2C-n%3D16%3A-registers-r8…r15-must-be-preserved)), which would become considerably longer.
+We have used 39 instructions: 11 one-byte, 17 two-byte (among them 5 "new" instructions), and 11 three-byte, for a total of $11\cdot1+17\cdot2+11\cdot3=78$ bytes. Somewhat paradoxically, the code size in bytes is slightly longer than in the [previous case](#c-3-2-three-address-and-two-address-register-machines,-preserved-registers), contrary to what one might have expected. This is partially due to the fact that we have assumed two-byte encodings for "new" $\texttt{MOV}$ and $\texttt{XCHG}$ instructions involving the stack, similarly to the "old" instructions. Most existing architectures (such as x86-64) use longer encodings (maybe even twice as long) for their counterparts of our "new" move and exchange instructions compared to the "usual" register-register ones. Taking this into account, we see that we would have obtained here 83 bytes (versus 87 for the [code](#c-3-2-three-address-and-two-address-register-machines,-preserved-registers)) assuming three-byte encodings of new operations, and 88 bytes (versus 98) assuming four-byte encodings. This shows that, for two-address architectures without optimized encodings for register-stack move and exchange operations, $m=16$ preserved registers might result in slightly shorter code for some non-leaf functions, at the expense of leaf functions [C.2.3](#c-2-3-case-%3A-all-registers-must-be-preserved) and [C.2.4](#c-2-4-case-%2C-%3A-registers-must-be-preserved), which would become considerably longer.
 
 ### C.3.5. One-address register machine, $m=0$ preserved registers
 
 For our one-address register machine, we assume that new register-stack instructions work through the accumulator only. Therefore, we have three new instructions, $\texttt{LD s}(j)$ (equivalent to $\texttt{MOV r0,s}(j)$ of two-address machines), $\texttt{ST s}(j)$ (equivalent to $\texttt{MOV s}(j)\texttt{,r0}$), and $\texttt{XCHG s}(j)$ (equivalent to $\texttt{XCHG r0,s}(j)$). To make the comparison with two-address machines more interesting, we assume one-byte encodings for these new instructions, even though this would consume $48/256=3/16$ of the opcode space.
 
-By adapting the code provided in [C.3.2](#c-3-2-three-address-and-two-address-register-machines%2C-preserved-registers-2) to the one-address machine, we obtain the following:
+By adapting the [code](#c-3-2-three-address-and-two-address-register-machines%2C-preserved-registers) to the one-address machine, we obtain the following:
 
 ```
 PUSH r4     // STACK: e
@@ -3095,15 +3110,16 @@ POP r0      // r0:=x ; ..
 RET
 ```
 
-We have used 45 instructions: 34 one-byte and 11 three-byte, for a total of 67 bytes. Compared to the 76 bytes used by two- and three-address machines in [C.3.2](#c-3-2-three-address-and-two-address-register-machines%2C-preserved-registers-2), we see that, again, the one-address register machine code may be denser than that of two-register machines, at the expense of utilizing more opcode space (just as shown in [C.2](#c-2-comparison-of-machine-code-for-sample-leaf-function)). However, this time the extra 3/16 of the opcode space was used for data manipulation instructions, which do not depend on specific arithmetic operations or user functions invoked.
+We have used 45 instructions: 34 one-byte and 11 three-byte, for a total of 67 bytes. Compared to the 76 bytes used by [two- and three-address machines](#c-3-2-three-address-and-two-address-register-machines%2C-preserved-registers), we see that, again, the one-address register machine code may be denser than that of two-register machines, at the expense of utilizing more [opcode space](#c-2-comparison-of-machine-code-for-sample-leaf-function). However, this time the extra 3/16 of the opcode space was used for data manipulation instructions, which do not depend on specific arithmetic operations or user functions invoked.
+
 
 ### C.3.6. One-address register machine, $m=8$ preserved registers
 
-As explained in [C.3.3](#c-3-3-three-address-and-two-address-register-machines-preserved-registers-2), the preservation of $\texttt{r8}$—$\texttt{r15}$ between subroutine calls does not improve the size of our previously written code, so the one-address machine will use for $m=8$ the same code provided in [C.3.5](#c-3-5-one-address-register-machine-preserved-registers).
+As explained [above](#c-3-3-three-address-and-two-address-register-machines%2C-preserved-registers), the preservation of $\texttt{r8}$—$\texttt{r15}$ between subroutine calls does not improve the size of our previously written code, so the one-address machine will use for $m=8$ the same [code](#c-3-5-one-address-register-machine%2C-preserved-registers).
 
 ### C.3.7. One-address register machine, $m=16$ preserved registers
 
-We simply adapt the code provided in [C.3.4](#c-3-4-three-address-and-two-address-register-machines%2C-preserved-registers) to the one-address register machine:
+We simply adapt the [code](#c-3-4-three-address-and-two-address-register-machines%2C-preserved-registers) to the one-address register machine:
 
 ```
 PUSH r0     // STACK: a
@@ -3152,7 +3168,7 @@ We have used 40 instructions: 18 one-byte, 11 two-byte, and 11 three-byte, for a
 
 ### C.3.8. Stack machine with basic stack primitives
 
-We reuse the code provided in [C.1.5](#c-1-5-stack-machine-with-basic-stack-primitives), simply replacing arithmetic primitives (VM instructions) with subroutine calls. The only substantive modification is the insertion of the previously optional $\texttt{XCHG s1}$ before the third multiplication, because even an optimizing compiler cannot now know whether $\texttt{CALL r\_mul}$ is a commutative operation. We have also used the "tail recursion optimization" by replacing the final $\texttt{CALL r\_div}$ followed by $\texttt{RET}$ with $\texttt{JMP r\_div}$.
+We reuse the [code](#c-1-5-stack-machine-with-basic-stack-primitives), simply replacing arithmetic primitives (VM instructions) with subroutine calls. The only substantive modification is the insertion of the previously optional $\texttt{XCHG s1}$ before the third multiplication, because even an optimizing compiler cannot now know whether $\texttt{CALL r\_mul}$ is a commutative operation. We have also used the "tail recursion optimization" by replacing the final $\texttt{CALL r\_div}$ followed by $\texttt{RET}$ with $\texttt{JMP r\_div}$.
 
 ```
 PUSH s5     // a b c d e f a
@@ -3190,7 +3206,8 @@ We have used 29 instructions; assuming one-byte encodings for all stack operatio
 
 ### C.3.9. Stack machine with compound stack primitives
 
-We again reuse the code provided in [C.1.7](#c-1-7-stack-machine-with-compound-stack-primitives-and-manually-optimized-code), replacing arithmetic primitives with subroutine calls and making the tail recursion optimization:
+We again reuse the [code](#c-1-7-stack-machine-with-compound-stack-primitives-and-manually-optimized-code), replacing arithmetic primitives with subroutine calls and making the tail recursion optimization:
+
 
 ```
 PUSH2 s5,s2    // a b c d e f a d
@@ -3221,7 +3238,9 @@ This code uses only 20 instructions, 9 stack-related and 11 control flow-related
 
 ## C.4   Comparison of machine code for sample non-leaf function
 
-[Table 5](#c-4-comparison-of-machine-code-for-sample-non-leaf-function) summarizes the properties of machine code corresponding to the same source file provided in [C.3.1](#c-3-1-sample-source-code-for-a-non-leaf-function). We consider only the "realistically" encoded three-address machines. Three-address and two-address machines have the same code density properties, but differ in the utilization of opcode space. The one-address machine, somewhat surprisingly, managed to produced shorter code than the two-address and three-address machines, at the expense of using up more than half of all opcode space. The stack machine is the obvious winner in this code density contest, without compromizing its excellent extendability (measured in opcode space used for arithmetic and other data transformation instructions).
+[Table 5](#table-5) summarizes the properties of machine code corresponding to the [same source file](#c-3-1-sample-source-code-for-a-non-leaf-function). We consider only the "realistically" encoded three-address machines. Three-address and two-address machines have the same code density properties, but differ in the utilization of opcode space. The one-address machine, somewhat surprisingly, managed to produced shorter code than the two-address and three-address machines, at the expense of using up more than half of all opcode space. The stack machine is the obvious winner in this code density contest, without compromizing its excellent extendability (measured in opcode space used for arithmetic and other data transformation instructions).
+
+<a id="table-5"></a>
 
 | Machine          | $m$   | **Operations** |     |     | **Code bytes** |     |     | **Opcode space** |     |     |
 |------------------|--------|---------------|-----|-----|----------------|-----|-----|------------------|-----|-----|
@@ -3235,17 +3254,19 @@ This code uses only 20 instructions, 9 stack-related and 11 control flow-related
 | stack (basic)    | $-$    | 18 | 11 | 29 | 18 | 33 | **51** | 64/256 | **4/256** | 71/256 |
 | stack (comp.)    | $-$    | 9 | 11 | 20 | 15 | 33 | **48** | 84/256 | **4/256** | 91/256 |
 
-**Table 5**: A summary of machine code properties for hypothetical 3-address, 2-address, 1-address, and stack machines, generated for a sample non-leaf function (cf. [C.3.1](#c-3-1-sample-source-code-for-a-non-leaf-function)), assuming $m$ of the 16 registers must be preserved by called subroutines.
+**Table 5**: A summary of machine code properties for hypothetical 3-address, 2-address, 1-address, and stack machines, generated for a [sample non-leaf function](#c-3-1-sample-source-code-for-a-non-leaf-function), assuming $m$ of the 16 registers must be preserved by called subroutines.
 
 ### C.4.1. Combining with results for leaf functions
 
-It is instructive to compare this table with the results in [C.2](#c-2-comparison-of-machine-code-for-sample-leaf-function) for a sample leaf function, summarized in [Table 1](#c-2-comparison-of-machine-code-for-sample-leaf-function) (for $m=0$ preserved registers) and the very similar [Table 3](#c-2-4-case-m%3D8%2C-n%3D16%3A-registers-r8…r15-must-be-preserved) (for $m=8$ preserved registers), and, if one is still interested in case $m=16$ (which turned out to be worse than $m=8$ in almost all situations), also to [Table 2](#c-2-3-case-m%3Dn%3D16%3A-all-registers-must-be-preserved).
+It is instructive to compare this table with the results in [C.2](#c-2-comparison-of-machine-code-for-sample-leaf-function) for a sample leaf function, summarized in [Table 1](#table-1) (for $m=0$ preserved registers) and the very similar [Table 3](#table-3) (for $m=8$ preserved registers), and, if one is still interested in case $m=16$ (which turned out to be worse than $m=8$ in almost all situations), also to [Table 2](#table-2).
 
 We see that the stack machine beats all register machines on non-leaf functions. As for the leaf functions, only the three-address machine with the "optimistic" encoding of arithmetic instructions was able to beat the stack machine, winning by 15%, by compromising its extendability. However, the same three-address machine produces 25% longer code for non-leaf functions. If a typical program consists of a mixture of leaf and non-leaf functions in approximately equal proportion, then the stack machine will still win.
 
 ### C.4.2. A fairer comparison using a binary code instead of a byte code
 
-Similarly to [C.2.5](#c-2-5-a-fairer-comparison-using-a-binary-code-instead-of-a-byte-code), we may offer a fairer comparison of different register machines and the stack machine by using arbitrary binary codes instead of byte codes to encode instructions, and matching the opcode space used for data manipulation and arithmetic instructions by different machines. The results of this modified comparison are summarized in [Table 6](#c-4-2-a-fairer-comparison-using-a-binary-code-instead-of-a-byte-code). We see that the stack machines still win by a large margin, while using less opcode space for stack/data manipulation.
+Similarly to [C.2.5](#c-2-5-a-fairer-comparison-using-a-binary-code-instead-of-a-byte-code), we may offer a fairer comparison of different register machines and the stack machine by using arbitrary binary codes instead of byte codes to encode instructions, and matching the opcode space used for data manipulation and arithmetic instructions by different machines. The results of this modified comparison are summarized in [Table 6](#table-6). We see that the stack machines still win by a large margin, while using less opcode space for stack/data manipulation.
+
+<a id="table-6"></a>
 
 | Machine          | $m$   | **Operations** |     |     | **Code bytes** |     |     | **Opcode space** |     |     |
 |------------------|--------|---------------|-----|-----|----------------|-----|-----|------------------|-----|-----|
@@ -3260,89 +3281,88 @@ Similarly to [C.2.5](#c-2-5-a-fairer-comparison-using-a-binary-code-instead-of-a
 | stack (comp.)    | $-$    | 9 | 11 | 20 | 15 | 33 | **48** | 84/256 | **4/256** | 91/256 |
 
 
-**Table 6**: A summary of machine code properties for hypothetical 3-address, 2-address, 1-address, and stack machines, generated for a sample non-leaf function (cf. [C.3.1](#c-3-1-sample-source-code-for-a-non-leaf-function)), assuming $m$ of the 16 registers must be preserved by called subroutines. This time we use fractions of bytes to encode instructions, enabling a fairer comparison. Otherwise, this table is similar to [Table 5](#c-4-comparison-of-machine-code-for-sample-non-leaf-function).
+**Table 6**: A summary of machine code properties for hypothetical 3-address, 2-address, 1-address, and stack machines, generated for a [sample non-leaf function](#c-3-1-sample-source-code-for-a-non-leaf-function), assuming $m$ of the 16 registers must be preserved by called subroutines. This time we use fractions of bytes to encode instructions, enabling a fairer comparison. Otherwise, this table is similar to [Table 5](#table-5).
+
 
 ### C.4.3. Comparison with real machines
 
 Note that our hypothetical register machines have been considerably optimized to produce shorter code than actually existing register machines; the latter are subject to other design considerations apart from code density and extendability, such as backward compatibility, faster instruction decoding, parallel execution of neighboring instructions, ease of automatically producing optimized code by compilers, and so on.
 
-For example, the very popular two-address register architecture x86-64 produces code that is approximately twice as long as our "ideal" results for the two-address machines. On the other hand, our results for the stack machines are directly applicable to TVM, which has been explicitly designed with the considerations presented in this appendix in mind. Furthermore, the actual TVM code is even *shorter* (in bytes) than shown in [Table 5](#c-4-comparison-of-machine-code-for-sample-non-leaf-function) because of the presence of the two-byte $\texttt{CALL}$ instruction, allowing TVM to call up to 256 user-defined functions from the dictionary at $\texttt{c3}$. This means that one should subtract 10 bytes from the results for stack machines in [Table 5](#c-4-comparison-of-machine-code-for-sample-non-leaf-function) if one wants to specifically consider TVM, rather than an abstract stack machine; this produces a code size of approximately 40 bytes (or shorter), almost half that of an abstract two-address or three-address machine.
+For example, the very popular two-address register architecture x86-64 produces code that is approximately twice as long as our "ideal" results for the two-address machines. On the other hand, our results for the stack machines are directly applicable to TVM, which has been explicitly designed with the considerations presented in this appendix in mind. Furthermore, the actual TVM code is even *shorter* (in bytes) than shown in [Table 5](#table-5) because of the presence of the two-byte $\texttt{CALL}$ instruction, allowing TVM to call up to 256 user-defined functions from the dictionary at $\texttt{c3}$. This means that one should subtract 10 bytes from the results for stack machines in [Table 5](#table-5) if one wants to specifically consider TVM, rather than an abstract stack machine; this produces a code size of approximately 40 bytes (or shorter), almost half that of an abstract two-address or three-address machine.
 
 ### C.4.4. Automatic generation of optimized code
 
-An interesting point is that the stack machine code in our samples might have been generated automatically by a very simple optimizing compiler, which rearranges values near the top of the stack appropriately before invoking each primitive or calling a function as explained in [2.2.2](#2-2-2-basic-stack-manipulation-primitives-suffice) and [2.2.5](#2-2-5-semantics-of-compound-stack-operations). The only exception is the unimportant "manual" $\texttt{XCHG3}$ optimization described in [C.1.7](#c-1-7-stack-machine-with-compound-stack-primitives-and-manually-optimized-code), which enabled us to shorten the code by one more byte.
+An interesting point is that the stack machine code in our samples might have been generated automatically by a very simple optimizing compiler, which rearranges values near the top of the stack appropriately before invoking each primitive or calling a function as explained in [2.2.2](#2-2-2-basic-stack-manipulation-primitives-suffice) and [2.2.5](#2-2-5-semantics-of-compound-stack-operations). The only exception is the unimportant [manual](#c-1-7-stack-machine-with-compound-stack-primitives-and-manually-optimized-code) $\texttt{XCHG3}$ optimization, which enabled us to shorten the code by one more byte.
 
-By contrast, the heavily optimized (with respect to size) code for register machines shown in [C.3.2](#c-3-2-three-address-and-two-address-register-machines%2C-preserved-registers-2) and [C.3.3](#c-3-3-three-address-and-two-address-register-machines%2C-preserved-registers-2) is unlikely to be produced automatically by an optimizing compiler. Therefore, if we had compared compiler-generated code instead of manually-generated code, the advantages of stack machines with respect to code density would have been even more striking.
+By contrast, the heavily optimized (with respect to size) code for register machines shown in [C.3.2](#c-3-2-three-address-and-two-address-register-machines%2C-preserved-registers) and [C.3.3](#c-3-3-three-address-and-two-address-register-machines%2C-preserved-registers) is unlikely to be produced automatically by an optimizing compiler. Therefore, if we had compared compiler-generated code instead of manually-generated code, the advantages of stack machines with respect to code density would have been even more striking.
 
 ## References
 
-<span id="ref-1">**1**</span> N. Durov, *Telegram Open Network*, 2017.
+[1] N. Durov, *Telegram Open Network*, 2017.
 
 
 ## Footnotes
 
-<span id="fn1">[1]</span> For example, there are no floating-point arithmetic operations (which could be efficiently implemented using hardware-supported *double* type on most modern CPUs) present in TVM, because the result of performing such operations is dependent on the specific underlying hardware implementation and rounding mode settings. Instead, TVM supports special integer arithmetic operations, which can be used to simulate fixed-point arithmetic if needed. [Back ↑](#introduction)
-
-<span id="fn2">**2**</span> The production version will likely require some tweaks and modifications prior to launch, which will become apparent only after using the experimental version in the test environment for some time. [Back ↑](#introduction)
-
-<span id="fn3">**3**</span> A high-level smart-contract language might create a visibility of variables for the ease of programming; however, the high-level source code working with variables will be translated into TVM machine code keeping all the values of these variables in the TVM stack. [Back ↑](#1-1-tvm-is-a-stack-machine)
+<span id="fn1">**1**</span> For example, there are no floating-point arithmetic operations (which could be efficiently implemented using hardware-supported *double* type on most modern CPUs) present in TVM, because the result of performing such operations is dependent on the specific underlying hardware implementation and rounding mode settings. Instead, TVM supports special integer arithmetic operations, which can be used to simulate fixed-point arithmetic if needed. [Back ↑](#ref-fn1)
 
-<span id="fn4">**4**</span> In the TON Blockchain context, `c7` is initialized with a singleton *Tuple*, the only component of which is a *Tuple* containing blockchain-specific data. The smart contract is free to modify `c7` to store its temporary data provided the first component of this *Tuple* remains intact. [Back ↑](#1-3-2-list-of-control-registers)
+<span id="fn2">**2**</span> The production version will likely require some tweaks and modifications prior to launch, which will become apparent only after using the experimental version in the test environment for some time. [Back ↑](#ref-fn2)
 
-<span id="fn5">**5**</span> Strictly speaking, there is also the current *library context*, which consists of a dictionary with 256-bit keys and cell values, used to load library reference cells of [3.1.7](#3-1-7-types-of-exotic-cells). [Back ↑](#1-4-total-state-of-tvm-scccg)
+<span id="fn3">**3**</span> A high-level smart-contract language might create a visibility of variables for the ease of programming; however, the high-level source code working with variables will be translated into TVM machine code keeping all the values of these variables in the TVM stack. [Back ↑](#ref-fn3)
 
-<span id="fn6">**6**</span> Our inclusion of `r0` here creates a minor conflict with our assumption that the accumulator register, if present, is also `r0`; for simplicity, we will resolve this problem by assuming that the first argument to a function is passed in the accumulator. [Back ↑](#2-1-5-register-calling-conventions)
+<span id="fn4">**4**</span> In the TON Blockchain context, `c7` is initialized with a singleton *Tuple*, the only component of which is a *Tuple* containing blockchain-specific data. The smart contract is free to modify `c7` to store its temporary data provided the first component of this *Tuple* remains intact. [Back ↑](#ref-fn4)
 
-<span id="fn7">**7**</span> For instance, if one writes a function for extracting square roots, this function will always accept its argument and return its result in the same registers, in contrast with a hypothetical built-in square root instruction, which could allow the programmer to arbitrarily choose the source and destination registers. Therefore, a user-defined function is tremendously less flexible than a built-in instruction on a register machine. [Back ↑](#2-1-7-arguments-to-arithmetic-primitives-on-register-machines)
+<span id="fn5">**5**</span> Strictly speaking, there is also the current *library context*, which consists of a dictionary with 256-bit keys and cell values, used to load library reference cells of [Types of exotic cells](#3-1-7-types-of-exotic-cells). [Back ↑](#ref-fn5)
 
-<span id="fn8">**8**</span> Of course, if the second option is used, this will destroy the original arrangement of $x$ in the top of the stack. In this case, one should either issue a `SWAP` before `XCHG s(j')`, or replace the previous operation `XCHG s(i)` with `XCHG s1, s(i)`, so that $x$ is exchanged with `s1` from the beginning. [Back ↑](#2-2-2-basic-stack-manipulation-primitives-suffice)
+<span id="fn6">**6**</span> Our inclusion of `r0` here creates a minor conflict with our assumption that the accumulator register, if present, is also `r0`; for simplicity, we will resolve this problem by assuming that the first argument to a function is passed in the accumulator. [Back ↑](#ref-fn6)
 
-<span id="fn9">**9**</span> Notice that the most common `XCHG s(i)` operation is not really required here if we do not insist on keeping the same temporary value or variable always in the same stack location, but rather keep track of its subsequent locations. We will move it to some other location while preparing the arguments to the next primitive or function call. [Back ↑](#2-2-2-basic-stack-manipulation-primitives-suffice)
+<span id="fn7">**7**</span> For instance, if one writes a function for extracting square roots, this function will always accept its argument and return its result in the same registers, in contrast with a hypothetical built-in square root instruction, which could allow the programmer to arbitrarily choose the source and destination registers. Therefore, a user-defined function is tremendously less flexible than a built-in instruction on a register machine. [Back ↑](#ref-fn7)
 
-<span id="fn10">**10**</span> An alternative, arguably better, translation of `PU`$O'$ `s(i_1)`,...,`s(i_γ)` consists of the translation of $O'$ `s(i_2)`,...,`s(i_γ)`, followed by `PUSH s(i_1+α-1)`; `XCHG s(γ-1)`. [Back ↑](#2-2-5-semantics-of-compound-stack-operations)
+<span id="fn8">**8**</span> Of course, if the second option is used, this will destroy the original arrangement of $x$ in the top of the stack. In this case, one should either issue a `SWAP` before `XCHG s(j')`, or replace the previous operation `XCHG s(i)` with `XCHG s1, s(i)`, so that $x$ is exchanged with `s1` from the beginning. [Back ↑](#ref-fn8)
 
-<span id="fn11">**11**</span> From the perspective of low-level cell operations, these data bits and cell references are not intermixed. In other words, an (ordinary) cell essentially is a couple consisting of a list of up to 1023 bits and of a list of up to four cell references, without prescribing an order in which the references and the data bits should be deserialized, even though TL-B schemes appear to suggest such an order. [Back ↑](#3-1-1-tvm-memory-and-persistent-storage-consist-of-cells)
+<span id="fn9">**9**</span> Notice that the most common `XCHG s(i)` operation is not really required here if we do not insist on keeping the same temporary value or variable always in the same stack location, but rather keep track of its subsequent locations. We will move it to some other location while preparing the arguments to the next primitive or function call. [Back ↑](#ref-fn9)
 
-<span id="fn12">**12**</span> From a theoretical perspective, we might say that a cell $c$ has an infinite sequence of hashes $(\text{Hash}_i(c))_{i\geq1}$, which eventually stabilizes: $\text{Hash}_i(c)\to\text{Hash}_\infty(c)$. Then the level $l$ is simply the largest index $i$, such that $\text{Hash}_i(c)\neq\text{Hash}_\infty(c)$. [Back ↑](#3-1-6-the-higher-hashes-of-a-cell)
+<span id="fn10">**10**</span> An alternative, arguably better, translation of `PU`$O'$ `s(i_1)`,...,`s(i_γ)` consists of the translation of $O'$ `s(i_2)`,...,`s(i_γ)`, followed by `PUSH s(i_1+α-1)`; `XCHG s(γ-1)`. [Back ↑](#ref-fn10)
 
-<span id="fn13">**13**</span> A pruned branch cell $c'$ of level $l$ is *bound* by a Merkle (proof or update) cell $c$ if there are exactly $l$ Merkle cells on the path from $c$ to its descendant $c'$, including $c$. [Back ↑](#3-1-7-types-of-exotic-cells)
+<span id="fn11">**11**</span> From the perspective of low-level cell operations, these data bits and cell references are not intermixed. In other words, an (ordinary) cell essentially is a couple consisting of a list of up to 1023 bits and of a list of up to four cell references, without prescribing an order in which the references and the data bits should be deserialized, even though TL-B schemes appear to suggest such an order. [Back ↑](#ref-fn11)
 
-<span id="fn14">**14**</span> Negative numbers are represented using two's complement. For instance, integer $-17$ is serialized by instruction `STI 8` into bitstring `xEF`. [Back ↑](#3-2-8-integers-in-cells-are-big-endian-by-default)
+<span id="fn12">**12**</span> From a theoretical perspective, we might say that a cell $c$ has an infinite sequence of hashes $(\text{Hash}_i(c))_{i\geq1}$, which eventually stabilizes: $\text{Hash}_i(c)\to\text{Hash}_\infty(c)$. Then the level $l$ is simply the largest index $i$, such that $\text{Hash}_i(c)\neq\text{Hash}_\infty(c)$. [Back ↑](#ref-fn12)
 
-<span id="fn15">**15**</span> A description of an older version of TL may be found at https://core.telegram.org/mtproto/TL. [Back ↑](#3-3-3-serialization-of-hashmaps)
+<span id="fn13">**13**</span> A pruned branch cell $c'$ of level $l$ is *bound* by a Merkle (proof or update) cell $c$ if there are exactly $l$ Merkle cells on the path from $c$ to its descendant $c'$, including $c$. [Back ↑](#ref-fn13)
 
-<span id="fn16">**16**</span> The field's name is useful for representing values of the type being defined in human-readable form, but it does not affect the binary serialization. [Back ↑](#3-3-4-brief-explanation-of-tl-b-schemes)
+<span id="fn14">**14**</span> Negative numbers are represented using two's complement. For instance, integer $-17$ is serialized by instruction `STI 8` into bitstring `xEF`. [Back ↑](#ref-fn14)
 
-<span id="fn17">**17**</span> This is the "linear negation" operation $(-)^\perp$ of linear logic, hence our notation `~`. [Back ↑](#3-3-4-brief-explanation-of-tl-b-schemes)
+<span id="fn15">**15**</span> A description of an older version of TL may be found at https://core.telegram.org/mtproto/TL. [Back ↑](#ref-fn15)
 
-<span id="fn18">**18**</span> In fact, $f$ may receive $m$ extra arguments and return $m$ modified values, which are passed to the next invocation of $f$. This may be used to implement "map" and "reduce" operations with dictionaries. [Back ↑](#3-3-10-basic-dictionary-operations)
+<span id="fn16">**16**</span> The field's name is useful for representing values of the type being defined in human-readable form, but it does not affect the binary serialization. [Back ↑](#ref-fn16)
 
-<span id="fn19">**19**</span> Versions of this operation may be introduced where $f$ and $g$ receive an additional bitstring argument, equal to the key (for leaves) or to the common prefix of all keys (for forks) in the corresponding subtree. [Back ↑](#3-3-10-basic-dictionary-operations)
+<span id="fn17">**17**</span> This is the "linear negation" operation $(-)^\perp$ of linear logic, hence our notation `~`. [Back ↑](#ref-fn17)
 
-<span id="fn20">**20**</span> If there are no bits of data left in `code`, but there is still exactly one reference, an implicit `JMP` to the cell at that reference is performed instead of an implicit `RET`. [Back ↑](#4-1-4-normal-work-of-tvm%2C-or-the-main-loop)
+<span id="fn18">**18**</span> In fact, $f$ may receive $m$ extra arguments and return $m$ modified values, which are passed to the next invocation of $f$. This may be used to implement "map" and "reduce" operations with dictionaries. [Back ↑](#ref-fn18)
 
-<span id="fn21">**21**</span> Technically, TVM may simply invoke a virtual method `run()` of the continuation currently in `cc`. [Back ↑](#4-1-5-extraordinary-continuations)
+<span id="fn19">**19**</span> Versions of this operation may be introduced where $f$ and $g$ receive an additional bitstring argument, equal to the key (for leaves) or to the common prefix of all keys (for forks) in the corresponding subtree. [Back ↑](#ref-fn19)
 
-<span id="fn22">**22**</span> The already used savelist `cc.save` of the new `cc` is emptied before the execution starts. [Back ↑](#4-1-8-restoring-control-registers-from-the-new-continuation-c)
+<span id="fn20">**20**</span> If there are no bits of data left in `code`, but there is still exactly one reference, an implicit `JMP` to the cell at that reference is performed instead of an implicit `RET`. [Back ↑](#ref-fn20)
 
-<span id="fn23">**23**</span> The implementation of `REPEAT` involves an extraordinary continuation that remembers the remaining number of iterations, the body of the loop $c$, and the return continuation $c'$. (The latter term represents the remainder of the body of the function that invoked `REPEAT`, which would be normally stored in `c0` of the new `cc`.) [Back ↑](#4-2-2-iterated-execution-and-loops)
+<span id="fn21">**21**</span> Technically, TVM may simply invoke a virtual method `run()` of the continuation currently in `cc`. [Back ↑](#ref-fn21)
 
-<span id="fn24">**24**</span> An important point here is that the tree of cells representing a TVM program cannot have cyclic references, so using `CALLREF` along with a reference to a cell higher up the tree would not work. [Back ↑](#4-6-1-the-problem-of-recursion)
+<span id="fn22">**22**</span> The already used savelist `cc.save` of the new `cc` is emptied before the execution starts. [Back ↑](#ref-fn22)
 
-<span id="fn25">**25**</span> This is not exactly true. A more precise statement is that usually the codepage of the newly-created continuation is a known function of the current codepage. [Back ↑](#5-1-1-codepages-in-continuations)
+<span id="fn23">**23**</span> The implementation of `REPEAT` involves an extraordinary continuation that remembers the remaining number of iterations, the body of the loop $c$, and the return continuation $c'$. (The latter term represents the remainder of the body of the function that invoked `REPEAT`, which would be normally stored in `c0` of the new `cc`.) [Back ↑](#ref-fn23)
 
-<span id="fn26">**26**</span> This is another important mechanism of backward compatibility. All values of newly-added types, as well as values belonging to extended original types that do not belong to the original types (e.g., 513-bit integers that do not fit into 257 bits in the example above), are treated by all instructions (except stack manipulation instructions, which are naturally polymorphic, cf. [2.3.3](#2-3-3-polymorphism-of-stack-manipulation-primitives)) in the old codepages as "values of incorrect type", and generate type-checking exceptions accordingly. [Back ↑](#5-1-4-changing-the-behavior-of-old-operations)
+<span id="fn24">**24**</span> An important point here is that the tree of cells representing a TVM program cannot have cyclic references, so using `CALLREF` along with a reference to a cell higher up the tree would not work. [Back ↑](#ref-fn24)
 
-<span id="fn27">**27**</span> If the cell dumps are hexadecimal, encodings consisting of an integral number of hexadecimal digits (i.e., having length divisible by four bits) might be equally convenient. [Back ↑](#5-2-5-tvm-code-is-a-bitcode-not-a-bytecode)
+<span id="fn25">**25**</span> This is not exactly true. A more precise statement is that usually the codepage of the newly-created continuation is a known function of the current codepage. [Back ↑](#ref-fn25)
 
-<span id="fn28">**28**</span> Notice that it is the probability of occurrence in the code that counts, not the probability of being executed. An instruction occurring in the body of a loop executed a million times is still counted only once. [Back ↑](#5-2-9-almost-optimal-encodings)
+<span id="fn26">**26**</span> This is another important mechanism of backward compatibility. All values of newly-added types, as well as values belonging to extended original types that do not belong to the original types (e.g., 513-bit integers that do not fit into 257 bits in the example above), are treated by all instructions (except stack manipulation instructions, which are naturally polymorphic, [Polymorphism of stack manipulation primitives](#2-3-3-polymorphism-of-stack-manipulation-primitives)) in the old codepages as "values of incorrect type", and generate type-checking exceptions accordingly. [Back ↑](#ref-fn26)
 
-<span id="fn29">**29**</span> Notice that any modifications after launch cannot be done unilaterally; rather they would require the support of at least two-thirds of validators. [Back ↑](#5-3-1-upgradability)
+<span id="fn27">**27**</span> If the cell dumps are hexadecimal, encodings consisting of an integral number of hexadecimal digits (i.e., having length divisible by four bits) might be equally convenient. [Back ↑](#ref-fn27)
 
-<span id="fn30">**30**</span> The preliminary version of TVM does not use codepage -2 for this purpose. This may change in the future. [Back ↑](#b-2-7-codepage-2)
+<span id="fn28">**28**</span> Notice that it is the probability of occurrence in the code that counts, not the probability of being executed. An instruction occurring in the body of a loop executed a million times is still counted only once. [Back ↑](#ref-fn28)
 
-<span id="fn31">**31**</span> It is interesting to compare this code with that generated by optimizing C compilers for the x86-64 architecture. First of all, the integer division operation for x86-64 uses the one-address form, with the (double-length) dividend to be supplied in accumulator pair `r2:r0`. The quotient is also returned in `r0`. As a consequence, two single-to-double extension operations (`CDQ` or `CQO`) and at least one move operation need to be added. Secondly, the encoding used for arithmetic and move operations is less optimistic than in our example above, requiring about three bytes per operation on average. As a result, we obtain a total of 43 bytes for 32-bit integers, and 68 bytes for 64-bit integers. [Back ↑](#c-1-3-two-address-register-machine)
+<span id="fn29">**29**</span> Notice that any modifications after launch cannot be done unilaterally; rather they would require the support of at least two-thirds of validators. [Back ↑](#ref-fn29)
 
-<span id="fn33">**32**</span> Code produced for this function by an optimizing compiler for x86-64 architecture with size-optimization enabled actually occupied 150 bytes, due mostly to the fact that actual instruction encodings are about twice as long as we had optimistically assumed. [Back ↑](#c-3-2-three-address-and-two-address-register-machines%2C-preserved-registers-2)
+<span id="fn30">**30**</span> The preliminary version of TVM does not use codepage -2 for this purpose. This may change in the future. [Back ↑](#ref-fn30)
 
+<span id="fn31">**31**</span> It is interesting to compare this code with that generated by optimizing C compilers for the x86-64 architecture. First of all, the integer division operation for x86-64 uses the one-address form, with the (double-length) dividend to be supplied in accumulator pair `r2:r0`. The quotient is also returned in `r0`. As a consequence, two single-to-double extension operations (`CDQ` or `CQO`) and at least one move operation need to be added. Secondly, the encoding used for arithmetic and move operations is less optimistic than in our example above, requiring about three bytes per operation on average. As a result, we obtain a total of 43 bytes for 32-bit integers, and 68 bytes for 64-bit integers. [Back ↑](#ref-fn31)
 
+<span id="fn32">**32**</span> Code produced for this function by an optimizing compiler for x86-64 architecture with size-optimization enabled actually occupied 150 bytes, due mostly to the fact that actual instruction encodings are about twice as long as we had optimistically assumed. [Back ↑](#ref-fn32)
\ No newline at end of file