|
| 1 | +--- |
| 2 | +title: "Assembler functions" |
| 3 | +--- |
| 4 | + |
| 5 | +import { Aside } from '/snippets/aside.jsx'; |
| 6 | + |
| 7 | +Functions in Tolk may be defined using assembler code. |
| 8 | +It's a low-level feature that requires deep understanding of stack layout, [Fift](/languages/fift/overview), and [TVM](/tvm/overview). |
| 9 | + |
| 10 | +## Standard functions are actually `asm` wrappers |
| 11 | + |
| 12 | +Many functions from [stdlib](/languages/tolk/features/standard-library) are translated to Fift assembler directly. |
| 13 | + |
| 14 | +For example, TVM has a `HASHCU` instruction: "calculate hash of a cell". |
| 15 | +It pops a cell from the stack and pushes an integer in the range 0 to 2^256-1. |
| 16 | +Therefore, the method `cell.hash` is defined this way: |
| 17 | + |
| 18 | +```tolk |
| 19 | +@pure |
| 20 | +fun cell.hash(self): uint256 |
| 21 | + asm "HASHCU" |
| 22 | +``` |
| 23 | + |
| 24 | +The type system guarantees that when this method is invoked, a TVM `CELL` will be the topmost element (`self`). |
| 25 | + |
| 26 | +## Custom functions are declared in the same way |
| 27 | + |
| 28 | +```tolk |
| 29 | +@pure |
| 30 | +fun incThenNegate(v: int): int |
| 31 | + asm "INC" "NEGATE" |
| 32 | +``` |
| 33 | + |
| 34 | +A call `incThenNegate(10)` will be translated into those commands. |
| 35 | + |
| 36 | +A good practice is to specify `@pure` if the body does not modify TVM state or throw exceptions. |
| 37 | + |
| 38 | +The return type for `asm` functions is mandatory (for regular functions, it's auto-inferred from `return` statements). |
| 39 | + |
| 40 | +<Aside type="note"> |
| 41 | + The list of assembler commands can be found here: [TVM instructions](/tvm/instructions). |
| 42 | +</Aside> |
| 43 | + |
| 44 | +## Multi-line asm |
| 45 | + |
| 46 | +To embed a multi-line command, use triple quotes: |
| 47 | + |
| 48 | +```tolk |
| 49 | +fun hashStateInit(code: cell, data: cell): uint256 asm """ |
| 50 | + DUP2 |
| 51 | + HASHCU |
| 52 | + ... |
| 53 | + ONE HASHEXT_SHA256 |
| 54 | +""" |
| 55 | +``` |
| 56 | + |
| 57 | +It is treated as a single string and inserted as-is into Fift output. |
| 58 | +In particular, it may contain `//` comments inside (valid comments for Fift). |
| 59 | + |
| 60 | +## Stack order for multiple slots |
| 61 | + |
| 62 | +When calling a function, arguments are pushed in a declared order. |
| 63 | +The last parameter becomes the topmost stack element. |
| 64 | + |
| 65 | +If an instruction results in several slots, the resulting type should be a tensor or a struct. |
| 66 | + |
| 67 | +For example, write a function `abs2` that calculates `abs()` for two values at once: `abs2(-5, -10)` = `(5, 10)`. |
| 68 | +Stack layout (the right is the top) is written in comments. |
| 69 | + |
| 70 | +```tolk |
| 71 | +fun abs2(v1: int, v2: int): (int, int) |
| 72 | + asm // v1 v2 |
| 73 | + "ABS" // v1 v2_abs |
| 74 | + "SWAP" // v2_abs v1 |
| 75 | + "ABS" // v2_abs v1_abs |
| 76 | + "SWAP" // v1_abs v2_abs |
| 77 | +``` |
| 78 | + |
| 79 | +## Rearranging arguments on the stack |
| 80 | + |
| 81 | +Sometimes a function accepts parameters in an order different from what a TVM instruction expects. |
| 82 | +For example, `GETSTORAGEFEE` expects the order "cells bits seconds workchain". |
| 83 | +But for more clear API, workchain should be passed first. |
| 84 | +Stack positions can be reordered via the `asm(...)` syntax: |
| 85 | + |
| 86 | +```tolk |
| 87 | +fun calculateStorageFee(workchain: int8, seconds: int, bits: int, cells: int): coins |
| 88 | + asm(cells bits seconds workchain) "GETSTORAGEFEE" |
| 89 | +``` |
| 90 | + |
| 91 | +Similarly for return values. If multiple slots are returned, and they must be reordered to match typing, |
| 92 | +use `asm(-> ...)` syntax: |
| 93 | + |
| 94 | +```tolk |
| 95 | +fun asmLoadCoins(s: slice): (slice, int) |
| 96 | + asm(-> 1 0) "LDVARUINT16" |
| 97 | +``` |
| 98 | + |
| 99 | +Both the input and output sides may be combined: `asm(... -> ...)`. |
| 100 | +Reordering is mostly used with `mutate` variables. |
| 101 | + |
| 102 | +## `mutate` and `self` in assembler functions |
| 103 | + |
| 104 | +The `mutate` keyword (see [mutability](/languages/tolk/syntax/mutability)) works |
| 105 | +by implicitly returning new values via the stack — both for regular and `asm` functions. |
| 106 | + |
| 107 | +For better understanding, let's look at regular functions first. |
| 108 | +The compiler does all transformations automatically: |
| 109 | + |
| 110 | +```tolk |
| 111 | +// transformed to: "returns (int, void)" |
| 112 | +fun increment(mutate x: int): void { |
| 113 | + x += 1; |
| 114 | + // a hidden "return x" is inserted |
| 115 | +} |
| 116 | +
|
| 117 | +fun demo() { |
| 118 | + // transformed to: (newX, _) = increment(x); x = newX |
| 119 | + increment(mutate x); |
| 120 | +} |
| 121 | +``` |
| 122 | + |
| 123 | +How to implement `increment()` via asm? |
| 124 | + |
| 125 | +```tolk |
| 126 | +fun increment(mutate x: int): void |
| 127 | + asm "INC" |
| 128 | +``` |
| 129 | + |
| 130 | +The function still returns `void` (from the type system's perspective it does not return a value), |
| 131 | +but `INC` leaves a number on the stack — that's a hidden "return x" from a manual variant. |
| 132 | + |
| 133 | +Similarly, it works for `mutate self`. |
| 134 | +An `asm` function should place `newSelf` onto the stack before the actual result: |
| 135 | + |
| 136 | +```tolk |
| 137 | +// "TPUSH" pops (tuple) and pushes (newTuple); |
| 138 | +// so, newSelf = newTuple, and return `void` (syn. "unit") |
| 139 | +fun tuple.push<X>(mutate self, value: X): void |
| 140 | + asm "TPUSH" |
| 141 | +
|
| 142 | +// "LDU" pops (slice) and pushes (int, newSlice); |
| 143 | +// with `asm(-> 1 0)`, we make it (newSlice, int); |
| 144 | +// so, newSelf = newSlice, and return `int` |
| 145 | +fun slice.loadMessageFlags(mutate self): int |
| 146 | + asm(-> 1 0) "4 LDU" |
| 147 | +``` |
| 148 | + |
| 149 | +To return `self` for chaining, just specify a return type: |
| 150 | + |
| 151 | +```tolk |
| 152 | +// "STU" pops (int, builder) and pushes (newBuilder); |
| 153 | +// with `asm(op self)`, we put arguments to correct order; |
| 154 | +// so, newSelf = newBuilder, and return `void`; |
| 155 | +// but to make it chainable, `self` instead of `void` |
| 156 | +fun builder.storeMessageOp(mutate self, op: int): self |
| 157 | + asm(op self) "32 STU" |
| 158 | +``` |
| 159 | + |
| 160 | +## `asm` is compatible with structures |
| 161 | + |
| 162 | +Methods for structures may also be declared as assembler ones knowing the layout: fields are placed sequentially. |
| 163 | +For instance, a struct with one field is identical to this field. |
| 164 | + |
| 165 | +```tolk |
| 166 | +struct MyCell { |
| 167 | + private c: cell |
| 168 | +} |
| 169 | +
|
| 170 | +@pure |
| 171 | +fun MyCell.hash(self): uint256 |
| 172 | + asm "HASHCU" |
| 173 | +``` |
| 174 | + |
| 175 | +Similarly, a structure may be used instead of tensors for returns. |
| 176 | +This is widely practiced in `map<K, V>` methods over TVM dictionaries: |
| 177 | + |
| 178 | +```tolk |
| 179 | +struct MapLookupResult<TValue> { |
| 180 | + private readonly rawSlice: slice? |
| 181 | + isFound: bool |
| 182 | +} |
| 183 | +
|
| 184 | +@pure |
| 185 | +fun map<K, V>.get(self, key: K): MapLookupResult<V> |
| 186 | + builtin |
| 187 | +// it produces `DICTGET` and similar, which push |
| 188 | +// (slice -1) or (null 0) — the shape of MapLookupResult |
| 189 | +``` |
| 190 | + |
| 191 | +## Generics in `asm` should be single-slot |
| 192 | + |
| 193 | +Take `tuple.push` as an example. The `TPUSH` instruction pops `(tuple, someVal)` and pushes `(newTuple)`. |
| 194 | +It should work with any `T`: int, int8, slice, etc. |
| 195 | + |
| 196 | +```tolk |
| 197 | +fun tuple.push<T>(mutate self, value: T): void |
| 198 | + asm "TPUSH" |
| 199 | +``` |
| 200 | + |
| 201 | +A reasonable question: how should `t.push(somePoint)` work? |
| 202 | +The stack would be misaligned, because `Point { x, y }` is not a single slot. |
| 203 | +The answer: this would not compile. |
| 204 | + |
| 205 | +```ansi |
| 206 | +dev.tolk:6:5: error: can not call `tuple.push<T>` with T=Point, because it occupies 2 stack slots in TVM, not 1 |
| 207 | +
|
| 208 | + // in function `main` |
| 209 | + 6 | t.push(somePoint); |
| 210 | + | ^^^^^^ |
| 211 | +``` |
| 212 | + |
| 213 | +Only regular and built-in generics may be instantiated with variadic type arguments, `asm` cannot. |
| 214 | + |
| 215 | +## Do not use `asm` for micro-optimizations |
| 216 | + |
| 217 | +Introduce assembler functions only for rarely-used TVM instructions that are not covered by stdlib. |
| 218 | +For example, when manually parsing merkle proofs or calculating extended hashes. |
| 219 | + |
| 220 | +However, attempting to micro-optimize with `asm` instead of writing straightforward code is not desired. |
| 221 | +The compiler is smart enough to generate optimal bytecode from consistent logic. |
| 222 | +For instance, it automatically inlines simple functions, so create one-liner methods without any worries about gas: |
| 223 | + |
| 224 | +```tolk |
| 225 | +fun builder.storeFlags(mutate self, flags: int): self { |
| 226 | + return self.storeUint(32, flags); |
| 227 | +} |
| 228 | +``` |
| 229 | + |
| 230 | +The function above is better than "manually optimized" as `32 STU`. Because: |
| 231 | + |
| 232 | +- it is inlined automatically |
| 233 | +- for constant `flags`, it's merged with subsequent stores into `STSLICECONST` |
| 234 | + |
| 235 | +See [compiler optimizations](/languages/tolk/features/compiler-optimizations). |
0 commit comments