Skip to content

Commit c2e9f7a

Browse files
committed
feat: tolk asm-functions
1 parent 420eede commit c2e9f7a

File tree

1 file changed

+235
-0
lines changed

1 file changed

+235
-0
lines changed
Lines changed: 235 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,235 @@
1+
---
2+
title: "Assembler functions"
3+
---
4+
5+
import { Aside } from '/snippets/aside.jsx';
6+
7+
Functions in Tolk may be defined using assembler code.
8+
It's a low-level feature that requires deep understanding of stack layout, [Fift](/languages/fift/overview), and [TVM](/tvm/overview).
9+
10+
## Standard functions are actually `asm` wrappers
11+
12+
Many functions from [stdlib](/languages/tolk/features/standard-library) are translated to Fift assembler directly.
13+
14+
For example, TVM has a `HASHCU` instruction: "calculate hash of a cell".
15+
It pops a cell from the stack and pushes an integer in the range 0 to 2^256-1.
16+
Therefore, the method `cell.hash` is defined this way:
17+
18+
```tolk
19+
@pure
20+
fun cell.hash(self): uint256
21+
asm "HASHCU"
22+
```
23+
24+
The type system guarantees that when this method is invoked, a TVM `CELL` will be the topmost element (`self`).
25+
26+
## Custom functions are declared in the same way
27+
28+
```tolk
29+
@pure
30+
fun incThenNegate(v: int): int
31+
asm "INC" "NEGATE"
32+
```
33+
34+
A call `incThenNegate(10)` will be translated into those commands.
35+
36+
A good practice is to specify `@pure` if the body does not modify TVM state or throw exceptions.
37+
38+
The return type for `asm` functions is mandatory (for regular functions, it's auto-inferred from `return` statements).
39+
40+
<Aside type="note">
41+
The list of assembler commands can be found here: [TVM instructions](/tvm/instructions).
42+
</Aside>
43+
44+
## Multi-line asm
45+
46+
To embed a multi-line command, use triple quotes:
47+
48+
```tolk
49+
fun hashStateInit(code: cell, data: cell): uint256 asm """
50+
DUP2
51+
HASHCU
52+
...
53+
ONE HASHEXT_SHA256
54+
"""
55+
```
56+
57+
It is treated as a single string and inserted as-is into Fift output.
58+
In particular, it may contain `//` comments inside (valid comments for Fift).
59+
60+
## Stack order for multiple slots
61+
62+
When calling a function, arguments are pushed in a declared order.
63+
The last parameter becomes the topmost stack element.
64+
65+
If an instruction results in several slots, the resulting type should be a tensor or a struct.
66+
67+
For example, write a function `abs2` that calculates `abs()` for two values at once: `abs2(-5, -10)` = `(5, 10)`.
68+
Stack layout (the right is the top) is written in comments.
69+
70+
```tolk
71+
fun abs2(v1: int, v2: int): (int, int)
72+
asm // v1 v2
73+
"ABS" // v1 v2_abs
74+
"SWAP" // v2_abs v1
75+
"ABS" // v2_abs v1_abs
76+
"SWAP" // v1_abs v2_abs
77+
```
78+
79+
## Rearranging arguments on the stack
80+
81+
Sometimes a function accepts parameters in an order different from what a TVM instruction expects.
82+
For example, `GETSTORAGEFEE` expects the order "cells bits seconds workchain".
83+
But for more clear API, workchain should be passed first.
84+
Stack positions can be reordered via the `asm(...)` syntax:
85+
86+
```tolk
87+
fun calculateStorageFee(workchain: int8, seconds: int, bits: int, cells: int): coins
88+
asm(cells bits seconds workchain) "GETSTORAGEFEE"
89+
```
90+
91+
Similarly for return values. If multiple slots are returned, and they must be reordered to match typing,
92+
use `asm(-> ...)` syntax:
93+
94+
```tolk
95+
fun asmLoadCoins(s: slice): (slice, int)
96+
asm(-> 1 0) "LDVARUINT16"
97+
```
98+
99+
Both the input and output sides may be combined: `asm(... -> ...)`.
100+
Reordering is mostly used with `mutate` variables.
101+
102+
## `mutate` and `self` in assembler functions
103+
104+
The `mutate` keyword (see [mutability](/languages/tolk/syntax/mutability)) works
105+
by implicitly returning new values via the stack — both for regular and `asm` functions.
106+
107+
For better understanding, let's look at regular functions first.
108+
The compiler does all transformations automatically:
109+
110+
```tolk
111+
// transformed to: "returns (int, void)"
112+
fun increment(mutate x: int): void {
113+
x += 1;
114+
// a hidden "return x" is inserted
115+
}
116+
117+
fun demo() {
118+
// transformed to: (newX, _) = increment(x); x = newX
119+
increment(mutate x);
120+
}
121+
```
122+
123+
How to implement `increment()` via asm?
124+
125+
```tolk
126+
fun increment(mutate x: int): void
127+
asm "INC"
128+
```
129+
130+
The function still returns `void` (from the type system's perspective it does not return a value),
131+
but `INC` leaves a number on the stack — that's a hidden "return x" from a manual variant.
132+
133+
Similarly, it works for `mutate self`.
134+
An `asm` function should place `newSelf` onto the stack before the actual result:
135+
136+
```tolk
137+
// "TPUSH" pops (tuple) and pushes (newTuple);
138+
// so, newSelf = newTuple, and return `void` (syn. "unit")
139+
fun tuple.push<X>(mutate self, value: X): void
140+
asm "TPUSH"
141+
142+
// "LDU" pops (slice) and pushes (int, newSlice);
143+
// with `asm(-> 1 0)`, we make it (newSlice, int);
144+
// so, newSelf = newSlice, and return `int`
145+
fun slice.loadMessageFlags(mutate self): int
146+
asm(-> 1 0) "4 LDU"
147+
```
148+
149+
To return `self` for chaining, just specify a return type:
150+
151+
```tolk
152+
// "STU" pops (int, builder) and pushes (newBuilder);
153+
// with `asm(op self)`, we put arguments to correct order;
154+
// so, newSelf = newBuilder, and return `void`;
155+
// but to make it chainable, `self` instead of `void`
156+
fun builder.storeMessageOp(mutate self, op: int): self
157+
asm(op self) "32 STU"
158+
```
159+
160+
## `asm` is compatible with structures
161+
162+
Methods for structures may also be declared as assembler ones knowing the layout: fields are placed sequentially.
163+
For instance, a struct with one field is identical to this field.
164+
165+
```tolk
166+
struct MyCell {
167+
private c: cell
168+
}
169+
170+
@pure
171+
fun MyCell.hash(self): uint256
172+
asm "HASHCU"
173+
```
174+
175+
Similarly, a structure may be used instead of tensors for returns.
176+
This is widely practiced in `map<K, V>` methods over TVM dictionaries:
177+
178+
```tolk
179+
struct MapLookupResult<TValue> {
180+
private readonly rawSlice: slice?
181+
isFound: bool
182+
}
183+
184+
@pure
185+
fun map<K, V>.get(self, key: K): MapLookupResult<V>
186+
builtin
187+
// it produces `DICTGET` and similar, which push
188+
// (slice -1) or (null 0) — the shape of MapLookupResult
189+
```
190+
191+
## Generics in `asm` should be single-slot
192+
193+
Take `tuple.push` as an example. The `TPUSH` instruction pops `(tuple, someVal)` and pushes `(newTuple)`.
194+
It should work with any `T`: int, int8, slice, etc.
195+
196+
```tolk
197+
fun tuple.push<T>(mutate self, value: T): void
198+
asm "TPUSH"
199+
```
200+
201+
A reasonable question: how should `t.push(somePoint)` work?
202+
The stack would be misaligned, because `Point { x, y }` is not a single slot.
203+
The answer: this would not compile.
204+
205+
```ansi
206+
dev.tolk:6:5: error: can not call `tuple.push<T>` with T=Point, because it occupies 2 stack slots in TVM, not 1
207+
208+
// in function `main`
209+
6 | t.push(somePoint);
210+
| ^^^^^^
211+
```
212+
213+
Only regular and built-in generics may be instantiated with variadic type arguments, `asm` cannot.
214+
215+
## Do not use `asm` for micro-optimizations
216+
217+
Introduce assembler functions only for rarely-used TVM instructions that are not covered by stdlib.
218+
For example, when manually parsing merkle proofs or calculating extended hashes.
219+
220+
However, attempting to micro-optimize with `asm` instead of writing straightforward code is not desired.
221+
The compiler is smart enough to generate optimal bytecode from consistent logic.
222+
For instance, it automatically inlines simple functions, so create one-liner methods without any worries about gas:
223+
224+
```tolk
225+
fun builder.storeFlags(mutate self, flags: int): self {
226+
return self.storeUint(32, flags);
227+
}
228+
```
229+
230+
The function above is better than "manually optimized" as `32 STU`. Because:
231+
232+
- it is inlined automatically
233+
- for constant `flags`, it's merged with subsequent stores into `STSLICECONST`
234+
235+
See [compiler optimizations](/languages/tolk/features/compiler-optimizations).

0 commit comments

Comments
 (0)