-
Notifications
You must be signed in to change notification settings - Fork 20
feat: tolk asm-functions #1510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: tolk asm-functions #1510
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,235 @@ | ||
| --- | ||
| title: "Assembler functions" | ||
| --- | ||
|
|
||
| import { Aside } from '/snippets/aside.jsx'; | ||
|
|
||
| Functions in Tolk may be defined using assembler code. | ||
| It's a low-level feature that requires deep understanding of stack layout, [Fift](/languages/fift/overview), and [TVM](/tvm/overview). | ||
|
|
||
| ## Standard functions are actually `asm` wrappers | ||
|
|
||
| Many functions from [stdlib](/languages/tolk/features/standard-library) are translated to Fift assembler directly. | ||
|
|
||
| For example, TVM has a `HASHCU` instruction: "calculate hash of a cell". | ||
| It pops a cell from the stack and pushes an integer in the range 0 to 2^256-1. | ||
| Therefore, the method `cell.hash` is defined this way: | ||
|
|
||
| ```tolk | ||
| @pure | ||
| fun cell.hash(self): uint256 | ||
| asm "HASHCU" | ||
| ``` | ||
|
|
||
| The type system guarantees that when this method is invoked, a TVM `CELL` will be the topmost element (`self`). | ||
|
|
||
| ## Custom functions are declared in the same way | ||
|
|
||
| ```tolk | ||
| @pure | ||
| fun incThenNegate(v: int): int | ||
| asm "INC" "NEGATE" | ||
| ``` | ||
|
|
||
| A call `incThenNegate(10)` will be translated into those commands. | ||
|
|
||
| A good practice is to specify `@pure` if the body does not modify TVM state or throw exceptions. | ||
|
|
||
| The return type for `asm` functions is mandatory (for regular functions, it's auto-inferred from `return` statements). | ||
|
|
||
| <Aside type="note"> | ||
| The list of assembler commands can be found here: [TVM instructions](/tvm/instructions). | ||
| </Aside> | ||
|
|
||
| ## Multi-line asm | ||
|
|
||
| To embed a multi-line command, use triple quotes: | ||
|
|
||
| ```tolk | ||
| fun hashStateInit(code: cell, data: cell): uint256 asm """ | ||
| DUP2 | ||
| HASHCU | ||
| ... | ||
| ONE HASHEXT_SHA256 | ||
| """ | ||
| ``` | ||
|
|
||
| It is treated as a single string and inserted as-is into Fift output. | ||
| In particular, it may contain `//` comments inside (valid comments for Fift). | ||
|
|
||
| ## Stack order for multiple slots | ||
|
|
||
| When calling a function, arguments are pushed in a declared order. | ||
| The last parameter becomes the topmost stack element. | ||
|
|
||
| If an instruction results in several slots, the resulting type should be a tensor or a struct. | ||
|
|
||
| For example, write a function `abs2` that calculates `abs()` for two values at once: `abs2(-5, -10)` = `(5, 10)`. | ||
| Stack layout (the right is the top) is written in comments. | ||
|
|
||
| ```tolk | ||
| fun abs2(v1: int, v2: int): (int, int) | ||
| asm // v1 v2 | ||
| "ABS" // v1 v2_abs | ||
| "SWAP" // v2_abs v1 | ||
| "ABS" // v2_abs v1_abs | ||
| "SWAP" // v1_abs v2_abs | ||
|
Comment on lines
+49
to
+76
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [HIGH] Ellipsis token and trailing inline comments in code examplesIn the Please leave a reaction 👍/👎 to this suggestion to improve future reviews for everyone! |
||
| ``` | ||
|
|
||
| ## Rearranging arguments on the stack | ||
|
|
||
| Sometimes a function accepts parameters in an order different from what a TVM instruction expects. | ||
| For example, `GETSTORAGEFEE` expects the order "cells bits seconds workchain". | ||
| But for more clear API, workchain should be passed first. | ||
| Stack positions can be reordered via the `asm(...)` syntax: | ||
|
|
||
| ```tolk | ||
| fun calculateStorageFee(workchain: int8, seconds: int, bits: int, cells: int): coins | ||
| asm(cells bits seconds workchain) "GETSTORAGEFEE" | ||
| ``` | ||
|
|
||
| Similarly for return values. If multiple slots are returned, and they must be reordered to match typing, | ||
| use `asm(-> ...)` syntax: | ||
|
|
||
| ```tolk | ||
| fun asmLoadCoins(s: slice): (slice, int) | ||
| asm(-> 1 0) "LDVARUINT16" | ||
| ``` | ||
|
|
||
| Both the input and output sides may be combined: `asm(... -> ...)`. | ||
| Reordering is mostly used with `mutate` variables. | ||
|
|
||
| ## `mutate` and `self` in assembler functions | ||
|
|
||
| The `mutate` keyword (see [mutability](/languages/tolk/syntax/mutability)) works | ||
| by implicitly returning new values via the stack — both for regular and `asm` functions. | ||
|
|
||
| For better understanding, let's look at regular functions first. | ||
| The compiler does all transformations automatically: | ||
|
|
||
| ```tolk | ||
| // transformed to: "returns (int, void)" | ||
| fun increment(mutate x: int): void { | ||
| x += 1; | ||
| // a hidden "return x" is inserted | ||
| } | ||
|
|
||
| fun demo() { | ||
| // transformed to: (newX, _) = increment(x); x = newX | ||
| increment(mutate x); | ||
| } | ||
| ``` | ||
|
|
||
| How to implement `increment()` via asm? | ||
|
|
||
| ```tolk | ||
| fun increment(mutate x: int): void | ||
| asm "INC" | ||
| ``` | ||
|
|
||
| The function still returns `void` (from the type system's perspective it does not return a value), | ||
| but `INC` leaves a number on the stack — that's a hidden "return x" from a manual variant. | ||
|
|
||
| Similarly, it works for `mutate self`. | ||
| An `asm` function should place `newSelf` onto the stack before the actual result: | ||
|
|
||
| ```tolk | ||
| // "TPUSH" pops (tuple) and pushes (newTuple); | ||
| // so, newSelf = newTuple, and return `void` (syn. "unit") | ||
| fun tuple.push<X>(mutate self, value: X): void | ||
| asm "TPUSH" | ||
|
|
||
| // "LDU" pops (slice) and pushes (int, newSlice); | ||
| // with `asm(-> 1 0)`, we make it (newSlice, int); | ||
| // so, newSelf = newSlice, and return `int` | ||
| fun slice.loadMessageFlags(mutate self): int | ||
| asm(-> 1 0) "4 LDU" | ||
| ``` | ||
|
|
||
| To return `self` for chaining, just specify a return type: | ||
|
|
||
| ```tolk | ||
| // "STU" pops (int, builder) and pushes (newBuilder); | ||
| // with `asm(op self)`, we put arguments to correct order; | ||
|
Comment on lines
+107
to
+153
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [HIGH] First-person pronouns in prose and commentsThe explanatory prose uses “let's” (“For better understanding, let's look at regular functions first.”) and code comments use “we” (“we make it (newSlice, int)”, “we put arguments to correct order”) to refer to authors and the reader. The style guide’s “Don't get personal” rule explicitly forbids first‑person plural pronouns for authors and inclusive phrasing that addresses the reader, marking such cases as HIGH severity. This wording shifts the tone from objective reference documentation to conversational narration, which the project aims to avoid. Please leave a reaction 👍/👎 to this suggestion to improve future reviews for everyone! |
||
| // so, newSelf = newBuilder, and return `void`; | ||
| // but to make it chainable, `self` instead of `void` | ||
| fun builder.storeMessageOp(mutate self, op: int): self | ||
| asm(op self) "32 STU" | ||
| ``` | ||
|
|
||
| ## `asm` is compatible with structures | ||
|
|
||
| Methods for structures may also be declared as assembler ones knowing the layout: fields are placed sequentially. | ||
| For instance, a struct with one field is identical to this field. | ||
|
|
||
| ```tolk | ||
| struct MyCell { | ||
| private c: cell | ||
| } | ||
|
|
||
| @pure | ||
| fun MyCell.hash(self): uint256 | ||
| asm "HASHCU" | ||
| ``` | ||
|
|
||
| Similarly, a structure may be used instead of tensors for returns. | ||
| This is widely practiced in `map<K, V>` methods over TVM dictionaries: | ||
|
|
||
| ```tolk | ||
| struct MapLookupResult<TValue> { | ||
| private readonly rawSlice: slice? | ||
| isFound: bool | ||
| } | ||
|
|
||
| @pure | ||
| fun map<K, V>.get(self, key: K): MapLookupResult<V> | ||
| builtin | ||
| // it produces `DICTGET` and similar, which push | ||
| // (slice -1) or (null 0) — the shape of MapLookupResult | ||
| ``` | ||
|
|
||
| ## Generics in `asm` should be single-slot | ||
|
|
||
| Take `tuple.push` as an example. The `TPUSH` instruction pops `(tuple, someVal)` and pushes `(newTuple)`. | ||
| It should work with any `T`: int, int8, slice, etc. | ||
|
|
||
| ```tolk | ||
| fun tuple.push<T>(mutate self, value: T): void | ||
| asm "TPUSH" | ||
| ``` | ||
|
|
||
| A reasonable question: how should `t.push(somePoint)` work? | ||
| The stack would be misaligned, because `Point { x, y }` is not a single slot. | ||
| The answer: this would not compile. | ||
|
|
||
| ```ansi | ||
| dev.tolk:6:5: error: can not call `tuple.push<T>` with T=Point, because it occupies 2 stack slots in TVM, not 1 | ||
|
|
||
| // in function `main` | ||
| 6 | t.push(somePoint); | ||
| | ^^^^^^ | ||
| ``` | ||
|
|
||
| Only regular and built-in generics may be instantiated with variadic type arguments, `asm` cannot. | ||
|
|
||
| ## Do not use `asm` for micro-optimizations | ||
|
|
||
| Introduce assembler functions only for rarely-used TVM instructions that are not covered by stdlib. | ||
| For example, when manually parsing merkle proofs or calculating extended hashes. | ||
|
|
||
| However, attempting to micro-optimize with `asm` instead of writing straightforward code is not desired. | ||
| The compiler is smart enough to generate optimal bytecode from consistent logic. | ||
| For instance, it automatically inlines simple functions, so create one-liner methods without any worries about gas: | ||
|
|
||
| ```tolk | ||
| fun builder.storeFlags(mutate self, flags: int): self { | ||
| return self.storeUint(32, flags); | ||
| } | ||
| ``` | ||
|
|
||
| The function above is better than "manually optimized" as `32 STU`. Because: | ||
|
|
||
| - it is inlined automatically | ||
| - for constant `flags`, it's merged with subsequent stores into `STSLICECONST` | ||
|
|
||
| See [compiler optimizations](/languages/tolk/features/compiler-optimizations). | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| --- | ||
| title: "Compiler optimizations" | ||
| --- | ||
|
|
||
| import { Stub } from '/snippets/stub.jsx'; | ||
|
|
||
| <Stub | ||
| issue="1128" | ||
| /> |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| --- | ||
| title: "Standard library of Tolk" | ||
| sidebarTitle: "Standard library" | ||
| --- | ||
|
|
||
| import { Stub } from '/snippets/stub.jsx'; | ||
|
|
||
| <Stub | ||
| issue="1128" | ||
| /> |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| --- | ||
| title: "Mutability" | ||
| --- | ||
|
|
||
| import { Stub } from '/snippets/stub.jsx'; | ||
|
|
||
| <Stub | ||
| issue="1128" | ||
| /> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[HIGH] Banned intensifier “actually” in heading
The H2 heading “Standard functions are actually
asmwrappers” uses the intensifier “actually”, which the style guide lists as a banned hedge/intensifier. This weakens the neutral, factual tone required for technical documentation. The “Hedges and intensifiers” rule marks these terms as HIGH severity because they add emotional emphasis instead of information.Please leave a reaction 👍/👎 to this suggestion to improve future reviews for everyone!