-
Notifications
You must be signed in to change notification settings - Fork 153
Description
Hi!
I noticed that in the Cargo.toml file Link-Time Optimization (LTO) for the project is not enabled. I suggest switching it on since it will reduce the binary size (always a good thing to have) and will likely improve the application's performance a bit due to more aggressive optimizations. If you want to read more about LTO and its possible modes, I recommend starting from this Rustc documentation. Additionally, codegen-units = 1 (CU1) option can help too in a similar to LTO way, so I recommend to enable it as well.
I recommend enabling LTO only for Release builds so developers experience won't be affected by the increased build time. Actually, I can propose to use flags directly from this ripgrep profile.
Basically, it can be enabled with the following lines to the root Cargo.toml file:
[profile.release]
codegen-units = 1
lto = true # FatLTO - the most aggressive LTO version
<possible other options like strip = true>
I have made quick tests (AMD Ryzen 9 5900x, Fedora 43, Rust 1.94, the latest version of the project at the moment, vargo build --release command, without stripping) - here are the results:
| Build configuration / Binary name | verus |
cargo-verus |
rust_verify |
libverus_builtin_macros.so |
libverus_state_machines_macros.so |
|---|---|---|---|---|---|
| Current Release | 4 Mib | 1.9 Mib | 20 Mib | 13 Mib | 7.9 Mib |
| Release + FatLTO + CU1 | 3.1 Mib | 1.4 Mib | 14 Mib | 11 Mib | 6.3 Mib |
Regarding build times - I haven't saved build times for each component, but the average increase was something about x2 (twice) from enabling FatLTO + CU1 compared to current Release profile. However, build time increase shouldn't be a problem since we enable it only for the Release profile (and these binaries will be automatically released on GitHub Releases page). From performance perspective, I expect that FatLTO + CU1 is also the most performant configuration (due to the most aggressive compiler optimizations) - however, I haven't made such benches yet.
This change does not affect the z3 binary since it's downloaded from the Z3 upstream prebuilt binaries. I've checked their scripts and see some LTO mentions here and there, but I am not sure if the used by Verus Z3 binary is compiled with LTO. Anyway, it's a different discussion topic :)
I decided to create a dedicated issue about LTO with some actual numbers since the corresponding discussion is a bit outdated and didn't get much attention.
Thank you.