Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
187 commits
Select commit Hold shift + click to select a range
f02f16e
Squashed commit of the following:
ailrst Jun 17, 2025
c63068f
delete tv related
ailrst Jun 16, 2025
45807b8
fixups, remove unwanted
ailrst Jun 16, 2025
7c7464c
remove spurious axiom gen
ailrst Jun 16, 2025
39a4f59
scalafmt
ailrst Jun 17, 2025
b17344d
move decl to new file
ailrst Jun 17, 2025
884b419
revert spurious change to main param list
ailrst Jun 17, 2025
c084744
update toscala test
ailrst Jun 17, 2025
268d193
fix GlobalVar old visitor
ailrst Jun 17, 2025
1b88cd5
add symulassign to sva
ailrst Jun 17, 2025
c7eaf22
Merge remote-tracking branch 'origin/main' into tv-ir-changes2
ailrst Jun 17, 2025
bdc10ee
wip add in tv again
ailrst Jun 18, 2025
7485e14
small fix
ailrst Jun 18, 2025
20cdfd8
remove goffs from simp pass
ailrst Jun 19, 2025
9b5b30c
wip, ackermann
ailrst Jun 23, 2025
96f104a
wip
ailrst Jun 25, 2025
0f6f5b7
fix bug
ailrst Jun 27, 2025
28a4cf4
include modified globals in live vars for inv inference
ailrst Jun 27, 2025
2b5f986
fmt
ailrst Jun 27, 2025
9daeab4
helper script
ailrst Jun 27, 2025
d27e4b2
awful hack for ssa
ailrst Jun 27, 2025
0c7d431
Merge remote-tracking branch 'origin/main' into tv-redo
ailrst Jul 1, 2025
7cb4833
more sensible cutpoint handling and ssa
ailrst Jul 1, 2025
919db3a
del comment
ailrst Jul 1, 2025
5231cbb
make tv less wrong
ailrst Jul 2, 2025
a635a4e
separate files
ailrst Jul 2, 2025
304b21e
doc
ailrst Jul 2, 2025
f4b2625
not compiling, wip
ailrst Jul 2, 2025
45c066a
compiling
ailrst Jul 8, 2025
41ed8d7
new file
ailrst Jul 8, 2025
d5a2769
tx setup; not validating
ailrst Jul 8, 2025
6a36c55
wipbroken
ailrst Jul 10, 2025
c50ba03
idk
ailrst Jul 11, 2025
9c72285
Merge remote-tracking branch 'origin/main' into tv-redo
ailrst Jul 14, 2025
8827bbc
call smt solver directly
ailrst Jul 14, 2025
0181228
basic model interpret
ailrst Jul 14, 2025
808eab8
inject model in cfg
ailrst Jul 14, 2025
01abaa0
cleanup
ailrst Jul 14, 2025
e2e73e0
collapse phi wrt trace
ailrst Jul 14, 2025
34ae5ec
fix wrap function
ailrst Jul 15, 2025
e933452
fix bugs in dsa&param inv
ailrst Jul 16, 2025
3bbfc72
log point to failure reason
ailrst Jul 16, 2025
6074e35
move inv to param
ailrst Jul 17, 2025
128b5bb
idk
ailrst Jul 18, 2025
0dbafda
fmt
ailrst Jul 21, 2025
ec4b8bb
nothing
ailrst Jul 21, 2025
00452e3
param mapping for invariant
ailrst Jul 22, 2025
0d8e0d9
use paramlist in invariant
ailrst Jul 22, 2025
5537b7e
dont immediately call solver
ailrst Jul 22, 2025
61a97aa
rm ocaml script
ailrst Jul 22, 2025
c5b4859
Install cvc5
b-paul Jul 22, 2025
faca29c
use cvc5
ailrst Jul 22, 2025
a98cda8
more sensible ssa input var set
ailrst Jul 22, 2025
1a3a8d6
fmt
ailrst Jul 22, 2025
113fc06
Merge remote-tracking branch 'origin/main' into tv-redo
ailrst Jul 22, 2025
8482b9e
fix
ailrst Jul 22, 2025
b0fa34f
nothing
ailrst Jul 23, 2025
2f06b6c
fixmaybe
ailrst Jul 25, 2025
5518fa3
fmt
ailrst Jul 25, 2025
73916f6
add leak statement on guard
ailrst Jul 25, 2025
cb582ce
assertion introduciton escape hatch
ailrst Jul 25, 2025
1bd4ab1
fix
ailrst Jul 28, 2025
f1f1f21
Merge branch 'main' of github.com:UQ-PAC/BASIL into tv-redo
ailrst Jul 28, 2025
3cbbe1b
apt update in doc build
ailrst Jul 28, 2025
5e58f59
fix doc link
ailrst Jul 28, 2025
ec51cb2
tv config
ailrst Jul 28, 2025
6a1f04d
simp mode connect to tests
ailrst Jul 28, 2025
5ebd785
docs
ailrst Jul 28, 2025
3202271
fix test
ailrst Jul 28, 2025
0c69f56
test (broken)
ailrst Jul 28, 2025
6acbdb6
ssa optim broke
ailrst Jul 28, 2025
7b2c89a
transform spec in validated pipeline
ailrst Jul 28, 2025
5d14857
fix
ailrst Jul 28, 2025
14a73be
hacky static stack spec inference
ailrst Jul 29, 2025
88ed972
add new tag and job
ailrst Jul 29, 2025
e8d6eba
Java
ailrst Jul 29, 2025
95ced31
Java
ailrst Jul 29, 2025
8336266
use -T to show shorter stack traces
katrinafyi Jul 29, 2025
4943171
add back ssa optimisation, support floating point uninterp funcs
ailrst Jul 29, 2025
00e5a19
fmt
ailrst Jul 29, 2025
d85aff2
relax timeout tv suite
ailrst Jul 29, 2025
93a4259
fmt
ailrst Jul 29, 2025
0b1a3a0
increase timeout
ailrst Jul 29, 2025
c703522
fix soundness issues
ailrst Jul 29, 2025
783a6c5
add doc
ailrst Jul 29, 2025
bf6214e
strengthen invariant
ailrst Jul 29, 2025
3a30a7f
disable slow test cases
ailrst Jul 29, 2025
cc56058
increase timeout
ailrst Jul 29, 2025
c032cd7
disable slow test
ailrst Jul 29, 2025
c81389f
Merge branch 'main' of github.com:UQ-PAC/BASIL into tv-redo
ailrst Jul 29, 2025
81827c7
fix ssa bugs
ailrst Jul 31, 2025
e3e52a3
ssa perf thing
ailrst Jul 31, 2025
d9289fa
disable leak stmt
ailrst Jul 31, 2025
6073977
mess fixing ssa/ack correctness and perf issues
ailrst Aug 1, 2025
8920861
memoiser
ailrst Aug 1, 2025
23850be
dont use mkstring to construct smt query
ailrst Aug 1, 2025
1f637b9
bgorken [no ci]
ailrst Aug 4, 2025
393baf6
liveness [no ci]
ailrst Aug 5, 2025
b22bec8
bgorken [no ci]
ailrst Aug 5, 2025
65a2ddf
fixes [no ci]
ailrst Aug 6, 2025
93ab3d6
fixes [no ci]
ailrst Aug 6, 2025
b736364
fix [no ci]
ailrst Aug 6, 2025
b6a7ca7
merge main [no ci]
ailrst Aug 6, 2025
740fa95
trying to fix heap exhaustion [no ci]
ailrst Aug 6, 2025
a9bf7e8
docs note
ailrst Aug 6, 2025
d4466ab
docs note
ailrst Aug 6, 2025
4756d95
cleanup and invgen fixes
ailrst Aug 8, 2025
1c1ac32
Merge branch 'main' of github.com:UQ-PAC/BASIL into tv-redo
ailrst Aug 8, 2025
bfa7a46
tv soundness litmus test script
ailrst Aug 8, 2025
8b9626d
cleanup and performance fixes
ailrst Aug 8, 2025
0e048e1
tv code cleanup
ailrst Aug 11, 2025
ac2d35d
refactor broke
ailrst Aug 11, 2025
5726201
Merge branch 'main' of github.com:UQ-PAC/BASIL into tvredobrokerefactor
ailrst Aug 12, 2025
ab907e1
stuff
ailrst Aug 12, 2025
cfa8796
Merge branch 'main' of github.com:UQ-PAC/BASIL into tvredobrokerefactor
ailrst Aug 12, 2025
569fb44
fix
ailrst Aug 12, 2025
d2a1355
commenting
ailrst Aug 12, 2025
9030d17
fix runutils
ailrst Aug 12, 2025
a7b0d91
improve tv flags
ailrst Aug 12, 2025
e381473
reintroduce leak statments
ailrst Aug 12, 2025
70cff47
put back separaate tv systemtest
ailrst Aug 12, 2025
80c54e2
emable case split on entry block
ailrst Aug 14, 2025
cc41138
wip composing invariants
ailrst Aug 14, 2025
ac0e45d
doc
ailrst Aug 14, 2025
cee90dd
fix large phi join of assertfails
ailrst Aug 15, 2025
51192d3
docs tweak
ailrst Aug 15, 2025
da13147
dont use composed dsa and copyprop;broken
ailrst Aug 15, 2025
1a74880
fixes
ailrst Aug 20, 2025
cb32683
fix sourceassumefail
agle Sep 10, 2025
c6a8728
fix offsetprop clob
agle Sep 10, 2025
dbcfc46
reinstate branch leak
agle Sep 10, 2025
ff6284c
fix flow sensitivity of offsetprop clobber
agle Sep 10, 2025
6cd8094
fix loop header lineup
agle Sep 11, 2025
c494334
trace vars for calls
agle Sep 15, 2025
91a9638
Merge branch 'main' of github.com:UQ-PAC/BASIL into tv-redo
agle Sep 15, 2025
4991456
rm debugprints
agle Sep 15, 2025
72fe2ad
restrict range of synth pc
agle Sep 15, 2025
aaf7062
add dryrun
agle Sep 16, 2025
9edd773
read-uninit
agle Sep 16, 2025
ec5c964
read-uninit check, run liveness on all procedures
agle Sep 17, 2025
71099c9
format
agle Sep 17, 2025
d1c7cb2
cleanup
agle Sep 17, 2025
e144aa1
add invariant
agle Sep 17, 2025
0a29073
ensure r30 set when used
agle Sep 17, 2025
d2fc2cc
fix inlining
agle Sep 17, 2025
c1a5131
fix-simp
agle Sep 17, 2025
cfda25d
identify stack collapses
agle Sep 17, 2025
73c1f5c
ensure not re-analysing liveness
agle Sep 18, 2025
b9ce4bf
fix flow sensitivity of offsetprop clobber
agle Sep 10, 2025
bf85730
Merge remote-tracking branch 'origin/main' into tv-redo
agle Sep 18, 2025
1474236
fixup
agle Sep 18, 2025
1e95df0
print
agle Sep 18, 2025
00fc9c3
fmt
agle Sep 18, 2025
0c8eaab
run liveness even when main is external
agle Sep 18, 2025
a674c4a
ite phi node structure
agle Sep 18, 2025
ab3260f
check read-uninit in frontend
agle Sep 18, 2025
0fb38b6
zero init variables in offline lifter to handle cases like 0x8b031041
agle Sep 19, 2025
a6c881f
read-uninit check during simplify
agle Sep 19, 2025
89a52f7
basic statement counting
katrinafyi Sep 22, 2025
79af955
blah guard complexity counter
katrinafyi Sep 22, 2025
40ad890
scalafmt
katrinafyi Sep 22, 2025
d41cae2
fix memory transform prog literal
agle Sep 24, 2025
37b8618
disable garbage
agle Sep 24, 2025
49edaab
Merge branch 'read-uninit' into tv-stats
agle Sep 24, 2025
a3ef17e
tailcall to block after entry
agle Sep 24, 2025
cb5502d
make new ReadUninitialised per procedure and run read-uninit in tv
katrinafyi Sep 29, 2025
f1a9206
add RemoveUnreachableBlocks to simplifyCFGValidated
katrinafyi Sep 29, 2025
4a72a89
add back readUninitialised assertion
katrinafyi Sep 29, 2025
2be9fe2
scripts
agle Sep 29, 2025
2cf3570
write logsimp
agle Oct 8, 2025
a50432e
guardinv
agle Oct 9, 2025
3a7ec42
disable ite phi
agle Oct 9, 2025
18e9381
add ite handling in JavaSMT backend
katrinafyi Oct 9, 2025
f59ecb4
fix guards on jump table
agle Oct 10, 2025
ab09b9c
add asserts for range of ddisasm resolved indirect jumps
agle Oct 10, 2025
34a8542
re-enable proc splitting
agle Oct 10, 2025
01983b5
remove guardcleanup invariant
agle Oct 10, 2025
a7ec434
antlr
agle Oct 10, 2025
19f9e19
yet again fix smt fapply
agle Oct 10, 2025
4173480
remove leak
agle Oct 10, 2025
6a133d9
comment broken rewrite rules
agle Oct 14, 2025
9210362
add-counterexample
agle Oct 14, 2025
c2de392
screaming
agle Oct 16, 2025
c22b958
assign pc after asserting value
agle Oct 20, 2025
41e42fd
smt filenames separators, fmt
agle Oct 22, 2025
154808e
increase fuel to handle very large blocks in cntlm crypto functions
katrinafyi Oct 22, 2025
06eda97
separate logger for tv-eval-marker. NOTE! scripts need `--vl log.tv-e…
katrinafyi Oct 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/run-examples.yml
Original file line number Diff line number Diff line change
Expand Up @@ -249,6 +249,7 @@ jobs:
AnalysisSystemTest2,
AnalysisSystemTest3,
AnalysisSystemTest4,
TVSystemTest,
]
fail-fast: false

Expand Down
13 changes: 11 additions & 2 deletions build.mill
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ import basilmill.BasilDocs
import basilmill.BasilVersion
import basilmill.ProfileModule
import basilmill.Z3Module
//import basilmill.CVC5Module

import os.Path

Expand Down Expand Up @@ -49,8 +50,12 @@ object `package` extends ScalaModule with BasilDocs with BasilVersion with Scala
val aslpOffline = mvn"io.github.uq-pac::lifter:0.1.0"
val javaSmt = mvn"org.sosy-lab:java-smt:5.0.0"
val javaSmtZ3 = mvn"org.sosy-lab:javasmt-solver-z3:4.14.0"
val javaSmtCVC5 = mvn"org.sosy-lab:javasmt-solver-cvc5:1.2.1-g8594a8e4dc"
val cats_collections = mvn"org.typelevel::cats-collections-core:0.9.10"
val cats_core = mvn"org.typelevel::cats-core:2.13.0"
val cats_kernel = mvn"org.typelevel::cats-kernel:2.13.0"

override def mvnDeps = Seq(scalactic, sourceCode, mainArgs, upickle, aslpOffline, javaSmt, javaSmtZ3)
override def mvnDeps = Seq(scalactic, sourceCode, mainArgs, upickle, aslpOffline, javaSmt, javaSmtZ3, javaSmtCVC5, cats_kernel, cats_core, cats_collections)

override def repositoriesTask = Task.Anon {
super.repositoriesTask() :+ MavenRepository(
Expand All @@ -67,6 +72,7 @@ object `package` extends ScalaModule with BasilDocs with BasilVersion with Scala

override def moduleDir = BuildCtx.workspaceRoot / "src"


override def sources = Task.Sources("main/scala")

override def forkArgs = Task {
Expand Down Expand Up @@ -223,6 +229,7 @@ object `package` extends ScalaModule with BasilDocs with BasilVersion with Scala
}

object z3 extends Z3Module
// object cvc5 extends CVC5Module

def ctagsConfig = Task.Source {
BuildCtx.workspaceRoot / "basilmill" / "scala.ctags"
Expand Down Expand Up @@ -255,7 +262,9 @@ object `package` extends ScalaModule with BasilDocs with BasilVersion with Scala
def runProfile(profileDest: String, args: String*) = Task.Command {
println(s"Profiling: you may want to set\n sudo sysctl kernel.perf_event_paranoid=1\n sudo sysctl kernel.kptr_restrict=0\n")
val prof = asyncProf.path()
os.call(("java", s"-agentpath:${prof}=start,event=cpu,file=${profileDest}", "-jar", assembly().path.toString, args), stdout = os.Inherit, cwd = BuildCtx.workspaceRoot)
val oargs : Seq[String] = forkArgs()
val realArgs : Seq[String] = oargs ++ Seq(s"-agentpath:${prof}=start,event=cpu,file=${profileDest}")
os.call(Seq("java") ++ realArgs ++ Seq( "-jar", assembly().path.toString) ++ args, stdout = os.Inherit, cwd = BuildCtx.workspaceRoot)
}

}
4 changes: 4 additions & 0 deletions docs/src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,10 @@
# Analyses

- [Data Structure Analysis](development/dsa.md) Basil Memory Analysis
- [Translation Validation](development/tv/index.md)
- [Public API](development/tv/tv-api.md)
- [Implementation](development/tv/implementation.md)
- [Assumptions](development/tv/assumptions.md)

# Development

Expand Down
16 changes: 16 additions & 0 deletions docs/src/development/profiling.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,4 +34,20 @@ sudo sysctl kernel.perf_event_paranoid=1
sudo sysctl kernel.kptr_restrict=0
```

## Identifying memory leaks

Memory leaks can be identified using async profiler.

E.g. build an assembly `./mill assembly` to get `out/assembly.dest/out.jar`

Then use asprof with flags `-e alloc --total --live`.

```
./async-profiler-4.0-linux-x64/bin/asprof -e alloc -o flamegraph --live -f alloc.html --total -d 60 out.jar
```

This will show the total allocation count only for references that are still live, allocations wiht a reference still retained
somewhere.

[related blog article](https://web.archive.org/web/20240228031308/https://krzysztofslusarski.github.io/2022/11/27/async-live.html).

113 changes: 113 additions & 0 deletions docs/src/development/tv/assumptions.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Lifter Assumptions

A transform pass may want drive a code-transform with a reasonable assumption
about the behaviour, that it cannot technically prove with analysis.

One such example is the `AssumeCallPreserved` pass which modifies
procedure parameter lists on the assumption that the program
respects the arm64 calling convention.

To allow this, we introduce an assertion to the program, encoding the
assumption that was made by the transform. When the program
is eventually verified in Boogie, we discharge this assumption,
proving that our assumption was sound.
Comment on lines +10 to +13
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to be clear, are the assertions introduced by the TV or by the AssumeCallPreserved pass itself? i /think/ it's the second, so the assertions will be in the boogie program which gets verified.


The translation validation pipeline proves that as long as
the assumption holds, the programs have the same behaviour.
Because the introduction of an assert statement changes the
program behaviour, we need to add the specification that
traces on which the assertion fails can be ignored.

Procedure procedure `_start` is an interesting example.

We can use the script in `scripts/soundnesslitmuscvc5wrapper.sh` to get the number of
program statements contributing to the unsat core of the program:
Comment on lines +23 to +24
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hyperlink to script? link checking makes this reliable now



```

tvsmt/simplifyCFG-_start_4213248.smt2
66% contributed 171/259 asserts

tvsmt/Parameters-_start_4213248.smt2
66% contributed 145/219 asserts

tvsmt/AssumeCallPreserved-_start_4213248.smt2
68% contributed 177/260 asserts

tvsmt/DSA-_start_4213248.smt2
53% contributed 169/316 asserts

tvsmt/CopyProp-_start_4213248.smt2
4% contributed 11/257 asserts

tvsmt/GuardCleanup-_start_4213248.smt2
7% contributed 10/138 asserts

```

The procedure `_start` contains an unresolved indirect tail call,
so the pass AssumeCallPreserved injects assertions that don't hold (but TV passes as its not TV's responsibility to prove them),
but then subsequent analyses that leverage this assumption are not sound and we get tv passing trivially due to
`assert eq(0x404a34:bv64, 0x404b64:bv64) { .comment = "R30 = R30_in" }; ~> assert false`. This is the expected behavior, a
program with an assert false is obviously a program that doesn't verify.

This unsat core is showing roughly that the translation validation is passing vacuously because
the copyprop transform has derived false from the assertion we introduced.

```
cvc5 --dump-unsat-cores tvsmt/CopyProp-_start_4213248.smt2
unsat
(
source5
source47
source54
source56
source51
source57
source53
source58
source69
source70
source48
)
```

<details>
<summary>full _start IL after copyprop pass</summary>

```

(R0_in:bv64, R10_in:bv64, R11_in:bv64, R12_in:bv64, R13_in:bv64, R14_in:bv64, R15_in:bv64, R16_in:bv64, R17_in:bv64, R18_in:bv64, R1_in:bv64, R29_in:bv64, R2_in:bv64, R30_in:bv64, R31_in:bv64, R3_in:bv64, R4_in:bv64, R5_in:bv64, R6_in:bv64, R7_in:bv64, R8_in:bv64, R9_in:bv64, _PC_in:bv64)
-> (R0_out:bv64, R1_out:bv64, R2_out:bv64, R3_out:bv64, R4_out:bv64, R5_out:bv64, R6_out:bv64, R7_out:bv64, _PC_out:bv64)
{ .name = "_start"; .address = 0x404a00 }
[
block %_start_entry {.address = 0x404a00; .originalLabel = "ufjL9zmpTde18uF80OwPVQ=="} [
var R5_2: bv64 := bvor(0x0:bv64, bvshl(R0_in:bv64, 0x0:bv64));
var var1_4213376_bv64_1: bv64 := load le $mem R31_in:bv64 64;
var var2_4202816_bv64_1: bv64 := load le $mem bvadd(0x430000:bv64, 0x40:bv64) 64;
assert eq(0x404b64:bv64, 0x404b64:bv64) { .comment = "R30 = R30_in" };
var (R0_4:bv64, R10_2:bv64, R11_2:bv64, R12_2:bv64, R13_2:bv64, R14_2:bv64, R15_2:bv64, R16_4:bv64, R17_3:bv64, R18_2:bv64, R1_3:bv64, R29_3:bv64, R2_3:bv64, R3_3:bv64, R4_3:bv64, R5_3:bv64, R6_3:bv64, R7_2:bv64, R8_2:bv64, R9_2:bv64)
:= call @__libc_start_main (0x404a34:bv64, R10_in:bv64, R11_in:bv64, R12_in:bv64, R13_in:bv64, R14_in:bv64, R15_in:bv64, 0x430040:bv64, var2_4202816_bv64_1:bv64, R18_in:bv64, var1_4213376_bv64_1:bv64, 0x0:bv64, bvadd(R31_in:bv64, 0x8:bv64), 0x404b64:bv64, 0x0:bv64, 0x0:bv64, R5_2:bv64, R31_in:bv64, R7_in:bv64, R8_in:bv64, R9_in:bv64);
goto(%phi_5);
];
block %_start_10 {.address = 0x404a30; .originalLabel = "eH7LoljnQS6XhgPVv4Qipg=="} [
assert eq(0x404a34:bv64, 0x404b64:bv64) { .comment = "R30 = R30_in" };
var var2_4203488_bv64_1: bv64 := load le $mem bvadd(0x430000:bv64, 0x190:bv64) 64;
assert eq(0x404a34:bv64, 0x404a34:bv64) { .comment = "R30 = R30_in" };
var (R0_6:bv64, R10_4:bv64, R11_4:bv64, R12_4:bv64, R13_4:bv64, R14_4:bv64, R15_4:bv64, R16_7:bv64, R17_5:bv64, R18_4:bv64, R1_5:bv64, R29_5:bv64, R2_5:bv64, R3_5:bv64, R4_5:bv64, R5_5:bv64, R6_5:bv64, R7_4:bv64, R8_4:bv64, R9_4:bv64)
:= call @abort (R0_4:bv64, R10_2:bv64, R11_2:bv64, R12_2:bv64, R13_2:bv64, R14_2:bv64, R15_2:bv64, 0x430190:bv64, var2_4203488_bv64_1:bv64, R18_2:bv64, R1_3:bv64, R29_3:bv64, R2_3:bv64, 0x404a34:bv64, R3_3:bv64, R4_3:bv64, R5_3:bv64, R6_3:bv64, R7_2:bv64, R8_2:bv64, R9_2:bv64);
goto(%phi_6);
];
block %phi_5 {.originalLabel = "eH7LoljnQS6XhgPVv4Qipg==, ufjL9zmpTde18uF80OwPVQ=="} [
goto(%_start_10);
];
block %phi_6 {.originalLabel = "eH7LoljnQS6XhgPVv4Qipg=="} [
goto(%_start_return);
];
block %_start_return [
return (R0_6:bv64, R1_5:bv64, R2_5:bv64, R3_5:bv64, R4_5:bv64, R5_5:bv64, R6_5:bv64, R7_4:bv64, _PC_in:bv64);
]
];
```
</details>
170 changes: 170 additions & 0 deletions docs/src/development/tv/implementation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# Translation Validation Implementation

At the highest level the translation validation takes two programs, an invariant linking them
and produces, constructs product program describing the simultaneous execution of the two programs,
and verifies this program satisfies the invariant.
Comment on lines +3 to +5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
At the highest level the translation validation takes two programs, an invariant linking them
and produces, constructs product program describing the simultaneous execution of the two programs,
and verifies this program satisfies the invariant.
At the highest level, the translation validation takes two programs and an invariant linking them
and produces a product program describing the simultaneous execution of the two programs.
Then, it verifies the product program satisfies the invariant.


We initially describe the translation pipeline and structures used throughout this process.

## Phases

### Cut-Transition System

- `TransitionSystem.scala`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link?

- Transforms a Basil IR program to an equivalent acyclic Basil IR program

A transition system describes one acyclic aggregate program step. This step breaks a single
Basil IR program into a program which represents a single acyclic step at a time.
This effectively fans out every loop in the program into a single loop which is equivalent
to the original program.
Comment on lines +16 to +19
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think my understanding is that the transition system conversion gives a program which runs one "step". it is the infinite repetition of this "step" program which is equivalent to the original program. is this right?


#### Cut transform:

1. Create a new entry and exit
2. Use program entry as a cut link it to the new entry and guard with a specific PC value `ENTRY`
3. Use program exit as a cut link it to the new exit and set with a specific PC value `RETURN`
2. Identify each loop header as a cut, set `PC := Loop$i` and redirect through exit, add edge
from entry to header guarded by a pc value `Loop$i`

### Monadic Local Side-Effect Form

- `SSADAG.scala` and `Ackermann.scala`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link? and so on for similar such mentions in the doc


This translates the Basil IR program to a program containing three statement types:

1. (Simultaneous) assignment `(a, b, c) := (d, e, f)` (Scala type `SimulAssign`)
2. Side effect calls : `(TRACE, a, b, c) := EffName (TRACE, d, e, f)` (Scala type `SideEffectStatement`).
This is an uninterpreted function call with multiple/tuple return.
3. Assumes / Guards

Note the `TRACE` boolean-typed variable here which represents the state passed through the program.
This is where the "monadic" terminology comes from. A boolean type is sufficient here as it only
needs to represent the truth value of the equivalence between the source and target trace.

Think of this `TRACE` as an oracle represnting the entire universe, i.e. we assume
the precondition `TRACE_source == TRACE_target`. This is assuming the programs execute at the
same time in the same universe state; thus it captures external IO, assuming
both programs will always receive identical external inputs if they are invariantly
in the same state.

A frame analysis is used to identify the interprocedural effects of calls. This transform
pulls these side effects (memory access, global variable access) into the parameter list
of the side-effect statement.

### SSA Form

This performs a naive SSA transform (not considering loops) on the Monadic program.

It introduces reachability predicates (`${blockName}_done`) for the end of every block.
This predicate is the conjunction of

1. the disjunction of the reachability of its predcessors and
2. conjunction of all assume statements in the block.

Note the phi nodes have a slightly odd structure so they fit in the existing Basil IR.
In the below code, the assume at the start of block `l3` represents the phi node
joining `l1` and `l2`.


```c
block l1 [
r0_0 := 1; // was r0 := 1
goto l3;
];
block l2 [
r0_1 := 2; // was r0 := 2
goto l3;
];
block l3 [
assume (l1_done ==> r0_3 == r0_0 && l2_done ==> r0_3 == r0_2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should r0_2 be r0_1? to match that assigned in block l2

ret;
];
```

This transform returns a function which renames an un-indexed expression
to one in terms of the ssa-indexed variables defined at a given block.

### Ackermannisation

- This is an invariant inference pass perfomed on the SSA-form program

This is a transform which soundly performs the reasoning about the correspondence of
side effects in the product program ahead of verification-time.

At a high level, assume we have side-effect statements in the source and target program below:

```
// source program:
(source_TRACE_1, source_R0_1) := source_Load_mem_64 (source_TRACE, source_R1);
// target program:
(target_TRACE_1, target_R0_1) := target_Load_mem_64 (target_TRACE, target_R1);
```

Analagous to the congruence rule of uninterpreted functions we have the axiom:

```
\forall ts, tt, r0s, r0t :: tt == ts && r0s == r0t
==> source_Load_mem_64(ts, t0s) == target_Load_mem_64(tt, t0t)
```

I.e. these loads have the same effect as long as they are the same address and occur
in the same state.

We would want to instantiate this axiom whenever we have two corresponding
source and target loads, but really we only care about those that
are already likely to line up. Instead of letting the SMT solver
decide when to instantiate this axiom we use the control-flow graph,
and the requirement that transforms must preserve the order and number
of side-effects to instantiate exactly only the instances of the axiom that
the verification will need.
Comment on lines +115 to +119
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
are already likely to line up. Instead of letting the SMT solver
decide when to instantiate this axiom we use the control-flow graph,
and the requirement that transforms must preserve the order and number
of side-effects to instantiate exactly only the instances of the axiom that
the verification will need.
are already likely to line up. Instead of letting the SMT solver
decide when to instantiate this axiom, we use the control-flow graph,
along with the requirement that transforms must preserve the order and number
of side-effects, to instantiate only the instances of the axiom that
the verification will need.

also, is this precise or an overapproximation? if approximate, it "might need" would be better than "will need".


This is done by walking the source and target CFGs in lockstep,
identifying matching side-effects and adding the
body of the axiom as an assertion to the verification condition.

- After this is performed all `SideEffectStatement` are removed from the program.

### Passified Form

Since we have SSA form the semantics of assignment are unneccessary, we replace
every assignment with an `Assume` stating the equality of assignees.

We now have a program consisting only of `Assume` statements.

### SMT

- `TranslationValidate.scala`

- Infer invariant component at each cut and rename for the SSA renaming at the corresponding cut
- Rename free variables for ssa indexes for synth entry precondition and emit assertion
- Rename free variables for ssa indexes for synth exit and emit negated assertion
- Add every assume from the passified program to the SMT query
- Add the initial invariant to the SMT query, add the negated exit-invariant to the SMT query.

This is built with `JavaSMT` and Basil's internal SMT builder.


## Debugging

### Validation Failure

When immediate verification is enabled, and `sat` is returned, the validator emits an `.il` file and a
CFG for a containing a fake representation of the passified product
program. It attempts to annotate the CFG with the model, however note that it often
incorrectly relates source variables to target variables (due to mis-alignment of blocks, assigns, SSA-indexing),
so this cannot be taken as given.

### Unsoundness

A litmus-test for the soundness of the verification is to generate the unsat core for the dumped SMT query.
If the verification is substantive, the unsat core should contain the entire transition system:
assertions named `source$number` and `tgt$number`.


# Split Optimisation

For large procedures we break down the proof based on the entry cut. Because we always ahve the precondition
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For large procedures we break down the proof based on the entry cut. Because we always ahve the precondition
For large procedures, we break down the proof based on the entry cut. Because we always have the precondition

that we start in the same entry cut, for each possible entry we select the edge corresponding to that
entry and remove all other outgoing edges from the entry point. This assumption is then fully propogated
through by a dead code elimination. (In fact we remove the edge before the SSA pass so the flow is removed from the program).

Loading