FVRuleLearner is the rule-learning pipeline for NL2SVA. The current release mainline is the prompt-only path under src/, with JasperGold used for syntax/functionality evaluation.
Paper: https://arxiv.org/abs/2604.03245
BibTeX:
@article{wan2026fvrulelearner,
title={FVRuleLearner: Operator-Level Reasoning Tree (OP-Tree)-Based Rules Learning for Formal Verification},
author={Wan, Lily Jiaxin and Ho, Chia-Tung and Bai, Yunsheng and Yu, Cunxi and Chen, Deming and Ren, Haoxing},
journal={arXiv preprint arXiv:2604.03245},
year={2026}
}The supported tasks in this release are:
nl2sva_humannl2sva_machinenl2sva_opencore
-
Create and activate an environment.
conda create -n fvrulelearner python=3.10 conda activate fvrulelearner
-
Install dependencies.
pip install -r requirements.txt
FVRuleLearner expects:
- either
OPENAI_API_KEYorANTHROPIC_API_KEY, depending onLLM_gateaway jgavailable onPATHLM_LICENSE_FILEandCDS_LIC_FILEfor JasperGold- optional
FVRULELEARNER_TRAIN_LOGDIRto override the training log used by inference - optional
FVRULELEARNER_EVAL_LOGDIRto override the inference log used by eval-only runs
Example:
export OPENAI_API_KEY=...
# or: export ANTHROPIC_API_KEY=...
export PATH=/path/to/jasper/bin:$PATH
export LM_LICENSE_FILE=5280@cadence.webstore.illinois.edu
export CDS_LIC_FILE=5280@cadence.webstore.illinois.eduRun from the repo root:
cd FVRuleLearnerThe main entrypoint is:
python3 src/main.pyConfiguration lives in src/config.py.
To switch benchmarks, update:
task = "nl2sva_human"
# task = "nl2sva_machine"
# task = "nl2sva_opencore"Dataset mapping in the current release:
nl2sva_human->FVEval/data_nl2sva/data/nl2sva_human.csvnl2sva_machine->FVEval/data_nl2sva/data/nl2sva_machine_updated.csvnl2sva_opencore->FVEval/data_1k/module_sva_nl_manual_editing.csv
Training learns suggestions and Q-Trees from the selected NL2SVA training split and writes them into a train_* log folder.
Use these core settings in src/config.py:
global_task = "train"
task = "nl2sva_opencore"
LLM_gateaway = "openai"
llm_model = "gpt-4o"
# or
# LLM_gateaway = "claude"
# llm_model = "claude-sonnet-4-5-20250929"Then run:
python3 src/main.pyFor nl2sva_human or nl2sva_machine, change only the task value and keep the rest of the flow the same.
Training outputs are written to:
src/logs/train_<timestamp>_<hostname>_<user>/
Important files in that folder:
suggestions.pklqtrees.pklqtrees.jsonlog.txtFLAGS.klepto
Inference uses a previously trained log folder as the rule source.
Use these core settings in src/config.py:
global_task = "inference"
task = "nl2sva_opencore"
LLM_gateaway = "openai"
llm_model = "gpt-4o"
# or
# LLM_gateaway = "claude"
# llm_model = "claude-sonnet-4-5-20250929"
use_RAG = True
RAG_content = ["Examples", "Suggestions"]
load_suggestions_path = "src/logs/train_<...>"Recommended inference settings for the current mainline are:
similarity_str = ["Question"]
retrieve_str = ["Suggestions"]
Examples_top_k = 1
prompting_instruction = 0
Suggestions_top_k = 3
Suggestions_Reasoning = True
use_qtree_inference = True
retrieval_on_ranking = True
qtree_similarity_top_k = 5
qtree_ranking_mode = "prompt"
rule_source = "generate"
deduplication = True
operator_explanation = True
filter_functionality = TrueThen run:
python3 src/main.pyThese are only for short smoke tests or targeted debugging. They are not required for normal release runs.
debug = True
# Only meaningful when debug=True.
# Restrict inference/eval to selected dataset row indices.
only_test_ids = [4]
# Optional: restrict the training pool used by retrieval/Q-Tree building.
training_cases = [0, 1, 2, 3]Notes:
debugis optionalonly_test_idsis only used whendebug=Truetest_modeis not part of the current release path and is no longer needed
Again, task can be switched to nl2sva_human or nl2sva_machine with the same inference flow.
Inference writes to:
src/logs/inference_<timestamp>_<hostname>_<user>/
and automatically runs evaluation at the end.
If you already have an inference output folder and only want to re-run postprocessing/evaluation:
global_task = "eval"
task = "nl2sva_opencore"
folder_to_eval = "src/logs/inference_<...>"Then run:
python3 src/main.pyTraining saves learned artifacts into the training log folder for the selected task:
suggestions.pkl: learned suggestion traces/rulesqtrees.pkl: serialized Q-Tree correctionsqtrees.json: JSON view of the Q-Trees for inspection/debugging
During inference:
load_suggestions_pathshould point to the training log directory, not directly to a single.pklfile- the code first looks for
suggestions.pkl - if that file is absent, it falls back to
all_suggestions.pkl - Q-Tree retrieval also loads
qtrees.jsonfrom the same folder embeddings.npymay also be created there for retrieval caching
So the flow is:
- Run
train - Note the generated
src/logs/train_<...>folder - Set
load_suggestions_pathininferenceto that folder - Run
inference
The training and inference task should match. For example:
- train with
task = "nl2sva_opencore"-> infer withtask = "nl2sva_opencore" - train with
task = "nl2sva_human"-> infer withtask = "nl2sva_human" - train with
task = "nl2sva_machine"-> infer withtask = "nl2sva_machine"
src/logs/train_<...>/
suggestions.pkl
qtrees.pkl
qtrees.json
log.txt
FLAGS.klepto
src/logs/inference_<...>/
log.txt
eval/
*_sim.csv
*_jg.csv
*.csv
suggestion_usage_analysis.json
Typical evaluation files:
*_sim.csv: BLEU / ROUGE / exact match*_jg.csv: JasperGold syntax / functionality*.csv: merged summary
inferencealready triggers evaluation automatically fromsrc/main.pyevalis mainly for re-running evaluation on an existing inference folder