The ViPlan benchmark is designed to be run on SLURM clusters, and the scripts in this directory are tailored for that purpose. If you are using a different cluster manager, you may need to modify the scripts accordingly, or directly run the Python scripts in the viplan/experiments directory (see the main README for direct Python usage).
Important
All sh_scripts are designed to be run from the root directory of the repository. (e.g. cd ViPlan && ./sh_scripts/slurm_cluster/run_blocksworld.sh)
local/— scripts for running experiments locally (sequential execution)slurm_cluster/— SLURM array job scripts for cluster execution
Within each, the "big" scripts are designed to run bigger VLMs that require two GPUs and the "cpu" scripts are designed to run API models that don't require GPUs (although a GPU is still requested for the renderer).
The two main entry points are run_blocksworld.sh and run_igibson.sh (located at sh_scripts/slurm_cluster to run on SLURM clusters; at sh_scripts/local to run locally), which are designed to run the Blocksworld and Household environments.
Parameters for run_blocksworld.sh / run_igibson.sh:
| Parameter | Type | Description |
|---|---|---|
--experiment_name |
string | Name used to organize result files |
--run_predicates |
boolean | Run VLM-as-Grounder experiments (default: true for BW, false for iGibson) |
--run_vila |
boolean | Run VLM-as-Planner experiments (default: true) |
--run_closed_source |
flag | Also run closed-source model variants |
Any additional arguments are forwarded to the individual experiment scripts.
The individual scripts under scripts/ accept additional flags to select experiment variants:
VLM-as-Grounder (benchmark_{env}_planning_array.sh):
| Variant | Flag(s) | Description |
|---|---|---|
| Default | (none) | Standard Yes/No QA prompt (prompt.md) |
| Chain-of-Thought (CoT) | --use_cot_prompt |
Uses the CoT prompt variant (prompt_cot.md) |
| With memory (Mem) | --include_prompt_history |
Injects previous-step failure context into the VLM prompt |
| Mem + CoT | --include_prompt_history --use_cot_prompt |
Combines CoT prompting with memory |
VLM-as-Planner (benchmark_{env}_vila_array.sh):
| Variant | Flag(s) | Prompt selected |
|---|---|---|
| Default | (none) | vila_{env}_json.md |
| Chain-of-Thought (CoT) | --use_cot_prompt |
vila_{env}_json_cot.md |
| Act | --act_prompt |
act_{env}_json.md |
| Act + CoT | --use_cot_prompt --act_prompt |
react_{env}_json.md |
Examples:
# Run VLM-as-Grounder with Mem + CoT prompting
./sh_scripts/local/run_blocksworld.sh --run_vila false --use_cot_prompt --include_prompt_history --experiment_name cot_mem
# Run VLM-as-Planner with Act + CoT prompting
./sh_scripts/local/run_blocksworld.sh --run_predicates false --use_cot_prompt --act_prompt --experiment_name act_cotCheck the individual scripts for more details.
Back to Main Documentation.