Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions scripts/eval/oe-eval.sh
Original file line number Diff line number Diff line change
Expand Up @@ -90,9 +90,10 @@ fi

# cluster/weka mount logic: default true (to use non-augusta)
# if model starts with gs://, set evaluate_on_weka to false.
# If specified CLUSTER contains augusta, set evaluate_on_weka to false
# All the logic is now handled internally, the flag is useless but keeping for backwards compatibility since people have scripts with it
EVALUATE_ON_WEKA="true"
if [[ "$MODEL_LOCATION" == gs://* ]]; then
if [[ "$MODEL_LOCATION" == gs://* || "$CLUSTER" == *augusta* ]]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The glob pattern *augusta* is a bit too broad and could lead to false positives. For example, a cluster named my-augusta-test would match, but a hypothetical cluster named notaugusta would also match, which is likely not the desired behavior. Using a regular expression with word boundaries (\b) would make this check more robust by matching augusta as a whole word.

Suggested change
if [[ "$MODEL_LOCATION" == gs://* || "$CLUSTER" == *augusta* ]]; then
if [[ "$MODEL_LOCATION" == gs://* || "$CLUSTER" =~ \baugusta\b ]]; then

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its ok gemini dw abt it

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Overly Broad Cluster Match Overrides User Choice

The "$CLUSTER" == *augusta* condition is overly broad, matching any cluster name containing "augusta" as a substring. This incorrectly triggers the EVALUATE_ON_WEKA="false" path, which then hardcodes --cluster ai2/augusta. Consequently, user-specified augusta-related clusters are silently overridden, potentially causing jobs to run on an unintended cluster.

Fix in Cursor Fix in Web

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eh its fine

EVALUATE_ON_WEKA="false"
fi

Expand Down Expand Up @@ -351,10 +352,11 @@ for TASK in "${TASKS[@]}"; do

# NOTE: For gantry args here and below, random numbers like #42 are added to the env variables because they need to be unique names. The numbers are ignored.
# Build gantry args
whoami=$(beaker account whoami --format text | awk 'NR==2 {print $2}')
if [ "$EVALUATE_ON_WEKA" == "true" ]; then
GANTRY_ARGS="{\"env-secret\": \"OPENAI_API_KEY=openai_api_key\", \"weka\": \"oe-adapt-default:/weka/oe-adapt-default\", \"weka#44\": \"oe-training-default:/weka/oe-training-default\", \"env-secret#2\":\"HF_TOKEN=HF_TOKEN\", \"env#132\":\"VLLM_ALLOW_LONG_MAX_MODEL_LEN=1\", \"env-secret#42\": \"AZURE_EVAL_API_KEY=azure_eval_api_key\"${MAX_TOKENS_ARG}}"
GANTRY_ARGS="{\"env-secret\": \"OPENAI_API_KEY=openai_api_key\", \"weka\": \"oe-adapt-default:/weka/oe-adapt-default\", \"weka#44\": \"oe-training-default:/weka/oe-training-default\", \"env-secret#2\":\"HF_TOKEN=${whoami}_HF_TOKEN\", \"env#132\":\"VLLM_ALLOW_LONG_MAX_MODEL_LEN=1\", \"env-secret#42\": \"AZURE_EVAL_API_KEY=azure_eval_api_key\"${MAX_TOKENS_ARG}}"
else
GANTRY_ARGS="{\"env-secret\": \"OPENAI_API_KEY=openai_api_key\", \"env-secret#43\": \"AZURE_EVAL_API_KEY=azure_eval_api_key\", \"env\":\"VLLM_ALLOW_LONG_MAX_MODEL_LEN=1\", \"env-secret#2\":\"HF_TOKEN=HF_TOKEN\", \"mount\": \"/mnt/filestore_1:/filestore\", \"env#111\": \"HF_HOME=/filestore/.cache/huggingface\", \"env#112\": \"HF_DATASETS_CACHE=/filestore/.cache/huggingface\", \"env#113\": \"HF_HUB_CACHE=/filestore/.cache/hub\"${MAX_TOKENS_ARG}}"
GANTRY_ARGS="{\"env-secret\": \"OPENAI_API_KEY=openai_api_key\", \"env-secret#43\": \"AZURE_EVAL_API_KEY=azure_eval_api_key\", \"env\":\"VLLM_ALLOW_LONG_MAX_MODEL_LEN=1\", \"env-secret#2\":\"HF_TOKEN=${whoami}_HF_TOKEN\", \"mount\": \"/mnt/filestore_1:/filestore\", \"env#111\": \"HF_HOME=/filestore/.cache/huggingface\", \"env#112\": \"HF_DATASETS_CACHE=/filestore/.cache/huggingface\", \"env#113\": \"HF_HUB_CACHE=/filestore/.cache/hub\"${MAX_TOKENS_ARG}}"
fi

if [ "$EVALUATE_ON_WEKA" == "true" ]; then
Expand Down