-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Open
Description
Hi all, sorry for the interruption.
I’m currently experimenting with multi-frame visual input training (max_frames=180) using Megatron + GRPO, with the Qwen3-VL-8B model.
Hardware setup
- 8 × 80GB GPUs
- bf16 training
Issue
- GPU memory runs out very quickly
- Training is stable only up to ~32 frames
- Increasing
max_framesbeyond that leads to immediate OOM (often before backward)
Questions
In multi-frame VL (video / multi-image) training, how are CP / SP / TP / PP typically combined in practice to scale to high frame counts?
Train scripts
# LOG_FILE=log_megatron_grpo.txt
# CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 \
MAX_PIXELS=501760
NPROC_PER_NODE=${GPUS_PER_NODE:-4} \
PYTORCH_CUDA_ALLOC_CONF='expandable_segments:True' \
megatron rlhf \
--rlhf_type grpo \
--model Qwen3-VL-8B-instruct \
--load_safetensors true \
--save_safetensors true \
--save_interval 100 \
--context_parallel_size 1 \
--tensor_model_parallel_size 2 \
--pipeline_model_parallel_size 2 \
--dataset Alert_Dataset\
--max_epochs 1 \
--global_batch_size 32 \
--micro_batch_size 1 \
--steps_per_generation 4 \
--num_generations 16 \
--reward_funcs Alert_Reward \
--external_plugins plugin.py \
--use_vllm true \
--vllm_mode colocate \
--vllm_gpu_memory_utilization 0.3 \
--vllm_tensor_parallel_size 4 \
--vllm_max_model_len 18384 \
--max_length 16384 \
--max_completion_length 124 \
--train_type lora \
--lora_rank 32 \
--lora_alpha 128 \
--lr 5e-5 \
--bf16 true \
--beta 0.00 \
--dynamic_sample false \
--overlong_filter true \
--loss_type grpo \
--sleep_level 2 \
--offload_model true \
--offload_optimizer true \
--log_interval 1 \
--recompute_granularity selective \
--finetune \
--num_workers 8 \
--dataset_num_proc 8 \
--no_save_optim \
--no_save_rng \
--attention_backend flash \
--temperature 1.0 \
--padding_free true \
--sequence_parallel true \
--log_completions true \
2>&1 | tee ${LOG_FILE}Any insights, configs, or references would be greatly appreciated. Thanks!
Metadata
Metadata
Assignees
Labels
No labels