A tuning-free VLM/MLLM inference acceleration framework that searches to prune operations rather than tokens.
conda create -n gsop python=3.10 -y
conda activate gsop
cd lmms-eval
pip install -e .
cd ../LLaVA
pip install -e .
pip install easydictFor additional setup instructions, please refer to:
bash scripts/gsop_inference.shbash scripts/gsop_search.shSome benchmarks (e.g., TextVQA) may produce results that differ from commonly reported metrics when run on lmms-eval. Please follow the evaluation setup detailed in Evaluation.md for those benchmarks.