π I am Xuyang Liu (εζζ΄), a third-year Master's student at Sichuan University. I am also working as a research intern at OPPO Research Institute, supervised by Prof. Lei Zhang (PolyU, IEEE Fellow). Previously, I have interned at Ant Group focusing on GUI Agent, and Taobao & Tmall Group working on Efficient VLMs. I've also spent half a year visiting MiLAB at Westlake University, supervised by Prof. Donglin Wang. I am fortunate to work closely with Dr. Siteng Huang from DAMO Academy and Prof. Linfeng Zhang from SJTU.
π My research centers on efficient Large Vision-Language Models (LVLMs), including:
- πΌοΈ Image-Text LVLMs: high-resolution understanding via context compression and fast decoding, including GlobalCom2[AAAI'26], V2Drop[CVPR'26], FiCoCo[AAAI'26], and MixKV[ICLR'26].
- π¬ Video Understanding: long/audio-video, and streaming reasoning via efficient encoding and compression, including VidCom2[EMNLP'25], STC[CVPR'26], and OmniSIFT.
- βοΈ Efficiency Toolbox: efficient transfer/fine-tuning and benchmarking for downstream task adaptation, including M2IST[TCSVT'25], V-PETL[NeurIPS'24] and AutoGnothi[ICLR'25].
π’ If you find these directions interesting, feel free to reach out via email: liuxuyang@stu.scu.edu.cn.



