VQA information

Progress Report

\textbf{Scene Understanding via VQA}
We implemented \texttt{/vision/vqa}, a service that enables open-ended scene understanding using the Vision-Language Model \texttt{qwen3-vl:8b} running locally via Ollama. The service accepts a natural-language prompt and returns the model’s response based on the live RGB camera feed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

VQA information

Progress Report

FilesExpand file tree

vqa_info.md

Latest commit

History

vqa_info.md

File metadata and controls

VQA information

Progress Report