Qwen3 VL · Ollama Blog
Qwen3-VL October 14, 2025 Qwen3-VL , the most powerful vision language model in the Qwen series is now available on Ollama’s cloud. The models will be made available locally soon.
Voice: OpenAI TTS
Transcript
Host A Welcome back to Exploring Next! Today we're looking at Qwen3-VL · Ollama Blog.
Host B Yeah, this one caught our eye because Qwen3-VL October 14, 2025 Qwen3-VL , the most powerful vision language model in the Qwen series is now available on Ollama’s cloud.
Host A So the big idea is Model capabilities Visual Agent : Operates PC/mobile GUIs—recognizes elements, understands functions, invokes tools, completes tasks Visual Coding Boost : Generates Draw.io/HTML/CSS/JS from images/videos Advanced Spatial Perception : Judges object positions, viewpoints, and occlusions; provides stronger 2D grounding and enables 3D grounding for spatial reasoning and embodied AI Long Context & Video Understanding : Native 256K context, expandable to 1M; hand
Host B What stood out to me is It is possible to use multiple images and drag and drop in images to make it easier to automatically type the file path.
Host A If you're curious, give the original a read: https://ollama.com/blog/qwen3-vl.
Host B And let us know what you try next!