Ep 29 Blog November 21, 2025 0:59 w/ Justy & Cody

Qwen3 VL · Ollama Blog

Qwen3-VL October 14, 2025 Qwen3-VL , the most powerful vision language model in the Qwen series is now available on Ollama’s cloud. The models will be made available locally soon.

Read the source → Plain-text transcript →

Embed this episode

Paste this on any site — the player is a self-contained iframe with no cookies or trackers.

<iframe src="https://sandrise.io/exploring-next/embed/29"
  width="100%" height="180" style="max-width:640px;border:0;border-radius:12px;overflow:hidden"
  title="Exploring Next — Episode 29 audio player"
  loading="lazy" allow="autoplay" referrerpolicy="strict-origin-when-cross-origin"></iframe>

Embed & API docs →

Voice OpenAI TTS

Transcript

Host A Welcome back to Exploring Next! Today we're looking at Qwen3-VL · Ollama Blog.

Host B Yeah, this one caught our eye because Qwen3-VL October 14, 2025 Qwen3-VL , the most powerful vision language model in the Qwen series is now available on Ollama’s cloud.

Host A So the big idea is Model capabilities Visual Agent : Operates PC/mobile GUIs—recognizes elements, understands functions, invokes tools, completes tasks Visual Coding Boost : Generates Draw.io/HTML/CSS/JS from images/videos Advanced Spatial Perception : Judges object positions, viewpoints, and occlusions; provides stronger 2D grounding and enables 3D grounding for spatial reasoning and embodied AI Long Context & Video Understanding : Native 256K context, expandable to 1M; hand

Host B What stood out to me is It is possible to use multiple images and drag and drop in images to make it easier to automatically type the file path.

Host A If you're curious, give the original a read: https://ollama.com/blog/qwen3-vl.

Host B And let us know what you try next!