Ep 16 Blog November 21, 2025 0:52 w/ Justy & Cody

Building a Multimodal RAG That Responds with Text, Images, and Tables from Sources | Towards Data Science

Large Language Models Building a Multimodal RAG That Responds with Text, Images, and Tables from Sources Why do few chatbots return figures from source documents in their responses? Partha Sarkar Nov 3, 2025 11 min read Share Photo by Steve Johnson on Unsplash Retrieval-Augmented Generation (RAG) has been one of the earliest and most successful applications of Generative AI.

Read the source → Plain-text transcript →

Embed this episode

Paste this on any site — the player is a self-contained iframe with no cookies or trackers.

<iframe src="https://sandrise.io/exploring-next/embed/16"
  width="100%" height="180" style="max-width:640px;border:0;border-radius:12px;overflow:hidden"
  title="Exploring Next — Episode 16 audio player"
  loading="lazy" allow="autoplay" referrerpolicy="strict-origin-when-cross-origin"></iframe>

Embed & API docs →

Voice OpenAI TTS

Transcript

Host A Welcome back to Exploring Next! Today we're looking at Building a Multimodal RAG That Responds with Text, Images, and Tables from Sources | Towards Data Science.

Host B Yeah, this one caught our eye because Large Language Models Building a Multimodal RAG That Responds with Text, Images, and Tables from Sources Why do few chatbots return figures from source documents in their responses?

Host A So the big idea is Partha Sarkar Nov 3, 2025 11 min read Share Photo by Steve Johnson on Unsplash Retrieval-Augmented Generation (RAG) has been one of the earliest and most successful applications of Generative AI.

Host B What stood out to me is Yet, few chatbots return images, tables, and figures from source documents alongside textual answers.

Host A If you're curious, give the original a read: https://towardsdatascience.com/building-a-multimodal-rag-with-text-images-tables-from-sources-in-response/.

Host B And let us know what you try next!