Building a Multimodal RAG That Responds with Text, Images, and Tables from Sources | Towards Data Science
Large Language Models Building a Multimodal RAG That Responds with Text, Images, and Tables from Sources Why do few chatbots return figures from source documents in their responses? Partha Sarkar Nov 3, 2025 11 min read Share Photo by Steve Johnson on Unsplash Retrieval-Augmented Generation (RAG) has been one of the earliest and most successful applications of Generative AI.
Voice: OpenAI TTS
Transcript
Host A Welcome back to Exploring Next! Today we're looking at Building a Multimodal RAG That Responds with Text, Images, and Tables from Sources | Towards Data Science.
Host B Yeah, this one caught our eye because Large Language Models Building a Multimodal RAG That Responds with Text, Images, and Tables from Sources Why do few chatbots return figures from source documents in their responses?
Host A So the big idea is Partha Sarkar Nov 3, 2025 11 min read Share Photo by Steve Johnson on Unsplash Retrieval-Augmented Generation (RAG) has been one of the earliest and most successful applications of Generative AI.
Host B What stood out to me is Yet, few chatbots return images, tables, and figures from source documents alongside textual answers.
Host A If you're curious, give the original a read: https://towardsdatascience.com/building-a-multimodal-rag-with-text-images-tables-from-sources-in-response/.
Host B And let us know what you try next!