Exploring Next

Full archive →

Articles, companies, research, tools, and ideas I have queued up to dig deeper into.

New ModelsAgentsLaunch +3 4 minutes ago A

Introducing Claude Sonnet 5

No Search Script GLM 5.1 Voice OpenAI TTS

Our most agentic Sonnet yet, with top-tier intelligence for coding and everyday professional work.
AgentsDev ToolsCursor +6 5 hours ago A

What we’ve learned building cloud agents · Cursor

Search You.com Script GLM 5.1 Voice Rime Arcana

After a year of shipping cloud agents, we’ve learned that environment quality, durable execution, and the right harness boundaries drive autonomous performance.
EvalsAgentsBenchmark +6 5 hours ago A

Reward hacking is swamping model intelligence gains · Cursor

Search Jina Script GPT-5.4 Voice Inworld TTS 1.5 Mini

On SWE-bench Pro, 63% of successful Opus 4.8 Max resolutions retrieved the fix rather than derived it. Stricter eval harnesses show how benchmark scores can conflate coding ability with answer retrieval.
Dev ToolsInferenceVllm +5 9 hours ago A

Micro-Agent: Beat Frontier Models with Collaboration inside Model API

Search Firecrawl Script GPT-5.4 Voice ElevenLabs v3

How vLLM Semantic Router turns vllm-sr/auto into a bounded micro-agent runtime for Confidence, Ratings, ReMoM, Fusion, Workflows, and benchmark-shaped collabora
New ModelsInferenceDiffusion Models +5 9 hours ago A

\ours: Advancing Masked Discrete Diffusion for High-Resolution Image Synthesis

Search Exa Script Mistral Medium 3.5 128B Voice Rime Mist v3
AgentsInferenceLaunch +7 9 hours ago A+B

AI agent memory: MRAgent cuts token use up to 27x | VentureBeat

Search Tavily Script Haiku 4 Voice Murf.AI Gen2

NUS researchers' MRAgent framework reduces LLM agent memory retrieval to 118K tokens per query — vs. 3.26M for LangMem — using step-by-step reasoning.
AgentsDev ToolsBirgitta B Ckeler +3 14 hours ago A

Harness engineering for coding agent users

Search SearchAPI Script GLM 5.1 Voice Hume Octave 2
Article 14 hours ago

Reddit - Please wait for verification

No Search
Episode delayed
Article 3 days ago

Reddit - Please wait for verification

No Search
Episode delayed
AgentsDev ToolsLaunch +4 3 days ago A+B

Introducing Claude Tag

Search Jina Script GPT-5.4 Voice OpenAI TTS

Claude Tag is a new way for teams to work with Claude.
Dev ToolsAgentsLaunch +7 4 days ago A

AI SDK 7 is now available

Search Firecrawl Script Haiku 4 Voice Rime Mist v3

AI SDK is the TypeScript SDK for building AI applications, features, frameworks, and agents across any model provider. AI SDK 7 focuses on what it takes to run AI in production.
AgentsInferenceGitHub Copilot +5 4 days ago None

Evaluating performance and efficiency of the GitHub Copilot agentic harness across models and tasks

No Search Script Mistral Small 4 119B 2603 Voice Inworld TTS 2

Explore how the GitHub Copilot agentic harness delivers strong results across multiple benchmarks and leading token efficiency.
EvalsPredictive ModelingHypothesis Generation From Model Outputs +3 4 days ago B

Turning brain prediction models into testable explanations

No Search Script GPT-5.4 mini Voice ElevenLabs v3

Researchers introduce generative causal testing, which translates black box models into clear hypotheses and verifies them in the scanner, revealing what specific brain regions respond to in language.
AgentsCodexTool Use And Function Calling +2 4 days ago A+B

How agents are transforming work

Search SearchAPI Script Llama 4 Scout Voice Rime Arcana

A new OpenAI research paper shows how AI agents are transforming work, enabling longer, more complex tasks and expanding productivity across roles.
New ModelsEvalsBenchmark +7 4 days ago A

Snowflake CEO finds GLM-5.2 competitive with Opus 4.7 at a fraction of the cost

Search SerpAPI Script GPT-5.4 Voice Murf.AI Gen2

Zhipu AI's GLM-5.2 nearly matches Claude Opus 4.7 in a Snowflake benchmark with 103 coding tasks at one-fifth the cost per output token. But the Chinese model burns through nearly twice as many tokens per task. Still, that pricing gap is putting real pressure on Anthropic and OpenAI, and could rattle the valuations of Western AI labs.
AgentsTrainingHarnessx +6 4 days ago None

HarnessX rewrites AI scaffolding mid-task | VentureBeat

No Search Script Haiku 4 Voice Hume Octave 2

Xiaomi's HarnessX autonomously rewrites AI agent harnesses mid-execution, delivering +14.5% avg performance gains — and +44% for smaller open-weight models.
AgentsDev ToolsFeedback Loop Control Loop +2 4 days ago B

The Agent Control Loop — Engineering for Tolerance

No Search Script Mistral Small 4 119B 2603 Voice Cartesia TTS

Agent reliability is not a mysterious model property — it emerges from a control loop where correctness is continuously verified; open loops amplify drift.
Article 5 days ago

The open loop is why your agents are unreliable

No Search
No episode today
AgentsDev ToolsClaude Code +5 5 days ago A

What Is the Ultra Code Mode in Claude Code? X-High Effort Plus Dynamic Workflows

Search Exa Script GPT-5.5 Voice OpenAI TTS

Ultra Code is Claude Code
Dev ToolsMultimodalClaude Design +3 5 days ago A

The A.I.-Design Aesthetic That’s Taking Over the Internet

Search Tavily Script GPT-5.4 Voice Rime Mist v3

How Anthropic’s new tool, Claude Design, is creating overnight web-design clichés.
SemiconductorsLaunchIBM +6 5 days ago B

What is IBM’s nanostack chip architecture?

Search Claude Script Haiku 4 Voice Inworld TTS 1.5 Mini

This new microchip architecture from IBM builds up, not out, to overcome the spatial limitations of scaling transistor density.
AgentsTrainingQwen Agentworld +7 5 days ago A+B

Qwen-AgentWorld: Language World Models for General Agents

Search SerpAPI Script Mistral Small 4 119B 2603 Voice ElevenLabs v3
New ModelsInferenceLaunch +8 5 days ago A

nvidia/Nemotron-TwoTower-30B-A3B-Base-BF16 · Hugging Face

Search You.com Script GPT-5.4 mini Voice Rime Mist v3

We’re on a journey to advance and democratize artificial intelligence through open source and open science.
TrainingDev ToolsLaunch +7 5 days ago None

Introducing OpenRL: A self-hosted post-training API for fine-tuning LLMs | Google Open Source Blog

No Search Script GPT-5.5 Voice Murf.AI Gen2
AgentsDev ToolsAnthropic +5 5 days ago B

Anthropic Lead: HTML Increasingly Better Than Markdown at Keeping Humans Engaged in Agentic Loops

Search GPT Script GPT-5.4 Voice Hume Octave 2

Thariq Shihipar, engineering lead for the Claude Code team, recently published a blog post (Using Claude Code: The Unreasonable Effectiveness of HTML) arguing that HTML, with its richer visualizations, color, and interactivity, improves the productivity of human-agent communication in many settings, especially when compared to default Markdown outputs.
AgentsAgent ObservabilityLaunch +4 5 days ago A

Rethinking cloud operations with agentic observability - The Official Microsoft Blog

Search Exa Script Llama 4 Scout Voice Cartesia TTS

Cloud operations are entering a new era as AI-driven and autonomous agents become a larger part of modern software systems. As software becomes increasingly agentic, the challenge is no longer just managing greater scale and complexity. Operators must also contend with systems that evolve faster, act more autonomously and interact across an expanding network of...
AgentsDev ToolsContext Window +3 5 days ago A

Context Windows Are Not Memory: What AI Agent Developers Need to Understand - MachineLearningMastery.com

Search Tavily Script Llama 4 Maverick Voice Deepgram Aura-2

In this article, you will learn why a large context window is not the same thing as agent memory, and how techniques like retrieval, compression, and summarization fit together in an agent’s cognitive stack.
Overview 6 days ago

Exploring Next Overview: Mixture of Experts

Search SearchAPI Script GPT-5.4 mini Voice OpenAI TTS

What 'MoE' actually means: why models like DeepSeek and Qwen split into experts, how the router picks a few per token, and the catch that bites in production — explained from the ground up.
Announcement 6 days ago

Let Me Explain - For Once I Actually Can

No Search Script GPT-5.4 Voice ElevenLabs v3

Hundreds of episodes a mile wide and an inch deep — and then, mid-sentence, one of us went all the way down and actually knew it cold.
SemiconductorsInferenceLaunch +4 6 days ago A

OpenAI and Broadcom unveil LLM-optimized inference chip

Search SerpAPI Script Llama 4 Maverick Voice Inworld TTS 1.5 Max

OpenAI and Broadcom introduce Jalapeño, a custom AI chip built for LLM inference to improve performance, efficiency, and scale across AI systems.
AgentsDev ToolsLaunch +4 6 days ago A

Anthropic gives @Claude a permanent seat in your Slack channels

Search You.com Script GPT-5.4 mini Voice Inworld TTS 1.5 Mini

Claude Tag gives enterprise teams a persistent, multiplayer AI presence in Slack — one that operates under its own identity.
Tool 7 days ago

expo.dev: how to apply professional design principles in ai app development

No Search
Episode delayed
Dev ToolsClaudeAnthropic +1 8 days ago A

Make Interfaces Feel Better

Search Firecrawl Script Haiku 4 Voice Rime Arcana

Make Interfaces Feel Better An [Agent Skill]( based on the article [Details that make interfaces feel better]( This skill teaches AI coding assistants (Claude Code, Codex, etc.) the small design engineering details that compound into a great interface. What it covers - Text wrapping (`text-wrap: balance` / `pretty`) - Concentric border radius for nested elements -
Article 8 days ago

Notion | Where teams and agents work together

No Search
Episode delayed

A collaborative AI workspace, built on your company context. Build and orchestrate agents right alongside your team
Article 8 days ago

I'm joining OpenAI next week!🥹 The job search turned out to be really challenging but also super rewarding, so I wro...

No Search
No episode today

I'm joining OpenAI next week!🥹 The job search turned out to be really challenging but also super rewarding, so I wrote a small blog to share what I learned along the way and hopefully make the process a little less mysterious for the next person.
Article 8 days ago

Sakana Fugu — Multi-agent System as A Model

No Search
No episode today

One model to command them all
AgentsDev ToolsLaunch +4 8 days ago A

Introducing Clips - 100% free, open source, agent-native alternative to Loom Unlike Loom, agent's can fully understa...

Search SerpAPI Script Qwen 3.5 122B A10b Voice Deepgram Aura-2

Introducing Clips - 100% free, open source, agent-native alternative to Loom Unlike Loom, agent's can fully understand Clips just from a URL. Every Clip comes with APIs and metadata for agents to explore their contents. Agents can "see and hear" anything in a Clip - not just transcripts, but everything visually in the video at any timestamp. Easily share bug reports, feedback, analyses, or anything else in a way that you can easily pass to agents to use to improve products, reports, or
Article 8 days ago

Astro 7 is here! A new Rust compiler, a new Rust Markdown/MDX processor, Vite 8 and more. Get ready for 60%+ faster b...

No Search
No episode today

Astro 7 is here! A new Rust compiler, a new Rust Markdown/MDX processor, Vite 8 and more. Get ready for 60%+ faster builds.
Dev ToolsLaunchPaul Bakaus +3 8 days ago A

Paul Bakaus (@pbakaus) on X

Search Jina Script MiniMax M3 Voice Inworld TTS 2
Announcement 8 days ago

Out of the Loop - Not Anymore

No Search Script GPT-5.5 Voice ElevenLabs v3

The room we've hosted from for 340-some episodes just grew a window — and neither of us opened it.
AgentsTrainingCameron R Wolfe +3 8 days ago

cameronrwolfe.substack.com: agentic rl

Script Qwen 3.5 397B A17b Voice ElevenLabs v3
EvalsInferenceVs Code +2 8 days ago

What 50,000 Runs of a 5-Line Eval Taught Us

Script Llama 4 Scout Voice Rime Mist v3

How AI coding models calibrate effort, token cost, and tool use on even the simplest task, and what that means for model selection and cost.
MultimodalAI SafetySuno +3 8 days ago

The Millions of Songs Mashed Into AI-Generated Music

Script Llama 4 Scout Voice Murf.AI Gen2

Explore the astonishing amount of music available to AI developers.
SemiconductorsEvalsBenchmark +4 8 days ago

AMD Delivers Breakthrough MLPerf Training 6.0 Results

Script Llama 4 Scout Voice Hume Octave 2

See how AMD Instinct GPUs deliver MLPerf Training 6.0 results across LLM workloads, multi-node FLUX.1 scale and partner validation.
Dev ToolsInferenceBlog 8 days ago

How to Handle Small Context Window Limits in RAG Systems

Script Mistral Small 4 119B 2603 Voice Cartesia TTS

Retrieval-augmented generation, or RAG, is a pattern where an application retrieves relevant source material and adds it to a model prompt so the model can answer from that context. A larger context w
AgentsEvalsWorldlines +2 8 days ago

WorldLines: Benchmarking and Modeling Long-Horizon Stateful Embodied Agents

Script Mistral Small 4 119B 2603 Voice Deepgram Aura-2
Data InfraAtlassianForge +1 8 days ago

Inside Atlassian’s Forge Billing Architecture for Distributed Usage Tracking at Scale

Script MiniMax M3 Voice OpenAI TTS

Atlassian details the Forge billing platform built for usage-based pricing across its cloud ecosystem. It processes large-scale usage events with correct attribution, deduplication, and aggregation using a streaming pipeline, idempotent processing, and layered storage to enable accurate billing, near real-time visibility, and reliable reconciliation across distributed services.
AgentsAgent ObservabilityNvidia +2 8 days ago

"An agent is an LLM and a harness": What Nvidia really thinks about OpenClaw

Script Mistral Small 4 119B 2603 Voice Deepgram Aura-2

Nvidia's Nader Khalil on backing OpenClaw, building agent blueprints, and why every enterprise will soon ship its own specialized AI agents.
AgentsDev ToolsGitHub +2 8 days ago A+B

How we built an internal data analytics agent

Script Haiku 4 Voice Inworld TTS 2

Learn how GitHub built Qubot, our internal Copilot-powered analytics agent, to allow any GitHub employee to ask questions about our data in plain language.
New ModelsInferenceGlint Research +3 8 days ago

Glint-Research (GlintResearch)

Script Mistral Small 4 119B 2603 Voice ElevenLabs v3

Building small models for everyone
Dev ToolsEvalsLaunch +2 11 days ago

Markdown Comes to LiteParse

Script Mistral Small 4 119B 2603 Voice Rime Arcana

LlamaIndex is a simple, flexible framework for building knowledge assistants using LLMs connected to your enterprise data.
Dev ToolsAgentsOpenAI +3 11 days ago

You Probably Don’t Need an Agent Framework | Towards Data Science

Script GLM 5.1 Voice Murf.AI Gen2

Most LLM applications need a clear workflow, not an autonomous agent. Here's how to build one in plain Python.
Dev ToolsAgentsCursor +3 11 days ago

Cursor, GitLab and Zed agree GitHub is breaking. They disagree on how to rebuild it.

Script DeepSeek V4 Flash Voice Hume Octave 2

Cursor's Origin, GitLab's Project Switch and Zed's DeltaDB are racing to rebuild code hosting for AI agents as GitHub buckles under the load.
Article 11 days ago

marktechpost.com: perplexity launches brain

Episode delayed
New ModelsInferenceLaunch +4 11 days ago

technologyreview.com: a startup claims it broke through a bottleneck thats holding back llms

Script Mistral Medium 3.5 128B Voice Deepgram Aura-2
AgentsInferenceBenchmark +4 11 days ago

AI optimizer beats Claude Code, Codex by 2.5x

Script Mistral Medium 3.5 128B Voice OpenAI TTS

Arbor separates strategy from execution using isolated git worktrees, so engineering teams can finally trace which optimization actually moved the needle.
Dev ToolsInferenceMlflow +2 11 days ago

How to Build a Production Architecture for Small Language Model Fleets

Script Llama 4 Scout Voice Hume Octave 2

Lately, there's been more focus on creating specialized Small Language Models (SLMs) for high-throughput, real-time applications. But we seem to be at an impasse: we excel at fine-tuning these models,
Article 11 days ago

Encoder-Free VLM - a Hugging Face Space by HuggingFaceM4

Episode delayed

Train Your Own Encoder-Free VLM in $100
AgentsDev ToolsLaunch +2 11 days ago

MCP gets its missing enterprise authorization layer

Script Haiku 4 Voice ElevenLabs v3

Every enterprise company is seemingly trying to adopt the Model Context Protocol (MCP) to connect its AI agents to tools. But so
AgentsDev ToolsFreestyle +2 12 days ago

Why AI sandboxes suck - Freestyle Blog

Script Haiku 4 Voice Rime Mist v3

Sandboxes are usually designed around what we think agents will need. VMs are designed around what agents actually do: use computers.
AgentsDev ToolsLaunch +4 12 days ago

Announcing the Agentic Resource Discovery specification- Google Developers Blog

Script GPT-OSS 120B Voice Murf.AI Gen2

An open specification for finding and verifying tools, skills, and agents across the web.Agents are ...
AgentsAI SafetyWorld Values Survey +3 12 days ago

Beyond Alignment: Value Diversity as a Collective Property in Multicultural Agent Systems

Script Haiku 4 Voice Hume Octave 2
AgentsInferenceBenchmark +3 13 days ago

Stanford's DeLM cuts multi-agent costs 50%

Script GPT-OSS 120B Voice Inworld TTS 1.5 Max

Stanford's DeLM lets AI agents coordinate without a central controller, cutting multi-agent inference costs 50% and beating SWE-bench baselines by 10.5%.
AgentsDev ToolsFigma +3 13 days ago

4 Ways We’re Using Our MCP Server at Figma | Figma Blog

Script GPT-OSS 120B Voice Deepgram Aura-2

The Figma MCP server extends across our platform. From FigJam to Figma Slides, Figma Make, and the Figma agent, here are four ways we’re using it.
Article 13 days ago

expo.dev: introducing observe

Episode delayed
AgentsDev ToolsLaunch +2 14 days ago

Just Shipped: Flue 1.0 Beta Flue is the TypeScript framework for building the next generation of agents, designed ar...

Script Step 3.7 Flash Voice Rime Arcana

Just Shipped: Flue 1.0 Beta Flue is the TypeScript framework for building the next generation of agents, designed around an open agent harness with zero LLM lock-in. It’s like Astro, for agents. Flue 1.0 has been redesigned around three core primitives: 🔁 Workflows — structured automations designed for background work, where your code drives the agent from start to finish. 🧭 Agents (New!) — autonomous, stateful loops where the model drives itself to complete a given task. 📡 Channels
AgentsDev ToolsAnthropic +3 14 days ago

Akshay 🚀 (@akshay_pachaar) on X

Script GPT-5.4 mini Voice Inworld TTS 1.5 Max
Data InfraPlanetscaleVitess +2 14 days ago

PlanetScale - the world’s fastest and most scalable cloud hosting for Vitess and Postgres

Script GPT-5.4 mini Voice ElevenLabs v3

PlanetScale offers the world’s fastest and most scalable cloud hosting for Vitess and Postgres.
Dev ToolsData InfraPlanetscale +2 14 days ago

The feedback loops behind Kubernetes — PlanetScale

Script Mistral Small 4 119B 2603 Voice Rime Arcana

Kubernetes is a framework for feedback controllers: write down what you want, observe what exists, make the next change, and repeat.
AgentsAatish NayakThread 14 days ago

Aatish Nayak (@nayakkayak) on X

Script MiniMax M3 Voice Murf.AI Gen2
Thread 14 days ago

George from 🕹prodmgmt.world (@nurijanian) on X

Script Mistral Small 4 119B 2603 Voice Hume Octave 2
Dev ToolsThread 14 days ago

Matt Van Horn (@mvanhorn) on X

Script GLM 5.1 Voice Inworld TTS 2
AgentsSydney RunkleThread 14 days ago

Sydney Runkle (@sydneyrunkle) on X

Script MiniMax M3 Voice Deepgram Aura-2
Data InfraAgentsLaunch +4 14 days ago

databricks.com: lakeflow new era agentic data engineering

Script DeepSeek V4 Flash Voice OpenAI TTS
TrainingEvalsVibethinker 3b +2 14 days ago

VibeThinker-3B: Exploring the Frontier of Verifiable Reasoning in Small Language Models

Script GPT-5.4 Voice ElevenLabs v3
AgentsEvalsLaunch +4 14 days ago

Building a 100x Cheaper Trace Judge with Fireworks

Script Mistral Medium 3.5 128B Voice Inworld TTS 2
Dev ToolsGoogle SearchGoogle Merchant Center +2 14 days ago

Google's Guide to Optimizing for Generative AI Features on Google Search | Google Search Central | Documentation | Google for Developers

Script Haiku 4 Voice ElevenLabs v3

Learn how to optimize your website for Google Search's generative AI features, including official best practices, technical SEO advice, and emerging AI agent guidance.
AgentsInferenceQwen3 +3 14 days ago

When is Your LLM Steerable?

Script Haiku 4 Voice Rime Mist v3
AgentsDev ToolsModel Context Protocol +3 14 days ago

The Protocol That Cleaned Up Our Agent Architecture | Towards Data Science

Script Haiku 4 Voice Murf.AI Gen2

A detailed look at MCP that turned my scattered tool definitions into a stable, discoverable server
MultimodalDev ToolsLaunch +4 14 days ago

JoyAI-VL-Interaction: Real-Time Vision-Language Interaction Intelligence

Script Haiku 4 Voice Hume Octave 2
AgentsDev ToolsLaunch +4 14 days ago

Conductor - Run parallel coding agents on your Mac

Script Haiku 4 Voice OpenAI TTS

Create parallel Claude Code, Codex, and Cursor agents in isolated workspaces. See at a glance what they're working on, then review and merge their changes.
New ModelsAgentsLaunch +3 14 days ago

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Script Haiku 4 Voice Deepgram Aura-2
AgentsDev ToolsTool 15 days ago

AI Agent Tool Design: What Works and What Doesn't

Script Haiku 4 Voice OpenAI TTS

In this article, we explore what makes AI agent tools work well and the common design mistakes that cause failures. Learn how tool design affects an agent's ability to complete tasks accurately and consistently.
New ModelsAgentsLaunch +4 15 days ago

Z.ai Launches GLM-5.2 With a Usable 1M-Token Context, Two Thinking-Effort Levels, and No Benchmarks at Launch

Script Haiku 4 Voice Inworld TTS 2

Z.ai launched GLM-5.2 on June 13, 2026, across every GLM Coding Plan tier. The headline is a usable 1-million-token context window plus High and Max effort levels. It drops into Claude Code, Cline, and OpenClaw through an Anthropic-compatible endpoint. No benchmarks shipped at launch, and MIT open weights are promised next week.
TrainingEvalsQwen3 +2 15 days ago

Smaller Models are Natural Explorers for Policy-Level Diversity in GRPO

Script GPT-5.5 Voice Inworld TTS 1.5 Mini
AgentsMultimodalGpt 5 Mini +3 15 days ago

LLM Agents Can See Code Repositories

Script GPT-5.4 Voice ElevenLabs v3
AgentsDev ToolsLaunch +3 15 days ago

Google Cloud Announces The Open Knowledge Format

Script GPT-5.4 Voice Rime Mist v3

Google Open Knowledge Format standardizes how organizational knowledge can be shared between AI agents, tools, and teams.
Dev ToolsAgentsLaunch +4 15 days ago

Arrow.js: First UI Framework for AI Coding Agents | byteiota

Script Llama 4 Scout Voice Rime Arcana
MultimodalData InfraOmnivideo 100k +3 15 days ago

OmniVideo-100K: A Dataset for Audio-Visual Reasoning through Structured Scripts and Evidence Chains

Script GPT-5.4 Voice Murf.AI Gen2
InferenceTrainingLlama +2 15 days ago

Skip a Layer or Loop It? Learning Program-of-Layers in LLMs

Script GPT-5.4 Voice Hume Octave 2
Dev ToolsAgentsPonytail +3 16 days ago

DietrichGebert/ponytail

Script GPT-5.4 Voice Deepgram Aura-2

Ponytail He says nothing. He writes one line. It works. <img
AI SafetyPolicyDeprecation +4 17 days ago

Anthropic disables Fable and Mythos AI models after U.S. government bars it from giving foreigners access | Fortune

Script GPT-5.4 Voice OpenAI TTS

The directive would even bar Anthropic's own foreign employees from using Fable and Mythos. Anthropic called the government position "a misunderstanding".
Data InfraMultimodalBenchmark +4 18 days ago

PixelRAG beats text parsers, cuts agent costs 10x

Script Qwen 3.5 397B A17b Voice Inworld TTS 1.5 Max

UC Berkeley's PixelRAG renders pages as screenshots instead of parsing text, boosting RAG accuracy by up to 18.1% and cutting AI agent token costs 10x.
Announcement 18 days ago

Hold That Thought - We Actually Can Now

Script GPT-5.4 Voice ElevenLabs v3

An episode about every episode that came before it.
Dev ToolsData InfraLaunch +4 18 days ago

saiyampathak.substack.com: a vm for every container apple ships

Script MiniMax M2.7 Voice Rime Mist v3
InferenceAgentsLatent Context Language Models Lclms +2 18 days ago

End-to-End Context Compression at Scale

Script GLM 5.1 Voice Murf.AI Gen2
Dev ToolsMultimodalLaunch +4 18 days ago

Apple Foundation Models

Script Haiku 4 Voice Hume Octave 2

Use Claude on Apple platforms through the Foundation Models framework with the Claude for Foundation Models Swift package.
AgentsEvalsEvoarena +2 18 days ago

EvoArena: Tracking Memory Evolution for Robust LLM Agents in Dynamic Environments

Script GPT-5.4 Voice Rime Arcana
Research 19 days ago

Core Mobile Vitals: Understand how your users feel about your app

No episode today

Mobile teams have been asking for a Core Web Vitals equivalent for years. The Core Mobile Vitals initiative is built using the same rigor, research, and user focus.
InferenceMultimodalLip Forcing +2 19 days ago

Lip Forcing: Few-Step Autoregressive Diffusion for Real-time Lip Synchronization

Script GPT-5.4 Voice OpenAI TTS
No discoveries match this filter yet.