Exploring Next

Exploring Next — Ep 412 w/ Justy & Cody — MetaAgent-X : Breaking the Ceiling of Automatic Multi-Agent Systems via End-to-End Reinforcement Learning

Justy and Cody discuss MetaAgent-X, a new paper proposing end-to-end reinforcement learning for multi-agent systems. They break down how it solves the 'frozen-executor ceiling' by jointly optimizing both the agent that designs the workflow and the agents that execute it. Cody explains the hierarchical rollout mechanism and stagewise co-evolution, while Justy explores what this means for production pipelines that currently rely on static prompts. They touch on the 21.7% performance gains, the reality of training stability, and whether this moves us from 'prompt engineering' to actual 'system engineering.'

Open source article

Full episode page with transcript →

Browse all Exploring Next episodes →