Exploring Next

Exploring Next — Ep 371 w/ Justy & Cody — How NetEase Games cut LLM cold starts from 42 minutes to 30 seconds

Justy and Cody unpack how NetEase Games used Kubernetes-native data orchestration with Fluid to shrink LLM inference cold starts from 42 minutes to about 30 seconds, and what that means for teams running their own models.

Open source article

Full episode page with transcript →

Browse all Exploring Next episodes →