What’s inside

A position paper for platform and infrastructure teams.

Three linked hypotheses, a survey of what Kubernetes already automates, and a head-to-head framework for evaluating an MCP control plane on accelerated infrastructure.

The heterogeneity tax

The performance gap between a portable baseline and a deployment tuned for each accelerator class and job purpose — well documented across GPU fleets.

H1 · A real performance gap

Mixed fleets leave throughput and tail latency on the table when configuration stays static while GPU work keeps changing.

H2 · A closed adaptation loop

A policy-guarded monitor, decide, act loop replaces declarative reconcile as the primary control law for accelerated compute.

H3 · The sustain burden

Sustaining per-workload tuning through Kubernetes-shaped operations exceeds human scale as hardware, frameworks, and traffic drift.