Google Cloud Donates llm-d, TPU Drivers, and More to CNCF

Google Cloud Donates llm-d, TPU Drivers, and More to CNCF

Apr 2, 2026

Guest:

  • Abdel Sghiouar

Choosing between GKE Standard and Autopilot has been an ongoing trade-off for teams — full control versus managed simplicity. And running AI workloads on Kubernetes often means dealing with proprietary hardware drivers and fragmented tooling.

Abdel Sghiouar, Senior Developer Advocate at Google Cloud, walks through the announcements made at KubeCon Europe Amsterdam 2026. GKE Autopilot is now available inside Standard mode, so you no longer have to choose — you can enable it within an existing Standard cluster and use compute classes to move workloads between modes. On the open-source front, Google, Red Hat, and NVIDIA donated llm-d (architectural patterns for LLM inference on Kubernetes hardware accelerators) to the CNCF, alongside open-sourcing TPU and GPU drivers for the Dynamic Resource Allocation (DRA) API. There's also an open source MCP server for GKE for agentic AI tooling, an Agent Sandbox project that wraps AI code execution in gVisor or Kata Containers, Ray on TPUs, and GKE becoming certified for CNCF's AI conformance program.

Most of these tools are vendor-neutral — llm-d and Agent Sandbox work on any Kubernetes platform — and contributions go upstream into the community so they benefit everyone, not just GKE users.

Read the full announcement

Subscribe to KubeFM Weekly

Get the latest Kubernetes videos delivered to your inbox every week.

or subscribe via

Transcription

Bart Farrell: So, first things first, who are you, what's your role, and where do you work?

Abdel Sghiouar: Hi everyone, my name is Abdel, I'm a developer advocate with Google Cloud.

Bart Farrell: What news do you want to share with our audience today?

Abdel Sghiouar: What news with S, we have a lot of them. So we have announced a bunch of things today at KubeCon Europe Amsterdam 2026. First and foremost, the most exciting one is Autopilot now is available inside GKE Standard mode, so you don't actually have to choose anymore between the two modes. You can enable or create your GKE Standard cluster and then enable Autopilot mode inside. And then use the provided compute classes to switch your workloads back and forth depending on the compatibility of the workloads. Then we have announced that we are donating the DRA, the Dynamic Resource Allocation API, we are donating the TPU drivers to the CNCF or making them open source. And at the same time, actually, NVIDIA announced that they are donating the GPU ones. So this should help drive DRA adoption further because the drivers now are open source, you don't have to care about vendor lock-in anymore. Also, the llm-d projects have been donated to the CNCF. There have been announcements made by us, Red Hat, and NVIDIA today. So llm-d, for those who don't know, is essentially a set of architectural patterns for optimizing large language model inference on Kubernetes on specific hardware accelerators. And today, llm-d project is part of the CNCF, which should actually also adopt or drive adoption for the project. Then, everybody's talking about MCP. We have an MCP because why not? We have an open source MCP server for GKE. That's the MCP server that should allow you to actually control GKE remotely using an agent. So you can integrate it into your favorite agentic AI tool like Gantt Gravity, Gemini CLI, Cursor, whatever, and then use it to drive operation within GKE. And then the last thing is Agent Sandbox. Speaking of agents, Agent Sandbox is a project we actually open sourced last year. And it's a project that allows you to sandbox code execution for code generated by agents. If you don't know already, agents are untrusted because they are running on deterministic language models and you shouldn't trust the code generated by an agent. So with the agent sandbox project, you can wrap the code inside a pod which is protected either by gVisor or Kata Containers so that your code execution for the agent is safe. Last, we have also made Ray run on TPUs which is super exciting for those of you who use Ray. And the last thing is the AI conformance projects from the CNCF, which was also announced last year. There have been a demo done at the keynote, so I highly recommend you check out the keynote. And if you are already on GKE, it is already AI conformance, so you don't have to do anything.

Bart Farrell: How do these announcements change the landscape compared to what existed before?

Abdel Sghiouar: Massively. So basically all these announcements that we're making are going upstream. And when things make it upstream, they become available as part of Kubernetes, they become available to the community so people can use them. And then eventually they benefit everything. I know that we live in a world where now everything is everything and everyone is talking about AI but actually a lot of these features are going to eventually benefit even people who doesn't run AI.

Bart Farrell: In terms of the stuff that you've talked about, are we talking about things that are open source and if so how do they fit into the CNCF landscape?

Abdel Sghiouar: Most of these projects are open source, some of them are part of CNCF like llm-d, like the AI conformance stuff and then the rest is open source which means the code is available on GitHub, you can use it. Most of it actually you can even adjust it to use it with other platforms. So it's both benefiting the CNCF ecosystem, benefiting the landscape, and then also benefiting everyone who is using Kubernetes as an open source project.

Bart Farrell: And can you break down Google's business model and pricing structure when it comes to the announcements that you've just mentioned for teams that are out there evaluating these solutions?

Abdel Sghiouar: What we're trying to do is we're trying to actually drive forward the Kubernetes projects and the ecosystem. Such a way that everybody benefits. And when everybody benefits, what we are also trying to do on our side is make GKE the best place to run these systems so that people can use our platform and benefit from them. So I'm not going to go into the details of the real price breakdown, but eventually our business benefits when the community benefits. And that's actually what we're trying to achieve.

Bart Farrell: And when people are exploring this space, which alternative solutions might they be considering alongside yours?

Abdel Sghiouar: Kubernetes is open source. You can technically run Kubernetes anywhere. llm-d itself, for example, is vendor neutral. Agent Sandbox is also vendor neutral, and it works with Kata Containers, so you don't even have to use gVisor for that. So most of the things actually can be run either on other platforms because they are available in the open-source space, or you can just use them alternatively. One of the things I really like about the CNCF ecosystem is that components are almost plug-and-play. So they are modular in a way that they are all API-driven. You can swap in and out components as you see fit for your business.

Bart Farrell: Looking ahead. What developments can our audience anticipate from Google and the things that you shared today?

Abdel Sghiouar: Only if I was allowed to share some stuff, but there's actually a lot of exciting more announcements coming at Google Cloud Next in a few weeks in Vegas. They're all in the AI space, in the agent tech space. But what I would say is just keep an eye on our blog, keep an eye on our social media, you will learn more. That's all I have to say.

Bart Farrell: How can interested listeners connect with you to learn more, get started?

Abdel Sghiouar: Follow me on social media, Abdel Sghiouar, or BoredAbdel on Twitter, and then follow the Google Cloud official Twitter account, Google Cloud Tech, and then you will learn more.

Subscribe to KubeFM Weekly

Get the latest Kubernetes videos delivered to your inbox every week.

or subscribe via