AI, Trust, and the Future of GKE

AI, Trust, and the Future of GKE

May 8, 2026

Guest:

  • Glen Messenger

Running AI workloads on Kubernetes creates a very different set of problems than running ordinary applications. Security boundaries, GPU-heavy infrastructure, and the pace of change all force platform teams to rethink what a cluster should do and who should trust it.

Glen Messenger, Group Product Manager for GKE at Google Cloud, explains how Google is approaching frontier AI security.

In this interview:

  • What still feels incomplete in AI-native infrastructure and GPU runtime support

  • Which Kubernetes operations are safest to hand over to AI agents first

  • What GKE Hypercluster is trying to solve for planetary-scale AI workloads

Subscribe to KubeFM Weekly

Get the latest Kubernetes videos delivered to your inbox every week.

or subscribe via

Transcription

Bart Farrell: First things first, can you just say who you are, what's your role, and where do you work?

Glen Messenger: My name is Glen Messenger. I work as a group product manager at GKE, and I've been here for about eight years.

Bart Farrell: What are three emerging Kubernetes tools that you're keeping an eye on?

Glen Messenger: A lot of my tools that I look at are mainly security tools, and with agentic coming on, there's no longer three. There's about 300. So when I think about security In this context, there's probably a lot from the CNCF or AI Illuminate that come through. A lot of these are very emergent, though. They're still not mature.

Bart Farrell: What are the biggest security concerns for people that are working with Frontier AI?

Glen Messenger: On GKE, the biggest concern is probably securing the model, not just from hackers, but also from the platform admin and from Google itself. For example, we've got a lot of customers who have spent billions of dollars training these models. They want to be able to deploy these models and these model weights on GKE and protect them from their own platform admins, protect them from us and protect them from hackers, but In a way that can be verified. So it's trust, but proven trust. For example, the biggest issue we've had at the moment with accelerators In this size of work is the ability to deploy workloads, verify the workload, use attestation. And only once it's been cryptographically sealed do they actually pull down the model weights and put them into memory. That actually required us to change the architecture of GKE and Kubernetes quite substantially because the Kubernetes of yesterday was not built for that much isolation. So we really had to create this new trusted computing base, or TCB, that was incredibly small. It's been a big challenge. The speed that we're seeing of these frontier AI companies has really driven us to do that.

Bart Farrell: With that in mind, for people that are approaching AI and Kubernetes, what are basic security concerns that they simply cannot forget?

Glen Messenger: Kubernetes was not built by default to be secure. It was built to be orchestrated at scale. It was built to be usable. So one should not assume that Kubernetes itself has default security. GKE does. GKE had to build in a lot of least privileges, secure by default. We had to have a lot of opinions outside of the box. Go in with the expectation that GKE is your partner, and we do a lot of the security, but it's also self-driven as well.

Bart Farrell: Kubernetes turned 10 years old about two years ago. What can we expect in the next 10 years?

Glen Messenger: It'd be dangerous to give an estimation of what the next 10 years will be like, because I think we've seen in the past two years with AI how quickly things have changed and how quickly we've been wrong. We thought agentic was still going to be a few years off and it's just been accelerated. I think the thing that we're going to see is Kubernetes as a holistic thing being fragmented down. We'll probably see one control plane replaced by many agents. We'll probably see a swarm effect from agents when it comes to orchestration, for example.

Bart Farrell: Everyone talks about AI-native infrastructure. When you actually try to run GPU-heavy workloads on Kubernetes, what still feels hacked together?

Glen Messenger: It's a really good question. A lot of it still feels hacked together because Kubernetes isn't one thing. It's hundreds of things brought together. While some of them might be AI-native, some of them might not be. If you look at the runtime support in GPUs, If you wanted to use a trusted computing environment. TEEs were built to be an enclave. Isolation. They weren't built for AI workloads. We're slowly bringing those in. I think when we think of AI-native, it's having the awareness of what the workload is trying to do, awareness of the runtime that it's trying to be in, and awareness of how that operates through its lifecycle.

Bart Farrell: We're here at Google Cloud Next. What are the things that are most important for you in this conference?

Glen Messenger: Two things. Really understanding what customers are trying to achieve, and it's happening so fast. So what they needed yesterday is not the same as what they'll need tomorrow. And secondly, it's really trying to liaise with customers on how we're going to deliver things. In the past, there was a bit of a waterfall approach to delivering software. You would plan it, you would go build it for six months. Then you spend six months shipping it. These days, it's co-design and how we can actually have rapid iterations with customers and turn products or features around in a couple of weeks rather than a year or six months.

Bart Farrell: There is growing interest in using AI systems to operate Kubernetes rather than just generate YAML. Which cluster operations would you hand over first and which ones are too risky to automate at this point?

Glen Messenger: I think agentifying the platform admin experience is pretty much what you're talking about. It's gaining in popularity a lot, and so it should do. You shouldn't need to know YAML. You should be able to use human language to figure out what you want to do. A lot of customers start with the read-only operations. If they can see an event happen, if there's some corrections they can do without terminating the workload, because that trust needs to be built over time. As the confidence gets more, you can allow the agents to do more things, just not holistically. You probably wouldn't let an agent roll down workloads to zero, or kill a cluster, but you might let it do new node provisioning, things that aren't going to be destructive in nature. It's not going to be a binary thing. It'll just be a trust level over time.

Bart Farrell: Looking towards the future, what are you going to be working on next?

Glen Messenger: I'm working on this thing called GKE Hypercluster, where we figured out a way of changing our architecture. So we can deploy a cluster that has a million accelerator chips and we can do this across regions. So right now there is a way to spin up GKE where you can have compute in US central one, US west one, Europe east one for example and you can attach those to one cluster. So if you are a frontier AI company, and you cannot get enough accelerators in one geographical location, we've solved that problem. At the same time, with the same infrastructure, the same scale, the same control plane, you can enable zero trust security on that isolates those workloads. And these two things that were incredibly difficult to do by themselves and usually couldn't be done together have been delivered by one. And that's probably going to be my future for the next six to 12 months.

Bart Farrell: If people want to get in touch with you, what's the best way to do that?

Glen Messenger: Email me. gmessenger at google.com or hit me up on LinkedIn Glen Messenger. I'm happy to take questions. I love to ask dumb questions as well so whoever contacts me should be prepared for me to come back with quite a lot of questions.

Bart Farrell: If there's one thing that AI could do for you on Kubernetes to make your life easier what would it be?

Glen Messenger: Give me a crystal ball as to what we should be doing next. I think one of the struggles that we have is we want to know where we need to be in six months, and the way to do that is to ask customers but we also want to respect privacy. We're trying not to look at how things are emerging in the market without going past those boundaries. I think AI at the moment is doing exactly what it's meant to do. We've just got to catch up and make sure we're using AI right to understand what we need to do.

Subscribe to KubeFM Weekly

Get the latest Kubernetes videos delivered to your inbox every week.

or subscribe via