Custom Metrics for Better Autoscaling

Custom Metrics for Better Autoscaling

Feb 2, 2026

Guest:

  • Nicholas Eberts

Still relying on CPU and memory metrics for your Kubernetes autoscaling? These aren't always the best proxies for actual demand.

Nicholas Eberts, Product Manager at Google working on GKE, explains why custom metrics deliver better efficiency and how tools like KEDA reduce the complexity of getting there.

In this interview:

  • Why CPU and memory are "not real metrics" for saturation

  • How kubectl AI helps analyze clusters with local inferencing

  • The lift required for custom metrics adapters vs. KEDA's out-of-the-box approach

  • How AI/ML workloads are pushing Kubernetes to evolve (DRA, scheduler changes)

Relevant links
Transcription

Bart Farrell: Who are you, what's your role, and where do you work?

Nicholas Eberts: I'm Nick Eberts. I'm a product manager. I work on GKE over at Google.

Bart Farrell: And what are three emerging Kubernetes tools that you're keeping an eye on?

Nicholas Eberts: For one, if you haven't checked it out, you should check out kubectl AI, right? So, all the fancy agent stuff that you want, basically SRE and a CLI tool that we open sourced, last spring. It's kinda neat. You can run it with, with local inferencing too, so you can run it with like a local model on Sonnet, doesn't need to be Gemini or ChatGPT, so it's a pretty great tool if you wanna analyze and inspect any number of clusters that you have in your fleet. That's a really great tool.

Bart Farrell: Any others? Or we're good with that?

Nicholas Eberts: I don't have any off the top of my head.

Bart Farrell: One of our podcast guests, Brian, identified two problems with traditional HPA. First, these metrics aren't always the best proxies for demand. And second, these metrics in HPA react more slowly than event-based triggers. What are your thoughts on using CPU and memory metrics for scaling decisions?

Nicholas Eberts: Yeah, so those are really tough metrics to rely on for accurate saturation, right? So the idea is when you want an HPA, you wanna actually utilize the computers that you're paying for. It turns out that CPUs are not sort of real metrics. And so you want to use something that's a little closer to, what it means to saturate your pod or your container, which in a lot of cases the easy button is, requests per second, but you could do all kinds of fancy things with, specific metrics that or implement, into your application and export. And you could get way more efficiency out of using custom metrics than you'll ever get out of CPU and memory.

Bart Farrell: Now, one of our other- ... guests, Zain, mentioned that KEDA is per- is pretty underrated.

Nicholas Eberts: Yeah.

Bart Farrell: It has great potential if used rightfully, especially for custom metrics autoscaling. What has your experience been with custom metrics autoscaling?

Nicholas Eberts: Listen, custom metrics adapters are great. The Prometheus metric adapter is great. There's a significant lift to get it running, so it's not easy to go from zero to autoscaling on a custom metric, I think. No one's really solved that problem, made it super easy. And what I like about, KEDA and these other frameworks that build on top of Kubernetes where they give you more of a platform feel is you sort of get all of that stuff just when you install KEDA, so you don't have to worry about installing a metrics adapter, exporting it. It's sort of all... It comes with the actual tool itself, and you can get much more intelligent routing out of the box based on, whatever the proxies are exporting that you're, you know, that you're spinning up for your application, so super handy. I think the superpower is getting there faster with less complexity.

Bart Farrell: Kubernetes turned 10 years old last year. What can we expect in the next 10 years to come?

Nicholas Eberts: Kubernetes 2.0. Let's go. (laughs) No, in all seriousness, AI/ML space is pushing Kubernetes to adapt in ways that it hasn't had to historically, and so I think you're gonna see a lot of movement back into the scheduler. If you look at DRA and how they're dealing with these disparate hardware classes to sort of represent within a deployment, all these things are gonna kinda shape Kubernetes 2.0, and I think it's gonna be, interesting to see like how far it can continue to keep on going and evolving with like the industry norms and all the new buzzwords that are happening and popping out every month, seems like.

Bart Farrell: What's next for you, Nick?

Nicholas Eberts: So what's next for me is to hang out with this guy at KubeJam on Wednesday, which is really the only reason I come to Kubernetes. But no, in reality, my PM job, I'm looking hard at AI/ML training and inferencing and figuring out how to help customers do it efficiently and what needs to get moved into upstream versus just run like natively on GKE. So that's kinda my job, and that's what I'm getting paid for, so let's go.

Bart Farrell: And how can people get in touch with you?

Nicholas Eberts: You can get me on the LinkedIns. You can get me on the Blueskies. That's it. Yeah. And then I have a YouTube channel where I just play guitar. You can get me there too.

Bart Farrell: Can you do, uh, a shout-out or, you know, a plug for the Dadbeats?

Nicholas Eberts: For the Dadbeats? Y'all, if you wanna see mediocre music played by a musician with even less talent, come see my band, the Dadbeats, when you're in Atlanta.

Bart Farrell: Okay, good.

Nicholas Eberts: We've got a hot event coming up. We're playing at my kid's school faculty party. Let's go.

Podcast episodes mentioned in this interview