Smarter Kubernetes Autoscaling on GKE

Smarter Kubernetes Autoscaling on GKE

Apr 17, 2026

Guest:

  • Roman Arcea

Are your Kubernetes workloads still over-provisioned, hard to autoscale, or too manual to tune with confidence?

Roman Arcea, Group Product Manager for Google Kubernetes Engine, explains how newer Kubernetes capabilities can reduce waste, improve responsiveness, and make platform operations less hands-on.

In this interview:

  • Why in-place pod resize and Vertical Pod Autoscaler change resource management

  • How custom metrics, HPA, and KEDA affect autoscaling behavior

  • Where teams still waste CPU and GPU capacity in Kubernetes clusters

Subscribe to KubeFM Weekly

Get the latest Kubernetes videos delivered to your inbox every week.

or subscribe via

Transcription

Bart Farrell: Who are you, what's your role, and where do you work?

Roman Arcea: I'm with Google. My name is Roman Arcea. I'm a group product manager, and I cover Google Kubernetes Engine.

Bart Farrell: What are three emerging Kubernetes tools that you're keeping an eye on?

Roman Arcea: Wow, there is so much happening in Kubernetes. It goes from the level of some of the most basics, from what we do with infrastructure provisioning now, with the likes of custom compute classes, of what you're seeing emerging in the market as capabilities to do fallbacks between various VMs. But I also am very impressed about the tools that are emerging in the area of resource, automated resource management, like in-place pod resize. And completely outside Kubernetes, I think the large language models and what's happening now with vibe coding is very interesting for Kubernetes, mostly because it removes the barriers to learning and understanding how to use Kubernetes the best.

Bart Farrell: One of our podcast guests, Thibault, said, before we had VerticalPod Autoscaler, resource management was a nightmare. What is your experience with resource management challenges before implementing automation?

Roman Arcea: Glad that you're asking that because the entire last year our team has worked relentlessly with the community to bring in-place pod resize to Vertical Pod Autoscaler. We launched it in summer, then we brought VPA support with in-place pod resize to the market, and we've just launched it inside Google Kubernetes Engine as well. For the first time ever, it allows you to resize resources in your workloads without ever having to restart them. And it's a complete game changer with what we're doing there in resource management and workload autoscaling. If you haven't touched that space before, if you haven't researched IPPR with VPA, I really encourage you to do so. It's a life-changing technology.

Bart Farrell: Our podcast guest Zain mentioned that KEDA is pretty underrated. It has great potential to be used rightfully, especially for custom metrics autoscaling. What has your experience been with custom metrics autoscaling?

Roman Arcea: A spot-on question right in my garden. So here's the thing with custom metrics. Custom metrics were always an afterthought in Kubernetes. It was designed with the most simplicity in mind, so we started the journey around workloads with scaling on CPU and memory, and this has been the main way how people approach this. CPU and memory are a proxy metric for what is generally acceptable to be usable autoscaling, so custom metrics was very, complicated. So I can tell you that the market develops very actively in this space. Just last month, as part of Google Kubernetes Engine, we've released a completely reworked stack for horizontal pod autoscaling with custom metrics that now is custom and embedded. So if you would look at the old stack, it would take like 90 seconds for custom metrics to propagate and react and scale your application. The new stack is significantly faster than that. I'm encouraging you to go and explore the new HPA custom metrics native stack and what it will do for your application.

Bart Farrell: Zain also observed there's a lot of waste happening on CPU and GPU and there's a significant opportunity to optimize this. What approaches do you take to reduce resource waste in your clusters?

Roman Arcea: A lot of this has to do with the fact that users have to make all those choices on resource sizing manually today, historically, if you were getting started with Kubernetes, you would have to choose your VM size, you will have to choose your GPU size, very often completely independently of what your workload needs. Very often in large organizations, the teams who provision infrastructure, make those choices on resources, are not also the same people who deploy workloads and size them. So a lot of this comes from the mismatch of the workload layer and the infrastructure layer. Kubernetes has evolved tremendously over the last couple of years to change that waste issue. So for example, the modern way that not everyone yet has adopted to provision infrastructure resources is for things like node-auto provisioning, node-auto creation, where the nodes are actually created and sized dynamically by the size of what your workload needs. If you've not explored what's new in Kubernetes over the last one, two years, I very much encourage you to look at the tools that are there for automated resource management because Kubernetes has evolved a lot. And if you started your journey with Kubernetes, especially two, three, four years ago, the current landscape of applications and capabilities is completely drastic. And we believe you could cut your waste meaningfully by just adopting the latest and greatest in Kubernetes GKE.

Bart Farrell: Kubernetes turned 10 years old a couple of years ago. We'll be two years in June. What can we expect in the next 10 years to come?

Roman Arcea: My hope is that we can make Kubernetes completely transparent. This is not a great thing to say for a product manager, but I really, believe that if you're in a business of writing SaaS, or maybe you're a chocolate brewer, or maybe you're producing engines. Infrastructure has to be completely transparent You should be able to run business on Kubernetes without ever getting a PhD in managing Kubernetes and its applications on this. So my hope is that we can become that universal substrate, open source standard for distributed systems. And the fact that Kubernetes is an open source application, it's an open source project allowing us to push it to become the standardized API for distributed infrastructure consumption going forward.

Bart Farrell: And what's next for you?

Roman Arcea: Well, next for me is I definitely am sticking around with Google. We have an amazing lineup of things coming out here. It's a big team that completely changes the face of Kubernetes. And in my opinion, in the next one, two years, it's going to transform how we think and work with Kubernetes, you know, in any business.

Bart Farrell: And how can people get in touch with you?

Roman Arcea: Well, hit me up on LinkedIn. You can always find me. I'm always happy to reply and have a conversation, exchange thoughts. And of course, if you are one of the users of Google Kubernetes Engine, always get in touch with us. We'd love to see how we can be more helpful.

Subscribe to KubeFM Weekly

Get the latest Kubernetes videos delivered to your inbox every week.

or subscribe via