Menu

Kubernetes Autoscaling and FinOps Best Practices

Kubernetes Autoscaling and FinOps Best Practices

Feb 20, 2026

Guest:

Amin Astaneh

Kubernetes resource optimization requires more than just monitoring: you need the right tools and strategies to reduce waste, manage GPU costs, and balance performance with flexibility.

Amin shares his experience with VPA, Node Autoscaler, and emerging tools for AI-enabled cluster sizing and FinOps practices.

In this interview, you'll learn:

Cost optimization strategies for cloud and on-prem clusters
GPU optimization approaches using DCGM metrics
Reducing resource waste through proper sizing and FinOps

Amin offers a contrarian view: instead of a "no ops" future, he expects specialized tools for specific use cases, with the base scheduler solving 80% of use cases.

Subscribe to KubeFM Weekly

Get the latest Kubernetes videos delivered to your inbox every week.

or subscribe via

Relevant links

Transcription

Bart Farrell: First things first, who are you, what's your role, and where do you work?

Amin Astaneh: Hey there, I'm Amin Astaneh. I'm an instructor for LearnKube. I'm also a consultant for Certo Modo.

Bart Farrell: What are three emerging Kubernetes tools that you're keeping an eye on?

Amin Astaneh: Really interesting question. I really think that we're going to have more AI-enabled sizing of clusters, make it easier for people to be able to just run their infrastructure without thinking about it too much. I'm also concerned about AI SRE and how that affects how we handle incidents and production issues with running Kubernetes clusters.

Bart Farrell: Now, our podcast guest, Thibault Jamet on VPA, he said that before we had the VerticalPod autoscaler, resource management was a nightmare. What's your experience with resource management challenges before implementing automation?

Amin Astaneh: Yeah, it was a nightmare for me, too, because you have the manual effort to monitor and then the manual effort to respond and actually resize your workloads. And in response to an incident, it's really tedious and it's really stressful.

Bart Farrell: Another one of our podcast guests, Mark Campora, spoke on the topic of Kubernetes cost optimization. Mark thinks with Kubernetes, it's quite easy to pay more than necessary because you pay for allocated or provisioned infrastructure, machines you start that are often underused. What strategies do you use to optimize Kubernetes costs?

Amin Astaneh: Right. So let's talk about the basics, which is the Node Autoscaler. It's available for most clouds and you're able to dynamically size your cluster based on utilization. But you also have to make sure... that your workloads actually fit inside of the resource limits and requests that you specify. You can do that yourself. But if you have a really large cluster, it might make sense for you to talk to some of the vendors here that create an automated mechanism to do that for you.

Bart Farrell: Another podcast guest of ours, Dave Masselink, was speaking about carbon and price-awareness scheduling. Dave observed that GPUs are such energy-intensive devices that energy was considered from the ground up. NVIDIA DCGM gives pretty good estimates of full card power. What GPU metrics do teams overlook that matter for optimization? Is utilization even the right primary metric anymore?

Amin Astaneh: Yeah, well, here's the thing. What is old is new again. So 20 years ago, we had grid compute, high performance computing, super compute clusters, and they had the exact same problems. So the way we have to look at this is from the perspective of the application side, not just the hardware side. We think about the golden signals, throughput through the system, latency So unfortunately, what those performance engineers are talking about becomes really important again for expensive hardware like GPUs.

Bart Farrell: From another podcast guest of ours, Zane Malik, he was speaking about resource waste. Zane observed, there's a lot of waste happening on CPU and GPU, and there's a significant opportunity to optimize this. What approaches do you take to reduce resource waste in your clusters?

Amin Astaneh: Right. So we have to answer two questions, right? What is the utilization of your cluster? I think that's a very common one. But also, once again... What is the utilization of your pods inside of the resources that you allocate? You have to be able to calculate both. Another thing that we have to think about as well is, are you even running on the correct instance type? And as part of that, we have to consider, hey, if you're running a certain set of infrastructure all the time, are you doing some FinOps? Are you doing reserved instances? And again, there are some vendors in this room that can help you figure out what the correct instance type is based on historical figures.

Bart Farrell: From our guest, Kensei Nakada, on the subject of Tortoise autoscaling, he said he's been looking forward to an in-place pod resizing for years. Carpenter turns 1.0. It's an exciting time for scaling and resource optimization Kubernetes. Where do you see this field evolving in the future?

Amin Astaneh: That's very interesting. Yeah, I actually have a contrarian view on this. So if you remember no ops or serverless from 10 years ago, I'm expecting a world... where application developers and even cluster operators, they're just writing up their workloads. They're submitting it to an API endpoint and the pods, wherever they land, they land. And you're not managing the hardware anymore.

Bart Farrell: From our guest Yue Yin on Gödel and Katalyst achieving 60% utilization. Yue shared, we think the KubeScheduler could approach higher performance levels with further optimizations. However, there will be inherent trade-offs because the KubeScheduler prioritizes flexibility extensibility, and portability. What's your perspective on this performance versus flexibility trade-off in Kubernetes?

Amin Astaneh: Again, I have a contrarian view. So I expect a future where, like the Unix philosophy, do one thing, do it well. So I'm expecting a larger and larger ecosystem of alternative schedulers for Kubernetes that power users can then decide to use for their given workloads and then... for the base scheduler, that should solve 80% of most people's problems. Hopefully a more green path, happy path to building a Kubernetes cluster using a recommended set of tools and just make it easier for people to get started because there's a lot of complexity and a lot of options. It might be easier for people just to have the recommended ones to use.

Bart Farrell: What's next for you?

Amin Astaneh: What's next for me? Well, hopefully doing a heck of a lot more trainings for LearnKube. as well as finding more clients that need help with operations, SRE, and reliability.

Bart Farrell: How can people get in touch with you?

Amin Astaneh: Yeah, you can reach me at Certo Modo. That's C-E-R-T-O-M-O-D-O.io. I'm also Amin Astaneh on LinkedIn.

Podcast episodes mentioned in this interview

Tortoise: outpacing the optimization challenges in Kubernetes
with Kensei Nakada
5,000 pods/second and 60% utilization with Gödel and Katalyst
with Yue Yin
Beyond Kubernetes: Serverless Execution Models for Variable Workloads
with Marc Campora
Building a Carbon and Price-Aware Kubernetes Scheduler
with Dave Masselink
VerticalPodAutoscaler Went Rogue: It Took Down Our Cluster
with Thibault Jamet

Subscribe to KubeFM Weekly

Get the latest Kubernetes videos delivered to your inbox every week.

or subscribe via