Kubernetes resource optimization: from autoscaling to sustainability

Guest:

Josh Cypher

Learn how Kubernetes is evolving towards efficient resource management and sustainable operations.

Josh Cypher, Senior DevOps Engineer at Sonos, discusses:

How Karpenter is revolutionizing autoscaling with workload-based constraints, delivering 30-50% better efficiency in resource scheduling and offering extensible APIs for predictive analysis
The growing focus on Kubernetes sustainability, moving away from over-provisioning towards data-driven resource allocation through proper testing and monitoring
The future of Kubernetes, including the need for standardization across cloud providers and the integration of AI technologies for improved efficiency

Relevant links

Transcription

Bart: The host is Bart Farrell . The speaker is Josh Cypher (works for Sonos, Inc.)

Josh: I'm Josh Cypher. I work at Sonos as a senior DevOps engineer.

Bart: Now, you mentioned that in terms of Kubernetes tools, you're into Karpenter. Why is that, and why should people be paying attention to Karpenter?

Josh: Karpenter's really changed the game with how autoscaling works. By using workload-based constraints to actually schedule things, it's led to way more efficiency. We're talking 30, 40, 50% better efficiency in how we bin pack and schedule, and bring on resources. So it's been a real game changer. And the Karpenter APIs for Karpenter are really extensible. I really like that you can essentially train an AI model and feed that into the Karpenter APIs and then suddenly start to do predictive analysis on scheduling in nodes for either seasonality or things like that. It's really cool.

Bart: In terms of looking at the landscape and other options or alternatives that might not be there for Karpenter, what are the ones that you most commonly come into contact with, and what differentiates Karpenter from those other options?

Josh: One thing I like about Karpenter is that it's now in the open source world. A lot of different cloud providers had their own implementation, which made everything very niche and specific to a platform. Now, Karpenter is platform agnostic. Of course, it works really well with AWS, and I'm heavily invested in AWS. I'm excited that we can start to standardize this across all cloud providers.

Bart: In terms of that need for standardization, Kubernetes turned 10 years old this year; what do you think we can expect in the next 10 years to come? More standardization, more introduction of AI technologies, what else?

Josh: I think we're going to see a lot better efficiency. Sustainability is going to play a big part in the future. We've been burning GPUs with AI by throwing more GPUs at it and burning more resources. If we can get more efficient about using our compute, I think we're going to see a lot of that as time goes on. We're going to start caring about sustainability and energy efficiency, which we haven't done much so far. It's been about throwing spaghetti at the wall with Node Provisioning.

Bart: For people that want to take their first step towards better Kubernetes Sustainability, what's your recommendation?

Josh: My recommendation is to look at the data of how your workloads run. A lot of clusters are completely over-provisioned. I think you could start by doing something as simple as load testing or performance testing your applications. Make sure your container requests are appropriate. Look at the data, put in the work to test it, and then create some realistic Resource Limits and Resource Quotas based on real data. A lot of times, it just feels like a guess, and I'm guilty of that too. But that's really where I would start.

Bart: Kubernetes has many features. What's your least favorite feature or the feature that you would like to see improve the most?

Josh: I would love to see the Vertical Pod Autoscaler (VPA) become smarter. It currently feels like painting with a broad brush. I have a hard time trusting it and handing over the scaling of Resource Limits requirements for an individual container to the VPA. It doesn't feel right yet. I would like to see it become more intelligent, provide better data, and offer more adjustable settings. I don't think it's quite there yet.

Bart: Since you mentioned over-provisioning, one of our guests wrote an article about over-provisioning. What strategies have you found effective in controlling over-provisioning in large Kubernetes clusters, considering aspects like Resource Quotas, Resource Limits, and Cluster Autoscaling, as well as tools like Karpenter, and the broader context of Kubernetes Sustainability?

Josh: Well, I talked about performance testing, and I think knowing your actual needs is super important. Understanding your priorities is also crucial. What are your most important workloads, and do you have them in the right Priority Classes? Are they in the right tiers? Once you understand the basis of what you're doing, you can structure your clusters around that. A lot of times, we create clusters with a certain number of nodes for high availability, or we focus on having a lot of I/O or CPU. However, we often don't spend enough time understanding the work. We've created self-service platforms and handed them over to developers, telling them to specify their needs. By building out a tiered approach to priority and importance, you can get a better picture of your needs.

Podcast episodes mentioned in this interview

Configuring requests & limits with the HPA at scale

with Alexandre Souza