The future of Kubernetes optimization: from cost to sustainability

Guest:

James Wilson

This interview explores the evolution of Kubernetes tooling and the future of cloud infrastructure optimization.

In this interview, James Wilson, VP of Engineering at nOps, discusses:

The impact of open source tools in Kubernetes, particularly how Karpenter, Kepler, and VPA are shaping the future of multi-dimensional autoscaling.
Practical approaches to Kubernetes cost optimization, from setting organizational priorities to implementing container-level improvements and leveraging spot instances.
How artificial intelligence will drive the next decade of Kubernetes, from workload optimization to intelligent orchestration across global data centers.

Relevant links

Transcription

Bart: Who are you? What's your role? And who do you work for?

James: I'm James Wilson. I'm the Vice President of Engineering for nOps, a fully automated FinOps platform that focuses on Kubernetes optimization.

Bart: What are three emerging Kubernetes tools that you're keeping an eye on?

James: Definitely Karpenter, number one. We're actually announcing our fully hosted Karpenter Education series here, which is a service we're providing to CNCF. Another one we're keeping an eye on is Kepler. We're really interested in sustainability and how it integrates with other frameworks. The third one I'm looking at, which is not new but has a lot of interesting new developments, is VPA (Vertical Pod Autoscaler). I think the integration between Kepler and VPA (Vertical Pod Autoscaler), Kepler and Karpenter, and Karpenter and VPA (Vertical Pod Autoscaler) will bring us into a world where we'll be talking about autoscaling at a multi-dimensional level.

Bart: Autoscaling. One of our guests, Gazal, prefers Karpenter over Cluster Autoscaler for cluster autoscaling, highlighting its benefits and reliability. In particular, Gazal and his team use Karpenter to consolidate workloads and save around 40% of their cloud bill. Karpenter was donated by AWS to the Kubernetes project. How do you think this will shape the future of autoscaling in Kubernetes?

James: I think that the Cloud Native Foundation and Kubernetes project taking control over Karpenter is a significant advancement because we're going to see an increase in momentum of adoption of Karpenter across different public clouds. Additionally, many organizations are looking at new ways to integrate different frameworks with Karpenter, which will open up a lot of innovation in the area of auto-scaling at both the container and infrastructure level, as they are interrelated. I believe that's the future of auto-scaling, along with the topic of AI. There's a lot of work being done on AI-based modeling around workloads, and the ability to integrate that work with auto-scaling frameworks is only possible if the open source community is driving the effort.

Bart: On the subject of platform engineering and people, one of our guests, Ori, shared that rushing into solutions without understanding the root cause can lead to fixing symptoms instead of the actual problem. He mentioned the case of Network Policies and how sometimes the root cause of a problem is a people problem, and the solution lies in addressing that. What is your experience with providing tooling and platforms on Kubernetes to other engineers? What are some of the soft challenges that you faced?

James: We deal with this a lot at nOps. The first challenge we think about is organizational priorities. For instance, cost management is often a punitive exercise. Engineers are not dealing with optimizing for cost or efficiency until somebody in the finance team is mad at them. If we set priorities at the organizational level, it gives engineers the ability to manage towards KPIs, which is something engineers are really good at. If we're only focused on security and reliability, then operational efficiency and actual costs are never going to be a priority.

Bart: On the subject of cost optimization and ARM instances, Miguel shared that Adevinta migrated their Kubernetes cluster to support ARM instances and workloads because they are more cost-effective. Do you have any practical advice on reducing Kubernetes costs?

James: I have lots of practical advice on reducing Kubernetes costs. Our subject expertise area at nOps is optimizing costs, and we often start from the bottom up in a greenfield environment. We begin by optimizing at the container level, the lowest level, and then move up to ensure our infrastructure utilization is tuned to get the most out of our infrastructure capacity. Finally, we optimize for price. However, most organizations don't have the flexibility to start from a greenfield situation.

My advice is to prioritize based on impact. Price optimization, such as shifting some workloads to Spot instances and using spot best practices, may be a quicker win than trying to re-architect workloads at a lower level. I recommend going for the biggest impact with the lowest amount of effort and making the process continuous. Come back around, figure out what the next biggest priority is, and attack that. Always start with visibility - you can't fix a problem if you don't know what it is.

Bart: Kubernetes turned 10 years old this year. What should we expect in the next 10 years to come?

James: I think we should expect a lot. We have a lot of data, and one thing I can tell you is that we're at the precipice of Kubernetes, at least in AWS, which is a pretty huge slice of the pie, being the biggest compute provisioner and the biggest cost center for the entire cloud. The cloud native community has been doing this for years, but there are still many organizations on that journey. We're going to start seeing AI take the driver's seat in Kubernetes optimization. We'll be discussing both sustainability and cost optimization in the same conversation, making it very multidimensional in how we tune our clusters. On the horizon, we'll be talking more about geo-shaping workloads and time-shaping workloads, allowing us to run our workloads in any data center, any availability zone in the world, at any time, based on AI-driven decisions that help us orchestrate things across clusters.

Bart: What's next for you?

James: I'm staying in this space. Kubernetes optimization is my area of focus. I think we're going to be looking at multi-dimensional auto-scaling as the next frontier, and really big integration between HPA (Horizontal Pod Autoscaler) and VPA (Vertical Pod Autoscaler), and how we can use AI to make those decisions in real time.

Bart: How can people get in touch with you?

James: You can find me easily. Go to nOps, our product website, or email me at [email protected].

Podcast episodes mentioned in this interview

Reducing compute capacity by 40% on EKS with Bottlerocket and Karpenter
with Gazal Gafoor
Network Policies are the wrong abstraction
with Ori Shoshan
Transparently providing ARM nodes to 4000 engineers
with Miguel Bernabeu Diaz and Thibault Jamet