Kubernetes scaling, upgrades and the future of container orchestration

Dec 23, 2024

Guest:

Nathan Taber

A practical discussion about optimizing Kubernetes infrastructure and managing clusters at scale.

In this interview, Nathan Taber, Head of Product for AWS Kubernetes and Container Registries, discusses:

How to implement effective over-provisioning strategies by balancing availability and efficiency based on workload types and scaling requirements
The advantages of Karpenter over traditional auto-scaling tools, including its ability to optimize compute resources and support heterogeneous node fleets
Why Kubernetes upgrades require a structured approach with dedicated resources, testing procedures, and proper planning to manage breaking changes

Relevant links

Transcription

Bart: I work for Amazon Web Services as a cloud architect.

Nathan: I'm Nathan Taber. I lead product management for AWS Kubernetes and Container Registries at Amazon Web Services.

Bart: What are three Kubernetes emerging tools that you're keeping an eye on?

Nathan: The first tool I'm keeping an eye on is Ray. Ray is becoming an extremely popular tool for machine learning, both for training and inference. It doesn't have to run on Kubernetes, but the Kubernetes project is growing really fast. The second tool I'm keeping an eye on is Karpenter. Karpenter is an AWS project that was built by the Kubernetes team several years ago. It seamlessly orchestrates, scales, and optimizes compute on your cluster. It's a really great tool. The third tool is the AWS Controllers for Kubernetes (ACK). This is another tool built by the AWS Kubernetes team, allowing you to seamlessly manage a set of AWS resources natively from your Kubernetes application. For example, if you have a Kubernetes application that needs a native AWS resource, such as an S3 bucket, without AWS Controllers for Kubernetes, you would have to use Terraform or CloudFormation to stand up those resources and wire everything together. With AWS Controllers for Kubernetes, you can simply specify that your application deployment needs an S3 bucket to be launched every time it scales, and this will happen seamlessly, allowing the resources to be wired together inside the Kubernetes cluster.

Bart: Now, the following questions are based on podcast episodes. On the topic of over-provisioning, one of our guests, Alexandra, wrote about over-provisioning strategies. What have you found effective in controlling over-provisioning in large Kubernetes clusters?

Nathan: Over-provisioning is part of a strategy that's a classic balance between availability and efficiency. When thinking about over-provisioning, it's essential to consider the kind of workload being run. Is it stateless or stateful? How fast does it need to scale? The over-provisioning strategy is a mechanism to balance availability versus efficiency. If an application is static, doesn't need to scale super fast, and is stateless, then it probably doesn't require a lot of over-provisioning. On the other hand, if an application is extremely stateful, hard to scale down, and has huge spikes, then the over-provisioning level will be set accordingly. Some customers over-provision up to 80 or 90%, while others are extremely efficient with their usage. Over-provisioning is an application-by-application or team-by-team strategy that needs to be considered. The Karpenter project is exciting because many people are over-provisioned beyond what they need at a minimum. Using a project like Karpenter, which reacts to the limits and requests of the pods and dynamically provisions and optimizes the amount of compute being run, can help reduce over-provisioning. This is open provisioning at the infrastructure layer. The second thing to consider is the limits and CPU resource requests and memory resource requests coming from the pods. Many people don't think deeply about their requests and limits. In fact, many customers, when they move to adopting a dynamic provisioning system like Karpenter, realize that their developers aren't even setting requests and limits on their pods. This is an entire journey that can be done manually through experimentation. There are also third-party tools and companies working on optimization, such as StormForge, which does continuous optimization and tries to determine how to set those requests and limits on the pods so that the correct amount of resources is allocated to the application on the node. Then, a dynamic system like Karpenter can help bring down the compute to just what the applications need, rather than over-provisioning above and beyond what they need. Understanding Pod requests and limits is crucial in this context.

Bart: Related to auto scaling and Karpenter, one of our guests, Gazal, prefers Karpenter over Cluster Autoscaler for its reliability, as it helps save 40% on cloud costs. How do you see Karpenter shaping the future of Kubernetes auto scaling?

Nathan: Having Karpenter go as far as it has is one of the most amazing things in my entire career, especially my career at AWS. We originally had a lot of people using Cluster Autoscaler on AWS. However, we found that Cluster Autoscaler, by nature of having to work everywhere, worked okay but not extremely well in many environments. For example, on AWS, we found that many customers were unable to take full advantage of EC2 Spot Instances. We were just discussing availability and the idea of availability and efficiency. EC2 Spot Instances allows you to have really good price efficiency because if you have an interruptible workload, you can get discounts of up to 90% on the EC2 instances you're running. Obviously, if you're a customer with a workload like that, or you're a small company, you can save a lot of money by adopting EC2 Spot Instances. However, because Spot workloads can be taken back at any time, the best practice for Spot is to have a very heterogeneous fleet of nodes. You want all different mixes of instance types, a whole mix of things, so that if any particular instance size or class goes out with Spot and is taken back by the system, you still have capacity that you can scale and schedule onto. Cluster Autoscaler, or at least back when we were building Karpenter, was optimized around a model that scales by locking in the size and class of node and then uses that as a reference to scale the cluster. When you get to a heterogeneous fleet of nodes, that model falls apart. You have a lot of over-provisioning, under-provisioning, and the cluster tends to zigzag, spiking up and down unpredictably. One of the big benefits of Karpenter is that its provisioning mechanism is not dependent on what comes before. Karpenter is always looking ahead and saying what the next best thing it should launch for the cluster, instead of looking at what's been provisioned and then adding or removing more of it. This makes Karpenter a really good fit for things like EC2 Spot Instances, where you want a lot of different types of instances, because Karpenter can handle that heterogeneity really well. We found that customers like this model, with its automatic provisioning and continuous optimization brought by Karpenter. We are excited about this, and we worked closely with other teams outside of AWS, particularly the team at Azure on AKS, to support Karpenter. Based on that momentum, we took the core Karpenter project and donated it to the CNCF last year, putting it as part of SIG Autoscaling. This has been absolutely incredible because our vision with Karpenter is that it has an open provider model, and everywhere customers want to use it, they should be able to write a provider and use it there. My prediction for the future is that as more and more customers discover how seamless and automated the provisioning and management model is for Kubernetes compute infrastructure, we will see more providers being written, more activity in SIG Autoscaling around Karpenter, and more folks adopting it across Kubernetes, wherever they're running. For us, this is bigger than any product we could ever launch at AWS, as it gives customers the ability to actually impact the Kubernetes community and the project in a positive way, which is really something that gives us deep satisfaction.

Bart: On the subject of upgrades, one of our guests, Matthew Duggan, states that Kubernetes adopted a very aggressive lifecycle and teams are forced to keep up with the updates or they're left behind. What's your advice when it comes to upgrading clusters?

Nathan: That's a great question. Kubernetes has adopted a very aggressive upgrade lifecycle and has accepted breaking changes during upgrades. Having regular patches and bug fixes is fantastic, and every software project should strive to be as advanced as Kubernetes in releasing updates that improve functionality, fix bugs, and make the project better to use. However, the flip side is that Kubernetes is also very comfortable introducing breaking changes, although less so now than a few years ago and likely less so in the future. These breaking changes are where customers struggle.

There is a big gap between the people pushing the project forward and those who have adopted Kubernetes at scale to run large applications, such as the world's largest banks, government institutions, and fighter jets, which typically have software timelines with a high degree of testing and a low degree of changes. This balance is something the Kubernetes project struggles to achieve. My prediction is that in the future, we will see different release cadences and variances, maybe something that's long-term and very static for a long period, with upgrades and bug fixes but no breaking changes, and a faster development branch.

For best practices on upgrades, the number one thing customers should do when adopting Kubernetes is to have a plan for upgrades. They should know what they're getting into and understand that part of running Kubernetes is upgrading it. Where people get into trouble is when they say part of running Kubernetes is just running it and putting applications on it, and life is good for about six months until the upgrade hits, and they don't have a plan or resources and haven't prioritized that work.

The most important thing is to have an upgrade strategy, and there are many different strategies that depend on the organization, risk tolerance, and the kind of applications being run. It's essential to have a plan for how to upgrade, put resources in place, and plan for it in dev cycles, just like a feature project or another type of release. The SRE team or DevOps team doing this work needs to have it on their roadmap and say, "We're going to upgrade once a year, twice a year," and have a plan around it.

On Amazon EKS, we introduced EKS Upgrade Insights, which gives a built-in report card for all clusters. If you're running an EKS cluster, you can go to this upgrade insights and get a full report card for all future versions of Kubernetes for that cluster, seeing where there are breaking changes, API calls that will fail if you upgrade to a future version. The second thing is to use the tools available that help predict where issues could happen and remediate those issues, having a strong structure in place around code ownership and maintainability where you can push that down to the developers and teams programming. This could be recommending which APIs to use and providing an interface into the Kubernetes cluster that abstracts complexity and removes the opportunity to use an API that might be deprecated in the future. The most important thing is to have a plan, use the resources available, and build abstractions and tooling that simplify life when it comes time to upgrade.

Bart: Kubernetes turned 10 years old this year. What should we expect in the next 10 years to come?

Nathan: The thing about Kubernetes is that in the next 10 years, the focus is going to move away from Kubernetes. It's not that Kubernetes is going to go away, far from it. But to me, Kubernetes is like Linux. Before Linux, there was a compilation of different types of operating systems, including proprietary operating systems and some open-source attempts like Unix. Linux brought a whole community together and established a reference for how operating systems should work. Now, Linux is the most used operating system in the world. I think you'd be hard-pressed to find an enterprise that does not use Linux as part of their core computing. Similarly, I think that's where Kubernetes is headed. The last 10 years or so have seen really aggressive growth, with the community coming together and finding different ways to use Kubernetes, apply it, and extend it. We see a lot of consolidation starting to happen. Over the next 10 years or so, I think we'll see more consolidation of tools and techniques, along with a lot of invention and innovation. I'm really excited to see some of that this week at KubeCon and have discussions with folks about where that invention and innovation is going. Ten years from now, I think we'll see Kubernetes at almost every enterprise organization. You'll be hard-pressed to find an organization that's not using Kubernetes. Kubernetes will become part of that cloud operating layer, the systems OS. If Linux manages the operations of a single system, Kubernetes manages the operations of multiple systems. I think you'll see this in every single enterprise organization, in every large complex software project or installation. Kubernetes will be there, but it will become a lot more standardized and accepted across many different industries. Maybe KubeCon won't be quite as exciting or new and different every single year, 10 years from now, but Kubernetes will absolutely be there and it will be everywhere.

Bart: What's next for you?

Nathan: What's next for us? We're building one of the best places to run Kubernetes in the world. We run tens of millions of Kubernetes clusters at Amazon Web Services (AWS) every single year. We don't know if we're the largest, but we think we're up there. Our team's mission is to make AWS the best place to run Kubernetes and to make it really easy for everybody, from small startups and individual developers up to the world's largest enterprises, to bring their applications to AWS and scale them across both the cloud and other environments, like data centers and the edge, using a set of both managed Kubernetes components and platform and cloud tooling built by AWS, including Amazon EKS.

Bart: How can people get in touch with you?

Nathan: Hit me up on LinkedIn, come and find me. I'll be wandering the halls at KubeCon. Look forward to having a discussion.

Podcast episodes mentioned in this interview

Reducing compute capacity by 40% on EKS with Bottlerocket and Karpenter
with Gazal Gafoor
Kubernetes needs a Long Term Support (LTS) release plan
with Mathew Duggan
Configuring requests & limits with the HPA at scale
with Alexandre Souza