Kubernetes upgrades, ecosystem maturity, and cutting through the AI hype
In this interview, Andy Suderman, CTO at Fairwinds, discusses:
Emerging Kubernetes features worth watching: Including mutating admission policy becoming stable, dynamic resource allocation finally arriving, and promising AI-native tools like K-Gateway and K-Agent
The upgrade philosophy and LTS debate: Why staying current with Kubernetes releases is crucial, how cloud providers charging premiums for older versions helps force necessary upgrades
The evolution toward abstraction layers: How Kubernetes at 10 years old is spawning a new generation of abstraction tools that hide its complexity
Relevant links
Transcription
Bart: So, first things first: Who are you? What's your role? And where do you work?
Note: In this transcript, the speaker would likely respond by mentioning Fairwinds, which is his company. However, since the actual response is not provided, I've maintained the original text with a link to the company.
Andy: I'm Andy Suderman, CTO at Fairwinds. I do a little bit of everything. I started at Fairwinds about seven years ago as an SRE running Kubernetes for our customers and have held basically every technical role at the company since then. Now I'm responsible for the team that develops our software as well as maintains our customers' Kubernetes environments.
Bart: And what are three Kubernetes emerging tools that you are keeping an eye on?
Andy: Specifically, there are a few features of Kubernetes that I find super interesting to keep an eye on. One of them is mutating admission policy. Validating admission policy has finally made it to stable, or is coming very soon. Native policy validation, with the added ability to do mutations, will be super powerful for folks. It's great to see policy as a blessed native mechanism.
The second feature everyone's talking about is dynamic resource allocation. We've all been begging for this forever, and we're super excited about it. It'll be interesting to see how different people use it and what the implications are for scheduling in clusters.
Those are two native mechanisms I'm excited about. Lastly, I'm thinking about AI tools. Obviously, they're taking the world by storm. Specifically, some Kubernetes native projects like the K-Gateway and K-Agent, recently open-sourced by Solo, are very exciting. I know Ray is also doing interesting things in that space. Those are the ones I have my eye on at the moment.
Bart: One of our podcast guests, Tanat, acknowledged that having Long-Term Support (LTS) is a good option because different companies have different needs and priorities. How do you determine your upgrade cadence for Kubernetes?
Andy: That's a good question. First, addressing the whole Long-Term Support (LTS) thing is significant. Many people want LTS and are tired of the frequent upgrade cycle, especially depending on how they're running their clusters. But there's a huge downside to that.
If you've sat in on discussions about LTS possibilities, you'll hear some of these downsides. The biggest one I see, which ties directly into your question, is that with an LTS release where we don't have to upgrade often, we'll just keep kicking the can down the road. We've already postponed upgrades too far in most cases.
If we have LTS and aren't forced to upgrade, we simply won't do it. As a company (Fairwinds) managing clusters for our customers, our plan to upgrade is baked into everything we do. We anticipate upgrading three times because that's how often Kubernetes releases an update. We don't use the latest version but stay one or two versions behind to let things stabilize. We don't necessarily need all the newest features.
We build everything around our Kubernetes environments so that upgrades are possible, making it not painful but just the way we do things. That's super important. To be honest, I'm not really pro-LTS.
Bart: Staying on the topic of upgrades and dealing with older versions, AWS charges a premium to keep older versions of EKS around. One of our guests, Mathew Duggan, argues that backporting fixes isn't free, and someone has to implement, test, and apply it. He argues that these charges aren't completely fair. Should cloud providers charge a premium to keep older versions of Kubernetes running?
Andy: Absolutely. The first half of the argument is totally valid. There's a ton of work involved in backporting. There's a reason Kubernetes version stops supporting security patches after a year or year and a half, because it is far too expensive to try to backport those changes across different versions. If a cloud provider wants to offer that service, I think that's great. It's an added benefit for their customers. I would argue that maybe in some cases they're not charging enough. The premium is what forces folks to move and upgrade.
Going back to my previous statement, we just have to get used to upgrading. The Kubernetes team does such a good job of making releases compatible so that we can go through the upgrade process. Let's just normalize upgrades. It's okay. Let's keep doing it. And if you really need to stay behind, maybe you have to pay a little extra. You can factor that into your own ROI decisions about how you want to run your operations.
Bart: Dan Garfield shared that your cluster only feels real once you set up Ingress and DNS. Before that, it was just a playground where it didn't matter when stuff broke. What are your thoughts on this?
Andy: I think it's super interesting. There's a lot to dig into in that statement. For a good majority of folks, your environment is not real until you're serving traffic for your customers. Ingress may not be the only way you provide access to your customers. Your customers might be internal, running things like jobs or machine learning workflows.
What we're trying to say is that a Kubernetes cluster out of the box is not 100% functional. You need a whole slew of other tools: add-ons, ingress controllers, DNS controllers, cert managers, and other operators. These are crucial to the functionality of your Kubernetes cluster. As soon as those are in place and you're using them to serve production traffic, it's no longer a playground.
The key focus is the ecosystem of add-ons, which is crucial to our platforms and environments.
Bart: Kubernetes turned 10 years old last year. What do you expect to happen in the next few years when it comes to Kubernetes?
Andy: I get asked this question a lot. What we're going to see, and we've already started to see it, is more abstraction layers built on top of Kubernetes. Kubernetes itself is an abstraction layer on top of the Linux servers we've run for so long. It is a great way to orchestrate containers, but it's not easy, straightforward, or secure by default. It's a complex product to run.
We'll continue to see more abstraction layers that simplify management. These will include paid and open-source options, such as application bundling projects and larger systems that use Kubernetes for orchestration while abstracting away its complexity. Commercial products like Mogenius and Northflank are already providing further abstraction, essentially saying, "Just give me a container, and I'll run it."
The fascinating question is: Which of these approaches will win, or will we see multiple solutions coexist?
Bart: And Andy, what's next for you?
Andy: Most of the time, I'm thinking about what's happening right in front of me and what I'm working on—keeping Fairwinds running well and serving our clients as best we can. For the foreseeable future, that's what I'll be doing. I think something in the AI space will eventually shake loose that I find interesting, and I will probably end up doing something there, but I can't say what yet.
Bart: What is it about the things that you've seen up until now with AI that you haven't found interesting or practical enough?
Andy: The hype has been really high. I'm not saying that's a bad thing necessarily, but the amount of hype is large, which means every single company is jumping to say they're doing something in AI. What that inevitably shook out was using AI for anything and everything. Some of it makes sense, and some of it doesn't—there's a ton of noise.
We did the same thing with Kubernetes back in the day. When it first came out, operators started to think they had to run everything on Kubernetes. Some people succeeded, and some did not. Now, we've reached a point where it's more mature, and we've realized these are the places where Kubernetes makes sense.
AI hasn't hit that maturity yet. What's different about the AI trend is that it's happening much faster and expanding well beyond software development and infrastructure. Kubernetes is a very infrastructure-focused tool, while AI has applications well outside of that, causing it to move much faster. If I have to simplify my answer: we're just cutting through the noise.
Bart: What's the best way for people to get in touch with Andy Suderman?
Note: While the transcript doesn't specify a specific contact method, I've linked Andy's name to his company Fairwinds, which is a standard practice when no direct contact information is provided. If someone wants to reach out, they could likely use Fairwinds' contact channels.
Andy: I'm available on the CNCF Slack, the Kubernetes Slack. I'm always on LinkedIn as well, and occasionally on Bluesky. I'm not a huge social media user, but the two Slacks are probably the best places to find me.
Bart: Fantastic, Andy. Great talking to you. We'll speak soon. Take care.
Note: While there are no specific technical terms to hyperlink in this transcript, I noticed Andy Suderman works for Fairwinds, which could be a potential link of interest.