Production Kubernetes With Claude

Jul 3, 2026

Guest:

Alex Burnett

Making changes to Kubernetes settings in production can be stressful, especially when requests, limits, autoscaling, or readiness probes are already affecting live traffic.

Alex Burnett, Enterprise Solutions Architect at Dash0, shares how teams can feel more confident by testing changes in mirrored environments, using observability to spot problems early, and using AI tools carefully for read-only checks.

Transcription

Bart Farrell: Who are you? What's your role? And where do you work?

Alex Burnett: My name is Alex Burnett. I'm an Enterprise Solutions Architect and I work at the lovely Dash0.

Bart Farrell: A Kubernetes setting can look wrong but still feel risky to change once it's already in production. Requests, limits, auto scaling, or probes. What would you tell a team that sees the problem but is nervous the fix could cause an outage?

Alex Burnett: There's a few ways you could do it. My normal one that we always tend to say is you wouldn't just have one Kubernetes cluster for production. You'd normally have a test or something that would mirror it. So anything that you feel uncomfortable with, try it in that other cluster first, especially if they're kind of like a mirror setup, you get a bit of confidence from that. And then you're able to comfortably make a change, test it and then roll that into production. As well, you can also use something like Claude, which is something that we use quite heavily here at Dash0, it kind of gives you some insight, especially if you use something like an MCP container that can give you some advice. And then obviously that make the recommendations that you need and give you the confidence to kind of make any changes in your production environment.

Bart Farrell: Missing readiness checks usually show up through something concrete. Traffic reaches a pod too early, auto scaling behaves strangely, or users report errors. If a team wanted to catch this before users do, where would you first have them work? Where would you have them look first?

Alex Burnett: kubectl is obviously the main place that you could look for certain things weirdly happening, especially when it comes to readiness probes. Apart from that, we offer a great observability solution here at Dash0, where you can get various insights into your Kubernetes operations. We receive anything that's like a cluster event. So if you kind of do a deployment, if something doesn't look quite right, we can kind of show that up and alert you very quickly if something's not quite right. Likewise, again, kind of Kubernetes MCP server, hook it up to Claude, ask it questions, it's able to kind of query kind of your Kubernetes setup. And again, kind of give you a bit more of an insight and speak, as I like to call it in a bit more kind of plain language rather than technical jargon, which can sometimes cause a bit of confusion with engineering teams.

Bart Farrell: Production readiness reviews can happen before launch, after incidents, during audits, or not formally at all. What would you put in place so Kubernetes readiness gets reviewed before it becomes urgent?

Alex Burnett: This kind of forms maybe a little bit of a DevOps action. So we're constantly reviewing our code when we release it. And we always bring observability into the forefront of that. It should be something that you're constantly reviewing, maybe every time that you do a release. And if there's something missing, maybe bring that into it there and then pretty much and make those changes. It's very much the case, don't leave it in the background because you'll never do it. You then forget it. Something big happens and you're like, hindsight is a great thing.

Bart Farrell: You mentioned Claude earlier. We're here at KCD Helsinki. Plenty of people are talking about AI. What are things that you're comfortable using AI for on Kubernetes? And what are things you'd say, ah, it's not quite ready for you?

Alex Burnett: For Claude, the Anthropic models, nine times out of 10, pretty good and don't really hallucinate for me. I'm very happy for it to do kind of like read requests. So I'm happy for it to look at getting configuration. I use that heavily at my job to kind of pull like kind of conversational data to kind of identify if customers are asking for certain things. At the moment, I'm not comfortable with it making any sort of change live. Not because I don't trust it, but just because I think it still hallucinates a little bit. And the last thing I'd want to do is have it hallucinate in a production environment that causes chaos. But to read data and understand kind of things, I'm more than happy and comfortable to use it like that.

Production Kubernetes With Claude

Relevant links

Transcription