CloudBolt announces The Kubernetes Automation Trust Gap Study

Apr 30, 2026

Guest:

Yasmin Rajabi

Cloud teams need automated resource optimization that can scale without increasing outage risk or 2 am pages.

CloudBolt's announcement: a new survey of 300+ practitioners on trust in automation, especially around right-sizing resources and AI/ML-driven optimization.

The results explain why human review still dominates, why manual optimization reaches its limits at scale, and which product guardrails help teams trust automation when resources are constrained.

Read The Kubernetes Automation Trust Gap study

Transcription

Bart Farrell: Who are you? What's your role? And where do you work?

Yasmin Rajabi: Hi, I'm Yasmin. I am the COO at CloudBolt. Technically, I work from my house, I guess, because we're all remote.

Bart Farrell: Yasmin, what news would you like to share with our community today?

Yasmin Rajabi: Last week, we announced a new survey that we ran. Over 300 people took the survey. It's really about trusting automation inside your environment, and the differences of what people are comfortable trusting, what they're not comfortable trusting with the rise of new AI tools too. What does that do to trust in automation? The results are really interesting.

Bart Farrell: For folks that maybe want some background on that, you mentioned the aspect of trust. Digging into this topic, CloudBolt has decided to go all in, interviewing 300 people about this. Why do this survey and not another one?

Yasmin Rajabi: We have a right-sizing solution. Inherently, every time that someone goes to look at either a server, a workload, anything that you would right size, maybe anything in your infrastructure, there's this inherent question: what's going to happen when it goes wrong? Automation is just a given these days, even in the survey, I think 89% said automation is mission critical. I imagine the last 11% still thought it was pretty important. But I think we've all accepted that automation is important. But when it comes to right sizing, and being in a scenario where you could reduce something that same trust you have in maybe deploying a new workload, deploying a change to production is very different when it comes to right sizing. We've worked with a lot of companies in building that trust. We wanted to go a little deeper and understand what are the things that hold people back? What are the confidence levers there?

Bart Farrell: You shared some of the findings, but at a high level, what people should keep in mind if they want to take a look at the survey more in detail. What were some of the conclusions that you got?

Yasmin Rajabi: I was surprised over 70% of people still require human review for resource optimization. If you think about it, most environments, even a small environment might be a couple of clusters, but you're running 100, 1000 workloads on those clusters. We work with some very large customers that have hundreds of clusters with tens of thousands of workloads in those. But the interesting thing is in the survey, almost 70% of people said that any sort of manual optimization breaks before you get to 250 changes a day. Imagine if you have 250 workloads, you're talking one cluster, you can maybe do manual. Outside of that, you can't. It's eye opening that still over 70% of people require human reviews for this.

Bart Farrell: In addition to that, you did get some direct quotes from people you interviewed. I know there was one about outages from a platform engineer. Could you shed some light on that?

Yasmin Rajabi: There was one that really stood out to me. There's a platform engineer that said, look, if an application team has made a change to resources and then they saw their service go down. Or maybe they didn't even change anything, but traffic increases, spike went up, and they saw an impact to their service when it comes to resources. They're not touching that workload or that namespace for months. I think the quote specifically was: we're not touching that for six months, which sounds extreme, but is real because all of automation comes down to humans and how comfortable they are. You can only move as fast as how comfortable those humans are. We're seeing that every day when we talk to people. But that quote says: OK, something bad happened to me. It's going to take me a while to believe that I can go make these changes without something bad happening again.

Bart Farrell: Thinking about next steps based on the data that you've collected, the insights that you're extracting from it. What are the next steps that we can expect from CloudBolt?

Yasmin Rajabi: There's quite a few learnings we're taking from the survey to add into the product, giving users more data points on what's going to happen, how they can get more insights into what our machine learning is looking at, because humans like to look at those things so that they can build confidence. This is something we've done from the very beginning because we've been doing AI/ML since 2022, this version of the product, and before that others. We always knew we have to put in that level of comfort, whether it's guardrails around what the machine learning does or guardrails in the automation and how and who can run it, when it doesn't run. But especially now with all the different AI tools, on one hand, it's great because people are becoming more comfortable with these things. On the other hand, it's opening a lot more questions like: okay, what would happen here? Can I trust it to do that? Taking those pieces and building that into the platform. The last thing, because in the survey results, I don't remember the statistics off the top of my head, but people are way more comfortable deploying a CI/CD change to production than constraining a resource and maybe right sizing it down, for example. I was really wondering, why are we okay deploying new code, but maybe we're less comfortable when it comes to right sizing. This is not scientific. This is my opinion. But I think it's because when it comes to deploying new code, we know what happens. We know what happens when it goes well. We know what happens when it doesn't go well. We have fallback mechanisms. We deploy things in a blue-green way. It's done so often that people know it's not that they trust when it's going to go right. They know what's going to happen when it goes wrong. I think that's on us, too, as a right sizing platform, yes, we're going to continue to build confidence in how we get it right. But what happens when it goes wrong? Everything's going to go wrong. It's infrastructure where we'd be naive to say it doesn't. But building in those mechanisms to detect failures and respond quickly to be able to roll back very quickly, have all those health checks. That's the type of stuff that even last week, we added a whole new set of health checks of things to detect in the cluster, that if we detect those things, we can either roll back or we can go out and proactively address. Building confidence through knowing what happens to the system when something breaks.

Bart Farrell: On the subject of VPA and adoption, anything you'd like to share about that?

Yasmin Rajabi: I think the same trust problem applies to the VPA, because at the end of the day, the VPA's right-sizing resources. I think there's a whole other category when you talk about using VPA and the HPA. For this example, think a static workload, even if you're going to deploy a VPA recommendation, you want to know what's going to happen when it goes wrong. The same quote earlier: you can only move fast as fast as you trust the system. People need to build a lot of guardrails into the VPA. I think any right-sizing solution or any automation solution that's making a change and constricting a resource versus a net new deployment needs to think through, how do I build trust with the humans that are using the system, and inherently in the system to be able to scale it out.

Bart Farrell: As much as we're talking about technical challenges, the human problems that come along with it, like you said, in this case, regarding trust, have a significant impact and continue to do so. What's next for you? Yasmin, what are you going to be working on next?

Yasmin Rajabi: I have been talking to our UX designer about doing some research beyond just this to dig in with those platform engineers. There's another quote that was: if I'm going to get paged at 2am, I'm going to do whatever is in my control to not get paged. I'm less worried about the cost of that. When it comes to resource costs, I'm more worried about the cost of it for me personally. Digging in and having follow-up conversations with those folks to drive the next evolution of the product. That's what I'm most excited about.

Bart Farrell: If people want to get in touch with you, what's the best way to do that?

Yasmin Rajabi: My email is yasmin at cloudbolt.io, yasmin at stormforge.io. I'm on LinkedIn, Yasmin Rajabi. My dream is to go by one name, but not there yet. Reach out. We'd love to chat.

Bart Farrell: Fantastic. Thanks so much for your time today. We'll speak soon. Take care.

Yasmin Rajabi: Thank you.

CloudBolt announces The Kubernetes Automation Trust Gap Study

Relevant links

Transcription