Menu

Migrating Kubernetes Off Big Cloud

Migrating Kubernetes Off Big Cloud

Mar 10, 2026

Host:

Bart Farrell

Guest:

Fernando Duran

This episode is sponsored by LearnKube — get started on your Kubernetes journey through comprehensive online, in-person or remote training.

Managed Kubernetes on a major cloud provider can cost hundreds or even thousands of dollars a month — and much of that spending hides behind defaults, minimum resource ratios, and auxiliary services you didn't ask for.

Fernando Duran, founder of SadServers, shares how his GKE Autopilot proof of concept ran close to $1,000/month on a fraction of the CPU of the actual workload and how he cut that to roughly $30/month by moving to Hetzner with Edka as a managed control plane.

In this interview:

Why Kubernetes hasn't delivered on its original promise of cost savings through bin packing — and what it actually provides instead
A real cost comparison: $1,000/month on GKE vs. $30/month on Hetzner with Edka for the same nominal capacity
What you need to bring with you (observability, logging, dashboards) when leaving a fully managed cloud provider

The decision comes down to how tightly coupled you are to cloud-specific services and whether your team can spare the cycles to manage the gaps.

Listen anywhere

Subscribe to KubeFM Weekly

Get the latest Kubernetes videos delivered to your inbox every week.

or subscribe via

Transcription

Bart Farrell: In this episode of KubeFM, I'm joined by Fernando Duran, who's a founder of SadServers and a former DevOps manager responsible for operating Kubernetes clusters under real cost and reliability constraints. This conversation focuses on the practical realities of running Kubernetes at scale, especially around cost, control, and operational trade-offs. We dig into Fernando's experience with GKE Autopilot, including how the billing model actually behaves in production, impact of system overhead like balloon pods, and why expected efficiency gains don't always materialize. From there, we compare managed cloud Kubernetes with a more hands-on approach using Hetzner, discussing workload placement, node sizing, operational responsibility, and where the break-even points really are. If you're making decisions about managed Kubernetes versus self-managed clusters, or trying to understand where your Kubernetes costs are really coming from, this episode is a grounded, experience-based discussion you'll want to hear. This episode of KubeFM is sponsored by LearnKube. Since 2017, LearnKube has helped Kubernetes engineers from all over the world level up through training courses that are instructor-led and are given in person and online. They are 60% practical and 40% theoretical. They're given to individuals as well as to groups, and students have access to the course materials for the rest of their lives. For more information, go to learnkube.com. Now, let's get into the episode with Fernando. You're tuned in to KubeFM. Fernando, welcome to KubeFM. What are three emerging Kubernetes tools that you are keeping an eye on?

Fernando Duran: Mostly now because I'm working on a project with AI chatbot. I'm looking at Kubernetes tools that they interact, that they use LLMs, like AI agents, to try and figure out SRE stuff, like what problems are in the system.

Bart Farrell: And for people who don't know you, Fernando, can you tell us a little bit about what you do and where you work?

Fernando Duran: I'm the founder of SadServers, a platform that gives you Linux and DevOps problems that you have to solve in a practical way.

Bart Farrell: Okay. And how did you get into cloud native?

Fernando Duran: First I got into cloud a few years after AWS was created. And then in one of the startups I worked for, I was the DevOps manager and we did a proof of concept of Kubernetes. So we introduced it to the company as a way to move our platform to our seat. And then my next startup, it was fully, 100% running on Kubernetes and I was also DevOps manager there. So that was my full-on introduction to Kubernetes.

Bart Farrell: And what were you before Cloud Native? I know you mentioned getting started with AWS, but prior to that?

Fernando Duran: I'm old, so I've done pretty much everything, I think. So I started as a software engineer. And then after that, I got into the DevOps side, like infrastructure. Starting with Linux sysadmin, I worked in startups and then it was more like DevOps. So that was pretty much my travel was like coming from software engineering and then sysadmin and then DevOps, I'd say.

Bart Farrell: Okay. And the Kubernetes and cloud native ecosystem moves very quickly. How do you stay updated on top of all the things that are going on?

Fernando Duran: I don't. It has to be rather than me pulling, it has to be more of a push from the outside in the form of something that shows up really in one of my feeds or in Hacker News or some of the groups that I have friends and we chat among each other, like DevOps group. So it has to be something, that is affecting the community and goes above a threshold. And then that's how I learn about it. But unless it's something that I need to work on, like I mentioned at the beginning, something related to solving problems with AI agents or a direct problem that I have, then I cannot keep in touch with everything that's going on.

Bart Farrell: It's a common challenge. If you could go back in time and share one career tip with your younger self, what would it be?

Fernando Duran: In general, I would say that a big part of a job, in any job, not only like technical job, is to say that you're going to do something, to do it, and then say that you've done something. That seems like a very simple thing, but I think it's a good professional tip. And with dealing with contractors or any job really in the world. Sometimes people don't do that and it just bothers me.

Bart Farrell: Now, as part of our monthly content discovery, we found an article that you wrote titled, Migrating Kubernetes Out of the Big Cloud Providers. So you recently wrote about migrating your Kubernetes workloads away from the major cloud providers. What prompted you to explore alternatives to the big three, AWS, Google Cloud, and Azure?

Fernando Duran: The easy answer is cost. Basically, it's super expensive for me. I'm a small startup. Cost is really important for me. All the big three, they have kind of like a monopoly. They have the same price for their control plane, plus they have a lot of hidden costs in terms of monitoring and other costs. The cost would be the issue for me. I was doing a proof of concept for a new product that I'm working on. And yet it was like really expensive for doing a proof of concept.

Bart Farrell: Kubernetes was often promoted as a way to reduce infrastructure costs through efficient resource packing. In your experience running production workloads, has that promise held true?

Fernando Duran: No, actually, I don't know anybody. This anecdotally from reading stuff, even if I just say that I don't read a lot of stuff, I'll see surfaces. That hasn't been true in my experience with all the companies working on Kubernetes or people that I talk to that they manage Kubernetes clusters. Basically, the advantages are other ones, like having the same way of doing things. So when I move from one company to the other, using Kubernetes, for me, the onboarding was very easy, so standardization is very important. Also, the way to extend things is a good advantage. That's incidentally why I think WordPress, for instance, took over the blog and easy website world. The cost, we thought bin packing was going to be a big thing, but it didn't live up to that promise in particular.

Bart Farrell: And you chose GKE Autopilot as your starting point. What made it attractive compared to standard GKE or other managed offerings?

Fernando Duran: In my first professional Kubernetes experience, we did different proof of concept with different, all the big three public clouds. And Google possibly or almost surely because of their Borg initial Kubernetes experience was the easiest one to work with. Then I worked in a startup that was fully 100% on GKE. But me working on DevOps, one of our biggest pain points was just managing the node pools. So GKE Autopilot, that now, for a while, has been the standard now for Google Cloud. So I've stretched that out and it was easier. On the other hand, whenever you have stretched something out, it hides a lot of things that you may want to have control of. It's a trade-off like everything. So I went with GKE Autopilot, and then I learned some things I didn't know about it.

Bart Farrell: GKE Autopilot has a unique billing model based on pod requests rather than nodes. How did this affect your cost planning, and what surprised you about it?

Fernando Duran: It was mostly like a me problem, not really fully reading the fine manual. It makes sense that it charges by what's requested. So it's the same as when you provision a VM in AWS or whatever, you provision a VM based on CPU and memory, not on what you actually use, so that makes sense. But it has some things like there's some ratios of CPU to memory that are minimum that you cannot change. So in my case, I was trying to use super small pods with very little CPU and then realizing that memory requirements were minimum, things like that. So basically, whenever you get into something new, especially if it's Kubernetes, which are things that are automated, for you, so you have to be careful about all those hidden costs.

Bart Farrell: Beyond the billing model, what other hidden costs or surprises did you encounter with GKE Autopilot?

Fernando Duran: The data plane costing the same, funny enough, everywhere. That's the easiest one that is well known. But the metrics, I got charged quite a bit for the metrics. And it's not easy to see which ones you were paying for. You would think, oh, this is covered by the basic logging and monitoring. That was some logging and monitoring costs. That was not initially you could think of. Also, there are some other costs in the sense of having pods just to keep a node alive, like these balloon pods that we can talk about later on. Other surprises are, like I said, when you have things that are very automated for you, and you have some hidden things that you cannot control. So you're getting Google Cloud problems as we like Quota and the quota like in AWS or Google Cloud, you can go, oh, I need to ask for more VMs. But GKE is kind of hidden. You don't know because it's a abstracted for you, and you don't know exactly what quota you're hitting there. And then also the errors that you get in the main dashboard, after you fix them, they don't go away and then you realize they go away after a day or two. So I don't know if it was like a me case or just for me or less general, but that was my experience. So there are like some things that you need to learn because of the abstraction.

Bart Farrell: I know you mentioned something about balloon pods in your GKE setup. For people who aren't aware of that, can you explain what those are and why they matter for costs?

Fernando Duran: What happens is like your default GKE setup, you have to use four availability zones, and that means at least one CPU each. And if you have less workload than that, what Google does is just have a pod doing nothing. So it's just you're paying for that one node, that one CPU, even if you're not using it. Which in a way makes sense because, they have to have a node app. At the beginning when you just read that, you're just paying for what you use or what you request rather. It was a bit surprising. You can get it instead of using the default. Then I learned that you can use actually the minimum is to have like two availability zones. So that's what I did and reduced the cost quite a bit. And that's for GKE Autopilot. I think for regular GKE, you can just have one.

Bart Farrell: And what was the actual cost impact you experienced? And how did it compare to your expectations?

Fernando Duran: I was using, I believe, the actual workload for the things I was running. It was just a proof of concept. It was something silly like 0.2 CPUs. Plus I had some things like the certificate automation, like a certificate bot that I didn't look at. And then because of defaults, it was using quite a bit of CPU. So basically for this proof of concept, I was running quite close to like a thousand dollars per month. When you're working in a big company and you are giving like free reign or doing things. That's why some costs they just run away. But then when it's just you and you have to take care of the costs, you can start reading the fine print and do this thing about like going from four CPUs to two CPUs using spot nodes, putting requests to all the pods that you're using, not using defaults. And then it went down to a few hundred dollars a month. But still, and this is answering more one of the initial questions, I was looking for something that wasn't really fully managed like GKE. But some happy middle that wasn't fully managed, but it wasn't like me starting from a bare metal VM and managing myself the full Kubernetes. That was what I was looking for.

Bart Farrell: Using spot instances as a common cost reduction strategy, what practical considerations should teams keep in mind when using them in Kubernetes?

Fernando Duran: I think this is an easy one. When you have workflows that can be interrupted at any time, then use spot instances because they are way cheaper. Actually, there's an easy way, at least in GKE, to set an alert. And I was getting an email every time the spot instance was being destroyed. And it seemed to be quite random. Some days I would get the spot node being created like two, three times a day, and then it will stay there for like a couple of days. If you can accept that level of, I wouldn't say stability, because the workload has to be appropriate to that, then it's a no-brainer to use it in the right workloads.

Bart Farrell: If someone doesn't need the full ecosystem of a major cloud provider, what alternatives exist for running managed Kubernetes more affordably?

Fernando Duran: I didn't spend a lot of time looking into that, but just in initial research, if you don't need to take advantage of the full cloud infrastructure, like for instance, I just use Kubernetes Cluster in one place, and then I use some of the secrets manager or object storage from the cloud service. Then you can go to a cheaper managed Kubernetes. So I look at, for instance, DigitalOcean. They seem slightly cheaper, although kind of like in the same range of price as the Big Four. OVH Cloud seems to be having like a free data plane as well as other providers that I didn't try out. One called like Civo. I don't know how to pronounce that.

Bart Farrell: Civo,

Fernando Duran: Civo, I never know how to pronounce the DIs and Engage. I would say try different ones and see which ones seem to have the minimum credibility that you want and the minimum requirements for you. But for individuals or startups at the beginning that are very price sensitive, I think It's worth it to spend just that tiny bit of time training a couple of things and see which ones are right for them.

Bart Farrell: And you ultimately landed on a combination of Hetzner and Edka. What does each bring to the solution and why this pairing specifically?

Fernando Duran: Hetzner is one of the big providers that are very affordable, very cheap. And they have the basic minimum that you need for our staff. So they have an API, so they have servers, they have load balancer. IPs, firewalls, even object storage. So you have the basics if you just need running workloads. The API is, I think, especially a big thing, and also even they have a Terraform provider. So, Hetzner, similar in a way to OVHcloud, so it seems like a good place if you need the basic server. Plus some of the minimum services. I found out about Edka. Also it's a startup running off of Barcelona, Spain, which is so nice. And what they offer is a control plane, a managed Kubernetes on top of Hetzner, and they're specialized on Hetzner specifically. So that was a good selling point. So I tried them out and I'm very happy with them. At the time when I signed up, I haven't looked recently, but you can have your first cluster for free, they start charging on the second one, so for me, it was a very good thing to try out. Like I said, that gave me that happy medium that I was looking for of having cheap underlaying VMs, like servers, without me having to do all the work, and having some Kubernetes management without all the bells and whistles of like logging, monitoring and things that I didn't have to do by myself, but they're like pretty standard.

Bart Farrell: And for teams out there that are considering a similar setup, can you walk through the specific infrastructure components and their costs?

Fernando Duran: For my POC that I was running, I was using a dedicated server for the master, which makes sense. And I think actually it requires. So that's like $15 a month. So very cheap. And then I use ShareCPU Worker, also in the $0.9-$0.10 a month. Volume is very cheap, like half a dollar per month. Load balance is like $6. And then a few public IPs, that's like $2. For $30 a month or so, it's like five times. That's the math I did at the time. I need to review that. So it's like five times, 500% cheaper than the Google Cloud GKE solution for the same capacity, nominal capacity.

Bart Farrell: And how did the setup process compare between GKE Autopilot and the Hetzner plus Edka combination?

Fernando Duran: The setup in both of them is completely automated. The difference is like the extra things that GKE gives you, that if you're okay with that, that's fine. In Edka, I didn't have any problem, just went with the documentation. There was a couple of very small things. I talked to the founder and they fixed it. So other than that, a small secondary thing, it was like pretty straightforward, so both of them.

Bart Farrell: And beyond cost, did you notice any difference in performance or operational characteristics between the two setups?

Fernando Duran: This is anecdotal for me, just, speeding up pods in both of them, looking at latencies, but basically, Hetzner seems way faster than GKE. So again, this is anecdotal, I didn't do any benchmarks or anything, but just from getting the command prompt back from Pods and how fast they went up and everything, I didn't notice any issue just because you went to the big cloud to a more affordable option.

Bart Farrell: What other tools would you recommend teams look at for managing Kubernetes clusters on commodity infrastructure?

Fernando Duran: Everything that, whatever Kubernetes management tool is missing. So in the case of Edka, it's the basic Kubernetes control plane. So it doesn't include like messaging and logging. So you have to add yourself, your own. Prometheus Grafana, and Loki, whatever logging system that you want for your applications. An open source or a tool for having a dashboard for Kubernetes. So Lens is the one that I use. I think it's a very popular one. You have to add your own observability stack, whatever that is, the one that you want, that you're more comfortable with. But once you add that, pretty much you are almost on par with a cloud solution. So I think it's very worth it to explore this option.

Bart Farrell: And Fernando, based on your migration experience, what advice would you give teams evaluating whether to move away from the big cloud managed Kubernetes?

Fernando Duran: Like any other abstraction tool, the trade-off is between your team's time and whatever the tool that you're using and abstracting out what's underneath is worth it. If you can spare some engineering cycles to set up your own observability and cost is important for you and you don't have to interact tightly with a lot of things from the cloud provider, then looking at a solution like running on commodity hardware plus a tool for the Kubernetes control plane, it can be a solution for you.

Bart Farrell: Fernando, what's next for you?

Fernando Duran: Besides keeping growing SadServers, I'm working on a superset of SadServers. So SadServers runs on, you've got the problem on a VM. So I'm working on internally, I call it SadSRE, which is a SRE simulator. So I'm working on the full-blown SRE simulator that, like super set of SadServers, like practical problems for SREs and DevOps.

Bart Farrell: And how can people get in touch with you?

Fernando Duran: They can send me an email at info at sadservers.com. Personally, I'm almost everywhere as fduran, as in LinkedIn, Twitter, what have you. I'm supposed to be even fduran.com, so I should be easy to be found.

Bart Farrell: Great. Well, thanks so much for sharing your experience with us, Fernando. Look forward to speaking to you again in the future. Take care.

Fernando Duran: Thanks for having me.

Bart Farrell: Pleasure.

Kubernetes experts reacting to this episode

Multi-Tenancy, Costs, and AI SRE
with Bill Shelton
Does Kubernetes Actually Deliver on Cost Optimization?
with Reid Vandewiele

Subscribe to KubeFM Weekly

Get the latest Kubernetes videos delivered to your inbox every week.

or subscribe via