Kubernetes on bare-metal: lessons learned
What does it take to build a Kubernetes cluster on bare metal?
In this episode of KubeFM, you will learn how to plan and execute a successful setup for a bare-metal Kubernetes cluster.
You will follow Mathias' journey as he rebuilt his cluster several times and learn how to:
Identify dependencies and priorities between components to avoid incidents in the future.
Leverage FluxCD to have a predictable and documented setup.
Secure the nodes from external traffic with firewalls and Cilium cluster-wide network policies.
Use Talos to have a self-contained Kubernetes operating system.
Mathias also shared tips and advice for other engineers embarking on the same process.
Relevant links
Transcription
Bart: When people have free time, some people like to go for long walks on the beach. Others like to read books. Others like to do sports. And others, yes, others build their own Kubernetes clusters on bare metal at home. In our next episode, you'll be hearing from Mathias, who did it all on his own, from start to finish. We hear about the technologies he used along the way, whether or not he used Argo or Flux, what he decided to use when it came to networking, and lots of other details that I'm sure you'll find interesting. Check it out. All right, Mathias, you've got a brand new Kubernetes cluster. Which three tools are you going to install on it first?
Mathias: I think I'm pretty basic in that regard. Ingress controller, cert-manager, and external-dns.
Bart: Yeah, it's interesting. Our previous guest that we had also mentioned cert-manager. Why that?
Mathias: I mean, I use it primarily for securing endpoints, ingress points, so getting HTTPS on the server. on your web services is what I use it primarily for.
Bart: Tell me a little bit more about what you do, who you are, and where you work.
Mathias: Yeah, so I'm a freelance DevOps engineer situated in Copenhagen. So I work primarily with Kubernetes DevOps and also a little bit of AWS. And then I'm also really fond of Headster, but not so much professional work there yet.
Bart: Now tell me a little bit more about your background. How did you get started with Cloud Native? What were you doing before you started working with Cloud Native Technologies?
Mathias: Yeah, so I'm a self-taught programmer. I started with HTML and CSS and did some auto-ed for some bots for video games and stuff like that. Moved into C++, then I finally got a job working as a C-sharp/.NET developer. But I was embedded in the infrastructure team. So, you know, the infrastructure side of it kind of started rubbing off on me. Got an interest in, you know, virtualization, networking, and all that stuff. And finally decided to see if I could jump into a more infrastructure-oriented role and ended up in a DevOps operations team, which is where my cloud-native experience kind of started, you know, working with AWS and a little bit of Kubernetes. I also started playing around with Docker and stuff like that.
Bart: Very good. You mentioned bots for video games. Can you tell me a little bit more about that?
Mathias: So it was a... a Danish version of Habbo Hotel, kind of, where you had to stay active to get credits. So I wrote a small bot in Autoit to move around so you didn't get logged out. I think that was my first wow experience with the program. I was like, okay, I can actually make something that has some use. I can interact with the computer in a whole different way.
Bart: I think it's really important though, right? In a developer's journey, we'll talk about this a little bit more, but you know, the fact of one thing to be coding in your job but then also to be taking that into, you know, personal projects so that you can build something and like you said, have that sort of wow experience. Now you learned Kubernetes just kind of jumping in. What was the process of learning? What were things that maybe were challenging for you along that process?
Mathias: When I started using Kubernetes, it was a little bit simpler landscape back then, I think. There weren't as many options like K3S was a thing, but like K0S and some of the more like Talos, for example, didn't exist. So it was a lot more difficult to deal with. I think a lot of people back then ran into the certificate expiration issues where all of a sudden the cluster stopped working after about nine months for some reason, stuff like that.
Bart: With that in mind, if you could go back to tell your previous self, and maybe just remind me, you know, when did you start working with Kubernetes? So let's go back a few years. At that point, what advice would you give to your previous self to make it maybe a little bit less difficult in terms of learning how to use Kubernetes?
Mathias: Oh, that's a difficult one. I mean, I think the best way to learn Kubernetes is to learn by doing, so just jumping into it. I mean, that's also kind of how I got started, but I think just getting into it and getting your hands dirty is the absolute best way to get started with it.
Bart: Makes sense, right? Just kind of, what we say in Spanish is, throw yourself into the swimming pool log.
Mathias: Yeah.
Bart: And start swimming, and getting in contact with it, and developing those projects on your own as well too, I think it's a good way. In terms of what we want to focus on today, you wrote an article about bare metal Kubernetes, particularly focusing on Talos on Hetzner. So you, once again, this focuses interest in, not just coding at work, but also taking it into personal projects. And these can also help people eventually get jobs because it gives different kinds of visibility. What are your thoughts on this? Is this something every programmer should think about doing?
Mathias: Programming in your own time or?
Bart: Yeah, developing personal projects, things of that nature.
Mathias: I think it's a very fine line. I think there's a lot of burnout, especially, in the development world as well. So I think you have to be careful. But I think as long as you're on your side project and stuff are something you're really interested in, and it's not something that you're like, burning yourself out over, I think it's a great way to expand your horizon a little bit. I mean, a lot of the technologies I work with are just stuff that I've heard about or think is interesting, not necessarily something I want to monetize or anything like that. But it's definitely made me a better both programmer and developer to just have like a, even just like a shallow understanding of networking, storage, how all these things work together.
Bart: That's a good point. You don't necessarily have to become an expert, but at least becoming, you know, being in contact with it and being able to empathize with the folks that are spending their, you know, day to day, working with those technologies. Yeah. And, and a further thing that you mentioned to the topic of burnout, something that comes up a lot, right. And how to detect it when it's happening, how to stop it before it starts. In general, something that I learned probably too late because of dealing with burnout on my own is, you know, estimation. So the idea of like, Oh, I'm just going to create this spot for a video game or I'm going to spin up a website or I'm going to do something that in my free time is that estimations in general, I think are off by at least 50% from hearing from Cal Newport, who's an expert on deep work and focus time and things of that nature. In general, if you think something's going to take one hour, it'll probably take two, if you think it's going to take one day, two, one week, two, etcetera. So keeping that in mind when doing stuff and being kind to yourself and trying to be as realistic as possible. I think it's an important thing to keep in mind so that things that are being done for fun, stay fun and don't become a burden and something that builds up resentment, but getting back to the article that you wrote, Bare Metal Kubernetes, there are a lot of different managed offerings when we talk about EKS, GKE, AKS, etcetera, why don't you do it on your own? What was the attraction about like, hey, I'm just going to jump right in and do this on my own?
Mathias: I mean, the primary one is definitely cost. It's a little bit expensive to run an EKS cluster 24-7 when you don't really have anything running on it. Like, I'm using my cluster right now for running a T-Speak server for some friends and some game servers whenever we want that. But it's basically a big, money sink. So that's one of the primary ones. Apart from that, I'm also kind of interested in, you know, privacy and also, you know, I mean, one of the motivations for this cluster is trying to make a self-contained cluster. So that means like no external load balancers and stuff like that. So that was also a big interest for me. And being able to eventually maybe move it onto actual hardware instead of hosting it with Hertzner.
Bart: If we shifted over to the side of provisioning, doing this on a production-grade Kubernetes cluster isn't something many would say that, for beginners or for the faint of heart. And there are lots of different details that have to be kept in mind in order to be able to deploy your first app. How did you divide up the work so that it didn't become so overwhelming and as you said previously, run into a failure to burn out?
Mathias: Yeah, I mean, it was difficult. I've worked with a lot of the technologies before and I've also rebuilt this cluster a number of times before. So I had a lot of background knowledge and ideas. So I was able to lay it out somewhat. But I mean, one of the biggest problems was figuring out the dependencies. So for example, you need to have, well, Flux for one to get, you know, to configure things and have a really good overview of what you've actually deployed. But something like persistent storage is really important for a lot of applications, like for example, Harbor. But that also has some dependencies. So trying to like figure out. which things depend on which, and then trying to lay it out in that order is the difficult part. And it's also something that I failed at because of the first incident, which is also part of the blog series, where basically the whole cluster stops working because Harbor is trying to reach the persistent storage, but the persistent storage failed because I was rebooting nodes, and the nodes can't get booted because it can't reach Harbor to get the images they need. So the whole thing just kind of falls apart. So that's one of the hard parts.
Bart: One of the art bars for sure. And with that in mind, what should come first? You know, if people, if you got to start with one thing, what's your recommendation? First go with this, then that, then what comes next?
Mathias: Yeah, I think, I think Flux or any kind of a CD product is a good place to start. Mainly because I believe that, you know, discipline is great, develop a discipline, but it's also like one of the most unreliable ways of actually keeping your business going. track of what you're doing and making sure that you have some sense of what's going on. So deploying Flux as one of the first things and then making sure that pretty much everything I did was through this Git repository was one of the ways to make sure that I actually had a clean cluster and I had a really good idea and history of what I was actually doing with it. Because in the past, you would try and experiment for something and you worked really hard for a week and a half, and then you lose interest or something comes up and you forget all about it. And the next time you need your cluster, you're trying to install something and it just doesn't work because you were halfway through replacing the CNI or mucking about with something else.
Bart: And speaking of mucking about with something else, I just want to ask, you know, I know, but it's true with so many different steps in the different, you know, technological choices you're making. How do you go about it? You say, okay, standard practice. I'm going to try Flux. I'm also going to try two or three alternatives. What's your process in terms of reaching the decision about which one is going to work best for you?
Mathias: Yeah. So having rebuilt this cluster, cutting up a number of times, I've played with a lot of the, both the technologies that I'm using, but also some of the competitors, like Argo CD, for example. So a lot of it is just the prior experience, you know, trying it out. What are the pain points? Which ones can I live with? Which ones can't I? And then just investigating the alternatives.
Bart: You, you did settle on Talos eventually, you know, as one of the choices that you made. Yeah. Can you tell us really quickly what, what is Talos and why did you decide to go for it? What, what were the pain points that you were willing to deal with and accept that you didn't really find another editor?
Mathias: Yeah, so Talos is basically a fully, what do you call it, full self-contained Kubernetes operating system, kind of like a Flatcar Linux, which is something I've also investigated earlier, but haven't had a chance to work with. So I really like the idea of having this, that the whole thing is basically, the whole machine is managed through a single YAML document. And you can say what you want about YAML, but it's nice to have, like, you know everything about the cluster or about the single machine from this one document. I think that's a great way to get a feeling for the whole setup without having to like dive into, oh, how is this thing configured? I mean, like I say in the blog series, they take a lot of, make a lot of decisions for you, but as long as you're okay with these decisions, then I think it's a great choice.
Bart: And in terms of any downsides, major downsides that our listeners should keep in mind if they were to decide to use Talos, what are some things that perhaps, you know, there are always going to be trade-offs? What are things that you think would be important to think about there?
Mathias: I mean, I ran into a very real one. I was experimenting with a new part of the series and I wanted to see if I could set up some virtual machine hosting using KubeVirt. But getting networking working with KubeVirt kind of required something called kube-ovn or a similar product, which uses open switch to do the routing between the, the virtual machines. But trying to deploy kube-ovn was actually really difficult because it wasn't really designed for this immutable storage, uh, solution that, that Talos works with. So that was something I had to actually walk back and just drop that project, at least for now. So I'm going to go ahead and wrap up here.
Bart: You've mentioned, you know, one of the things that came up earlier was, you know, storage. And it's something that having spent a lot of time talking to people in the database space and the storage space, we know it's something that's tricky and can involve a lot of complexities. Something that also involves a fair amount of complexity would be networking. And you've settled on Cilium as your CNI of choice. What led you to that decision?
Mathias: I mean, I've used a lot of other ones in the past, like Flannel and stuff like that. I think. The performance claims, I haven't actually benchmarked any of this, but I mean, the performance claims and using eBPF makes a lot of sense to me for faster routing. I also didn't need anything like super advanced, so Cilium also works well there. And then I also have a really good use case for it with the Cilium cluster-wide network policies for securing the nodes, since I didn't have that. one of those things that you would have if you go with a provider like Azure or AWS, where you have the firewall in front or the load balancer. Not having that meant that I had to find another way to secure my nodes and the Cilium Cluster Wide Network Policy was great for that.
Bart: And in terms of, you know, you mentioned the fact of, you know, about a firewall. And it's also interesting too, because of having spoken to, yeah, I mean, been at CiliumCon in Amsterdam and hearing about other folks that are end users of Cilium. A lot of it seems to be that there's an initial understanding of Cilium being very much focused on networking. And another part of it, as you rightfully mentioned, is the part about security and securing the nodes. Did you use a regular firewall to do so? And on top of that, could you go into a little bit more detail about Cilium network policies and policy audit mode?
Mathias: Yeah, so I've locked myself out of clusters at least a couple of times in the past. And the servers, you know, configure the firewall, you accidentally cut off your own SSH access and then you're just kind of lost. So that was one of those things I was trying to avoid this time around. So basically what I'm doing is I'm using the Hertzner firewall for securing the Talos and the Kubernetes endpoints, which is just hardcoded to whitelist only in my IP basically. and then I'm using the cluster policy for everything else. Policy audit mode is really useful for debugging these things, but at least for the nodes, which are servers that are exposed to the internet, there is a lot of traffic. So if you're enabling the audit mode, so you can go through and check that everything makes sense, there's a lot of data there. So making sure that you don't do something wrong is a little difficult. Also... The way Cilium handles these firewall policies is by, at least by default, is that they don't enforce any rules unless there are rules. So it's basically allow all, except if you have policies enabled. So I've also previously used these cluster-wide network policies. but apply them in the wrong order, which also locked me out of the cluster, which is why I'm using just a single policy this time around.
Bart: So at this point now, you've got a bare bone cluster that could accept deployments. What was your next move?
Mathias: Well, I mean, the Ingress and the external-dns and the server manager is a great place to start because then you can start getting access to web UIs and stuff like that.
Bart: And I know we touched on it previously, but I want to dig into this a little bit more because we love a little bit of controversy. Looking at, you know, Flux CD versus Argo CD. In terms of the Argo CD fans out there, if they're saying, hey, why did you make that choice? What do you tell them? I imagine you probably got some feedback around that, but I'm just curious. How are those conversations going?
Mathias: Yeah, I actually haven't had many people comment on that, but it was a very deliberate choice. I am a big fan of Argo CD, and I think as far as discoverability and interface, I think it's a lot better than the Flux CD. The only real gripe I had with it was the way it handled helm releases, basically. where you had to subchart a Helm chart in order to deploy it from a Git repository, which was a little roundabout way, and I wasn't exactly sure how I was managing to keep up with the versioning. So if you're subcharting another chart... is ArgoCD capable of properly noticing that an upgrade is ready for that subchart. I'm pretty sure that since then they've actually implemented a solution for this. So you can, what do you call it, you can have the value stored alongside your repository. But they didn't have it back then.
Bart: With having Flux set up, you go ahead and configure the rest of the cluster. Is that correct?
Mathias: Yes, so with Flux CD and the Ingress controller and external-dns and the search manager all running, I decided to actually scale out the cluster because until this point, it's just been one server, which is usually pretty painless because there's no coordination, there's no quorum or anything like that. So the next step was to basically scale the cluster to all three nodes.
Bart: Alright, I think everyone would be in agreement that this sounds like a fair amount of work. Some would even say a lot of work. Exactly how much time did this take from the start up till this point?
Mathias: Probably around maybe a hundred hours all in all I think. I mean, it's a lot. editing and writing the blog post also added a lot of time to the project, but it's also incredibly worthwhile experience because it forces you to go through your decisions and actually document them. And whenever you have to make a decision, you have to challenge yourself and be like, why exactly am I just defaulting to this? What is the reasoning for me choosing this? And if you don't have an answer, you can't really write about it. So it's a very rewarding experience.
Bart: I see it as an exercise because writing, taking it one step further, having to defend those, you know, viewpoints because of the fact that there are alternatives out there and you say, well, I tried this one, I tried that one, and this would inform my decision as opposed to perhaps being in, being in a team where decisions are making outside, where I made outside and you're, you're simply informed. This is the technology we're using. Don't worry about the why, just start using it. In terms of all the different things that you installed and configured, is there anything that really stood out or surprised you or things that you would like our listeners to know?
Mathias: I think starting with the FluxCD thing is definitely a good move. Just because it, like the writing, it forces you to... kind of document your progress. You can't just like kubectl apply something. You're forced to keep track of what you're actually doing with the cluster. I think that's a super helpful tool.
Bart: Having done this, would you recommend this as a good experience for our audience to build a bare metal cluster?
Mathias: I mean, if cost is a big issue, like if you're an enterprise customer and you're worried about keeping costs down, I think yes. I think also if it's a learning experience, I think yes. There are still, I mean, Kubernetes has improved a lot, but there are still some foot guns out there. So managed solutions are great if you just wanna hit the ground running and you have a product to build. But there are still some edges you can cut yourself on and stuff like that. But it's really fun.
Bart: And in terms of companies that based on the sector that they're in have no choice but to deploy clusters on-prem, any recommendations you would give them?
Mathias: I mean, make sure that you have... the people and the education to manage it because it's a little bit different to manage on-prem architecture versus cloud architecture. A lot of the hard things like networking storage is handled for you in managed solutions. So there's like a whole new world of bugs and issues. strange issues you can run into. So, I mean, I don't wanna like hard sell it too hard. Kubernetes can be difficult, but it doesn't have to be super difficult. Most people can do it, but it also takes time. And I think you need to devote the proper attention and time to know what you're actually doing if you do it this way.
Bart: I think that touches a very important point that you mentioned, you know, the people factor in an organization, building a culture where folks are out there trying different things, getting their own experience doesn't mean it has to be an obligation to do so, but have you seen any cultural aspects in different organizations that might be a factor? more, let's say, lending to certain practices of people making sure that they're going out and doing their own sort of proofs of concepts, things of that nature?
Mathias: I mean, I think it's super important that developers have the time to also experiment with some of the potential benefits. huge benefits they might see in future technology. I think it's also one of those things that people overlook a lot in organizations, that you don't have enough slack because you're running a sprint and you need to fill it like 90 to 100% of the time. So I mean, develop at downtime, I think is something that people should focus a lot more on because at least, well, most of the developers and operations people I've met have like a natural curiosity. So if they get, you know, 20% of their time where they're just like sitting there twiddling their thumbs, they're probably going to go on Reddit or Hacker News or some other technology forum and start looking at what other people are doing and seeing if anything is applicable to their own situation.
Bart: Great point. And then those experiences can then make decisions better informed because someone actually has real life experience working with one of these technologies. You said that you spent a hundred hours in this process and then many, many more in terms of writing the blog series. What's the feedback been like? What have people commented on? Have there been any surprises there for you?
Mathias: I mean, I was surprised to get any feedback at all, to be honest. I posted on Reddit and got some pretty good feedback there. And then my girlfriend egged me on to post it on Hack News and I went along with that. Good job.
Bart: Shout out to her.
Mathias: Yeah. Yeah, thanks Lars. And just, I kind of expected to just get, you know, just drown in the deluge of content that's posted there. So I was actually really surprised to get any feedback at all, and especially such positive feedback. I mean, who cares about my tiny little three node cluster in Germany? But apparently a lot of people do and found it really interesting. Yeah.
Bart: I think it's something for everyone, make community part of your experience and part of your strategy. And by putting those ideas out there, people will interact. Sometimes you may get responses that might be, there could just be a misunderstanding or that they might be really, really passionate about one technology over another. But respecting the effort that goes behind it, I think is essential. Then having done this, what's next for you? Are you gonna be building more on prem clusters? What can we expect from Mathias?
Mathias: I'm hoping to build some more clusters. I also have a few more blog ideas up in the pipeline, some monitoring and stuff like that. Yeah, I'm hoping to make Kubernetes cluster building my core work, but it hasn't really, we call it, cemented yet.
Bart: But it's a work in progress. Yeah. And when you're not building, you know, on-prem clusters in your spare time, what do you like to do for fun?
Mathias: Well, that's what I do for fun. I became a dad in April. So some downtime with the paternity leave is also what allowed me to actually go through with this project. So a six month old daughter and we're moving here in a month. So...
Bart: you've got nothing but free time.
Mathias: I've got no free time. It's just paperwork and diapers. So, uh, yeah, that's what I'm having fun with in my spare time.
Bart: I think it's great though, is that, yeah, being able to blend something that's both professionally beneficial and also, you know, that's personally rewarding and going out and testing out these things, getting that experience, sharing with others to help them grow. I really thank you for your work and I'm sure many other people have done the same. If people want to get in touch with you to ask you more directly about what you've done, what's the best way to do it?
Mathias: Either by email or through LinkedIn, I think. Both of them are on my website.
Bart: So pretty easy to find. We'll be linking your website as well as the show notes so people will be able to reach out if they want to. Mathias, thank you very much for joining us today. Really appreciate your work and hope to see you in the future.
Mathias: Thanks for having me.
Bart: Perfect. Cheers.
Mathias: Cheers.