Kubernetes is simple: it's just Linux

Kubernetes is simple: it's just Linux

Host:

  • Bart Farrell

Guest:

  • Eric Jalal

Eric Jalal, an independent consultant and Kubernetes developer, explains how Kubernetes is fundamentally built on familiar Linux features. He discusses why understanding Linux is crucial for working with Kubernetes and how this knowledge can simplify your approach to cloud-native technologies.

You will learn:

  • Why Eric considers Kubernetes to be "just Linux" and how it wraps existing Linux technologies.

  • The importance of understanding Linux fundamentals (file systems, networking, storage).

  • How Kubernetes provides a standard and consistent interface for managing Linux-based infrastructure.

  • Why learning Linux deeply can make Kubernetes adoption an incremental step rather than a giant leap

Relevant links
Transcription

Bart: Kubernetes is just Linux, right or wrong? In today's episode of KubeFM, we'll be speaking to Eric Jalal, who will explain how Kubernetes might seem complex at first, but is fundamentally built on familiar Linux features. In other words, for Eric, Kubernetes isn't an entirely new innovation - it's more of an orchestration layer that wraps existing technologies to manage containerized applications at scale. It shares concepts with other orchestrators like Docker Swarm and Apache Mesos, but stands out due to its simplicity and flexibility. Kubernetes is built on top of Linux and relies on Linux APIs for its functionality. And for Eric, it's essential to understand Linux in order to work effectively with Kubernetes. Eric shares with us his experience about learning the fundamentals of Linux, such as file systems, networking, and storage, as they are crucial for working with Kubernetes. You'll learn from him about how Kubernetes provides a standard and consistent interface for managing Linux-based infrastructure, and also get some great book recommendations. Join us as we discuss how Kubernetes takes the best of Linux and packages it into a powerful, efficient orchestrator.

This episode is sponsored by Learnk8s. Learnk8s is a training organization that helps engineers all over the world level up their Kubernetes skills. From beginners to advanced, there are courses that are online and also in-person, which are 60% hands-on and 40% theory-based. Students who take the courses have access to the materials for the rest of their lives. For more information, go to Learnk8s. Now, let's get into the episode and talk to Eric.

Eric: All right

Bart: Eric, can you tell me about the three emerging tools you're keeping an eye on?

Eric: Thanks for asking about that. To be honest, there isn't a specific tool that I'm really following. There are many tools that I'm following all the time. One of the most important ones is the container application. This is the biggest challenge that I have in different organizations. When you're setting up a Kubernetes cluster, you usually have lots of challenges with Linux servers at the bare metal level. If you're using some public cloud providers, you don't have those issues, but then you have some other issues. Recently, I've been working a lot with Hetzner providers and setting up everything myself. I'm choosing different container runtimes for Kubernetes and following many projects. For example, I'm familiar with the Lima project and Colima, which are container runtimes. The Kubernetes project itself is something I follow all the time, and I attend meetups. That's the biggest thing I follow. There are many other projects, though. Sometimes I build my own operators and put them in the company organization GitHub, and I follow those as well.

Bart: Good. And for people who don't know you, what do you do and who do you work for?

Eric: I'm an independent consultant at the moment, and I do Kubernetes development. I started as a DevOps engineer and joined companies in that role. I say DevOps because it's more of an infrastructure-focused role, but nowadays you can see other names like SRE, infrastructure engineer, or DevOps. I believe these distinctions should exist, but there are projects where I have to develop new tools, for example, for GitHub. There's a project where I have to develop a complete new cluster or investigate its security side, which is mostly on the Kubernetes side. That's what I do nowadays, mostly.

Bart: And what was the process like for you getting into cloud native? What were you doing before, and what was it like becoming more of a cloud native person?

Eric: I started as a backend developer, working with the Java programming language about 15 years ago. At that time, Java was quite popular and competing with Microsoft's .NET framework. Internet access was limited in my country, so people were using a lot of Windows applications. I wanted to build software, but since most applications were Windows-based and required a paid license, I turned to Linux and Java as a more accessible option. I used NetBeans as my development environment, and that's how I got started.

It took about seven or eight years, and then I decided to dive deeper into infrastructure-related topics. I asked my managers if I could work on tickets related to virtual machines, and they were supportive. Kubernetes was a particularly interesting topic for me, given its connection to containers, which are closely related to operating systems, specifically Linux. This journey took about two to four years.

Bart: And you know, the Kubernetes and cloud native ecosystems move very quickly. What works best for you to stay on top of things? I know you mentioned you go to meetups. But also, what are other resources that are good for you - if they're books, blogs, videos, or podcasts?

Eric: To keep myself updated, I read a lot, especially about mTLS. I use the watch button on GitHub for projects I'm interested in, and I sometimes contribute to them. For example, I actively contribute to some Facebook repositories, and I also work on RunC and Terraform providers, writing them in Golang. These projects are important to me, so I keep myself updated by checking my GitHub notifications every morning. I closely watch these projects, and I also follow DevOps and technology experts on Twitter, where they often share their experiences. Recently, I got a Medium subscription, which provides me with brief information on new technologies and how they connect to other technologies. I enjoy reading these articles during my free time, such as on the bus.

Bart: If you could go back in time and share one career tip with your younger self, what would it be?

Eric: If I could go back in time and give advice to my younger self, I would say learn Git and C++ more thoroughly. My understanding of C and C++ was limited back then, and I didn't grasp the concepts as well as I do now. I struggled with data structures and building them, and it took me many years to realize what I was missing. If I had a better understanding of these concepts earlier, I would be in a better position today.

I would also tell myself to focus on learning, not just passing exams. At university, I was more focused on passing, but now I realize the importance of understanding the material.

Additionally, I wish I had learned Git earlier, when it was still a relatively new tool. If I had understood how Git works and even contributed to the project, that would have been a huge achievement. Git is an amazingly engineered project with a complicated architecture, and it's the backbone of every technology company that uses code. It's the glue that holds everything together.

Bart: As part of our monthly content discovery, we found an article you wrote titled "Kubernetes is just Linux. The following questions will take a closer look at this topic. It's interesting to know that this year, Kubernetes celebrated its 10th birthday, whereas Linux celebrated its 33rd. But to start out, how can Kubernetes and Linux be the same?

Eric: Kubernetes is basically the child of Linux. Linux is the parent, and it's not only Kubernetes, but also containers, that have a super close relationship with each other. I'm not saying they are the same, as they don't do the same thing. Linux is an operating system, while Kubernetes orchestrates smaller operating systems. All the functions are based on the Linux operating system, and Linux empowers Kubernetes to do its job.

50% of infrastructure work involves plumbing. The plumbing concept refers to connecting already existing components that work standalone and creating new automation, effectively renovating a new piece of technology - in this case, Kubernetes. To plumb these components together, Kubernetes uses etcd, control groups (cgroups), a volume system from Linux, and networking interfaces. All these components have been part of Linux for a long time. Kubernetes connects the dots and adds logic on top, which is why I call it a renovated operating system for a specific domain, the Kubernetes project. It's not inventing anything new, but rather connecting things together.

Bart: And it's not a secret that Kubernetes is a complex piece of technology. However, in the article, you argue that Kubernetes is simple. We've been doing this podcast for over a year now, and I've heard very few people say that Kubernetes is simple. What's your argument for that?

Eric: I don't think people should say that Kubernetes is difficult. I'm not saying it's super easy, but it's not rocket science either. When you already know how things work and how they work together, it's much easier to connect with this technology. For example, if I ask you how networking works in Kubernetes, you can answer based on your current knowledge of IPs, subnets, and protocols. You can debug and diagnose issues using your existing knowledge. You know where to SSH into, read logs, and debug.

This is another example of how fundamentals are essential to learn. They're more important than just learning new technologies. When you know how smaller pieces of technology work, such as networking and storage systems, data structures, and Linux command lines, it makes your life easier. You can dive into the Kubernetes documentation and find your way around because the terms and names used are from Linux technologies.

For instance, the downward API provides metadata from your container to the kubelet, and then the control plane processes it. You can understand what's happening inside your cluster and worker node. This circulation of data is based on existing Linux commands. If you don't know how Linux works, you won't know that you can read data when launching something in your container.

Sometimes, you can even contribute to the code. For me, Kubernetes isn't a difficult thing to do. It's a wide technology that requires knowledge of many things at the same time. You need different experiences to struggle with it and fix challenges. What I call difficult is how control groups (cgroups) in Linux are implemented. This is a much deeper side of technology, more science, and technology that's harder to understand and implement, especially on the CPU side, where you're dealing with hardware.

Bart: Is it safe to say that Kubernetes is just a wrapper on top of Linux?

Eric: I can say that the whole Kubernetes idea is not just a wrapper on top of Linux, but as I said earlier, 50% of these technologies is just plumbing. So, there are a lot of other components invented around it, and they are connecting the dots together. To a large extent, it is wrapping Linux APIs. Let's think about it like this: you have Kubernetes, which already had many similarities to the code in Docker. Docker is a bunch of plugins that also has an orchestrator that can control the network interface, the storage interface, and all of these. But this is not the container part. The container runtime is called Docker D, and then Docker D also talks with another low-level container runtime, which is called RunC. There are also other alternatives to RunC. The thing is that each of these layers, which are too many, are designed to talk with different parts of the system and actually wrap other components of Linux, which at the end of the line is connected to the kernel and the kernel to the hardware. So, there are multi-layers of different technologies from these lower levels, run files to the higher level, and to the basically to this controller, which you call, for example, Docker Desktop, which is another on top of the basically the CLI Docker, which is the Docker GUI. That's why these are all based on the operating system, and they all connect to each other using the same APIs. And why do we have many? Because we have just one Linux, we have multiple distros and one kernel. But why do we have many container runtimes? Because each of them is designed for different purposes and they have different scopes of engineering. For example, containerd is more lightweight than Docker D. There are also other ones. In general, I think I answered the question.

Bart: Let's focus on pods and containers for a second. If Kubernetes is wrapping Linux APIs, why was there a huge fuss when Docker was replaced as the default container runtime?

Eric: I don't know exactly when Docker was replaced as the default container runtime in Kubernetes, but that's one of the beauties of a modular system like Kubernetes - you can easily replace different components with what fits your needs. I was never forced to use Docker, so I didn't need to know when the change happened.

People were probably surprised that Docker was replaced as the default container runtime because Docker has a lot of overhead and many plugins that are often unnecessary. In the Kubernetes clusters I've worked on, we never used Docker as the default container runtime; instead, we used CRI-O or containerd. There are many alternatives available.

To answer your question, Docker and other container runtimes use the same approach to shipping and running software, with differences in optimization. However, the underlying system for all of them is Linux, and the main enablers are the segregation functions defined by the Linux kernel, such as network namespaces, interfaces, memory, and disk managers. They all achieve the same goal, but with different optimizations.

For example, the main difference between Docker and a technology like Lima is that Docker uses HyperKit to create a Linux environment to run containers, while Lima uses a Linux VM on Mac OS. At the end of the day, it's all about achieving the same goal, but with the most optimized approach for our needs.

Bart: It kind of looks like it's turtles all the way down. How many more layers can we have?

Eric: Too many layers exist, not only downward but also upward. You can have many other layers on top of it, which can create other technologies, like GCP or AWS. I don't know why it went this way in technology, but probably because it has become quite commercialized. Companies wanted to make a lot of money from these open-source technologies that they built, for example, AWS. Now, we have to take a lot of courses and certificates to understand how AWS works, even though under the hood it is still using the same technologies - containerization, virtualization, and orchestration - which were invented and are maintained by the open-source community. These technologies were meant to make our lives easier, but on the other hand, they are not doing that and are adding a lot of overhead, too many things that we don't need, and we are paying for them.

There are a lot of limitations, especially when it comes to security. There are so many things that these technologies have that we don't need, and there are so many things they do not have. They also add many limitations to the organization. I can talk a long time about that, especially since I've been setting up some AWS IAM systems and security groups, and I've encountered many logical problems that I don't understand why they were invented that way.

The layers don't really stop; they go down from Kubernetes to your container, to your runtime, then to your hardware, then to your manager. The problem is that to win, you need to understand what you're doing and what each term means and how things were invented. Then you have a much easier way to find your way in. This demystification - understanding what brought you here and how this piece of technology was invented - helps me at least to understand a lot of things when I want to debug my problem. It also helps me design my solutions, which is a very important thing.

I can bring an example from the business side. If you don't know exactly what you are doing when you're working, there are so many different points where you can make mistakes. For example, when you're opening a new VPC, you're creating a new VPC. If you don't exactly know what a net is or what a public gateway is, you can easily route internal company traffic through an external IP, and then from your NAT Gateway, you get a lot of costs, even though it could be free if it was routed from VPC peering, for example. These subtle problems sometimes get caught easily, sometimes cannot be caught easily, and they produce a lot of loss for your company. So, this is also a business issue. Not knowing the difference between NAT and simple peering can be problematic.

Bart: Now, back to the Kubernetes pods and limits and requests, what control groups (cgroups) do they wrap?

Eric: There are many technologies on the Linux side. For example, there are control groups (cgroups), SELinux, and MAC. SELinux is a standard, but seccomp is a function that is part of it. In Kubernetes, the allocation of CPU processes is handled by an API called CFS (Completely Fair Scheduler), which has a mechanism to report CPU bandwidth. This defines the allocated processing power to each request coming to each pod, after the request router system divides it based on percentage and request. Sometimes, pods can throttle and accumulate more CPU usage than they are limited to, and Kubernetes tries to enforce that limitation by your YAML. However, this is not always the case, as Kubernetes sometimes allows for some drifting to prevent requests from failing and pods from being restarted or crashed. This is one example of limits and request limits. There are still many other things to consider, not just limitations on requests, but also expansion of requests, such as the networking part, which includes eBPF or iptables, which come from Linux.

Bart: I imagine there must be several knobs that you could turn to adjust settings on how the CPU is allocated to those processes. How are those then wrapped and exposed in Kubernetes?

Eric: They are wrapped and exposed with YAML, which comes to the control plane when pushed and applied. The master node discusses this with the kubelet and tells them to control the processing power and computing time based on the manifest. This is how the workload gets spread by the scheduler. There are many other options, including tainting, labeling of nodes, and disruptions like disaster recovery, as well as controls that can be applied to nodes and the cluster. Kubernetes is designed with the idea of maximizing uptime, so if the master node drops and is restarted, none of the worker nodes will be affected. This design ensures that the cluster will act as expected 100% of the time. There are also mechanisms in place for pods to adapt and be flexible enough to keep the system up and running as much as possible.

Bart: Is it fair to say that Kubernetes wraps the Linux API, but has some opinions about it?

Eric: Wrapping and exposing an API is not an easy task. It requires considering many different compatibility scenarios and case studies, as well as conducting numerous tests and gathering extensive experience. This is something that was already happening in huge data centers, such as Google's, and many of those people are still main contributors to Kubernetes. As a result, there are many best practices behind the Kubernetes idea, which has created a model for deploying software and ensuring it is deployed correctly. For instance, the idea that if a kubelet (master node) drops, the worker node should still stay up is a best practice designed for a purpose. Kubernetes serves the purpose of achieving the highest uptime, flexibility, and modularity. We're not just talking about APIs; we're also talking about business, mapping, and organization. Thousands of developers maintain and build hundreds of products, and they all need to work together while having flexibility and agility for CI and CD processes. Kubernetes was designed with this in mind. That's why I say Kubernetes is not just wrapping Linux, but a combination of software developers and dev and ops coming together to design a system for these purposes.

Bart: So, does the value of Kubernetes mostly come from

Eric: And an opinionated API... I'm not sure how exactly, but it consolidates every standard and consistent interface. Because it abstracts a single Linux host and offers an option to connect several Linux servers together, it wraps APIs and makes a higher-level abstraction. The operator doesn't need to know how many steps are necessary to create, for example, a control groups (cgroups) and assign it to a process. So. I would say that no.

Bart: Okay. So we've talked a lot about Linux and Kubernetes and their relationship. What I got from our conversation is that if you are new to Kubernetes, you'd be better off learning Linux instead. And the corollary to that is that if you've already mastered Linux, learning Kubernetes is an incremental step. You need to map your mental model of how Linux works into Kubernetes. Am I correct?

Eric: To some extent, yes. The idea is that you should learn how to learn. If you want to learn infrastructure well, you need to know your operating system very well, since infrastructure runs on top of either Linux or Windows. Knowing your operating system makes it easier to work with infrastructure software, as it controls the operating system, databases, and other entities.

In the case of Kubernetes, since the master node only runs on Linux, you definitely need to know Linux. You should understand how Linux works, including file systems, networking, and software-defined storage. This knowledge helps you understand the difference between a volume claim, a claim, and a persistent volume claim (PVC).

For example, if you have a huge object, like 2,000 gigabytes, you need to understand that you require an object storage solution, such as Ceph. You should be able to configure it for a production system without building a new cluster, and update the current cluster seamlessly, without notifying users. This requires a fundamental understanding of the operating system, especially Linux.

Bart: In terms of your fundamental understanding of Linux, is there any book, course, or tutorial that you recommend for someone who's starting with Linux? What was your learning journey like there?

Eric: Prior to this, I brought a book that I really loved reading. I don't know if you can see it already. This is the Unix and Linux System Administration Handbook, a quite long book with about 1000 pages. You really don't need to read all of it, especially the email part, but it gives you great knowledge for administering Linux. I also have another book called Linux Bible, which has a lot of the same knowledge, but it provides a more high-level picture of how Linux is designed. This is very beneficial for understanding the domain that my operating system is capable of handling, as it covers not only CPU and storage but also how Linux interacts with external devices, such as printers. These systems are enterprise-level designed, so you can understand how organized and capable they are. This knowledge helps you understand what they are capable of doing.

When you read the administration handbook, it provides a lot of knowledge and names of software that come by default with the kernel and are available in most distributions, although probably not in Alpine Linux. In general, it covers the tools you have at your disposal when SSH'ing into a machine and trying to diagnose something. For example, it explains what Promiscuous mode is, which is useful when debugging a Linux machine, or what mTLS, WireGuard, and other tools are.

Bart: Do you think there will be a Kubernetes without Linux in the future?

Eric: No, I don't think that will ever happen. The Linux and Kubernetes community is already quite advanced, and I don't see any reason to rebuild or reinvent the wheel. I think there are still many things to do on Kubernetes, and there are too many things left for us to work on.

Bart: So, when you're not working on Kubernetes or Linux, we found out that you're a huge fan of Bouldering and also a professional Call of Duty player. Can you tell me about that?

Eric: I don't always have time to play computer games due to my busy schedule, but I try to find at least one hour a week to play. I'm quite skilled at playing Call of Duty. I also make time to go to the gym to stay healthy, as sitting behind a desk all day isn't ideal. I've recently taken up Bouldering, which I find exciting, and I've been doing it for about two years now. I highly recommend it to fellow developers.

As for what's next for me, I have many ideas for the tools I'm creating. I've built various tools for SRE teams in different companies, but I haven't had time to complete them, add features, and release open-source versions on my GitHub. This is something I'd like to focus on. I also want to start blogging about my experiences and share my knowledge with the community. I've learned a lot from overcoming challenges and receiving consultations, and I believe it's worth giving back. However, I haven't had time to do so yet. My main career goals are to work on open-source projects and blogging. I'll continue doing what I'm doing, gaining more knowledge, and enjoying life.

Bart: What's the best way for people to get in touch with you?

Eric: I have Twitter, my handle is @Twitter @JalalEric with no space or dots. It's now just called x. My email is [email protected], which is also where you can send me an email to get in touch. I would be glad to help, definitely.

Bart: Great. Well, Eric, thank you very much for your time today and for sharing your knowledge. We're looking forward to your next steps, whether it's Linux, Kubernetes, Bouldering, or Call of Duty. We wish you nothing but the best. We'll be speaking soon. Thanks a lot.

Eric: I'm ready to assist. Please provide the transcript text that needs to be reviewed and edited with hyperlinks. I will output the amended text in full with the hyperlinks in an answer HTML tag.