Innovation at the operating system level and bare-metal layers

May 20, 2024

Guest:

Edward Vielmetti

Edward Vielmetti, Developer Partner Manager at Equinix, discusses:

Learning Kubernetes effectively by experimenting with underpowered setups to understand potential failures and error messages.
The layers involved in running Kubernetes on bare metal and the innovation happening at the operating system level.
The future of Kubernetes and the hardware evolution that might steer its direction.

Relevant links

Transcription

Bart: Who are you? What's your role? And who do you work for?

Ed: Hi, I'm Ed Vilmetti. I work at Equinix Metal, where I'm the developer partner manager for open source. We work with a number of open source projects, including members of the CNCF community, to provide infrastructure to help their projects succeed.

Bart: What are three emerging Kubernetes tools that you are keeping an eye on?

Ed: I'm keeping an eye on about 100 of them at the moment. But if you had to pick, I would pick innovation at the operating system level. It's the level between Kubernetes and the bare metal. Flatcar Linux is something that we work with closely, an operating system distribution focused on providing infrastructure that lets you run a Kubernetes cluster successfully. Talos Linux is another one, a very innovative Linux distribution, very minimalist in terms of what it provides, just the minimum amount of what you would need to get Kubernetes running successfully. It supports the hardware that you need. And then number three is sort of a wild card. I think the biggest challenge that people have with Kubernetes technology is observability, trying to figure out what's going on inside their clusters and how it all works. And if something is going wrong, how to deal with it. I'm really interested in the whole observability, the OpenTelemetry (OTel) space and all the tooling up and down the line from visualizing all that to insert. We are supporting a project called otel-cli for putting observability into your shell scripts. So like soup to nuts, all of that stuff.

Bart: One of our guests, Mathias, suggested the best way to learn Kubernetes is by doing and getting your hands dirty. He then went on to build his own bare metal Kubernetes cluster in his spare time. What's your strategy to learn Kubernetes tools and features?

Ed: One of the things that is really hard to learn about Kubernetes at scale is what's going to fail if you don't design things right. My favorite experience with Kubernetes learning to date has been running a Kubernetes installation on a much too small, much too slow machine with really awful storage. I discovered what can go wrong with etcd when your storage is way too slow and error messages that most people with reasonable installations would never see. There's a tradition in this world of having a homelab of some sort. It doesn't have to be expensive or fancy, but having a system where you can build it, break it, fix it, and repeat is really helpful. If you don't have that, having some sort of access to cloud resources is a good alternative. Of course, I'm going to say that because of where I work, but just to be truthful, having something that you can just do at home and test things out on tiny, underpowered, unreasonable equipment helps make sense of what goes on when things go wrong.

Bart: Mathias also believes that on-premise deployments require proper education and attention, especially regarding managing on-prem architecture versus cloud architecture. After spending a few months building an on-prem Kubernetes cluster, he shared this advice. What's your experience with bare metal clusters? And how does that differ from using Kubernetes in the cloud? What would you have liked to know before starting Kubernetes on bare metal?

Ed: Kubernetes on bare metal at Equinix Metal is a hybrid where we have bare metal resources that you can use. There's no hypervisor, no shared tenancy; you have the whole thing, but there's a bit of an API to help you provision things and whatnot. I think the hardest thing to understand about bare metal is the layers. Below the operating system, the control plane of the hardware itself rather than the data you're working on. Again, I think the homelab is the solution to that, to have some of the resources you need to break things. Also, at the bare metal level, you're almost always running some sort of operating system on top of the bare metal that's not Kubernetes. So you need to understand the range of choices you have for that. It's probably not going to look like the operating systems you're used to for desktop use. Exploring that space is really interesting.

Bart: Kubernetes is turning 10 years old this year. What should we expect in the next 10 years to come?

Ed: The next 10 years, I think, will see a couple of innovations. People are going to figure out how to solve the problems that Kubernetes solves in a way that's easier to understand. People will find solutions that address the issues Kubernetes solves but are simpler to use. Whether it will still be called Kubernetes or not is hard to know. People are building platforms these days where Kubernetes is underneath but well hidden from the user. All they see is a container runtime. They don't know about scheduling and don't care about pods. These complexities are hidden from them, yet there's a Kubernetes layer underneath. In 10 years, there will likely be a complete rewrite of some components. People might start over and say, "All right, we did etcd once. Can we do etcd again?" Start over from scratch. Maybe not that project, but something else like it or some other key component where the community won't be just one thing. There will be a couple of similar things. I wonder whether the complexity of Kubernetes will drive people away from it and whether they will revert to simpler times when you had a server, ran processes on it, and systemd was your friend. Maybe that server will have 100 or 200 cores, but you'll manage a single big box rather than a bunch of little devices. It's hard to say. Part of it will depend on the evolution of hardware. If hardware evolves to have more AI chip interfaces, innovations from that market will creep into this market, and people will work on it that way. So I don't know, 10 years is a long time.

Podcast episodes mentioned in this interview

Kubernetes on bare-metal: lessons learned

with Mathias Pius