Dear friend, you have built a Kubernetes

Dear friend, you have built a Kubernetes

Host:

  • Bart Farrell

Guest:

  • Mac Chaffee

Mac Chaffee, a platform engineer and security champion, examines why developers often underestimate the complexity of running modern applications and how overconfidence leads to expensive technical mistakes.

You will learn:

  • Why teams reject Kubernetes then rebuild it piece by piece - understanding the psychological factors, like overconfidence, that drive initial rejection of complex but proven tools

  • How to identify the tipping point when DIY solutions become more complex than adopting established orchestration tools, especially around scaling and high availability challenges

  • The right approach to abstracting Kubernetes complexity - why hiding the Kubernetes API often backfires and how to build effective guardrails instead of reinventing interfaces

  • Why mentorship gaps lead to poor technical decisions - how the lack of proper apprenticeship programs in tech results in teams making expensive mistakes when building infrastructure

Relevant links
Transcription

Bart: In this episode of KubeFM, we are joined, or should I say rejoined, by Mac Chaffee, an internal platform engineer who knows what it's like to build infrastructure that developers actually use. We talk about why teams that reject Kubernetes often end up rebuilding it, what it really means to design the right level of abstraction, and how failing to understand Kubernetes can lead to very expensive mistakes. We also get into the deeper challenges of knowledge transfer in our industry and why mentorship might be the missing piece behind so many bad technical decisions. If you've ever had to justify using Kubernetes or tried simplifying it without breaking everything, this episode's for you.

This episode of KubeFM is brought to you by LearnK8s. Since 2017, LearnK8s has provided in-person and online trainings to engineers all over the world, helping them level up their Kubernetes skills. Courses are instructor-led and given in-person as well as online. Courses are given to groups and individuals, and students have access to the course materials for the rest of their lives. For more information, check out LearnK8s.io.

Now, let's get into the episode. Mac, welcome back to KubeFM. To get started, what are three emerging Kubernetes tools that you're keeping an eye on?

Mac: One is new to me: Argo. It adds real continuous delivery features to Argo CD. I'm really looking into the WebAssembly space. Third, anything that doesn't have a pushy sales team - that'll oddly enough make you stand out these days.

Bart: So you heard it from the source. Don't have a pushy sales team. Make sure it's focused on providing value and helping folks out. Now, we know a fair amount about you already because we interviewed you on a previous episode. But for those of our listeners who may not know you, can you just tell us quickly what you do for work?

Mac: I work on an internal platform engineering team. We build tools that make it easier for developers to deploy their applications and provision infrastructure, such as databases and S3 buckets. I also have a somewhat specialized focus in security and do SOC 2 compliance work. I recommend that people develop a specialization like this because it helps them stand out in interviews and adds variety to their work.

Bart: And how did you get into Cloud Native?

Mac: It's basically resume-driven development. At my first job, my mentors decided we needed Kubernetes when we definitely didn't need it. But I'm really glad that they did because it launched my whole career. If you don't have an employer that's as patient, I recommend standing up a K3S cluster, deploying some stuff in it, and kicking the tires that way.

Bart: If you could go back in time and share one career tip with your younger self, what would it be?

Mac: A challenge for me has been finding ways to sustainably learn on the job. We need to move quickly, so you have to find ways to learn. Early on, I would try to do that in my free time, which meant having no life. So, don't do that.

The other extreme is resume-driven development—picking a bunch of flashy tools that end up not working out. I wouldn't recommend that either because it's not sustainable. Instead, I recommend finding just a few new pieces of technology that actually solve legitimate problems and then find ways to sell that to your company.

This is its own skill set and is really important. If you can do that, you can achieve a good, sustainable balance of working on new tools and learning in a way that prevents burnout.

Bart: Now, as part of our monthly content discovery, we found an article that you wrote titled "Dear Friend, You Have Built a Kubernetes". So Mac tells us what this post is all about.

Mac: It's basically a fake letter addressed to someone who's just trying to deploy a simple application. In the process, they're stubbornly refusing to use Kubernetes, but end up rebuilding Kubernetes piece by piece. The reality is that Kubernetes solves a bunch of legitimate problems. I didn't come up with the format. The original was called "Dear Sir, You Built a Compiler", which has the same vibe—someone trying to avoid building a compiler but building one anyway. I just loved the overly formal and smug writing style, which is good for an internet that likes to click on things that make them angry.

Bart: Now, what was the community's reaction to your post? What kind of feedback did you get?

Mac: So it kind of started a trend. There was at least one other person who did a "Dear Sir" or "Dear Friend" article. The internet loves a good copypasta trend. Unfortunately, I feel like many people didn't take away what I wanted them to. There were many people defending their choice of something simpler than Kubernetes, saying they just use EC2, but they would neglect to mention the dozen other supporting tools they use to make EC2 work.

Bart: Your article captures how developers reject Kubernetes, only to recreate it piece by piece. What psychological factors do you think drive this initial rejection of established complex tools?

Mac: The psychological factor for stubborn, know-it-all tech workers is the trap of overconfidence. Many people underestimate how difficult it is to run a modern application. It's not easy to recognize that difficulty, which is how you get trapped in overconfidence. The designers of Kubernetes didn't set out to build an overcomplicated piece of software. It grew organically, and there's a lot of hard-won knowledge baked into the codebase. It would be unwise to totally reject that.

Bart: After rejecting Kubernetes, teams often start with shell scripts that gradually grow more complex. At what point in this evolution do the scales tip, and teams should reconsider adopting the tool they initially rejected?

Mac: I think the tipping point depends. It's easy to tell where the tipping point is if you really understand Kubernetes and the niche that it fills. There are people who have explicitly chosen a non-Kubernetes option for running their applications. Those people aren't worrying daily about whether they chose the wrong thing because they understand Kubernetes and their application and don't see a need to change.

If you don't actually understand Kubernetes, that's a risk. You may end up needing some of the tools Kubernetes provides, like auto-scaling, service discovery, or failover. You would only encounter this at the worst possible time—for example, Black Friday for an e-commerce company. That would be a bad time to realize you shouldn't have rejected what you thought was overly complex.

Bart: When traffic is peaking and your site is crashing because of so many people trying to buy something, you should have moved to a different kind of infrastructure and really gotten Kubernetes in there. At least understand the trade-offs of why you're deciding something and why Kubernetes might not be necessary, rather than just straight-up rejection. But this pattern of rejection followed by reinvention seems to repeat across our industry in different domains, whether it's with compilers, orchestration tools, or databases. What does this cycle reveal about how knowledge transfers or fails to transfer in software engineering?

Mac: I think it's a mentorship problem. In our industry, we struggle because we don't have proper apprenticeship programs like other industries. If you don't have a mentor and look at something like Kubernetes, you're going to think it is really overcomplicated and that you could build something simpler.

Re-implementing a system can be a great learning tool—like in my databases class in college, where we re-implemented a database. But this approach only works if you have a mentor who can guide you through the proper steps and help you complete the project in a semester.

Without a mentor, if you're working for a company and think you can reinvent Kubernetes without fully understanding it, you're probably going to make very expensive mistakes.

Bart: Your article suggests that scaling to multiple servers is a key inflection point. With today's cloud providers offering increasingly powerful single machines, couldn't teams simply scale vertically on one large box rather than dealing with distributed systems complexity?

Mac: This might come as a surprise, but I'm a big proponent of the one big box infrastructure. I think you can get really far with it, and it does simplify things a lot. But there's still one question left when you use one big box: how do you solve failover and high availability?

The way Kubernetes solves that is through distributed computing. You don't necessarily need to use distributed computing. My point is that understanding Kubernetes means understanding the problems that it solves, and then you can insert your own solutions to them, whether that's high availability, disaster recovery, or some other failover mechanism.

On the subject of single-node Kubernetes or single-node infrastructures, I think many companies are starting to realize that Kubernetes solves numerous problems inherent to running applications, not just failover. They're starting to run on-premise products that run in a single node in Kubernetes—my company is doing that. It just makes sense to use Kubernetes because it's essentially an operating system for running applications. It simply solves many problems you'd otherwise have to address with other tools like shell scripts.

Bart: One critical aspect that your article touches on is security concerns when mounting Docker sockets. What other security implications do you see when teams build these DIY orchestration systems?

Mac: The biggest security concern is simply ignoring security altogether. Both Kubernetes and security are fields that are deeper than they seem. It's hard to know what you don't know about either Kubernetes or security. You can't just stumble your way into building a secure orchestration system. If you set out to do that without deep knowledge of security or Kubernetes, you're likely to make many mistakes without even realizing it, which is really dangerous.

Kubernetes isn't the most secure system and doesn't solve all security problems. OpenShift arguably does a better job. However, if you read through Kubernetes enhancement proposals, many security experts are embedding their expertise into the features that come out in Kubernetes. It would be unwise to ignore all that expertise.

Bart: Many developers wish for a Kubernetes Lite—something with core features but less complexity. Based on your observations of what teams actually need, is such a simplified orchestration tool viable or valuable?

Mac: I have a lot of respect for the Podman team and what they're doing with Quadlet, because if you read through those docs, it definitely reads as a group that actually understands Kubernetes and is setting out to build something simpler. Kubernetes will inevitably be replaced by something simpler, and I welcome that. It would make my job easier. However, I don't think you can achieve this without first understanding Kubernetes and the problem it solves.

Bart: For teams who recognize they need Kubernetes, but want to avoid overwhelming their developers, what advice would you give about building internal platforms that abstract the right level of complexity?

Mac: I like this question because it's exactly what my job is: trying to find the right level of abstraction. I've got a hot take on the subject: I don't think we should hide or cover the Kubernetes API with additional layers of abstraction, because you end up reinventing the Kubernetes API.

If you think of Docker Compose, they have similar concepts like health checks and security context, but in a more condensed format that's not as feature-rich. I believe we should expose the Kubernetes API to developers. The challenge is building guardrails around that exposure.

You will need templates because there's a lot of boilerplate. You need CI checks, linters, and policy engines to enforce best practices—such as ensuring health checks, setting resources, and avoiding CPU limits. This is the real work of the platform team: building those guardrails.

Bart: For teams who recognize they need Kubernetes, beyond technical implementation, platform adoption requires cultural change. How should platform teams approach the human and organizational aspects of moving to Kubernetes-based platforms?

Mac: The key to addressing this challenge is empathy. A lack of empathy leads development teams to think infrastructure work is easy, making them resistant to adopting platforms. Empathy is a two-way street. Platform teams often think developers have it easy and will look down on them, telling them to "just learn Kubernetes" instead of building tools and documentation to help them improve.

Platform teams should set a good example by empathizing with developers and responding to their feedback. It's also necessary to set clear boundaries around ownership and deprecation cycles, which goes hand in hand with empathy. This lack of empathy extends to other departments like IT, finance, and sales. Every department needs to learn to respect the work of other departments. Otherwise, we'll continue to see situations where people only learn to respect each other after being told "I told you so" — an approach that won't get us very far.

Bart: Much agreed. Mac, what's next for you?

Mac: I have to go back to work. I think it would be funny to say that I have another blog post planned, but I can't imagine having something interesting to say more than once every few months.

Bart: And just a topic I want to touch on since it came up towards the beginning: you mentioned the point about knowledge transfers and mentoring. In your experience, how is that in terms of upskilling and getting to the point where you're at right now? Is there anything you've seen that works? As you said, there's not necessarily a formalized process, but perhaps there are things you have noticed that maybe have worked or things you would like to see in the future. Anything you'd like to comment on that?

Mac: I think we need a complete rewriting of the tech industry to be similar to other fields that have proper apprenticeship. I've worked at companies that are all juniors and companies that are all seniors. I believe you really need to have that mixture. We've kind of lost sight of the fact that when juniors and seniors work together, both groups benefit. Seniors can mentor juniors, and juniors can learn from seniors. In a couple of years, you have a team of really talented people. It seems like such a good deal. I am shocked that we don't do it. Any kind of formal, industry-wide mentorship process would be great. I'll just add that to my Christmas list.

Bart: That's good. You heard it here first, folks: an approach to matching people at different skill levels in order to build a really good team. Mac, thank you once again for sharing your knowledge with our community. I look forward to interacting with you soon. Take care.