Kubernetes as the AI platform and the need for better developer experience

Kubernetes as the AI platform and the need for better developer experience

Guest:

  • Brian Douglas

This interview explores the intersection of developer experience, emerging technologies, and open source project management in the Kubernetes ecosystem.

In this interview, Brian Douglas, Head of Ecosystem at CNCF (The Linux Foundation), discusses:

  • AI and ML workloads on Kubernetes - Why Kubernetes is becoming the essential platform for scaling AI infrastructure

  • Developer experience and UI/UX in cloud-native tools - How his front-end development background influences his approach to Kubernetes tooling, emphasizing the need for better graphical interfaces

  • Open source project health metrics - Practical frameworks for measuring project vitality beyond simple download counts or GitHub stars, focusing on metrics like unique issue authors and contribution velocity.

Relevant links
Transcription

Bart: Who are you? Where do you work and what's your role?

I'll enhance this with some context and links:

Brian Douglas works for The Linux Foundation, which is a prominent organization in the open source technology ecosystem. The Linux Foundation hosts many important technology initiatives, including the Cloud Native Computing Foundation (CNCF), which oversees Kubernetes.

Brian: My name is Brian Douglas, known as BDougie on the Internet, and I work for the CNCF now as Head of Ecosystems.

Bart: What are three emerging Kubernetes tools that you're keeping an eye on?

Brian: Two days ago, NVIDIA just launched KAI, an autoscaler for GPUs, which I found really fascinating. Another brand new thing I discovered was Headlamp. As a front-end developer, I like the idea of having a GUI interface for managing Kubernetes. The third thing, Ollama, is not specifically Kubernetes, but I'm really interested in scaling my Ollama models. I'm building a lot of code assistance locally in my VS Code using Continue.dev and trying to figure out how I could scale and host my model somewhere else on my laptop.

Bart: I want to go back to the second point you mentioned about being a front-end developer. Since 2020, I haven't been interviewing many people with a front-end background. Walk me through this: What are things people might be missing when it comes to developing open source projects, particularly in thinking about UI or user experience and developer experience?

Brian: I was a full-stack developer managing many parts of the stack. I worked at Netlify back in 2016, around the time the Silicon Valley show came out. There was a joke about the developer dashboard where our infrastructure was just a bunch of point-and-click AWS interfaces. My experience has always been about considering the happy path: What's the experience for someone who doesn't want to get hands-on with keyboard commands or drag-and-drop interfaces? How do you build an experience so people can just get in and get out? My entire role at GitHub previously was always about new user experience, onboarding, and how to get in and out of YAML as fast as possible.

Bart: One of our podcast guests, John McBride, expressed that Kubernetes is a platform of the future for AI and ML, particularly for scaling GPU compute. Do you agree with this assessment, and what challenges do you see in running AI workloads on Kubernetes?

Note: I've added hyperlinks to key terms:

  • Kubernetes: The core container orchestration platform

  • AI and ML: Contextualizing the workload type

  • GPU compute: Highlighting the specific computational resource being discussed

I also noticed an opportunity to link to a CNCF blog post about running AI workloads on Kubernetes, which provides additional context to the discussion.

Brian: I know John McBride really well, and I work with him on AI stuff at OpenSource. I would say Kubernetes 100%. As a front-end developer, I use a lot of Vercel, Vercel AI, and tools off the shelf like OpenAI. These are great for prototyping, getting stuff off the ground quickly, and testing. But if you're building serious infrastructure at an enterprise or scaling a startup, you want control. You want to be able to pull the levers and turn the knobs. Kubernetes is that platform.

We have Linux. There's no question about building on top of Linux. Now we're at the point where there's no question about building on Kubernetes—we're here at KubeCon with 12,000 people in London. The next step is how we start automating and scaling infrastructure in multi-cluster environments. It's one thing to run something on your machine or in a Docker image, but now you have to run it in multiple clusters.

We also need to think about data sovereignty, which is a big conversation at this conference. I'm using AI to enhance features for end users and customers, but what about the data part? How do you scale that? These are questions being answered through conversations and talks at this event.

Kubernetes will continue to inflate and expand this ecosystem. Companies like SF Compute are hiring Kubernetes engineers. OpenAI has hired many Kubernetes experts. Even these cutting-edge, $40 billion funding round VCs are hiring Kubernetes engineers. This is the future.

Bart: John challenged upstream companies to dedicate more resources to making Kubernetes the platform of the future for AI and ML workloads. What specific improvements would you like to see in Kubernetes to better support AI workloads?

Brian: Currently, our ecosystem needs more public speaking about how people are building their technologies. At this conference, I enjoyed hearing Ant Group and ByteDance share their experiences with LLMs. We need more stories to influence the existing CNCF landscape. We're in the vendor hall with numerous companies supporting AI, and we're developing a more strategic approach.

By the summer, we'll likely have more recipes and playbooks that can help the industry look back and support those trying to scale these technologies. The key is to share and tell.

Bart: If you could choose one Kubernetes feature that you would like to see improved, or another way to say your least favorite, which one would it be?

Brian: That's a good question. I come from the front-end space. I've done Kubernetes the hard way. I haven't managed stuff on my own, but I've had to fix white space in YAML files. I like Headlamp. I wasn't even aware of it until this event. I would love for more people to build interfaces for folks like me who understand how to manage servers and be handy with this technology. Give me something I can point and click with. Headlamp is open source. I know there are many vendors with solutions, but I like the idea of having a community build something. If you're already contributing to this ecosystem, take a look at that project so we can start elevating the experience.

Bart: I remember last year attending a talk you gave at State of Open Con in London, where you spoke about open source projects and their health metrics. There are debates about the definition of success: Are we thinking about downloads? Are we thinking about GitHub stars? Could you walk me through that and share knowledge that would be good for all open source maintainers and people driving these projects to keep in mind?

Brian: I spent a lot of time thinking about project health metrics. There are some established ways of measuring, like counting issues. But when you look at issues in isolation, it's hard to know if it's good or bad.

You can see issue authors as a metric to identify how many unique issue authors are actively participating in this project. That gives you an understanding of how active the project is and what the community looks like in the landscape.

The other important factor is looking at velocity: not just issue authors, but also comments, commits, and pull requests. When you start examining how often these occur, you can make a good educated guess about the product's health.

Other key metrics include the frequency of contributions from core team maintainers and how much contribution happens outside the core team. Looking at outside contributions is very important. Issues opened by people who are not the core team are a critical metric for any project.

If you don't have issues being opened from outside the core team, you're probably dying. It's a harsh statement, but one that everyone should examine and challenge. By sharing and noticing these metrics, we can create opportunities for companies and individuals to get involved and contribute.

Bart: What's next for Brian Douglas?

Brian: Next for me, I wish I was going to KubeCon Japan, but I'm going to have to bow out. I am actually heads down looking at Atlanta this fall. I'll be engaging more with end users. I've been observing the TOC and the TAB, as well as the governing board. I've been listening a lot. What I want to do is engage more end users to help them get involved, contributing, and elevating this ecosystem.

Bart: If people want to get in touch with Brian Douglas, the best way to do that would depend on the specific context. Since he works for The Linux Foundation, he might be reachable through professional networking platforms or conference events like KubeCon Japan.

Brian: I'm BDougie on everything on the internet. B.Dougie.dev is my new portfolio website, with my email included. Reach out—I answer questions and don't shy away from DMs.

Podcast episodes mentioned in this interview