eBPF, sidecars, and the future of the service mesh

eBPF, sidecars, and the future of the service mesh

Host:

  • Bart Farrell

Guest:

  • William Morgan

This episode is sponsored by Learnk8s — become an expert in Kubernetes

Service meshes and the community's opinion of them have changed drastically over the years.

From being perceived as unnecessary, complicated and bloated, they matured into security and observability powerhouses (while still retaining much of their complexity).

In this KubeFM episode, William deep dives into the world of service meshes and explains a few of the technical choices and trade-offs of service meshes in simple terms.

You will learn:

  • What is a service mesh and its design (i.e. control plane and data plane).

  • How Ambient mesh departs from the traditional sidecar model and how it affects reliability and security.

  • Why there's more than just eBPF in sidecarless service meshes and the limitation of this technology.

  • The direct costs (compute) and human factors involved in operating a service mesh.

Relevant links
Transcription

Bart: When talking about Kubernetes, getting to the subject of networking can often be tricky. On top of that, we start discussing service meshes, the best choices to make out there, what you need to be thinking about when approaching this topic, and sometimes even be scary for practitioners. So we got a chance to speak to William Morgan, who's the CEO of Buoyant, and Buoyant are the creators of Linkerd, very active in the service mesh space, something he's been working on for quite a long time. Now, if we're talking about this, we're going to end up talking, of course, about eBPF, sidecar, and also sidecar list models, talking about the cost and complexity regarding those particular choices, the data plane and control plane having different responsibilities and distinct roles inside of a service mesh. Of course, also speaking about ambient mesh standards, such as a gateway API. William really has something to say and a lot to say about all of these. I'm sure we have enough to have a second conversation. This episode is sponsored by Learnk8s. Learnk8s is an organization that provides training so that Kubernetes practitioners can level up in their careers. The trainings are given both in person as well as online. They are 60% hands-on practical and 40% theoretical. If done, you have access to all the materials for the rest of your life so that you can continue to dig into those topics and improve your Kubernetes skills. For more info, check out Learnk8s.io. All right, William, you got a brand new Kubernetes cluster. What three tools are you going to be installing first?

William: Well, first I would install Linkerd, of course, the best tool out there for any Kubernetes cluster, a necessity, I would say. After that, you know, I'm kind of... you know, I'm a little biased in that really the things I'm most exposed to are the things that people use alongside Linkerd, so it'd probably be something like Argo and something like cert-manager. Very boring, maybe, but that's, I think with those three projects, you got yourself a pretty good setup.

Bart: We definitely hear, you know, Cert Manager comes up a lot. Argo's mentioned a fair amount as well, talking about GitOps. Why not Flux?

William: Oh, actually, Flex would be totally fine. I'm agnostic between those two. So, yeah.

Bart: But it's something from the GitOps space. Okay. Right.

William: Yeah, yeah, that's right. Especially for things like progressive delivery. I think that's a really powerful tool when combined with Linkerd, you know, if you want to be really safe. Rollout's a new code. So, I like that stuff.

Bart: Okay. you are a CEO. So who do you work for? What do you do? Tell me more about that.

William: Yeah. I've got the worst boss in the world, myself, you know, taskmaster, uncaring, you know, just horrible, miserable person. So, yeah, I think the role of CEO is one, and it's not like I'm a real expert at this. I was an engineer for all my life until I decided to take on this lofty mantle. And it's not like, oh, I'm the CEO of Boeing or something, where it's like, I'm the CEO of a small tech startup. So what is that like? A lot of it is, I think, is trying to stay focused on the big picture and then trying to... use the people around you, maybe use is a little mercenary of a word, but in the way that kind of drives everyone in the same direction. And I think those are the two skills coming into this as an engineer, where mostly I was like, okay, how do I write the code and how do I make it maintainable and stuff like that? Those are the two skills that I end up using the most or that I've had to develop rapidly and use today. Is that helpful?

Bart: Yeah, no, it is. From a technical perspective, as an engineer, how did you get into cloud native and what were you doing before cloud native was a thing?

William: Yeah. So, you know, I actually, funnily enough, most of my early kind of career and stuff was very much around AI and NLP and, you know, machine learning and that kind of stuff. Long before it was cool, you know, and I got, I went to school. I had, you know, I did some research. Originally, I thought I was going to go into academia. And then, you know, I quickly got to the point where I was like, oh, my gosh, I really like writing code. And I really like, don't like doing research. And when I was supposed to be doing research, I'd be like, you know, writing my little side projects and stuff. So I eventually, you know, ended up kind of working at a bunch of startups in the Bay Area. You know, mostly in kind of NLP and machine learning roles. I worked at a company called PowerSet that was building this whole natural language processing kind of like Wikipedia alternative and engine and stuff that got acquired by Microsoft. And then some other companies kind of in that space. And then I ended up working at Twitter in kind of the early days of Twitter. Well, not super early. I was there 2010, so it was about 200 people. It wasn't like the 20-person days. It was the 200-person days. starting in there as, you know, I'm kind of the natural language processing, machine learning side, AI side, I guess. But then that was where I actually transitioned to infrastructure because I was like, this stuff is so hard and so hard to make a product, you know, that really is satisfying, you know, at least at that point in time, it was incredibly hard. And meanwhile, all this interesting stuff was happening on the infrastructure side. You know, people were building these really interesting projects or solving these kind of really immediate challenges for the rest of Twitter. And so I kind of got sucked into that world there. And Twitter, you know, I would say at that point in time, this was a, I would describe it as cloud native, even though the nouns were all totally different, the verbs are kind of the same. So, you know, we didn't have Kubernetes. We didn't have Docker, you know, at that point in time. We didn't really have containers. You know, didn't have even the word microservices was like a very nascent word at that point. But we were doing all those things. You know, we had we had Mesos. We were basically turning Mesos from a grad student project into a real project. We had this idea of orchestration. You know, we had we didn't have containers, but we had the JVM. So all the stuff was built in Scala or in Java on the JVM, which gave us isolation. And we had cgroups, which we use. So, you know, you kind of have a basic form of containerization. And then we built stuff. When I was there, we moved it from this, you know, monolithic Ruby on Rails service into this massive, what we would now call microservices deployment. So like all those building blocks were there. And then when I left Twitter, you know, that's when we kind of tried to take those same ideas and apply them to this new world that was developing in front of us around Kubernetes and Docker and like all these new things that were like kind of new technologies to us, but that were exactly the same as what we were doing. You know. in Twitter.

Bart: In terms of the changes that you mentioned, whether it's from the infrastructure perspective and also moving from AI and LP into a different area, how do you stay up to date with all these things? The Kubernetes ecosystem, the cloud-native ecosystem moves very, very quickly. Do you read blogs? Do you listen to podcasts? What's your strategy?

William: Yeah, I think my strategy is kind of informed by, you know, my job at this point. So a lot of what I do is learn from the engineers and the people who are using, who are building Linkerd. So I kind of like get to absorb what they're, you know, they're on the bleeding edge of this stuff and I get to absorb what they're seeing. You know, other than that, I don't think I do anything really special. You know, I read r/kubernetes on Reddit. I read r/Linkerd. You know, I like. make some incendiary comments every once in a while, or every once in a while look at hacker news. But yeah, mostly I just absorb it from the people that I work with day to day who are like the experts in all this stuff. So, you know, I kind of get a cheat mode on that, I think.

Bart: That's good. But it's also, you know, building a community that solves a problem or, you know, resorting to a community for those things. In terms of career advice, if you could go back... you know, whether your experience at Twitter or prior to that, if you could go back and give yourself one career tip, what would that be?

William: you know, the tip I would, yeah, I think the tip I would give myself is look, optimize your impact. You know, I think as an engineer, the tendency I had, and I think this is just kind of part and parcel of that, you know, of that role is I love building stuff. And like, I kind of didn't care early on what I was building. I was just like, hey, just the act of building stuff. I'm like, oh, there's a cool new language or a cool new library. And look, now I get to think about things this way. And. you know, I learned about continuations, like, wow, what are the, you know, that's a mind blowing thing. And for a long time, I kind of like just went down that path, because it was it was gratifying. And I don't think there's anything wrong with that. But career advice, once I started, you know, kind of bringing my head up a little bit and looking around and saying, you know, how do I actually make an impact on this company? Or what's the highest leverage point, then career wise, things got a lot clearer to me of, you know, where I should be spending my time. You know, part of that. I think was responsible for me, you know, the shift that happened to me at Twitter where I was like, you know what? The infrastructure thing is the thing that's really making a difference at this company. Let me focus on that. And like, as much as I loved, you know, the finding, you know, the hashtag predictor or whatever, like kind of interesting, you know, NLP stuff I was working on infrastructure was the thing that was really making a difference. So yeah, that's my advice. Optimize for impact.

Bart: Now, in terms of what we want to focus the conversation on today, you wrote an article called Sidecarless eBPF Service Mesh Sparks Debate, which is a follow-up to an earlier piece that you had written about eBPF sidecars in the future service mesh. You're one of the creators of Linkerd, which is a service mesh. For those who don't know what a service mesh is, can you just walk me through it? Why would someone need a service mesh? What is the role that it plays, the value that it adds?

William: Just imagine like a magical silver bullet that no matter what problem you have, it makes that problem go away. That's what a service manager is. Boy, that makes it sound like. Makes it sound like heroin. Yeah. You know, say,

Bart: Shut up and take my money. Come on, take my money, boy.

William: Right. Yeah, it makes all the pain go away. You know, you'll feel great. Yeah. No, it's, you know, so there's kind of two ways I like to describe it. The first, which I kind of prefer, is like, what does it do? And then the second way, which is kind of interesting if you're an engineer, is like, how does it work? So, you know, what does it do? It is a layer that you add on top of Kubernetes that solves a bunch of stuff that Kubernetes itself. doesn't solve. And there's kind of three buckets of things that it solves. It solves a bucket around security, especially around the communication, right? So can we make every connection that's happening in your cluster or between your clusters encrypted by default and authorized and authenticated? And can we have like policy on top of that that describes things in terms of services and gRPC methods or HTTP routes like this? A is allowed to talk to /foo, but it's not allowed to talk to /bar. Right, that's one set of things that it solves. It solves another set of things in the bucket of reliability. So can I failover gracefully between clusters? Can I shift traffic between cluster A and cluster B and obey that's transparent to the application? Can I do a progressive delivery? We talked about that very early with Argo, where I've got this new code, I want to deploy it, but I don't want to send production traffic to it immediately. Can I send 1%? Can I send 2%? And I like ease into it. Oh, if things are screwed up, like roll back. Okay, if we like minimize the damage there. All of those mechanisms, you know, load balancing, circuit breaking, retries, timeouts, all that stuff. And then the third bucket is around observability. You know, so can I have a uniform layer of metrics for every workload that's running in my application? Things like what is the success rate? What is the latency distribution? of the requests. What's, you know, how much traffic is this thing getting? And I do that in a way, you know, that it's uniform across all my services. And most importantly, like kind of the power of the service meshes, it does this in a way that doesn't require changing application code. So you as a platform owner. as a Kubernetes, you know, kind of like, you know, the person who's responsible for building this Kubernetes-based platform, you get control over those features without having to bug the developers or ask them to implement TLS or, you know, fight with the product managers. So that's what it does. The way it works is like maybe, you know, kind of less important in some ways, but that's, I guess, the topic of this. this podcast because people like to get into you know the details and compare this engine versus that engine you know um but kind of the the the original way that it worked or no no i should say the most common way that it works today it's basically a whole lot of proxies a whole lot of proxies you know and and kind of the the reason why this makes sense today i think and and you know didn't necessarily make sense you you know, 20 years or even 10 years ago, is that with something like Kubernetes and containers, it's very easy to deploy a whole lot of proxies and kind of, you know, you treat them uniformly as a fleet. So the idea, you know, which would have been laughable a decade ago, like, hey, I want you to deploy 10,000 proxies, you know, now it's actually doable, right? It's just like, that's the thing and kind of works. So the fact that we can do all these, you know, we can deploy all these proxies gives us a lot of power. there's a lot of power over the communication that's happening there because that's the insertion point for a lot of this functionality now how do you deploy those proxies where are they what kind of practice what language are they written in like that's all the the debate portion but at the macro level all the service measures basically work by running these l7 proxy layer 7 proxies that understand http hdb2 grpc and can do stuff you

Bart: Really quickly, can you just touch on how things can change when we're talking about the data and control planes across service messages? A little bit about how the old sidecar model was with an extra container in order to better understand where we're at today.

William: Yeah, yeah. So, you know, typically the macro level architecture for a service mesh is you have this idea of a control plane and a data plane. And, you know, that's a very old idea from a lot of networking terminology, right? The data plane is the thing that actually handles the traffic. You know, the control plane is the thing that sits off to the side and allows you to interface with the data plane. You as the operator to interface with the data plane or allows the data plane to share state and things like that. So, you know, in Kubernetes terms, control plane for any service mesh, you know, not just Linkerd, it's basically a bunch of, you know, regular old Kubernetes services that like run somewhere, run in a namespace or run in a cluster. And the data plane is where things get interesting because that's the actual, you know, proxies, right? And so you mentioned the word sidecar, that's a very common approach. And that's the one that Linkerd uses today where we add the. proxy to every pod, right? And kind of like transparently injected in there. So if you have 20 pods, well, now you have 20 little proxies and there's pros and cons and we'll get into the gory details of that. But that set of proxies kind of collectively is your data plane. And, you know, if those proxies are large and... slow, then everything sucks. And if those proxies are fast and small, then like everything's great, you know? And so the implementation, you know, that's like the architecture, but the implementation details make a big difference, right? Because you are adding, if you think about this in the sidecar approach, service A is talking to service B, that's kind of like our mental model. Well, what that really turns into is you're talking to service A, you're talking to service B, and you're talking to service A. pod, you know, A1 is talking to pod B7, right? And then what that really turns into is pod A1 is talking through its proxy to the destination proxy of pod B7 to the destination service. So now we're adding a couple hops there, right? So if we're adding those hops, there's going to be a consequence, right? And we have to make sure that there's a cost to doing that, right? And we have to make sure that the value we're getting is, you know, is much greater than the cost we're imposing to do that.

Bart: Related to that in terms of cost, you know, running an extra container for each pod sounds like it could get a little bit pricey. Does cost become an issue here?

William: Yeah, you know, it's interesting. Yes, there's a compute cost to running a service mesh, kind of like there is to adding any component onto your system. You know, you're paying for CPU, you're paying for memory. Memory usually is the thing you're a little more worried about in terms of raw compute, because that kind of implies, you know, an aggregate that means, okay, well, we might have to scale up our instances, or we might need to scale up our cluster and add another node. Typically, that actually hasn't been a big deal for Linkerd, because what we do in Linkerd, which is unique, is we have what we call a micro proxy, which is a proxy that we wrote in a language called Rust, which is super cool and allows us to deliver these very, very fast and very lightweight proxies, and also allows us to avoid a whole class of. You know, this is kind of what Rust is most known for. It has... memory management, you know, kind of like compile time memory management in such a way where you avoid a whole class of CVEs and buffer overflow exploits that are kind of endemic to languages like C and C++. You avoid that by the compiler being very, very aware of how memory is managed. And so, you know, if you're a programming languages geek, you know, that's kind of like two basic, you know, at the rough level to kind of two basic ways of programming. of running something, you either have something that has, you know, two basic classes of language, you either have something that has like a runtime environment that's doing management for you and keeping you safe and has like the garbage collector and things like that. So Go is an example of that, right? Like you, you know, you run your code in Go and you're like, okay, I'm probably not going to have a, you know, you know, a big memory vulnerability in here because the Go runtime environment is going to like manage this stuff for me. and like it'll crash the program rather than allow me to do something dangerous um and then on the other side you've got languages like C and C++ that are like hey you know what go nuts here's here's a full memory do whatever you want and you know it turns out that basically it's impossible for a human being to really write secure C or C++ code, you know, so you achieve security in those projects basically over decades by like finding all the, all the bugs and issuing CVE warnings and then going and correcting them and then like waiting for the next one to crop up. So you can, you know, you eventually get there, but it's through a whole lot of, you know, human in the loop kind of involvement. So. In fact, I think you can look at these studies. I saw one from Google and I saw one from Microsoft, too. Both of the same number. So basically 70% of the security bugs that they find. in C and C++ code are because of memory management, like mistakes. So, you know, these are the top programmers in the world and like even they can't do this. So anyways, that's all to say, like, so Rust is here, you know, basically as a reaction to that, right? It's a way of saying, hey, we want to write code that's as fast as C and as C++. And we want to write it, you know, in like a full-fledged modern language with, you know, type inference and kind of like all those other niceties. But we also... want to be able to be sure when we compile this and we give you a binary that the binary is not going to be full of you know uh used after free exploits or whatever it is buffer overflows and things like that so we wrote a proxy in rust right so and the um and the reason i think one of the many reasons why Linkerd chose that approach and you got to remember this was in 2018 when we picked this so you know nowadays Rust is like the hot thing you 2018, I was like, whoa, that's a gamble, right? It's like the networking, you know, the language had just like stabilized at 1.0. You know, prior to that, language itself was still evolving. And the network library ecosystem was like, you know, very rudimentary back then. We had to do a lot of work to, you know, kind of core investments in some of the underlying libraries like Tokyo and Tower and H2. These are all things that, you know, that Buoyant and that the Linkerd team invested in to build that up. But, you know, the reason why we went through all that effort is because the data plane, getting back to your original question, the data plane is the thing that the application... data has to transit. And that means in the case of something like Linkerd, this is medical data, it's financial data, it's sensitive customer records, it's 911 call data. Like this is the most critical stuff you can imagine. And we want to be able to give our customers and our adopters every assurance that like, hey, this data plane, not only is it reliable, but it's also secure. And if someone breaks in and like does some naughty thing, you know. we're not going to be the failure point there. We're not going to be the weak link in the chain. So, you know, we build, we use Rust, we build these micro proxies and these proxies, you know, because of the way we design them and because of kind of the powers of Rust are very, very small, very, very fast, very lightweight. And so, you know, you run them and like everything scales up with the amount of traffic kind of in their baseline configuration. They're like two megs of memory or something. It's like, you know, kind of ridiculous how small they are. If you pour a ton of traffic into it, okay, well, you know, we've got to, we've got to consume memory to handle all that. So things will grow, but the footprint tends to be really, really. really small. And so that's usually not a concern for us. Usually the thing that, you know, it's a big cost when you're designing these big distributed systems is the latency. It's like the end-to-end. latency. So what does a user see? Are they seeing, you know, a tail latency, a P99 latency of like five seconds, or are they seeing a tail latency of, you know, 30 milliseconds or, you know, and that's because that affects user experience. It affects, you know, whether your customers are staying around or not. And that's something that the service mesh can actually help with, you know, oddly enough, even though you're adding these hops, you know, you can actually optimize your tail latencies. So that's all to say, you know, in terms of your cost question, usually it's not for Linkerd, it's usually not an issue because of the way we've designed those proxies. For other service meshes, it is because, you know, if those proxies are big and bloated, you know, then it's a problem, right? Then you end up consuming a lot of memory. I don't see even then, though, the thing that really the true cost, the thing that really bites you is what's the operational complexity of these things. Do I have to care and feed for these things? Do I have to hire three engineers to maintain these proxies because they're really complicated? Or does this thing just work? Because in most cases, it'll dwarf your computational costs. It's like the cost of the humans who have to be involved in making this thing stay alive.

Bart: Very good point about that in the human sense, as much as we look at the technical challenges when it comes down to, like you said, the time and sifting through and rooting out errors, which, and that time spent is time that can't be spent on other things. In terms of those concerns and costs, you know, the community has sort of responded, developing different, you know, products, projects to alleviate some of those concerns. One being, you know, Istio with Ambient Mesh. and the other is sidecarless service meshes like Cilium Cluster Mesh. Starting with Ambient Mesh, what is it and how can it improve on the classic, if we want to call it classic sidecar model of service meshes?

William: Yeah, yeah. So both of these things are things that we've looked at deeply in Linkerd land. And I think... there's a point in the future where it may make sense for us to adopt those approaches. I don't think that a point is right there. So, you know, I'm going to say a bunch of critical things, but I don't want that to be taken as like, oh, you know, we're anti this or anti whatever. Every decision we're making here is a trade-off. decision. You know, like when you come down to distributed systems and especially the intersection of distributed systems and like having to run this in production, you know, at a company, you know, maybe a company that... you know, usually a company for whom the platform is like a means to an end, not like this amazing thing that they've, that they, you know, kind of are building for its own purpose. It's like, it's there to serve the, the, the application. So in that framework, everything is a trade-off and the trade-off that we always make in Linkerd land is what is the thing that's going to reduce the operational burden? So that's the lens by which I see all of this. So. what Ambient does, and I think in both cases, you talked about Ambient and eBPF, they kind of both are, they're not a reaction to the complexity, I don't think. I think they're a reaction to... The fact that sidecars are annoying in some ways, right? Because in both cases, kind of their punchline is, hey, and guess what? Now you don't have to have sidecars. Now, how do you get there, right? And it's like that's where you find out the cost of this approach. So in ambient, kind of at the rough level, you don't have sidecar containers. Instead, you have some like connective tissues, a tunnel or something that you run in the namespace. And then those talk to some proxies that are running. somewhere else on the cluster. So you get a bunch of proxies that are running outside of the pod. And you have like a tunnel thing and the pods use a tunnel to talk to the proxy and the proxy does some stuff. But the reason why that's interesting and the reason why, you know, sidecars, I said originally sidecars are annoying, it's because sidecars have... some implications when you run them. There's, you know, there's like the cost thing, which we talked about, which in Linkerd's case, you know, it's not really a big deal. I think the bigger one is in order to upgrade the proxy, in order to like update it with a newest version, you have to restart the pod, right? Because pods in Kubernetes are immutable. And I think there's very good reasons for that to be the case. But what that means is... Okay, now we have to kind of do two things, right? We're keeping our applications up to date. And also, if we ever want to upgrade our service mesh, well, now I'm going to take the control plane, and now we have to update the data plane. And that's kind of annoying. There's some other kind of little nuisances. For complicated reasons, we run the proxy as a first container, which means that kubectl logs, like... you know, like if you just ask it for the logs, the proxy logs, or it's usually you want your application. There's little sources of friction like that. And I also think, honestly, I think there's kind of like a psychological thing, which is like, I don't want to see, I don't want to see the proxy. The fact that I have to see it is like, means that, you know, it's annoying. Now who sees it, right? Like, do the developers see it? No, the platform team sees it. I can't, you know, can't the networking just be invisible to me, right? So I think there's a little bit of that. But that's all to say, both Ambient and the eBPF stuff, which are really not orthogonal, as we'll get into, those are all kind of a reaction to, like, I don't want to see the sidecars. I don't want the sidecars to be there. And in the case of something like Istio and most service measures which are built on Envoy, Envoy is complicated, and Envoy is memory-hungry. And Envoy is something that you have to... you know, care and feed for. Like, you're tuning that thing, you know, based on the specifics of the traffic, you know. And I think a lot of our investment early on kind of, like, got us into a better spot. So I kind of, like, bristle a little at being painted with the same brush. And, like, a lot of the problems that you guys have with sidecars, Linkerd doesn't have. That's, you know, those aren't, like, Envoy problems, they're not. you know, sidecar problems per se. But, you know, this is a cloud native world doing, you know, you got to get some marketing in there and paint everything with a brush and like, you know, write the blog post that says we're killing sidecars, you know, death of sidecars. all right, you know, sure. I guess, you know, now I have to write the article. It's like, well, hold on. Let's make an informed engineering-based consideration. And like, that's much less exciting to read, right? I don't want to read the informed analysis. I want to read the like death, you know, death of the enemy kind of blog post. Okay, so for ambient, you know, you get rid of sidecars because you're running the proxy now somewhere else. And you have these tunnel components. And that's nice because, you know, you can kind of maintain the proxies separately. So, like, you need to upgrade them. You don't have to reboot your apps. Now, I would argue that if you're deploying those stuff in Kubernetes anyway, it's like the whole model. of Kubernetes is your pods should basically be rebootable at any point. Like, you know, Kubernetes can reschedule these things whenever it wants. You know, there's no guarantee. that a pod is going to live for any particular period of time. This is like the whole point of Kubernetes. You build your application as a distributed system and you should be able to scale it out and scale it down and shuffle around between nodes. But, you know, there's legacy applications, there's things written in some way where it's like annoying to reboot them. Okay, I get it, sure. And in those cases, like, sidecar approach is annoying. There's other warts too. I think jobs for a long time, cron jobs and other types of jobs were annoying to have sidecar proxies on because basically there was no mechanism in Kubernetes to tell the proxy that, hey, the job's done, so you can shut down now. There's literally no Kubernetes way to know that. So you had to go through these hoops. where you were like, okay, I'm running this job, and now I have to have it signaled to its proxy that like, hey, I'm done. And, you know, that kind of violates the premise of the service mesh in the first place, which is like, hey, we want these things to be decoupled, right? We don't want the developers to have to know about the sidecar and have to do some extra code. Well, you can do it in the Docker config. You know, it's just like, it was just annoying. Those warts are being addressed, so there's like, you know, there's a... Kubernetes, very famous Kubernetes KEP, like enhancement, Kubernetes enhancement proposal called the Sidecar Container. That was first opened in 2019. I wrote a blog post about that. I'm actually giving a talk about this at KubeCon in Paris. you know, first opened in 2019, and that's like, finally, you know, four years later, just last year, made it to alpha in 1.28, and will be in beta in Kubernetes 1.29, where you can mark your proxy as a sidebar container. And then there's a bunch of nice... consequences to that. One of them being, hey, your jobs can actually terminate, you know, like the proxy once they know that they're done. So there's little works like that that you, you know, that are slowly being ironed out. I guess I should say wrinkles, you know, iron warts. They're all wrinkles that are... Removed,

Bart: sorted out, dealt with.

William: Surgically removed, yeah.

Bart: The main thing is about progress, yeah.

William: Yeah, yeah, that's right. So, you know, my analysis of ambient, so we did a pretty deep dive right now. My analysis of ambient was like, it adds more complexity than it's worth for Linkerd. because now you have these extra components. You've got a tunnel component. Now you've got the things running in, you know, the proxies running over here. It's like, you know, I understand the motivation. We haven't seen the pain in Linkerd either. I think Istio has seen because of their choices. So the trade-off is not there for us. I don't think it's a bad approach, you know, in contrast to the eBPF one, which I think actually is bad. I think Ambient is fine. It's just we haven't seen that pain. So, like, why bother? And the final point I'll make on this is what's really nice about the sidecar approach. What's really, really nice is that the operational boundary and the security boundary. are very clear, right? They are the pod. The pod itself makes its own decisions about security, makes its own decisions about, you know, operations. And you can treat that thing as a unit. And that's kind of like the Kubernetes way, right? And so, you know, even when you're talking about something like zero trust, oh, we've got the, you know, zero trust philosophy, which has like, you know, kind of has all these aphorisms and sayings and whatever. But like one of them is... you got to have enforcement of these things. It's got to be at the most granular level possible, right? We don't have our firewall anymore and then treat everything inside the firewall as, you know, as, as all trusted, right? every decision has to be, every security decision has to be made every time at every level. And so when you have a sidecar in the pod acting as a security boundary, you get that, right? Like no request can come in here without the sidecar itself validating, in the case of Linkerd, validating, is this a valid TLS request? Do you have a valid client ID? Not IP address, because you can't trust the network, but like a cryptographic service identity. Do I trust that identity? You know, can I have like a TLS cryptographic chain of trust where I believe that identity? And then are you allowed to talk to me? Are you allowed to call this particular method? All that stuff can happen in the pod at the most granular level possible in Kubernetes with sidecars. And as soon as you move away from that, or, oh, the proxy is over here and the TLS certificate is over there, then you start losing that very clear model. So that's kind of the security argument. There's an operational argument, too, that we can get into. Maybe we want to get more into the eBPF stuff.

Bart: yeah good and speaking of which you know that's the next next point i want to touch on it's got a lot of attention the past few years you know through not just Cilium Service Mesh, but we're hearing a lot about eBPF in many different areas, sometimes talking about networking, sometimes talking about observability, sometimes talking from a security perspective. When you're asked about eBPF, how do you define it?

William: Well, I think defining it is pretty easy. It is a kernel technology that allows you to run certain types of code inside the kernel. And that's interesting for... I think two reasons. One, because inside the kernel, you can be very, very fast. You know, if you're processing network packets, you don't have to bounce back and forth between kernel space and user space, right? And it's also interesting because the kernel has access to like everything, you know, so this has to be done in a really, really secure way. So kind of like traditionally, right, you wouldn't be allowed to do that. You as a user space program would have to talk to the kernel and say, hey, can you please give me the data for packet one, two, three, four, and the kernel would say, okay, here you go, right? And you'd have this like a very, very secure way to do that. syscall boundary is what it's called, system call. And that's expensive if you're doing that a billion times a minute or whatever. Modern network hardware can get really, really fast. you know, you get a lot of packets in there. And so that part is expensive. So, hey, maybe we can just do that analysis in the kernel. But if you're doing it in the kernel, well, now it's like you let someone into the inner sanctum. Like anyone here, they're in the Oval Office standing next to the president, you know, and they can do whatever they want. So the way eBPF kind of worked around that is it said, okay, we're going to have this bytecode compiler and we're going to have all these like kind of checks in here that only allow you... will only run things under certain conditions. And it's very, very restrictive. So instead of things that you can actually run there, you know, it's like you can't have any unbounded loops. You know, the bytecode verifier, which is what it's called, has to be able to explore every possible, you know, outcome and, you know, make sure you're not doing anything naughty. You can't, you know, your number of instructions is like limited to a certain size. It even checks, this is like the crazy part to me, it even checks. that the bytecode is GPL licensed by like looking at the first couple of bytes of the bytecode and looking for the strings GPL or something crazy like that. You know, there's like all these things that they put in there right in the name of security. And so the result is, okay, we've got this really cool mechanism that actually unlocks things we've never been able to do before, but it's also very restrictive. So you have to be really clear, you know, in what is... what is eBPF capable of and what is it not capable of? And I think that's where the marketing really tries to blur those lines. Hey, eBPF is cool, you know, brand new solution. It allows us to, you know, throw away sidecars. And like, that's not really, that's not really the case, right? There's like, there's a choice to use eBPF or not. And then there's a separate choice, orthogonal choice, independent choice that says, are we deploying things with sidecars or are we deploying these proxies in some other way? Are we deploying as a daemon set, which is what, um.

Bart: Ciulum does. Okay, so as you said, there seems to be perhaps some confusion about where there are limits that are there that maybe are... aren't evident at first, if eBPF is limited, does that also mean that all service meshes, I don't know why I'm stumbling on that. If eBPF is limited, does that mean that also that all service meshes that use eBPF are subject to those limitations?

William: Yeah, no, no, not at all. And this is like where there's like a little bit of smoke and mirrors. Like the eBPF service mesh. the heavy lifting there is done by an Envoy proxy anyways. So, you know, in order to get that functionality, so, you know, eBPF, for example, one thing, so it just restricts you to the network space. And there's a whole bunch of applications, I would say. There's security applications. There's like performance monitoring applications for eBPF. I think that's actually the most interesting area for eBPF is instrumenting applications. Because since you're sitting in the kernel, you can compute usage. you know, of CPU and function calls and stuff like that, you know, like a very, very direct way. So there's, you know, I'm just talking about the network kind of usage. eBPF, one of the things that's... hard bordering on impossible, but not quite impossible to do is to maintain significant state. So you parsing a bunch of TCP packets and like counting, you know, how many come through with this IP address and this port combination. That's very easy, right? Like that's finite, you know, data structure. You parsing HTTP2 is basically impossible. It's basically impossible. Now it's not. It's not actually impossible. You can have these, I would describe them as academic exercises where you're like, okay, we're going to split. The way we're going to do HTTP2 parsing is we're going to split it out so that we're going to reenter eBPF land and then we're going to maintain the state of structure. We're going to do it over here and then we're going to reenter and we keep doing that. You have to go through contortions where at the end of it, you're like, well, I guess you technically did parse a small amount of HTTP2, but it's not what anyone would normally do. So in practice, what they do is- use eBPF for the layer four stuff, right? You know, what are the TCP, how many TCP packets are passing between A and B? We can do that really well. And there's some other cool stuff you can do there too. But then anything that involves like what we call layer seven stuff, right? HTTP2 parsing, like I want this request and I want to look at the, you know, I want to look at the header, you know, and it's encrypted via HPACK or whatever. And then I want to make a routing decision. based on, or I want to do a TLS handshake or anything like that. You kind of have to do outside of eBPF. So in the case of something like Cilium, I should have a proxy. And they deploy that proxy as a daemon set. And that's the part of it that I don't like because that's the part of it that actually makes it sidecar free. but that's the part that actually makes it worse, right? So you could use eBPF with sidecars, and then I think you'd actually have something quite nice. And this is kind of the direction, you know, we're looking at in Linkerd is like, hey, let's keep sidecars because they've got a clear operational boundary, clear security boundary. And let's add eBPF because it actually can help with some things. So for example, sometimes you're asking Linkerd just to proxy a TCP connection from A to B. Right? So, you know, hey, this is some obscure protocol or this is some, you know, a protocol that we invented, or this is a client initiated TLS thing or something. We're like, I don't want you to interpret this thing, Linkerd. I don't want you to look at requested responses and do load balancing, but I just want you to proxy. In that case, there's no real reason for Linkerd to do anything other than just like hand it off to the kernel and say, hey, send these bytes from here to there. Right? So that is something that eBPF would be able to help us with. And that would speed up that kind of raw. TCP proxying case, I think, quite dramatically, possibly, for Linkerd. But for anything else, you know, for anything that is kind of, like, interesting from the service mesh perspective, and by interesting, I mean, you know, load balancing gRPC requests based on the latency of the endpoints. You got 30 different endpoints, you know, which one's the fastest? Okay, send a request to that. Or initiating a TLS handshake and doing, like, you know, L7 policy authorization on top of that. You basically need... a proxy running a user space. And so then your choice is, okay, eBPF doesn't help me with that, right? I have to run a proxy user space. Do I want to run that proxy as a sidecar or do I want to run it as a daemon set? And if you run it as a daemon set, the problem is, yes, you have no sidecars, but the problem is now if you're doing TLS, all those TLS certificates are being mixed in memory of that sidecar or in the memory of that daemon set proxy. Well, that's crappy for security reasons because, you know, there's... the, oh gosh, I'm forgetting the name. It's like a classic security problem. Not the, it's like the deputy, the unauthorized deputy. Okay, blanking on the name. But you've basically deputized that thing to act on your behalf and it could go rogue. Like you've got no idea. It can't, it's got all the TLS certificates for all the pods. So like, what's the point of doing TLS? you know, if they're all being mixed together and this is acting as your representative. And you've got a much crappier operational model because if that thing goes down, then like some random set of pods that were scheduled on that machine now suddenly are unable to communicate. Or if you're rebooting that thing, well, now you've got a random set of random pods. from random applications that Kubernetes has decided to schedule on that machine, you know, who are now unavailable. And the reason why I'm so familiar with that is because Linkerd1.x actually ran as a daemon set. So we knew this model inside and out. We were like, great, why bother with sidecars? Let's do daemon sets. And then, you know, every single user was like. oh, this sucks. Every time we reboot this thing, it's like, it's really hard to figure out what's going on because random pods go down. And, you know, how do we do MTLS in a way that we feel safe with? So, you know, I think if you unwind from there, basically what our analysis, and we dug pretty deep into this, what our analysis kind of led us to is like, eBPF is cool. I think there's ways we could use it in Linkerd, but it doesn't change anything about the trade-offs between running things as a sidecar, running things as a demon set, or even the ambient, like running them somewhere else, you know, and having like a tunnel component. like that trade-off is totally independent of whether you're using eBPF or not. And the thing that kind of irks me is like, it gets smeared together in marketing land. Hey, we've used eBPF, this amazing technology to kill sidecars. Well, no, those are two independent choices that you've chosen to kind of present as one. But you know, if you take it from the engineering perspective, kind of independent.

Bart: And like you said, there is...

William: Apologies for my very long...

Bart: No, no, not at all. It's like you said, whether it's... But we see this all the time. And that's often why as well, too, in the very beginning asking, why one and not the other? What are the trade-offs? And if we're in the Kubernetes world, we go for folks that were in the container orchestration wars, Kubernetes versus Mesos. Some people to this day will still say that Mesos was a better technology. Kubernetes had... other factors behind it that led to that. But that's another podcast for another day. The point is there's a lot of experimentation, innovation in this field. A lot of people are working really hard in different communities. If we dig deep enough, I'm sure we could find a romantic angle and Netflix would want to produce a series about it. But what do you expect, if thinking seriously, what... What do you expect to happen in the next, let's say, couple of years regarding this? Have we reached the peak in terms of the conflict or doubts around service meshes and sidecars? Do you think there will be some kind of a peace treaty? Can we expect new factions? What do you anticipate in the coming months and years?

William: Yeah, well, I love the fact that this world is so open to experimentation and so open to trying new things. And so. you know, kind of aggressive about pursuing new angles. I think that's great, you know, and I think the only aspect of it that I don't like is when the marketing is used as a way to kind of like push one solution over another versus, you know, trying to take an engineering analysis and especially, I think, for those of us who are so focused on the operational. benefits for, or the operational aspects, implications for our customers, you know, that's just, it's not a very sexy, exciting argument to make. I'm like, how do I gussy this up? You know? Yeah. So I think that's all great. And yeah, where is this all going? You know, I think. I think the value prop for the service mesh is pretty clear in people's minds by now. Early on, it was like, oh, lots of hype. What does this thing mean? I want to learn about it. And now it's like, okay, whatever. I know I need a service mesh. I don't really care. Pick one and move on. In some ways, that's nice. In some ways, it's like, all right, we're at a much more mature state now. and I'm not telling people, well, I guess I am occasionally telling people, here's what a service mesh is, and here's why you should care, but that's like not 100% of my conversations all the time, and I think they're kind of converging in terms of functionality, too. You know, like a lot of the features are... clear now. Okay, you need MTLS. Great. You need load balancing and circuit breaking. Okay, great. You need mesh expansion. So you need to be able to pull in non-Kubernetes objects. That's actually the big announcement we're making in about mid-February, whenever anyone listens to this.

Bart: The past is the future. It's exciting. Yeah, check it out.

William: So yeah, come check out Linkerd 2.15. We've added the ability, and this is a huge step for the project. to run the data plane outside of Kubernetes. And so you get the same TLS communication to and from your VM workloads as we've been providing for your pods. So profound step forward for the project. But these things are all kind of moving in the same direction. And in fact, you even see things like the Gateway API. which is an API built into Kubernetes itself that is slowly, and we're part of this, is slowly converging to consume not only how do I configure ingress, but how do I configure service mesh, right? So can I use the same configuration primitives to talk about traffic, not just as it comes into the cluster, but as it transits the cluster and maybe even as it egresses the cluster? I think the answer is yes. You know, we're not quite there today and, you know, the beginning of 2024, but we're not that far away from it. So you even see these standards evolving to kind of consume. So I think that's great. That's a sign of a healthy technology. I think we'll continue to refine things. I think that the industry as a whole will continue to refine. you know, some of the little bits and pieces of, okay, you know, can we use eBPF in these situations? Does Ambient give us advantages in those situations? But like kind of the value prop is the same for all those things. So what I'm excited about kind of the next step is, well, how do we apply this to, you know, kind of an organization as a whole? The customers that we see, you know, that are like the most forward-thinking customers, you know, our foot in the door is Kubernetes. They're like, okay, we've got this new Kubernetes thing. Sure, like let's make it secure and compliant and reliable. And then they're like, okay, well, how do we expand this to the rest of our organization? And that's the part that's really interesting to me because I feel like the service mesh kind of represents a new level of networking. It's a new abstraction for networking where kind of at the core. you know, you're not talking about IP addresses and like, can I establish a TCP connection and safely pass packets from, you know, reliably pass packets from this IP address and port to that IP address and port. Instead, you're talking about a network where kind of the primitive is, I'm service A, you know, represented as this actual, you know, set of instances juggling across machines, IP address change all the time, blah, blah, blah. And you're service B, can I establish a secure, reliable connection to you? And that's like the baseline primitive. It's not just like. establish a TCP connection. It's like, I need a secure, reliable connection, and then I can start building stuff on top of that. So that's the part that gets me excited, and that's kind of the direction we're heading with Linkerd is. You know, the technology is always fun to talk about, and it's fun to argue about, but, like, kind of at the end of the day, you want this stuff to become plumbing. You want it to become, like, you know, things that hide in the walls, and, you know, you don't really have to think about it. You're, you know, you're, like, using the sink. I don't know.

Bart: I've thought of pretty exciting analogies. No, just normal, functional. Yeah, it's like what, you know, the Kelsey Hightower quote, that eventually Kubernetes will just kind of disappear in the background, will become boring, you know, things of that nature. And I think a hell of a lot of work has to go into it for that to happen. Like you said also, that it doesn't always have to be an emotionally charged conversation saying, look, we're trying to make people's lives easier and hopefully this will become something as routine. Like you said, he's using the kitchen sink. shifting away from the professional and more to the personal side now um we found out that you have an interesting tattoo is it of Linkerd or what is that yeah i did tweet i did tweet this picture so uh

William: i'm gonna disappoint you which is that was a uh that was a temporary tattoo it's not a real tattoo so it's part of our this was at kubcon you I think a year ago, one of the bits of swag we had to give out were temporary. You know, everyone gives out stickers and stuff. And I was like, well, you know, we can be a little cooler than that. So, and it's a, it's a temporary tattoo of Linky the lobster. you know, who is a representative of Linkerd. When Linkerd graduated, so, you know, CNCF has like these tiers of project and graduation. It's like the top tier. Okay, project's really mature. So Linkerd, of course, was the first service mesh to graduate, a testament to the maturity of this beautiful project. But when they graduated, they made us pick a mascot. You know, and most of the mascots, you've seen these, you know, it's like, oh, you got the cute little gopher for Go, and you've got a cuddly, I don't know, something else. And we were like. Well, what's the least commonly object? you could think of what's the least cuddly critter and it was a lobster so we're like all right linky linky's got a blue lobster which in some ways is like linkerby's you know kind of position in the cncf universe we're kind of like the ugly you know rare thing that people are like i guess the lobster's here again you know i'm gonna i'm gonna go over here at least it feels that way sometimes so yeah we got linky the lobster that was a tattoo and i think in that case linky the lobster was actually uh crushing some sailboats in his hands which of course is you know logo so we put in a little uh you know a little easter egg in there linky the lobster you know a giant lobster rising out of the sea crushing some sailboats and that was a temporary tattoo if you were there at kubcon uh well i don't know where that was you know uh amsterdam maybe um

Bart: okay yeah you could get your temporary tattoo i'd say you don't so you don't have any tattoos not not as of yet not yet but you know let's let's see how let's see how Linkerd does opportunity all right we'll look for it we look forward to hearing more about that um what's next for you you know you're you're obviously very active mind yeah apart from this is there anything else that you're working on any side projects uh

William: for me personally gosh i'm devoting all of my my life energies to to Linkerd and to buoyant sounds like it you know i am uh yeah I am as a, you know, I'm, I'm a dad, uh, you know, I've got kids now they're learning the piano. I'm like, okay, I'm going to start learning the piano too. As an adult learner, I can tell you, it is humbling how fast kids will learn and how slow you are. So it's, you know, it's nice to be in the, in that kind of role sometimes where, uh, you know, I take lessons from a teacher who's probably a third my age who's like just phenomenal and I'm like you know so much of my day I'm like you know acting as the authority figure on something and now I'm like okay I'm really I'm like the shitty student here I'm like you know uh so you know it's nice to have that kind of uh little uh reminder to to be humble which maybe I need a little little more than most people do

Bart: so yeah that's that's really a you know that's good balance i mean that's that's i think it's really good balance i think it's a nice reminder for a lot of people is like it's okay to do something that you're not super good at it's not a big deal and and it's hard you know as a kid what you realize is as a kid you're used to doing that all the time like you never expected to be particularly good because it's all good exactly it's no judgment you know yeah yeah no but as an adult on top of it ceo i think the yeah the it can be it can be something to struggle with

William: actually i've gotten Linkerdetox hundreds of people and felt very confident never felt a moment of panic and then i've you know tried to do a rehearsal in front of like a piano recital in front of like 20 people and it's terrifying you know it's just it's complete opposite. So yeah, it's a good.

Bart: So next KubeCon, would you play piano if I bring a keyboard?

William: Hell no. There's no way.

Bart: I tried. I had to try. I had to try. Yeah. Okay. Well, it's all good. If people want to get in touch with you, what's the best way to do it?

William: Yeah, you can send me an email. That's easy. William at buoyant.io. Just make sure you spell it the right way. Look it up in the dictionary if you have to. It's B-U-O-Y-A-N-T. It's a real English word, but you know, it's weird because the U is before the O. or you know I'm on the Linkerd Slack so you can always jump in there it's like slack.linkerd.io I used to be really active on Twitter I'm kind of like not so much anymore but I guess you could send me a DM and I might see it I'm just WM on Twitter. But yeah, yeah. Come talk to me. I'd love to, you know, tell you about like New York, you know, maybe, maybe more realistically. I'd love to hear about what kind of challenges, you know, you're trying to face and see if there's something.

Bart: Sounds great. William, thank you very much for your time today. Really generous with your knowledge, how you share things, the perspective that you provide. Really appreciate it. Looking forward to seeing you in KubeCon in a month or so.

William: Thanks, Clark. It's great to be here.

Bart: Pleasure.

Kubernetes experts reacting to this episode