Migrating 24 services from Docker compose to Kubernetes

Host:

Bart Farrell

Guests:

Ronald Ramazanov
Vasily Kolosov

Should every project start with Kubernetes?

And if not, when is the right time to switch without incurring (unbearable) technical debt?

In this episode of KubeFM, you will learn how the team at Loovatech designed an app from scratch and decided to use Docker Compose to host their infrastructure cheaply and effectively in a single virtual machine.

As the project grew, the team had to make the difficult choice to rearchitect their infrastructure and plan for scalability and fault tolerance.

Follow their journey and learn:

How to migrate from a single Docker Compose file with 24 containers to Kubernetes.
How to verify that your apps are stateless and what changes are necessary to deploy them into Kubernetes.
How to manage expectations and explain the value of a complex migration to your boss or (non-tech-savvy) customers.

Vasily and Ronald also shared how they integrated ArgoCD and their existing CI/CD to leverage push and pull-based GitOps and their plans to incorporate multi-tenancy and custom metrics.

Relevant links

Transcription

Bart: Planning Kubernetes infrastructure can be challenging, particularly when the end user, the final client, could end up having different technical levels in terms of their maturity when it comes to using these technologies. I got a chance to speak to Vasily and Ronald from Loovatech. They told me about their experience with application migration from Docker Compose to Kubernetes. How, what, and the problems that they encountered along the way. Not only does this dynamic duo work together when it comes to Kubernetes challenges, but they also play music. playing different instruments, and singing. I think you'll enjoy this episode. Check it out. Ronald, Vasily, welcome to KubeFM.

Vasily: Hey, happy to be here.

Bart: Great to have you.

Ronald: Hey, Bart, yeah, thank you for having us on your podcast.

Bart: Absolute pleasure. Ronald, starting out with you, what three tools would you install on a new Kubernetes cluster?

Ronald: Yeah, I would say, I think it definitely would be ArgoCD, a powerful GitOps tool with a pretty useful web interface. And it's the first one, because then by using ArgoCD I would install other components in Cluster. The next tool I think would be Prometheus. because it's important to monitor what's happening in your cluster with your application and Prometheus with Grafana and other manager is a good way to do it. And I think KEDA is the third one because one of the useful features of Kubernetes is scaling and KEDA is a good way to do it. provides more opportunities here because we can use custom metrics.

Bart: We have a previous guest from the podcast, Jorge, who's a KEDA maintainer, who will be very happy to hear that in addition to Argo CD and Prometheus, as you mentioned. Vasily, do you agree, disagree, anything you would add or modify there?

Vasily: Yeah, I totally agree with this. And I'd say that in most of our projects, these are like three go tools that go anywhere because. They're like Swiss army knife. You can just go run with them. If you go this route, you will have things covered for sure.

Bart: Fantastic. That being said, Ronald, just to dig into your background a little bit more, can you tell us a little bit about who you are and where you're working at the moment?

Ronald: So I'm a DevOps lead in a team of four engineers and I've been working at Loovatech for four years. So me and my team, we mainly work on cloud infrastructure, mostly in AWS and prefer Kubernetes as a container orchestration tool. And so my team works on the applications that the Loovatech develops. And also we help to customers who have their own developers team and we help them as a DevOps vendor.

Bart: All right. Vasily, what about you?

Vasily: Me and Ronald, we're co-workers. So we both work at Loovatech and I'm the CTO. So I'm essentially, I'm responsible for the delivery of our services. So what we do, we do like custom development as a service, product development as a service, and as an outsourcing service. So people come to us when they need to develop a product or support an existing application or maybe do some cloud infrastructure project. And what basically I do in the company is that I make sure that we deliver what we promised to the client. And this is like a very important part and that encompasses software development infrastructure. And also, you know, talking to client when the client is maybe panicking or thinking that we might not be on time but we will be on time of course as always and Me and Ronald, we met back in the day in 2018, when we were just looking for a systems administrator. We were not cloud native at the time, like at all. We just needed someone to help us with, you know, managing Linux machines and just, you know, doing things the old fashioned way. And that's how we met. And ever since I say that me and Ronald, we grew together. like as professionals and we grew with the technology that we use. So that was our joint journey in Loovatech for the two of us.

Bart: Fantastic. With that in mind, you said about six years ago, what was that process like of getting into cloud-native technologies, specifically with Kubernetes? What was it like before using it and how did you go about learning it?

Vasily: Yeah, I say that we as a company, we mostly evolve based on the projects that we do. And the projects that we have had like in our early days, they were not as complex and we were not as tech savvy as we are right now. And 2018, that was a different time. That was a time when... you would seriously consider if you need to go the container route, or you would just, you know, do the classic old-fashioned way. That was a choice back then, now you don't have a choice, like you just fire up Docker and that's what everyone does. Back in the day it was not like that, and I'd say that it was precisely the time, the moment when we understood that we need to level up our game a bit, because we already started struggling with some things that involved Like high availability, reliability of the services. And as a company, we are a small company and we don't have the luxury of having like a crowd of engineers that, you know, may do a lot of things. you know, manually, we strived on automation from the day one. We needed to cover a lot of things with just a few people. And we came to a point when we understood that even with automation tools in place, like Ansible, like we mostly use Ansible and Terraform. That was not enough to, you know, manage applications that we were, you know, working with later. So we needed to make that leap because, you know, we were We understood that we just spent too much time on fighting things that were just solved in Kubernetes just by design.

Bart: That's great. One thing thinking about there too is like you said, learn the technology is one thing, but on top of it based on the team that you have, you know, how many engineers are at your disposal, that's definitely going to shape it more. Ronald, in your experience, were there any key resources or things that helped you learn Kubernetes?

Ronald: Yeah. So, yeah, I remember clearly that when I just started learning about Kubernetes, it wasn't easy. Yeah. So, I think I mostly I use as theory, I used to read articles on Medium or other resources, videos from conferences. And so I think the balance, I just use the balance between theory and practice. So I had like a real task and used the sources. And of course, Kubernetes documentation and Stack Overflow did their job.

Bart: They are go-to resources for a reason. Just want to see if there's anything that makes sense. It's not a Google Stack Overflow. It probably doesn't exist, right?

Vasily: Yes. And I mean, you know, it's funny that back in the days you would go and buy a book on a subject. Nowadays, you can't just write a book in Kubernetes because it will be obsolete in a year or something like that. So we've got a new way to learn things now.

Bart: Like you said, keeping it fresh and it's no coincidence while you're speaking today is more specifically about an article. That being said, if you could go back in time and give any advice to your previous self about, you know, Hey, avoid this or focus on that when it comes to learning Kubernetes, what would you recommend there, Vasily?

Vasily: I'd say that I really think that we went right with the technology side. I wish we would do this earlier, though. There's maybe some time lost when we just were drifting in a way, should we do this or not, we just have a way of doing things. And I mean, it's a good question. As an engineer, you should be more open to trying new things and devote some time into tinkering with stuff. 80% of what you try that will be like useless tools in the future that you will just throw away But still you gain experience and you look at new things So I think that as an engineer, it's vital that you you know, always spend time on some bleeding-edge edge technology That's completely new and that really moves you forward because people who didn't do that at the time They have a huge disadvantage right now And the second point I think is that we might be better off reaching out more to the community, because back in those days we were like constrained in our own box, we were just our team and that, and right now we're speaking here and one question is I ask, why wouldn't we do something like this five years earlier? Like what held us? Like nothing held us, just ourselves. So Yeah, you know, the community and what you're doing here, you know, is a great way to spread knowledge, you know, glue people, move people together to, you know, share things. So that's a beautiful things to do. And you can start doing it anytime. Like everyone has something to bring in. Everybody has their unique experiences and you always have something to tell the world about.

Bart: I think it's a wonderful point about two things. A, the part about trial and error with different tools, you know, like not everything is going to work, but it doesn't mean it's a waste of time. They're going to be exact process of elimination. You're going to be getting your eyes and ears open both to the technologies themselves, as well as the people who are building them to understand their logic, the pain points they're trying to solve, then the other point that you mentioned is about making community part of your solution. And there's so many different ways to do that. which is very beneficial because some people don't want to do a podcast or they don't want to write a blog, but everyone can find their community fit, whether it's being a lurker in a Slack channel or a Discord server. It's really up to everyone to decide what's going to work best for them. But I think it's a solid point. Ronald, anything that you would add there?

Ronald: Oh, you know, I think I just agree with everything that Vasily said. I think nothing to really add here.

Bart: You published an article about application migration from Docker Compose to Kubernetes, the how, why, and what. Now you work with different kinds of customers that have different maturity levels. Can you give us some examples of things that you've encountered regarding that, where some companies might be a little bit more advanced than others, things that you have to keep in mind when helping them migrate from one technology to another?

Vasily: Yeah, our portfolio of clients is very different because some of them are startups on early stages, some are internal projects of an enterprise company, for example, there's one project where there's a food manufacturer that has large manufacturing plants and the software they needed was a system that could help them to... simplify the relationship with the supplier. So to help them, you know, better communicate with their suppliers to... organize that process, etcetera. This is not a system that, you know, is getting any sort of high load and they need to host it on-premises. Like there's their requirement that it's got to be installed somewhere and then you go talk to their IT team and if you mention something like Kubernetes, they will say, oh, that's too much overhead. Why? We don't need this. We'll just give you two machines and that's it. And you say, what about failover? What about high availability? They'll say, oh, that's too much. It doesn't matter if the system goes down for half a day, nothing will break that hard that would justify giving the extra overhead and extra management overhead, etc. Many of our projects are too simple or don't have that high requirements for the availability and scalability to justify giving the extra overhead. the overhead that you inevitably get with Kubernetes. Like I say, like Docker Compose is not going anywhere because like, it's super simple. You just look at the example of how Docker Compose is structured and you can write a Docker Compose file yourself. Like every developer can write a Docker Compose file like with no learning curve, like at all. And one thing that we always do, we keep... all every application containerized. So this can help us to migrate to something like Kubernetes in the future. But yeah, most of our projects are not as infrastructure heavy or demanding as others. But some of them are like very infrastructure demanding. And that's where our engineers spend a lot of time, you know, working on the infrastructure. And some projects we just have literally templates for Docker compose files for. you know, the typical services. So we have like building blocks that we can use to quickly spin up a deployment for a more simple application.

Bart: Now, one of the clients that, you know, where the infrastructure seemed to be a little bit trickier or that at least stood out in the article is a client of yours called Picvario.

Vasily: Yeah. Yeah.

Bart: So in terms of what, can you just walk us through, you know, that the app and the infrastructure, when they first asked for help as opposed to where they're at now, what does that look like?

Vasily: We actually started developing Picvario from scratch because the founder came us with an idea. He said that I see a market for digital asset management systems that is not covered yet. So what Picvario does, it helps a company that has a lot of like media content, say sports team or news media or a museum. And the problem, the first problem is organizing that, but the biggest problem is finding the right content when you need it, like for a sports team, when you need to pull up a picture of one of your like sports team athletes, you need to pull it up from a game that happened three years ago with that particular team. And how do you do it? You just go to Google Drive folder and try to... click through things to find this, you need something better in place. So Picvario does exactly that and it's... Unlike many systems of this class, it's adapted towards companies that want to, you know, build their own media content bank. But that was just an idea in the beginning. And so when we started working on that, we had like zero infrastructure in place. And the only task at hand that we had was moving very fast because we needed to iterate like super fast because they had some initial investment. They needed to be very careful with that budget because you couldn't just burn it without, you know, thinking you needed to, you know, account for every cent spent. And they were also working on the first sale. And that first sale, it was to an enterprise client and they found. like a minimal viable products concept that involved like very basic features like being able to upload the content process, find it like in the simplest way and we were working on delivering that MVP and now with that in mind We just didn't have nor time, nor the resources to, you know, work on some complex infrastructure, we just needed the developers to, you know, bring in features, bring in features, bring in features. And as that was like an MVP launch, there were no. like any hardcore requirements for how well it should handle the load, because the client had like 50 gigabytes of content. Like you just, it's just, it's not a lot. You don't need like super, a lot of computer computing, you know, power to process that. So, and I think that one important point here is that your infrastructure and generally. your technical complexity of the project that involves infrastructure, architecture, like how the code looks, how it's structured. You should follow the development stages of the product. So when it's early stage and a lot of things are undefined and you don't know how the product will turn, like what features would be in place, how would they consume resources, you don't need to come up with something complex at this stage. You just want to keep things simple. Then as the product evolves you naturally support it by introducing more complexity and parts that just need it and that's the way we did it with Picvario we started like with nothing and it was a Docker compose file with five services then it had ten services Then after a year it had 25 services in it. And that was the moment when we started scratching our heads and thinking, hey, this is ground out of control. We got to do something about this. And we were also having the issues with handling the spikes in load, because there were just some static computing power allocated to the project. And when people started uploading a lot of things, Like the interesting feature about Picvario is that it features all kinds of workloads. You have file uploads, you have picture and video processing, like transcoding, like preparing it for small previews, for streaming, for like different formats. You have like AI workflows to detect things on photos, and you have search. So the ability to find things in the index when there are... millions of assets. So, Picvario covers all type of workloads that you have in a modern system. And that was a challenge too. So that's when me and Ronald, we started to think that we need to change things up and, you know, elevate the infrastructure to follow the, you know, the product development.

Bart: With that in mind, Ronald, you had 25 different services running on a single host. How big was that? I just want that to stand out. I want that to sink in. How big was that virtual machine? And was it on steroids?

Ronald: Yeah, I think so. Yeah, it was big. Yeah, but it worked. Yeah, it worked. But of course with some limitations. Yeah, so the infrastructure was quite cheap and easy to maintain, but it had a number of drawbacks that we just need to fix. Like you know, the lack of fault tolerance. Because when... and we had some problems with virtual machine or network in the specific availability zone in Cloud Provider. So it could be the result of the application failure. We had a single point of failure, so problems with one application component can impact other components in case of memory leaks or high CPU usage. So it's not critical when one worker is not performing correctly, but it's a major issue when because of that the frontend or backend, so the main components of the application is not performing correctly or even worse, they are down. So yeah, that was the problems.

Bart: And so getting to that point, you know, the problem is starting to realize, all right, this is not sustainable. We've got to take it somewhere else. And when you were evaluating, you know, the different options, whether it was Kubernetes, Ansible, Docker Swarm, eventually you did decide on Kubernetes. What influenced that decision and did the migration start immediately after that? Or was there a waiting time? Walk me through that.

Ronald: Current solution requires reworking from scratch. And so we have basically analyzed different options. First of all, of course, we were thinking about Kubernetes because it's the current standard, the most popular platform for container orchestration. But also, we also looked at other options, because some team members had doubts about Kubernetes and thought that it's like an overkill solution and that we should go with something simpler to maintain and cheaper. So yeah, we discussed, like, research different options, for example, using Ansible with Docker model to orchestrate containers across several hosts and distribute traffic by a load balancer. Also, we consider it Docker swarm. But after research, we understood that all these solutions look like a compromise. They could help and improve some of the challenges that we had, but only for a short time and not as effective as Kubernetes. Uh, so yeah, we, we chose Kubernetes, but, uh, we, we didn't, uh, didn't start migrating after that.

Vasily: The issue was that like when you read something about how, how do you migrate to Kubernetes, the first thing that you hear that your application should be stateless, right? So you shouldn't like store some persistent state on the node or in the container itself, but the devil is in the detail because it's easy to say that make your application stateless, blah, blah, but how do you actually do it when you know. have the actual challenges that come with it. And we had two. One of them was that we had a sophisticated process of file uploading, because you just can't upload a large file, like several gigabytes, via a browser in just one post. request, it just won't work. And none of the web service will let that pass through and you will encounter timeouts, et cetera. So we chunk upload, we separate the file uploaded into multiple smaller chunks and they are in the same file. send to the backend separately and then the backend just stitches them together to get the final file. And that was easy to do when you had just that machine. We actually had all the files like in a file system path. So it's like images and there it goes, like folder structure with the content, the media content that you have. And the chunked uploads, they just went into the temp folder on the machine and then the backend could easily, you know, just use the file names to stitch them together and here we go. Here's your file. But something that you just can't do in Kubernetes, just because you can't rely that, you know, this folder will be available to everyone. And it's just, it's just not easy to pull off this trick. So, uh, What we ended up with, first, of course, we moved everything to S3. And to this day, it's just S3. You can never go wrong with using S3 for storage. It just works. And for the upload part, we use an interesting feature of the S3 called multipart uploads. It's the S3 protocol allows you to create multipart uploads as a part of its protocol so you literally upload those chunks and then... tell it to stitch them together and here it goes. And it all happens on the bucket without you even having to, you know, write some specific code for it yourself. You just, it's an API call. And this is the first thing. The second was the processing, because we have a lot of things that happen to every uploaded assets. You need to transcode it, do some AI magic. You have to extract metadata. You have a lot of things to work on and it's all asynchronous. So it's all like happening in background via Celery workers that, you know, have a task in queue, they go, they fetch the asset and they work with it. And the problem is that. Many things like FFMPG, they require you to provide the file name on a file system. And most of the tools that we use, most of them take a file system path as an argument. And when you have that distributed, you just can't have the temporary file somewhere to rely that it's present on every node. So we did some changes to that. This went in two stages. First, we were just. upload in the file to S3, then a worker pulled it back to work on it, to a temp folder, and it uses a small optimization that it doesn't do it twice if the file is already present. But in the end, we just went with a shared scratch file system. It's ended up with introducing less delay. You can be always sure that the files in the scratch space are always accessible. So my advice, if you encounter something like this, is to... Don't be shy to use things like Amazon Elastic File System. Just throw it in, connect it everywhere, and it will just work, and it will size itself according to the scratch page. Because sometimes people upload like 50 gigabyte video files, like raw, uncompressed video files, and you must be sure that... You know, you have space to accommodate for such huge uploads in your temporary scratch space. So just go with the Elastic File System or something like that. You can't go wrong. It will just put so much headache off you. So that were the two things that we changed in the application, you know, to become stateless.

Bart: And with deploying and, you know, planning the infrastructure. How did you organize your Kubernetes clusters? And also on top of that, were you working with a specific cloud provider or were you totally agnostic at that point?

Ronald: At that moment, all infrastructure was already deployed in cloud provider. So we... and no changes were planned there. So we were thinking about how to create Kubernetes cluster here. So there were two options. The first option was to deploy an on-premises Kubernetes cluster on virtual machines using KubeSpray or another similar tool. And the second option is to use a managed service from a cloud provider. So almost all providers offer such services, for example, AWS EKS, Azure AKS, and et cetera. So, and the main point is in using Kubernetes service in cloud. is that the responsibility for the control plane configuration and performance is on the cloud provider side. So as an engineer, I can focus directly on deploying and managing my application in the cluster, setting up like Kubernetes system components, like auto-scaling, ingress, controller, setting up working nodes, monitoring and so on. So you don't need to set up a control plane or troubleshoot it. So it's easier to like update the Kubernetes version. Yeah, of course you need to, when you perform it, you need to do some research, you need to do some actions. You need to, so for example, if you. upgrade a Kubernetes version in AWS EKS, you need to go through AWS documentation, check what changed in your version, what the preparations you need to do, but it's like you need a lot less to do. So yeah, we chose the option of using cloud service. It's because, as I said, it requires less time from engineers and it's easier to maintain and also because it's more cost-effective. Because, you know, for example, in EKS you need to pay something like $75 per month for high availability EKS control plane. So yeah.

Bart: And yeah, 2023 is definitely the year of cost optimization. And Vasily, I'm sure you hear a lot about that from customers. But going back to our magic number 25, you're running, deploying 25 services in test and prod could require some tweaking of variables and things of that nature. Did you have, did you use any templating engine for that?

Ronald: Yeah, yeah, we did. We use, sure, we are using Helm charts for that. because it allows us to minimize the count of configuration files. So instead of using just standard Kubernetes manifest, we have several universal hand-made charts that we use for all environments and all application components, and we can set up... necessary parameters with various files. So it makes configuration easier and allows us to implement a don't repeat yourself approach. So, and of course we made some changes in our CI-CD tools and approaches. Before the migration, we used TeamCity building Docker images and deploying them. And after that, we We continued using it for building images, but started to use ArgoCD as a deployment tool. So ArgoCD performs deployment of our Helm charts to the cluster.

Bart: And in the article, just taking Argo CD further, you mentioned at the beginning of the podcast about why one of the three tools you would install on a new Kubernetes cluster. In the article, you talked about the GitOps push and pull comparison. Can you share with us what exactly happened there?

Ronald: Yeah. So there are two different models of deployment. Push model is where your CI-CD tool performs deployment steps. So it has access to the environment and push changes. So for example, it's just perform hell install or kubectl install or apply command. So we use this, actually we use this approach before migrations or TeamCity just did SSH exec to the host and just run Docker compose comments on virtual machine. So pull model is how GitOps works. You have an... operator like Argo CD or Flux installed in your Kubernetes cluster and it monitors the state of your Git repository and when it changes, it automatically starts synchronization and updates your application. So both approaches have their advantages and disadvantages. So we decided to combine them. So as before, we use TeamCity to build images. And then we still use TeamCity for launching the deploy pipeline that just passes the image ID variable to ArgoCD and triggers it to sync applications. So that's how it works.

Bart: Now, we've heard quite a few different things here. This migration was no easy task from what it sounds. And, but what was the result after so much work? What was the client's feedback? What did they have to say about that? What were your conversations at that point, Vasily?

Vasily: I'd say that we really got what we were after because the main like pain point for us at that time was that with the setup that we had before, If we had multiple clients uploading a lot of information into the system, like a lot of content, that would become a huge bottleneck because our queue would get just overloaded. We had too much stuff waiting for it to be processed. And the clients, of course, they were complaining because, I mean, we all expect that if we upload things to the system, it just gets processed instantly. And if you have to wait, then. What is happening there?

Bart: They want to see this right now. Yeah. The peak time with certain demand. Yeah. We can understand that for sure.

Vasily: Of course. Of course. If you are, if you are like a photographer on a sports game, you just want to upload. The stuff is with the little delay as possible. So after like the first period, you go upload it and send it to your, you know, to your editor's team and, uh, you know, You don't want to wait on things and that that was the point when it became evident that we need to you know change things like transcoding and uploading and the processing of uploads. It was the largest pain point. And with Kubernetes, we nailed exactly that because with the cloud service, we had an easy way to, you know, order more resources, scale up and scale down as we needed. And like, that was a textbook example of using cloud technology to, you know, being able to do that. effective with your budget and getting resources as you need. So we just used the thing that clouds were just designed for, you know, by their nature. And with Kubernetes, it was so easy to, you know, make that happen because you didn't have to worry about You know, just nothing. You set up your scaling policy, you've just set the boundaries for it, the triggers, and here you go. It just works and, you know, it's some kind of magic that in the end justified the overhead and things we had to go through with that. So, the... the quality of service improved dramatically and we no longer have had those moments when we had clients complain about that. So that was a huge win for us.

Bart: That's great. You know, we've talked a lot about the technical elements here in terms of all the different choices you had to make about what were you going to use. In terms of helping the client level up, what was that process like? Because, you know, you mentioned at the beginning about how, in some instances, you mentioned the word Kubernetes, like, no, no, no, no, we're definitely not doing that, or that's overkill, things like that. What was the process like? You know, you talked about your own journeys of Stack Overflow and, you know, Kubernetes documentation. What was it like, you know, communicating these technical choices to the client and bringing them along? Sometimes we hear that. It's not necessary for clients to know about everything, not because you're trying to hide something, but just to reduce complexity. What was your approach there?

Vasily: That was not easy. Because we are super open with our clients and we just, I say that we never hide things from our clients. And we believe that transparency is the way to go. But, you know, we engineers, we understand why Kubernetes is the way to go. Why certain technology is, you know, more advanced and, you know, gives you more capabilities than the other. And we all as engineers, we understand that, but then you go to a client and you say, Hey, we need to do X. We need to, you know, go to Kubernetes, change our gaming. He says, all right, how much would that cost? That's the number one question. How much is that gonna cost me is the question number one. And the second is, what do we get out of it? Like, what is the value that you provide? And this requires two things. First, the client gotta trust you. You gotta earn the trust of the client so he or she understands that you can be trusted with making such decisions. And sometimes. You know, if you have good relationships with the client, you can come to a point where you say that we absolutely need this. And, you know, just we need that. And he says like, OK, I trust you with that. Just don't burn too much money when you when you do that. And. The second is that you got to think hard about justifying the value that the client gets. Because if you just want to sell something like this, you got to come with a value proposition. And in our case, we did attempt that before, but we didn't have a solid value proposition. You know. at hand, we just said, okay, this will improve the availability. This will improve the reliability, et cetera. And the client said, well, we don't need that too much right now. Like we have 20 clients and they do not have as much data in the system right now. And if the system goes down for an evening, you know, that'll do so. it was not just enough justification for the client to go with something like that. But then when we grew, when we started experiencing those pains, then our value proposition became clear. We do this, we spend this much budget on, you know, migrating to Kubernetes, and we will solve this problem that directly impacts your business and the satisfaction of your clients. So that's a great value proposition. And, you know, anyone would agree with something like that. So if you want to sell. some project like this to your boss or to the business or to your clients, you've got to think very hard about coming up with a good value proposition. Otherwise they just shrug their shoulders and say, Hey, let's save the budget for now. We will, you know, spend it on adding new features or, or something like that.

Bart: I like that. And that's a common topic. You know, that's like, don't just throw bells and whistles at me. Tell me about like reliability, availability, latency, multi-tenancy, downtime, all this sort of stuff. Sounds nice. But how does this translate into practical business value? I know that's tough. And engineers are like, look, I didn't get an MBA. You know, I'm not a business owner, but your, your customers are definitely going to be thinking about that. So I think it's really important to keep that in mind. One other thing I wanted to ask. You know, you created this article. What was your objective overall in creating it? What's the reaction that you're hoping people will have?

Ronald: First of all, I think I just wanted to share my experience with the community because as I said, when I learned and when actually when I just did this migration, I used a lot of information from community, articles, videos, and so on. documentation so I thought it might be useful for someone and plus I think I just wanted to try myself in writing so it was my first and I think not last experience and also you know it's like a kind of documentation for a new team members so they use this article as information about the project and infrastructure, they can find out what was done, how it was implemented, and even why we decided this solution.

Bart: That's great. And I think, you know, sharing is caring and also internally on your team. You know that when people are getting started, you don't have to repeat yourself 500 times a day, just go read this. It's all, it's all there. And it also stimulates that general curiosity that I think all good engineers have of that, you know, of course you can ask questions to people directly, but what effort have you made to find that answer on your own? And when you haven't been able to find it, that's when you jump in a Slack channel or you go to a coworker and say, Hey, I'm going to ask you a question. I've checked these three other places and I haven't found something. What's your experience been, or could you share something that might be helpful? So that's a great lesson that's learned there. Now Vasily, what's next for you? Having had this experience, do you plan on writing more blogs? Will you start hosting your own podcasts?

Vasily: What's going to happen? I think that's, yeah, we want to write more material on what we're doing because since then we had several more, even in this project, for example, we've introduced things like scaling of custom metrics and we did some interesting things regarding the you know implementation of fair queues. For example, we have a multi-tenant architecture and we need to make sure that we treat each of the tenants equally so we don't have one of them overflowing other clients. So we have an interesting case in Picvario about this and several other things that happened. Ronald is right now working on some huge infrastructure projects that involve You know, hundreds of thousands of people from all over the globe, you know, consuming them and working with them. So I think that's something that we, you know, we'll be sharing with the community too. Ronald, could you elaborate on that?

Ronald: Yeah. No spoilers.

Bart: Or one spoiler if you want.

Ronald: It's fine. Yeah. Yeah, Vasily already told that we are using CloudFormation, but we plan to migrate to Terraform. I want to explain why. Besides Picvario, right now I'm working on another project with AWS infrastructures and Kubernetes. And if, you know, for Picvario, we need to maintain infrastructure for like single large application with multiple environments. So here in this project, I need to work on a large number of small or medium size applications with dynamic life cycle. So, and all of them are separated from each other. They run in different EKS clusters because of like customer requirements. So it's really necessary here to be able to quickly launch, update, migrate between accounts or completely remove this application and the infrastructure. Right now we're talking about 15 applications and it seems like their number will continue to grow. So automation is definitely needed here and we're already using CloudFormation. So this is like native infrastructure as code tool for AWS. But so yeah, currently I'm working on migrating from that to Terraform and TerraGrant because despite the fact that CloudFormation is a good tool in our case CloudFormation is a good tool. The combination of Terraform and Terragrant will make working with the infrastructure more efficient and convenient because Terragrant allows you to define Terraform code once and then specify parameters using variables or application, environment, region or AWS account level. it lets you keep your automation configuration DRY, which is something we lack in CloudFormation. So I'm now in early stage of work, but I can already see that it is worth it.

Bart: Wow, you're very busy. And on top of all this, when you're not writing articles and diving into these technical topics, you're a guitarist. Can you tell me about that experience and your evolution as a guitarist, when you started, what you're doing now?

Ronald: Yeah, I play on electric guitar. I think I've been playing for about two, three years. So, yeah. So I have lessons every week with my teacher, and I learn new techniques and songs that I like to listen to, and I record them for myself. I think mostly I play metal and some indie rock.

Bart: Metal, indie rock, not necessarily neighbors all the time, but I think that's healthy. No, I think it's good. I think it's good, no, no, I really mean that. I think it's good to listen to everything. And when I say everything, I mean everything. And in the same way we were talking about tools, just because you don't like music doesn't mean you can't appreciate, understand, oh, well, how did they go about creating this song? What's the song structure? Things of that nature. If people listen to metal and only hear noise, they're missing out on a lot.

Vasily: Yeah.

Bart: And if we're talking about a purely technical perspective, the speed and also the music theory that goes behind a lot of metal is very, very complex. As a drummer, like I said, you can do CrossFit or you can play metal drums. You know, it's really intense, really, really sharp tempos that you have to be keeping in mind. Vasily, I understand you have a bit of a music background as well. Can you tell me about that?

Vasily: Yeah, I'm a one man band. I say I produce music, I sing, I write lyrics, melodies, et cetera. So I'm just putting it all together. And I say that like our topic right now is, I'd say on this podcast is technology. And one thing I... really appreciate about, you know, producing music nowadays is how it depends on technology because, uh, nowadays music is very technological. Like you have all those like plugins and processing and sound design techniques that were impossible like years ago. And in the same way as like Kubernetes is, uh, you know, opening your possibilities to, uh, you know, do more. come up with better services and we like with more high quality services and just open up your possibilities. I say in the same way. The same happens in music too, because new technology, new plugins, new AI tools in music that come up these days, they level up the game too. And again, as with technology in the music, you gotta keep up with the current advancements to improve your game as well. So that's an interesting, you know... how these two things seem similar in a way.

Bart: That's a great point. You know, the democratization of these different tools and technologies, whether it's with open source and communities that you can use, technologies with music that were previously only in recording studios, if they were. And it's interesting as well, too, with the particular customer, you know, client you talked about with Picvario. you know, video production is something that nowadays a lot of people do with the phone and you know, CapCut or programs like that. So then there's a lot to be learned through the availability of these different technologies and a lot of it comes down to once again, how, how willing are you to learn? Are you willing to accept the fact that you don't know things and that the process, you know, takes time? A really good thing that someone said multiple times is, that I've seen on Twitter, someone from the cloud native community, I think it's Ian Coldwater, said, you know, the first step in getting good at something is being terrible at it. And just embrace that, it's fine. You know, like, Ronald, can you remember when you first started playing guitar, how it sounded?

Ronald: Well, it's better to... yeah. Yeah, actually I have... When I started to play, I uploaded all this on YouTube for private access. So now I can just watch how I played like two years, three years ago and complain. Yeah, it was like... It's better to not show it for someone.

Bart: but now we really want to see it. But I think the point is, at an individual level, you can see how far you've come. And you can really appreciate that process of leveling up. That being said, we can't access your private YouTube videos. But if people want to get in touch with you, what's the best way to do it?

Ronald: I think the best way is LinkedIn. It should be easy to find me here. By using my name? Yeah.

Bart: Pretty good.

Vasily: Yeah. I'll see you. We'll be happy to talk.

Bart: Fantastic. Looking forward to the next steps. Be sure to pick me when that, you know, next blog article comes out about what you're working on right now, Ronald. This is a fantastic conversation. Very eyeopening for people that are Everyone's at a different phase. Some people, that's the thing, you walk into a room and you say, Cooper and Adam, they'll be like, no, no, no, talk to me in 2025 or 2028 or when I'm retired. I think it's really good to get these different perspectives. Also, one of the things that was mentioned is really understanding, empathizing with your customers, thinking about what are their objectives? How do we translate these things into business value? Because they're going to go off to explain this to their boss and they want to be equipped with. How much does this cost? And what do we get next? It's such a basic point, but it's so valuable for everybody out there. So thank you both very much for joining us today on the podcast. I'm looking forward to seeing you soon.

Vasily: Bart, thank you so much. That was a pleasure.

Bart: Likewise. Thank you, Bart. And I expect to see you and hear you on future podcasts with us as well as others, because this is very, very informative. So thank you for the work that you put into creating the article and for sharing your knowledge.

Vasily: Perfect. Let's do this.

Bart: All right. Bye-bye.

Vasily: Thanks.

Listen anywhere