Modern Kubernetes observability: beyond metrics and monitoring

Modern Kubernetes observability: beyond metrics and monitoring

Guest:

  • Shahar Azulay

Kubernetes observability goes beyond traditional monitoring with emerging tools and approaches.

In this interview, Shahar Azulay, CEO and co-founder of Groundcover, discusses:

  • The distinction between monitoring and true observability, explaining how correlating traces, logs, and metrics in one place provides deeper insights than traditional metric-focused tools like Prometheus and Grafana

  • Groundcover's bring-your-own-cloud approach stores observability data on customers' cloud premises, allowing them to maintain 10x more data without cost concerns while ensuring security compliance

  • The blurring line between security and observability in Kubernetes environments, where eBPF technology enables both comprehensive system visibility and security insights

Relevant links
Transcription

Bart: I'm Shahar Azulay, and I work for Groundcover. My role is focused on the company's technology and solutions.

Shahar: Hi, I'm Shahar Azulay, CEO and co-founder of GroundCover.

Bart: Three emerging Kubernetes tools that I'm keeping an eye on:

  1. eBPF technologies - which are becoming increasingly powerful for observability and performance monitoring

  2. OpenTelemetry - an open-source observability framework

  3. Tools leveraging ClickHouse for data storage and analytics in cloud-native environments

The response is based on the context of the speaker working at Groundcover, which specializes in observability technologies, and appears to be tracking innovative tools in the Kubernetes ecosystem.

Shahar: Security, observability, and automatically scaling Kubernetes and maintaining the scale are the three hot topics currently at the conference and in general.

Bart: Any open source projects that you've tried out recently?

Shahar: GroundCover is very involved in both OpenTelemetry and eBPF, and also the databases that we use, such as ClickHouse, which we are contributing to and part of.

Bart: ClickHouse running stateful workloads on Kubernetes has been a big pain point and challenge. What's been your experience with ClickHouse?

Shahar: ` tag as instructed

Bart: One of our guests, Isala, expressed interest in using AIOps and eBPF to build tools that make SREs' lives easier. What are your thoughts on the potential of AI and advanced observability tools to improve DevOps practices?

Shahar: We definitely believe in that. The key part of improving observability with AIOps is having the right data in place. GroundCover is well positioned to do that because we use bring your own cloud. We store all of the logs, metrics, and traces on cloud premises. We are heavily building AI experiences on top of that to allow you to get to the root cause better and correlate the right data when troubleshooting. That's the key point. Access to very rich data that eBPF can have, but also very privately in your cloud premises, are the cornerstones for these advanced AI capabilities.

Bart: Our guest, Artem stated that if you deploy Grafana on Prometheus, it only allows you to see what's happening in the system. I think this does not solve any problems. What do you consider a complete observability solution beyond Grafana and Prometheus?

Shahar: Grafana and Prometheus are more built to be a monitoring stack than an observability stack. It's the old bit about how monitoring is not the same as getting insights about a system. Focusing on just metrics is one thing, but observability is about correlating the right data together.

GroundCover uses the same sensor, the same technology basically, to correlate traces, logs, and metrics in one place. That's where the magic starts to happen, because you can see a log alongside the infrastructure metrics, or a trace that failed or had high latency. This is where insights start to form from raw data.

Such insights are hard to get without technologies like eBPF. Prometheus and Grafana are more focused on metrics and raw data rather than the deeper insights you can get from collecting everything and correlating it with the right context.

Bart: Artem also mentioned that successfully setting up observability tools does not necessarily equal the value generated by the company. How do you ensure your observability implementation generates business value?

Shahar: GroundCover doesn't price my data, which decouples us from the old world where you would send 50% unusable logs that you would never query or use, yet still pay for them. The first step of providing value is ensuring that the data you store doesn't correlate with your price because people don't know how to control everything.

When they want to investigate a severe failure from last night, they might want to go deep. But in the usual case, they shouldn't have to pay for tons of unused logs or metrics. By doing this, we provide much more value to our customers because we ensure all the data is available, and they're paying in proportion to their infrastructure size, not the noise their data creates.

Second, we concentrate all data in one place. Our backend integrates with Prometheus stack, OpenTelemetry, Cloud Metrics, and our eBPF sensor. We can provide value even when customers are using multiple platforms that GroundCover might not have fully replaced. Our goal is to be a single pane of glass where customers can store 10x more data without concern.

Bart: ` tag

Shahar: I think Kubernetes is still not fully mature when it comes to utilizing resources properly and making sure people really know how to monitor their stack. People use Kubernetes because it's amazing, but mostly because they anticipate needing it as they scale in two years. This doesn't mean they know how to work with it, are power users, or know how to monitor it properly.

I see Kubernetes evolving to become more natively integrated with solutions like GroundCover and other security solutions to provide immediate value to customers. If you're using Kubernetes at high scale, you want to ensure you can monitor and secure it without adding extra components.

The community is moving towards making Kubernetes about more than just cluster orchestration—focusing on resource utilization, cost monitoring, observability, and security built directly into the Kubernetes project. We see many efforts in this direction, and I'm confident it will continue to develop accordingly.

Bart: If there was one Kubernetes feature that you could improve, what would it be?

Shahar: I think the key challenge is understanding what's happening inside a Kubernetes cluster from an infrastructure perspective when it comes to your application. This is where we feel a lot of pain. For example, one of the features our customers love is the ability to identify when an application is running alongside other applications that might take the same resources on the same host. This can help explain hard-to-troubleshoot issues, like why a server is taking so long to respond—it might be a different application interfering with its resources.

I think everything around this area is still a bit raw and requires orchestration systems like Groundcover to complete the puzzle. Kubernetes will likely be better at this in the future.

Bart: Shahar, what's next for you?

Shahar: Groundcover is focused on expanding its platform. We've just released form reviews and monitoring and did a conference on RBAC patterns (role-based access control). We're expanding the platform to basically replace solutions like Datadog, New Relic, Grafana Cloud, and many other vendors we compete with. The future for us is to keep building the platform and supporting our customers with more complex use cases, allowing us to be natively integrated with our Kubernetes stack using technologies like eBPF or bring-your-own-cloud.

Bart: Backtrack a bit, since you mentioned the connection between observability and security. I think a Red Hat report from a few years ago said 71% of vulnerabilities in Kubernetes came from misconfigurations. Tell me about the role of observability in that picture. How does observability relate to potential misconfigurations, vulnerabilities, and security problems in Kubernetes?

Shahar: Once you use a technology like eBPF, the borderline between security and observability becomes a gray area. For example, GroundCover provides a map of all your APIs, showing which were encrypted, which were external-facing, and which included PII. It's observability focused on providing security.

The second point is ensuring that observability data is secure, because observability data is essentially the most vulnerable information an organization has. It contains logs about customers logging into your system and potentially performing PII-related activities. That's why we're steering away from the SaaS model our competitors use (like Datadog, New Relic, and Grafana Cloud), instead storing data under the security policies of your organization inside a bring-your-own-cloud approach.

These are the two domains where we see security and observability blend: making sure the data is secure and ensuring that observability system insights look into the security domain, correlating with container security, API security, and other aspects.

Bart: Shahar Azulay can likely be contacted through Groundcover's official website or professional networking platforms like LinkedIn. Since he works for Groundcover, interested parties could reach out via the company's contact channels.

Shahar: GroundCover.com has a trial. You can log in for free and use our playground to experiment with the platform at high scale without installing anything. That's the best way to contact us. Try out the platform, see how it works for you. Try out eBPF and talk to us when you're satisfied.

Podcast episodes mentioned in this interview