LEARN

Kubernetes Monitoring: An Introduction

One of the first things you’ll learn when you start managing application performance in Kubernetes is that doing so is, in a word, complicated. No matter how well you’ve mastered performance monitoring for conventional applications, it’s easy to find yourself lost inside a Kubernetes cluster.

Not only are there more layers to monitor in the context of Kubernetes, but simply getting at the data you need can be a lot more challenging as well, given the way application data is hidden within Kubernetes clusters.

To help you get started with Kubernetes performance management, this article offers an overview of which metrics you’ll want to monitor and how to collect them. We won’t walk through every aspect of Kubernetes application performance management because that would require much more than a blog post, but we’ll cover the essentials that you’ll need to know in order to begin building an application performance monitoring strategy for Kubernetes.



Kubernetes Application Monitoring vs. Kubernetes Monitoring

A basic concept that you need to understand before going further is that there are two different types of performance monitoring that you will want to perform in a Kubernetes environment.

The first involves monitoring the applications that run in your Kubernetes cluster in the form of containers or pods (which are collections of interrelated containers). This is what we’ll be covering below.

The second is monitoring the performance of Kubernetes itself, meaning the various components –‌‌ like the API server, Kubelet, and so on –‌‌ that make up Kubernetes. The metrics you’ll want to watch for keeping your Kubernetes cluster itself healthy (and the way you access those metrics) are different from those that matter when monitoring the performance of individual applications (and are fodder for a future blog post).

You could think of the differences here as being akin to the differences between monitoring the performance of an operating system on a conventional server and monitoring an application running on that server. In the case of Kubernetes, Kubernetes is the operating system, and the containers or pods deployed on it are the application. Obviously, the performance of Kubernetes itself impacts the performance of your applications, but each layer of the environment exposes different types of metrics and stores them in different ways.

Metrics for Kubernetes Application Performance Monitoring

If this sounds a bit confusing, the good news is that the actual metrics you’ll want to collect for application performance monitoring in Kubernetes are generally the same as those you’d collect in a conventional environment.

Those metrics will vary depending on which applications you’re managing, but they’ll generally include data such as:

  • Request rate.
  • Response time.
  • Error rate.
  • Memory usage.
  • CPU usage.
  • Persistent storage usage.

Usually, applications running in Kubernetes expose these performance metrics either in log files or by printing them to standard output and standard error streams, just as they would if they were running on a standard server instead of in Kubernetes.

Easy peasy, right?

How Kubernetes Handles Application Metrics and Logs

Well, not quite. Matters get more complicated when you get into actually collecting application performance metrics in Kubernetes.

The big difference between Kubernetes and a conventional server is that, in Kubernetes, data that pods and containers write to their internal file systems is not stored persistently when the pods or containers shut down. It disappears permanently, unless you collect it and move it somewhere else first.

What’s more, pods and containers don’t write monitoring data to a single location in your Kubernetes cluster. Each pod or container will store its log and events data in a different location (usually, within the internal file systems of its containers). That means that Kubernetes doesn’t provide you with a way of aggregating or querying monitoring data from all of your applications using a single interface or command.

In other words, you can’t simply do something like “tail /var/log/syslog” in Kubernetes and get all relevant application data.

How to Collect Kubernetes Application Data

Fortunately, it’s still quite possible to collect application metrics and log data in Kubernetes. You just need to work a little harder than you would in a conventional server environment.

There are two main ways to get application monitoring data from Kubernetes. One is to deploy a logging agent on the nodes within your Kubernetes cluster. The nodes are the individual servers that host your containers. When containers write monitoring data to their internal file systems, a logging agent running on the node can pull that data out, then forward it to an external monitoring tool. There, the data will persist for as long as you want it to, even if the container or pod shuts down.

The other approach is to run what’s known in Kubernetes speak as a “sidecar” container that hosts a logging agent. Under this technique, the sidecar container runs alongside the other containers you want to monitor within the same pod and collects monitoring data from them. It then forwards the monitoring data to an external logging and monitoring system.

A third solution is to build logic into your applications themselves that exports their monitoring data directly to an external logging system. Because this requires changes to the applications, however, it’s a less commonly used technique than the other two.



What about Kubernetes Cluster Monitoring?

Again, to manage all aspects of application performance, you’ll also want to monitor the performance of your Kubernetes clusters themselves and correlate that data with data from individual applications. That’s the only way to know whether an application that is responding slowly due to a lack of available memory has an internal memory leak, for example, or if it’s suffering from a lack of resources at the cluster level.

You can also collect this data from Kubernetes from sources such as the operating system logs on the Kubernetes master and worker nodes. That topic is, again, outside the scope of this article. But we mention it as a reminder that Kubernetes application performance monitoring needs to include monitoring of Kubernetes, too, and not just applications.

Conclusion

Monitoring applications in Kubernetes may seem daunting to the uninitiated. Fortunately, it’s not very different at all from application monitoring in other contexts. The main difference is in the way application data is exposed within a Kubernetes cluster. Getting at the data you need is a little more challenging than you’re likely used to, but it’s hardly an insurmountable task once you understand the architectures at play.

What is Splunk?

This is a guest blog post from Chris Tozzi, Senior Editor of content and a DevOps Analyst at Fixate IO. Chris Tozzi has worked as a journalist and Linux systems administrator. He has particular interests in open source, agile infrastructure, and networking. He is Senior Editor of content and a DevOps Analyst at Fixate IO. This posting does not necessarily represent Splunk's position, strategies, or opinion.

Stephen Watts
Posted by

Stephen Watts

Stephen Watts works in growth marketing at Splunk. Stephen holds a degree in Philosophy from Auburn University and is an MSIS candidate at UC Denver. He contributes to a variety of publications including CIO.com, Search Engine Journal, ITSM.Tools, IT Chronicles, DZone, and CompTIA.