Kubernetes is an open-source orchestration platform that allows you to manage and scale your containerized workloads. You can run Kubernetes anywhere—on-premises or in a public or hybrid cloud. Kubernetes helps you build scalable services by providing functionalities like declarative configuration, immutable infrastructure, horizontal scaling, load balancing, service discovery, and self-healing systems.

Despite its advantages, a Kubernetes environment has several moving components and introduces monitoring challenges. You’ll need to have complete observability of the services running in the Kubernetes cluster to troubleshoot any performance issues. You’ll also need to track the health of the Kubernetes master and worker nodes. Luckily, the Epsagon interfaces give you the ability to visualize all nodes, pods, containers, and deployment metrics in detail.

In this article, you’ll learn how to efficiently monitor your Kubernetes cluster using the Epsagon platform.


Epsagon provides observability support for your Kubernetes cluster running on Azure Kubernetes Service (AKS), Amazon Elastic Kubernetes Service (EKS), or anywhere else. The integration process is straightforward—Epsagon automatically discovers and sets up a dedicated Prometheus instance for your workload and generates the required metrics across your application stack.

There is no need to manually configure dashboards for visualizing your application state. Epsagon provides out-of-the-box dashboards to monitor your Kubernetes cluster and review the real-time state of your nodes, pods, deployments, and container metrics like CPU and memory requests.


It’s critical to have a strategy to monitor your microservices workload efficiently and troubleshoot issues. Epsagon lets you do this with ease by providing access to the three pillars of observability—logs, metrics, and traces—all on the same platform and by helping you build complex systems. 

Epsagon offers an advanced monitoring solution for applications running on a Kubernetes cluster, either on-premise or in the cloud. Kubernetes allows you to autoscale your applications based on CPU utilization or memory consumption, while the Epsagon platform provides you with a simplistic view to monitor your application state and auto-scaling metrics.


Epsagon gives you a consolidated platform to help you troubleshoot issues and detect bottlenecks in your Kubernetes cluster. You’ll gain in-depth visibility into application-performance issues and make informed decisions to optimize your infrastructure, in turn increasing your team’s productivity by allowing them to focus on feature development rather than the maintenance of existing applications.

Troubleshooting issues with Epsagon is faster since you have the required tooling in a single platform at your disposal. You can correlate your application logs, metrics, and traces under the same set of pre-configured dashboards—a very helpful capability for development teams since they don’t have to navigate different observability tools to monitor their workloads.


Epsagon features a robust real-time alerting strategy to notify teams about issues with their workload, negating the need for teams to manually monitor the Kubernetes cluster 24/7. A common alerting scenario in a Kubernetes cluster is when CPU/memory utilization on a node/pod passes a particular threshold. 

You can also leverage Epsagon’s integration with a wide range of industry-standard alerting tools like Opsgenie, PagerDuty, ServiceNow, Slack, Microsoft Teams, etc. Once the teams receive alerts, they can leverage the Epsagon platform to drill into the service and review the logs, metrics, and traces, all on the same dashboard, to quickly troubleshoot and fix any issues. 


Epsagon’s Trace Search screen allows you to search across any request in the workload and is highly customizable, letting you search traces based on timeframe and other available filters like application, duration, HTTP status code, and Kubernetes resources (cluster, node, pod, container, namespace, etc.). Once you execute your search criteria, you’ll see the events that match the conditions. The best part is, once you click on any of these events, it opens a diagrammatic representation showcasing the details of the payload and the interaction between components.

Kubernetes Dashboards

The Kubernetes monitoring screen displays an overview of cluster, node, pod, container, and deployment metrics.


Figure 1: View of Kubernetes cluster metrics



A Kubernetes cluster can have several nodes to run your application workload. Depending on the resources available on each node, Kubernetes schedules pods for all nodes in a cluster. The Epsagon dashboard displays node metrics, like CPU, Memory, Disk, and Network, to help you troubleshoot any performance issues in the cluster.


Figure 2: View of Kubernetes node metrics



Pods in Kubernetes are the smallest deployable unit. You can have multiple containers running inside a pod that shares the same storage and network resources. The Epsagon dashboard lets you easily visualize pod metrics and monitor their performance. These include the pod’s namespace, including the deployment and cluster it belongs to; CPU utilization; in-use memory utilization; network traffic, both receive and transmit; disk storage details; and the pod’s state, e.g., Running, Pending, Succeeded, Failed, CrashLoopBackOff.


Figure 3: View of Kubernetes pods metrics



A container separates an application from its underlying host infrastructure and serves as a standalone, lightweight, executable package consisting of the entire runtime environment needed to run the application. In Epsagon’s UI, you can easily inspect the containers running in your cluster by clicking on the “Containers” tab to view details such as container and pod name, namespace, and CPU and memory usage. You can also access “Disk” data, the local ephemeral storage measured in bytes, and “Actions,” i.e., the logs of all running containers. 

In the Containers console, you can view the container logs with a single click and troubleshoot issues in your workload. For example, in scenarios where your container crashes, you will want to access your application logs to identify the root cause.


Figure 4: View of Kubernetes container metrics



In Kubernetes, the deployment controller makes sure that the desired state and actual state are always in sync. The deployment takes care of the management of your Kubernetes resources so that you don’t have to create, update, or delete the pods manually—this will be automatically managed for you.

If you want to inspect the deployments in your cluster, click on the ”Deployments” tab to view details, such as deployment name, namespace and name of the cluster where a deployment was performed, and the date the application was created. You can also see the deployment.kubernetes.io/revision annotation under “Observed Generation,” as well as the number of replicas available to users.


Figure 5: View of Kubernetes deployment metrics


Epsagon Dashboards

Epsagon provides a combination of out-of-the-box and custom dashboards that you can leverage to visualize metrics and improve the observability of your services deployed in a Kubernetes cluster. The dashboards are customizable, allowing you to add/remove panels as needed.


Figure 6: Epsagon Dashboards


The Kubernetes Overview dashboard gives you in-depth details of the health of cluster components, like nodes and namespaces. You can view metrics such as Node CPU, Node Memory, Filesystem Usage, and Network I/O. You can also visualize the memory, CPU, and network I/O details at the pod and container level. This Kubernetes dashboard provides the key performance indicators required to monitor and troubleshoot any performance issues with your cluster resources. 


Figure 7: Kubernetes Overview Dashboard


Custom Dashboards

You can also create a custom dashboard from scratch or duplicate an already-created dashboard and modify it. Plus, you can select any available metrics that Epsagon tracks and add them to your custom dashboard or even create your own widget/panel inside it. Epsagon collects a large variety of metrics based on your application data, enabling you to add those metrics as separate panels in your customized dashboards. 


Figure 8: Kubernetes custom dashboards


With Epsagon, you can also make dynamic and interactive dashboards by creating template variables anywhere in the dashboard. For example, you can create a custom dashboard for your microservices and use the application name as a “template” variable. This will ensure that you have a standard dashboard for all of your microservices, which is advantageous from an observability perspective.


Figure 9: Template variables


Epsagon Metrics

Epsagon leverages the power of Prometheus so you can customize your monitoring queries. You can build your query by selecting any metrics stored in Prometheus and perform an aggregation over a given timeframe. You can also switch to using PromQL syntax to query the time-series datastore.


Figure 10: View of Epsagon metrics



The ability to view traces, logs, and metrics all in one place without any manual configuration is a powerful feature that makes Epsagon stand out as one of the best tools available in the observability space for monitoring and troubleshooting Kubernetes workloads.

To take the next step, start your Epsagon 14-day free trial here. To learn more about integrating and monitoring your microservices-based environments with Epsagon, check out the onboarding documentation here.


Read More:

Detect Errors and Correctness Issues using Epsagon

Detecting and Troubleshooting Performance Issues with Epsagon

Monitoring Microservices-based Environments Using Epsagon