Monitoring is a crucial component of modern distributed applications. It helps administrators stay up-to-date with the current state and performance of their applications, as well as with their application infrastructure and environments. Integrating a monitoring pipeline is a major requirement for cloud-native applications that run in complex and dynamic clusters with a lot of moving parts affecting application availability and performance.

This article is the first part of our two-part series about Prometheus—a popular monitoring solution for cloud-native applications. In this article, we’ll introduce you to Prometheus’ architecture and key features, discuss some basic use cases, and compare Prometheus to other monitoring alternatives such as Thanos, Grafana Cloud Agent, Victoria Metrics, and Cortex. 

What Is Prometheus?

Prometheus is an open-source monitoring solution designed for cloud-native applications. Due to its built-in integration with containers, Prometheus is one of the most popular monitoring agents for Kubernetes.

Key benefits offered by Prometheus for cloud-native applications are the following:

  • Multidimensional data model: Support for structured time series metrics and ability to represent complex data types
  • PromQL: Native Domain Specific Language (DSL) for querying multidimensional data
  • Multiple metrics types: Support for various metric types, e.g., counters for monotonically increasing metrics, gauges (single numerical values), histograms, and summaries
  • Automatic service discovery: Automatic detection of cloud-native microservices and containers based on a user-provided configuration; ability to scrape metrics from them
  • Alert management: Built-in alert manager that can be configured to send alerts in response to certain metrics (e.g., low storage, low memory, network problem)

Architecture and Key Components

Prometheus is composed of multiple components implemented as microservices, which usually run in the same environment as the monitoring targets. 

Interactions with monitoring targets such as scraping or pushing metrics to remote locations via the push gateways are handled by the Prometheus server. It acts as the API server for client requests and also handles service discovery, metrics scraping, and persisting metrics in storage. 

Metrics scraped by the Prometheus server are stored on a local on-disk time series database in a storage-efficient format. One limitation here is that data persistence is linked to the lifecycle of the Prometheus server instance, as no remote storage is configured by default.

Prometheus expects metrics to be written by applications in the Prometheus format, which can be a challenge for developers looking to ship metrics to the Prometheus monitoring pipeline. Fortunately, Prometheus has a developed ecosystem with dozens of client libraries that facilitate integration of Prometheus metrics into the application code. Official client libraries exist for Go, Ruby, Java, Python, and Scala, and there are many unofficial ones as well. Prometheus architecture also includes special-purpose exporters for shipping metrics from various monitoring targets like HAProxy, StatsD, and Graphite. 


Figure 1: Prometheus architecture (Source:


Prometheus Metrics Format

Prometheus is designed for time series data identified by metric name and key/value pairs:

<metric name>{<label name>=<label value>, …}

For example, this format lets you represent the CPU utilization rate for the user space and kernel space in Linux as cpu_utilization (space=”user”) and cpu_utilization(space=”kernel”).

Prometheus allows having multiple key/value pairs inside one metric, which is very useful for high-dimensional metrics data. For example, if you want to query HTTP requests, you can define the metrics as http_requests(method=”POST”, path=”assets”)

Metrics scraped in the format above can then be queried and aggregated using PromQL, which has some powerful querying features; these include support for range vectors that let you query historical metrics within the user-provided time frame. You can also query metrics using complex regex rules such as the one below: 


Prometheus Metrics Scraping

Prometheus uses a pull model to collect metrics by way of HTTP. It sends requests to applications at the user-defined scraping interval via the /metrics endpoint listening on the application server. Developers who want to ship application metrics to the Prometheus server need to perform the following steps:

  • Deploy a web server and design the REST API /metrics endpoint that exposes metrics.
  • Convert application metrics into the Prometheus format using the client library for the programming language in which their application is written.
  • Configure Prometheus to scrape metrics from the /metrics endpoint.
    If you want to scrape metrics from Kubernetes or other providers that ship Prometheus-format metrics, you only need to define these targets in the configuration file—everything else is already configured.

When to Use Prometheus?

Prometheus is a great tool for recording purely numeric data that can be represented as a time series. You can implement it for machine-centric monitoring or for more dynamic service-oriented architectures such as Kubernetes. 

However, using Prometheus out of the box has certain limitations with any large-scale production deployment because it’s not configured by default for high availability, scalability, and multi-cluster metric scraping. Also, using Prometheus in production may require provisioning persistent cloud storage because the lifecycle of a standalone Prometheus server is linked to the machine on which it runs. 

Finally, according to the official Prometheus documentation, Prometheus is not ideal if you need the collected data to be 100% accurate. For example, due to potential outages, data for per-request billing may contain gaps and may not be as detailed as required. 

Running Prometheus on Kubernetes

You can deploy Prometheus on Kubernetes manually or by using an automation solution such as the Prometheus Operator. The latter manages the entire lifecycle of specific containerized applications by allowing you to automate the deployment and management of Prometheus monitoring pipelines. 

Generally, the manual deployment of Prometheus has the following prerequisites:

  • Create and deploy a ConfigMap for Prometheus jobs and scraping targets. You can configure Kubernetes targets via the kubernetes_sd_config setting. This default module lets you retrieve metrics from K8s Rest API and specific nodes, services, endpoints, ingress, etc. 
  • Manually deploy Prometheus containers using a StatefulSet.
  • Configure other Kubernetes resources such as RBAC and Services.

For both deployment options, you’ll also need to ship Prometheus-format metrics from your application unless you just want to monitor Kubernetes and containers.

Prometheus Alternatives

Several software products build on top of or extend Prometheus with new features that enable a production-grade monitoring pipeline. Let’s briefly describe these alternatives. 


Thanos is a Prometheus-based monitoring solution that enables long-term storage and data retention, high availability, and scalability of Prometheus deployments. It supports GCP, S3, Azure, Swift, and Tencent COS.

To enable long-term data storage for Prometheus metrics, Thanos leverages the native Prometheus 2.0 storage engine, which periodically creates immutable data blocks for a fixed time range on the specified object storage. Data persistence and retrieval from the block storage is managed by the Thanos store nodes, which synchronize metrics data and translate client queries into object storage requests. 

Store nodes filter relevant blocks by their metadata and cache frequent lookups, optimizing query performance. Also, Thanos supports compaction and downsampling of historical data to speed queries up considerably.

Finally, Thanos lets you query metrics across multiple Prometheus servers and clusters, allowing you to scale your Prometheus deployments. 

Grafana Cloud Agent

Grafana is a popular data visualization and analytics tool that’s widely used to compose observability dashboards and apply statistical measures to metrics ingested by Prometheus. 

Grafana Cloud Agent (GCA) ingests and ships metrics to Grafana Cloud; it is seen by some as the lightweight alternative to Prometheus that leverages the most relevant parts of the Prometheus code (service discovery, scraping, write-ahead log, remote writes, etc.) and is tuned for Grafana Cloud and Grafana Enterprise Stack. 

The agent lets you send Prometheus metrics to any system that supports Prometheus remote writes and also seeks to improve the performance and memory footprint of your monitoring pipeline. GCA achieves a 40% reduction in memory usage compared to Prometheus with equal scrape loads. 


Cortex is a time series database and monitoring solution based on Prometheus whose key innovation is adding horizontal scaling and long-term data retention on cloud storage. Cortex uses cloud-native storage like AWS S3 or DynamoDB to store metrics indefinitely. Other important features include:

  • Faster PromQL queries: Better performance is achieved through caching and intensive parallelization. 
  • Global view of historical data: Aggregated metrics from multiple clusters and servers can be accessed and used to generate insights from long-term historical data.
  • Horizontal scalability. The Cortex cluster can ingest metrics from multiple Prometheus servers.
  • High Availability. Data replication between machines helps you survive machine failures without any gaps in your data.

Victoria Metrics

Victoria Metrics is a Prometheus-compatible cross-protocol monitoring solution that offers long-term metrics storage, multiple data-source integration, scalability of your metrics pipeline, and improved performance.

In a typical Prometheus setting on Kubernetes, the environment from which metrics are ingested is limited to a single cluster. Victoria Metrics provides a global query view that integrates multiple Prometheus instances and clusters. It also implements its own query language (MetricsQL) that is backward-compatible with PromQL. Victoria Metrics is also compatible with various other metrics-scraping protocols implemented by InfluxDB, OpenTSDB, and Graphite, plus it supports arbitrary data in CSV format, JSON Lines format, and native binary format. 

Various low-level optimizations implemented in Victoria Metrics ensure high performance and storage efficiency. On average, Victoria Metrics uses 7x less RAM than Prometheus and occupies 7x less storage space. 


Prometheus is a great monitoring solution for cloud-native applications with a powerful query language and advanced service discovery. Tools like Prometheus Operator facilitate fast and seamless deployment as well as the management of Prometheus on Kubernetes. Although Prometheus is suitable for any production-grade monitoring pipeline, you should consider aspects such as scalability, metrics persistence, and cross-cluster aggregation.

In the second part of our Prometheus series, we’ll focus on some concrete examples of using Prometheus for scraping metrics and creating and shipping custom application metrics. We’ll also discuss advanced features like pushing metrics via the Pushgateway and leveraging a remote-write API with Espagon and Prometheus.


Read More:

How Epsagon supports Prometheus

Thanos: Prometheus at Scale

How to Scale Prometheus for Kubernetes