Application Programming Interfaces (APIs) provide an entry point for external users into your internal world, that is, your business logic. This is a secure way of giving them such access. The alternative would mean that these users could completely access your code and review your system’s internal architecture, which would be a bad idea. 

Instead, by abstracting the contact between the outside world and your internal one, you’re also opening the door to a couple of interesting benefits:

  • Your API is the same for all clients, so if they need to create more than one, they can re-use code. 
  • You’re free to modify the internal business logic of your application without affecting the external API, like rearranging your furniture but keeping the outside of your house the same. The mailman won’t realize anything has changed, and you’ll be happier and more comfortable living inside your updated home.

Now, take this concept and apply the principle of “Separation of concerns,” which loosely states that if a piece of business logic is taking care of more than one concept at a time, it should be separated into multiple entities. By doing this, you end up with the concept of a microservices-based architecture.

Moving to the Cloud

Taking microservices one step further and into the proverbial cloud, you find yourself inside a very powerful architecture. It’s capable of individually scaling depending on demand and internal needs (such as processing power and memory consumption) as well as providing high availability due to the automatic response from cloud management services (i.e, if one or more of your microservices crash, you can automatically spawn new ones). 

Applications can easily and organically scale by focusing on individual microservices or by only adding new ones when required, without affecting the rest. A properly orchestrated microservices architecture is a real work of art, requiring you to plan ahead and automate a lot. But if you do it right, you’re left with the perfect basis for any application you´re looking to build. 

There are many resources out there that focus on the benefits of taking your architecture to the cloud, especially when dealing with serverless platforms. The only real downside to this approach is that because you can potentially end up with so many “moving pieces” interconnected with each other but functioning individually, you need to find a reliable way of monitoring the performance of all these elements.

Lucky for you, in this article, we’ll cover several solutions that do just this.

The Why & What of API Performance Monitoring

The performance of your API is crucial, so it needs a dedicated set of services to correctly translate a set of custom KPIs into numbers that you can understand and that you or your system can act upon.

But which KPIs should you look for when monitoring and measuring your API? Typical metrics include CPU load, memory utilization, and the like, but you need to monitor both the infrastructure and code output to cover all potential problems that might arise. 

Here below are some KPIs you’ll want to keep an eye on when measuring your API’s performance:

Requests per Minute

Measuring the number of requests per minute (RPM) means being able to count how many requests are received and served correctly every minute. Remember, your API is normally doing something, and even if that something translates into a “200 OK” message to the client, that message needs to happen.

This KPI might be affected by a lot of things, such as bad infrastructure, causing you to not be able to process requests faster, which can lead to a denial of service if you don’t have some sort of buffer structure in place. But there could be other reasons for this issue that lies outside of infra, such as a huge dependency on external services. For example, take a look at the following diagram:


Figure 1: Classic microservices-based architecture depending on internal and external services

The blue arrows depict the data flow required for the request to reach its final destination, while the red arrows show the path the response has to follow in order to be sent back to the public-facing API.

While the communication with internal microservices is vast, it is expected to happen fast if they are inside the same internal network. However, sending requests to third-party services outside of your network will incur higher latency, potentially killing your own response times. You need to take into account RPM as well as resource consumption because the bottlenecks might not be directly related to resources.

Average and Max Latency

As you just saw, latency can affect the behavior of the entire platform. But if you only look at an aggregated value, as we did above, the actual culprit might never surface.

Aggregating the total latency of a single request will give more weight to values that happen more often than others. So if you have 100,000 read requests per second and maybe every once in a while a couple of really heavy write requests, the overall number might not stray too far away from the average or even reach what you would consider a max.

Essentially, this masks the problem by not drilling hard enough into the data. So instead, try to disaggregate these numbers into multiple avenues such as:

  • By request type
  • By route
  • By the time of day
  • By geography
  • By client type

There could be many other options unique to your platform. Breaking down data this way lets you debug and find the problem much faster.

Errors per Minute Using HTTP Code from the Response

Unlike RPM, which can give you a measurement of health around your API, EPM (or errors per minute) will give you a general idea of how buggy and error-prone your API is. You would normally want to track any 400 or 500 response codes since everything else is potentially a correct response. 

That being said, this also implies you’re making proper use of the HTTP response codes; if you aren’t, you might want to start thinking about them. The key takeaways here should be:

  • A higher number of 400 errors means your users don’t know how to use your API. So consider reviewing the documentation or providing better error messages.
  • A higher count of 500 errors means your code is crashing. There are many reasons why this could be happening, and following a disaggregation strategy, as with latency above, might help you to determine where the errors lie.

Optimizing Your API, What Can You Improve?

There are many ways for you to optimize your API, but they really depend on the output you’re getting from your monitoring service and the real root cause of the detected problems.

It’s important to understand that optimizing for the sake of optimizing is normally not recommended because you might end up affecting sections of your platform that were already working at peak performance. You should first measure and understand where the problem lies and only then start the optimization process.

Some typical way you’d want to optimize your API are:

  • Moving to a serverless architecture. The less infrastructure you have to manage normally makes everything easier. True, you might need to restructure and refactor most of your code to make it work, so only do it if there is no simpler choice. The benefits of this move could indeed solve all of your performance problems if you were having issues with a hard-to-scale infrastructure.
  • Batching requests: Adding the ability for clients to send several requests at the same time would reduce the load of incoming requests, which would probably increase response times. But if you manage to keep these below your current rate for the same number of individual requests, then it’s definitely a win.
  • Consider using external services: This might seem contrary to what you would expect, but many companies already provide APIs focused on performing a single task, such as Auth0 when you need a secure login service or Stripe if you need to deal with money. In these situations, using external services that have already been optimized and have a great deal of support behind them could benefit your own metrics (such as reducing EPM).

There is one key thing to remember when trying to optimize your APIs: It’s not all about performance; in fact, it should be about the KPIs you defined above. So maybe you don’t need to make your API work faster, but instead need to make it more reliable. In that situation, an increase in latency might be required to achieve a considerable decrease in errors per minute. 

At the end of the day, this is a balancing act, where all concurrent factors must be taken into consideration. 

Tools You Can Use to Monitor Your API Performance

When it comes to performance monitoring, you have many options, so let’s review a few here.

Cloud Provider Tools

These are usually your first option because if you´re already deploying in a cloud environment, you normally have some kind of monitoring tool at your disposal:

  • AWS provides you with CloudWatch, which can gather information from their API Gateway and transform it into metrics for you.
  • Azure offers Azure Monitor. You can connect all of your applications and services (including your APIs) to it, and it’ll act as a centralized and generic solution for monitoring on its own platform.
  • GCP gives you a monitoring dashboard out of the box as well. It can receive information from all APIs you set up in order to centralize and report on three key metrics: overall traffic, error rate, and latency. While you don’t have a lot of freedom regarding the metrics used, you do have the ability to create the type of visualization you want for all available data. 

Cloud solutions are a great way to get started with API monitoring solutions simply because they’re already there and normally add no extra cost. However, if you’re looking for more detail, you’ll have to look beyond the Big 3.

Open-Source Solutions

The open-source community has produced many solutions when it comes to monitoring, although not all of them are up to the standards you’d want. Some also have no support team, which means you’re on your own if you run into any trouble.

However, there are two options that are still in the works but come with an active support team: Prometheus and OpenTSDB.


Prometheus is one of the top open-source solutions used when it comes to monitoring cloud solutions. It provides a time-series approach for representing and monitoring key metrics and comes with an integrated querying language (PromQL) that allows you to filter and gather more specific statistics about the data. Plus, Prometheus has an integrated alerting system, capable of making smart decisions and aggregating alerts when required. 

The best part is that it has support for most of the common programming languages with already existing integration libraries—Go, JAVA, Ruby, C, C++, R, Rust, and many more. Prometheus is a valid option when you need to instrument monitoring into a very customized environment.


OpenTSDB is a scalable time-series database, a great approach for tackling metric storage. It works on top of Hadoop, which means it can scale with very cheap hardware; you can also go with BigQuery for something even more scalable and reliable. That’s right, it has out-of-the-box support for BigQuery, which is great if you’re already using Google’s cloud.

The only downside to this database is that it doesn’t have an alerting system, so it’s more of a storage layer for your metrics, requiring you to hook something else on top of it to add an extra layer for visualization and alerts.


Finally, leaving the open-source realm and entering the land of paid products, you have Epsagon. After a 14-day free trial, you are charged a fee, but the product’s level of integration, coupled with personalized features that are directly focused on monitoring cloud-based solutions, makes it worth every penny.

With Epsagon´s service, you´ll be able to auto-discover your tech stack and orchestrate it without having to write a single line of code. You also get an advanced UI, which allows you to trace requests to the level of granularity you need and then correlate all KPIs to understand how one affects the other.

Like the previous solutions, Epsagon has out-of-the-box integration with most common technologies, including Python and JavaScript, and with other platforms, such as JIRA, AWS, Azure, and more. 

Out of the solutions covered here, Epsagon is the one that will most likely give you the most bang for your buck.

Closing Words

API monitoring is not an optional practice if you’re trying to deliver a production-ready and highly available service. In fact, it is a must. Because of this, you need to understand what to look for when selecting the right fit for your organization’s needs. 

This means you need to define your own KPIs, make sure your chosen product has support for them, and finally consider how you can automate your actions based on them.

If you don’t have the budget for monitoring just yet, go with solutions from your current cloud provider since they’ll give you quality tools that require very little configuration. Then, make the jump to paid solutions, such as Epsagon, once your budget allows for it. Only go for open-source options when you have a team of experts that can spend time to understand, review, and configure these tools.

Sign up for a free trial.


Read More:

Troubleshooting Application Errors with Epsagon

Third-party Dependencies and Accountability

Tools that a DevOps Engineer Uses