Microservices, by nature, have a lot of moving parts. There can be a lot of failure points within a microservices-based application. There are a lot of different reasons that can cause performance issues. Even if your microservices are running efficiently with one another, sometimes the limitations of a third-party service or API can cause significant issues for an application. Finding performance issues in your application is therefore a complicated task.
Detecting Performance Issues with Epsagon
Epsagon gives the users multiple options to visualize and detect performance issues within serverless applications. The issues can be at a serverless function level, trace-level, or at the infrastructure level. Let us see how users can detect performance issues.
1. Service Maps
Service maps are a visual representation of the entire application. In Epsagon service maps, the arrow between microservices depicts the dependency and the latency between them.
Clicking on a particular microservice with higher latency opens up a side panel. In the side panel, users can see the average duration and the duration breakdown by operation. The side panel information is useful to visualize any anomalies in latency.
2. Trace Search
Using the Traces screen is another way to detect performance issues. Using filter “duration” > X, users can see a list of traces that exceed X seconds.
3. Functions Screen
Specifically for Lambda functions, users can use the “Average Duration” column in the Functions screen to see which of their functions are problematic.
Epsagon provides Out-of-the-box (OOTB) dashboards to help users monitor different metrics. Specifically for latency issues, users can use the Kubernetes Overview dashboard to understand the latency at a node, pod, or container level.
Users can also use the Application Overview dashboard to understand latency issues at the application level or use the many OOTB AWS (and Open Source) services dashboards to understand latency issues within different AWS services. An example of AWS RDS dashboard is shown in Figure 7.
Users can also create their own custom dashboards for specific performance monitoring tasks. Thus, Epsagon provides a wide range of options so that users won’t miss out on any performance issues.
Finally, users can also configure alerts based on the different latency-related metrics that they are interested in. With Epsagon, users can create alerts based on Lambda metrics, Traces, and Prometheus & AWS Cloudwatch metrics.
Troubleshooting Latency Issues using Epsagon
Epsagon helps users reduce their Mean-time-to-detection and resolution (MTTD/R) by quite a lot. For troubleshooting performance issues, users can use the Trace Search screen and compare traces to root-cause issues. With the timeline view, users can identify which services are taking longer than usual. Epsagon also correlates metrics, logs, and traces. So, users can quickly jump to logs from the trace search to find out why a particular service took a long time.
Performance issues can lead to a horrible customer experience. Detecting and troubleshooting such issues in microservices-based environments can be very tricky. Since microservices have many moving parts, latency issues can arise in any of those parts. Hence, it is important to have different views to detect them. Epsagon provides many options to users so that they don’t miss any latency issues. With a powerful troubleshooting toolset, Epsagon users can quickly root cause performance issues and guarantee that the SLAs and SLOs are met with confidence.