Creating applications or even platforms that can scale up to the amount of traffic you receive–no matter what that means–is not as easy as slapping a few microservices together and hoping for the best. There are several practices and complementary activities you need to perform to have a successfully scalable architecture.
In this article, we’ll cover one of these activities: the monitoring and tracing of AWS AppSync APIs that serve content from multiple data sources.
A Quick Intro to GraphQL and AppSync
GraphQL is a query language for APIs that provides a more natural way of gathering data, especially compared to the previous alternatives (REST and SOAP). It gives consumers the ability to specifically request the data they need and nothing more (i.e., they can specify the fields they need and avoid having to parse 30+ fields of a record just to get an ID).
If you’re working with APIs serving content from multiple different data sources, AppSync will help you simplify the task of gathering and serving this content.
Luckily, AppSync is a managed service from AWS, which provides a GraphQL interface, allowing clients to query and retrieve data coming from multiple sources. A managed service simplifies a lot of the leg work required to scale such an application or make it highly available.
Aside from a GraphQL interface, AppSync also features resolvers for many cases, i.e., the custom code needed to query the different data sources on a normal GraphQL instance.
This essentially means you can create a microservices-based architecture to serve up data from multiple sources with what is now a pretty much standard API in just a few clicks.
Monitoring Your App
Even applications based on managed resources require monitoring to understand why they fail and why they behave the way they do. With a proper monitoring setup, you’ll gain insight into the performance of your services, their resource consumption, and actions you can take to prevent them from failing.
Another benefit of monitoring is that you’ll generate enough data over time to perform other types of analysis, such as failure prediction.
AWS X-Ray & CloudWatch
The first allows you to understand what happens inside your distributed platform through a detailed analysis of the data going across it. The second allows you to centralize the performance data coming from your different servers into a single location, which in turn, lets you visualize, compare, and even set up automated actions based on the values received.
With these two tools, you can pretty much get the details of everything happening on your platform from a single place and react accordingly, making the most informed decisions possible and without any fear of missing some important context.
This happens because CloudWatch can receive information from more than 70 different AWS services out of the box and show it all in the same place. Aside from service-specific metadata, you can also make CloudWatch your centralized logging repository by sending every log your services save into it. This way, you can monitor that information as well and even manually query it when trying to troubleshoot a problem.
Monitoring AppSync Apps with X-Ray & CloudWatch
Now that you understand what monitoring a microservices-based architecture is all about, let’s go back to our use case, AppSync. AppSync is essentially a set of microservices pulling data from different sources and serving it through a GraphQL interface.
So how can you monitor a managed application like this? In fact, the first question you need to ask yourself is: What exactly do I want to monitor? After all, a managed architecture doesn’t really have any infrastructure for you to monitor. But you do need to focus on performance and the actual execution time of your queries, as that’s all your application is doing: serving data through queries.
To achieve this, you need to enable logging on your AppSync instance; when you do, make sure you create (or select) a CloudWatch ARN role. And it’s that easy! You can now receive data from AppSync inside CloudWatch, although you still have to make sense of it all.
Here are some of the metrics you have access to and can utilize to monitor your application:
- HTTP Error Codes: These are caused by either a problem with the request itself (i.e., a malformed query) or by an error during the data request (probably an incorrectly set schema).
- Latency: The amount of time it takes for a query to be resolved, from the moment it hits AppSync to the moment the data is returned to the client.
- Real-time subscription metrics: I encourage you to read the full documentation for this category, but these metrics are related to connections and disconnects (both successful and unsuccessful ones) to your API as well as active connections at any given time.
If you want to take your monitoring to the next level, you can also add X-Ray to the mix; as before, you just have to go to your AppSync’s settings and turn on “Enable X-Ray.”
With X-Ray enabled, you’ll get a new level of insight into what goes on under the hood every time your API receives a request. So, when things don’t look right on CloudWatch, you can go into X-Ray, sample your requests, and understand the latency between the different stages of the queries. For example, a query that maps to a DynamoDB data source will give you details about things like:
- The time it takes AppSync to parse the field mapping between the query and the data source’s schema
- The actual request to DynamoDB and the time it takes for that to happen
- The time AppSync takes to translate the response from DynamoDB into the schema you defined
With X-Ray and CloudWatch, you have everything you need to understand the performance of your application and figure out the problem when something goes wrong.
Tracing AWS AppSync Apps
To expand a bit on the use of X-Ray, you also have the practice known as “tracing.” This entails reviewing the details of the execution of an operation while measuring each internal segment of it. A request that takes 30 seconds to be resolved is a warning sign, but you need to debug the code to understand where the problem resides. Let’s say that inside a particular request you have:
- 1 second for data parsing
- 0.5 seconds for query creation
- 27 seconds of waiting for the data source to answer your request
- 1.5 seconds for parsing the response from the data source and sending the right formatted data to your client
With this data, you know where the problem resides, and you can take action to solve it.
Although X-Ray provides a good amount of detail when trying to trace your AppSync applications, there are other options out there that improve the experience and provide developers and DevOps with an even higher level of detail.
Epsagon is one of those products. Its integration with AWS means you don’t have to add anything extra to your already-managed infrastructure. The main benefit of Epsagon over AWS solutions, such as X-Ray, is that the latter has a limited amount of detail it can show. This is especially important for very complex queries, which require far deeper insight.
This is where Epsagon can help. It allows you to automatically trace and understand the requests your API is receiving and what it’s doing with the data along the entire way; at the same time, it allows you to access the full payload at any given time.
Epsagon was the first to provide distributed tracing and monitoring for AWS AppSync APIs — without any code change, thus enabling developer speed and accuracy.
Epsagon Benefits, Right Out of the Box
The best part about Epsagon is that once you sign up for their service, you immediately get:
- Automatic discovery of your architecture presented as part of an easy-to-understand diagram generated by the platform. As your application grows, this map is continuously updated.
- Automatic monitoring of your platform. You don’t have to do anything other than sign up, after which monitoring becomes an automated task, notifying you whenever there’s something out of place.
- Payload visibility and troubleshooting help. You can visualize your data at any point within your transactions, providing an ultimate level of detail when simple performance charts aren’t enough to show the entire story.
If you’re running a complex architecture where availability and performance are crucial (which, let’s be honest, is a standard for most production architectures today), Epsagon is a tool you should definitely consider. Start your 14-day free trial and see for yourself.
As a recap, any client-facing platform should implement some form of monitoring to help understand how it is performing and to be notified the minute something starts to go wrong.
And once things do start going wrong, tracing is a great tool to have to help you and your team figure out where the root cause of those problems reside. In the case of tracing AWS AppSync applications, you have options such as X-Ray that AWS that are already provided, or you can go with Epsagon to build on top of those services for even better results.