Drum.io, an e-commerce company, provides a revolutionary new platform where promoters called “Drummers” recommend businesses for the platform and leverage their own networks to attract buyers, buyers save with promotions and offers, and businesses extend their customer reach.  

Positioned for Growth with Epsagon

Drum.io was “built for the cloud” on AWS cloud services, according to Joe Kearney, Head of Cloud Engineering at Drum.  Drum went 100% AWS with API Gateways, Lambda, DynamoDB, Code Pipeline for development and CloudFormation for automated orchestration, thus enabling consistency and speed across the development environment. The business goals were rapid time to market, scalability, and cost-effectiveness.

With a combination of AWS, a serverless environment primarily, and Epsagon’s automated tracing with payload visibility for monitoring and troubleshooting cloud services, Drum is well-positioned for growth.

With a combination of AWS, a serverless environment primarily, and Epsagon’s automated tracing with payload visibility for monitoring and troubleshooting cloud services, Drum is well-positioned for growth.”

“We built our cloud-based serverless environment so we could prioritize spend on engineering and building services versus building out and supporting the IT infrastructure. With that approach, the path to services really speeds up. We get our products out the door quicker and have a solid foundation to move forward.”

Mobile Apps Add Complexity

Rapid path to services is critical to the success of this e-commerce company with the ongoing addition of “Drummers” and businesses with offers and features that can change daily, as well as the introduction of new services.  

While serverless was the perfect choice for rapid development, visibility into distributed microservices and serverless can be difficult, as developers at Drum discovered. 

Ninety percent of Drum’s usage is generated by Drummers and buyers via mobile applications.  But…”mobile apps, in fact, most apps, are difficult to see on the backend, relationships are hard to understand, and user interactions and event flow hard to correlate,” Joe explained.

“Mobile apps, in fact, most apps, are difficult to see on the backend, relationships are hard to understand, and user interactions and event flow hard to correlate.”

And Joe noted that serverless is a newer technology. “Documentation is not as robust” as with legacy monolithic applications and tools, and serverless tools are “limited and/or scarce,” Joe explained. 

“As we pushed code to AWS cloud services, developers immediately had performance questions about what they were seeing or not seeing in terms of problems. And I began auditing and demoing tools.”

“As we pushed code to AWS cloud services, developers immediately had performance questions about what they were seeing or not seeing in terms of problems. And I began auditing and demoing tools.”

Since Drum did not have legacy applications or a hybrid environment, enterprise legacy monitoring tools were not appropriate in Drum’s serverless environment. 

“While the majority of available tools were built for the introspection of infrastructure, serverless functionality is typically grafted on, and they’re hard to configure. When something is hard to configure, that is a barrier to adoption and efficiencies. “

And other serverless tools are still manual and lack automation, Joe explained. 

“One of the biggest gaps in serverless and microservices is how to correlate logs. There often is no logical flow around how your software is working.  It’s drop some breadcrumbs and follow the trail.”

“One of the biggest gaps in serverless and microservices is how to correlate logs. There often is no logical flow around how your software is working.  It’s drop some breadcrumbs and follow the trail.”

To fix that log correlation issue, “I didn’t want to slow down the development of the product and ask the team to add functionality with correlation IDs for the logging piece.”

Automation and Correlation Win

Viewing Epsagon in action, Joe immediately saw the benefits. “We didn’t have to do any development or configuration work; we could use a CloudFormation stack with Epsagon.” Epsagon’s s ability to auto-connect without using agents “made for ease of entry that just sold me. It’s also easy to disable.”

“We didn’t have to do any development or configuration work; we could use a CloudFormation stack with Epsagon.”

Drum Production Environment with Epsagon

In addition, Epsagon’s visualization of everything in production, automated tracing, metrics, and correlation of traces, logs and payloads in one view enabled instant observability and ability to fix issues fast.  As a result, developer velocity improved.  

Epsagon Correlating Traces, Logs and Payloads for Drum

As its buyer app went live, Drum started to see performance issues with AWS AppSync. The team was trying to use and correlate AppSync logs to get at the issues using CloudWatch Insights and AppSync logs. Then Joe asked Epsagon about its AppSync integration for tracing.

“In five minutes with Epsagon versus a couple of hours, we identified immediately the AppSync issues and were able to diagnose the problems using tracing and metrics. We also are using Epsagon alerting that integrates seamlessly with Slack, PagerDuty, and JIRA.”  

“In five minutes with Epsagon versus a couple of hours, we identified immediately the AppSync issues and were able to diagnose the problems using tracing and metrics. We also are using Epsagon alerting that integrates seamlessly with Slack, PagerDuty, and JIRA.”  

Epsagon’s integrations, for example with JIRA, were key since “a tool must be able to talk to other tools in the environment. I wanted a single, flexible tool like Epsagon,” Joe noted.

Data and observability, Joe said, are critical to ensuring performance. “Everything we do in our own product is data-driven. We modify our product based on data we receive, ” Joe explained. 

“Epsagon helps us improve product performance and reliability with data about how things are working together. We were having some buyer (user) problems with the performance of photo processing in the app and Epsagon showed us the exceptions and individual user problems with the app as a pattern over time. We will probably rewrite or remove that code based on that information.“

“Epsagon helps us improve product performance and reliability with data about how things are working together.”

Improved Performance and Reliability 

As to advice to other businesses, Joe recommends not being “scared” of younger tools. “You want a solution that fits your purpose.” He thinks going cloud-native with tools makes sense, not struggling with building your own monitoring solution, and not spending too much time on bundled provider tools if they are not targeted to your purpose or issues, or simply perform inadequately.

Essentially for Drum, Epsagon provided improved QA for services and time to market for new services, the ability to push new features out automatically, scalability and cost savings.

“Essentially for Drum, Epsagon provided improved QA for services and time to market for new services, the ability to push new features out automatically, scalability and cost savings.”

“It’s almost as if we have a co-development relationship.” Joe noted that Epsagon not only adds functionality like AWS AppSync integration routinely as part of its roadmap but also is open to working on customers’ needs within the context of that roadmap.

“When infrastructure engineers and developers agree, that’s a win. Genuinely, my team sees the value of Epsagon. The AppSync integration showed that value dramatically.”

“When infrastructure engineers and developers agree, that’s a win. Genuinely, my team sees the value of Epsagon. The AppSync integration showed that value dramatically.”