By Diego Lewin, Head of Development at Onceit
Onceit is one of New Zealand’s fastest growing online fashion sites. It launched in May 2010 with a staff of one and a dream to become a leading destination for online designer sales. Onceit has become a one-stop destination for more than 500 local and international brands that keep the company’s customers on top of the latest styles & trends.
As part of our serverless transformation, Onceit is developing three different projects:
- An application to synchronize products from e-commerce to a marketplace platform.
- A complete rebuild of a WMS – Warehouse Management System.
- A merchant portal for suppliers.
Any new project is done in serverless and we are also migrating existing code to serverless.
As part of the implementation, we are using multiple AWS services: AWS Lambda, API Gateway, SNS, SQS, DynamoDB, CloudWatch, RDS, and more. We are using MailChimp and other APIs as well.
Debugging and troubleshooting using only the AWS console was quite difficult. We were working with CloudWatch, going through every log, which was very time-consuming.
The hardest part was not being able to see the whole picture. We have multiple moving parts: API Gateway, Lambdas calling Lambdas, Dynamo, RDS, SQS, SNS – it is impossible to see the whole request from end to end.
Previously Proposed Solutions
We tried AWS X-Ray. It was useful, but we struggled because it couldn’t show us the whole picture, such as Lambda to Lambda calls or MySQL requests. We had to go back using logs, which was a setback.
We also tried solutions provided by other vendors, which suggested some good features in addition to X-Ray. The problem was that they couldn’t provide the visualization of the entire architecture and required a lot of manual instrumentation instead of doing it automatically.
The Chosen Solution – Epsagon
There were several reasons we were interested in trying Epsagon:
- Architecture visualization
- Performance and timings of API calls
- Getting the CloudWatch log for every request
- Intelligent cost prediction
- Identifying our non-used (=unused) functions
- Out-of-the-box experience without manually changing functions
The onboarding was indeed very easy. We used the Serverless Framework plugin which was easy to get started with. We were finally able to get the complete picture of all the requests and their lifetime, and the time being spent in every part of the application. It was very important to us since we wanted to identify bottlenecks in our application.
We have multiple databases. In one particular case, with Epsagon’s aid, we were able to find a problem by measuring how much time each query takes, and the associated Lambda, accordingly.
Results Achieved Using Epsagon
Our MTTR (mean-time-to-resolution) has decreased significantly. We are using Epsagon’s Trace Search technology to easily find specific requests in their application and all the information related to it. Overall, we are seeing a 90% decrease in our troubleshooting time.
We finally have visibility into our application. We are able to see all the requests and where they’re coming from which gives us more confidence when developing and operating our application.
We can also improve our performance faster. Since we can easily identify where issues are, we can optimize the memory and costs of Lambda functions. We found optimizations that we wouldn’t have known about otherwise.
We had some 3rd party integrations which weren’t working we didn’t even know about. Epsagon helped us find them.
Cost monitoring by Epsagon has been helpful to find potential optimizations of memory and timeouts. We use it to monitor the health of our system. It also helps to keep the technical debt under control in our infrastructure. We are using cost as an indicator of issues and optimization opportunities.
Quality improvement was also a major benefit. By identifying slow queries and errors in 3rd party APIs, we are able to identify architecture problems. We decide to split one Lambda into several Lambda functions, for example. By doing so, we were able to improve the quality of the queries and the code. So far, we are seeing a 40% quality improvement.
Before, we didn’t understand the full impact of our decisions. With every addition to the serverless equation, more issues kept popping up. Epsagon gave us the confidence to use advanced and different design patterns
Today, the breakdown of our developer day is roughly 60% coding, 20% testing, and 20% debugging. Before Epsagon, the time spent debugging was extremely higher and we had very limited visibility or confidence to resolve complex issues. Now, with Epsagon, things are running smoothly, coherently and are easy to use.