In this article, we look at two different approaches to implementing a distributed tracing solution: Manual and Automatic. We compare the two approaches, look at real-life scenarios, and compare their advantages and disadvantages. We also look at yet another approach that combines the best parts of both manual and automatic solutions – without any of their downsides.
Over the last decade, the DevOps movement has had a profound influence on the development and operational sides of IT. Before DevOps, the benefits of automating operational and deployment processes were understood, but the perceived risks involved made it almost impossible to pursue. Today, IT pros and developers assume that all processes should be automated and only realize the possible dangers as they reach the point of no return.
With the transition to distributed applications and microservices, distributed tracing has emerged as the best way to monitor and troubleshoot distributed software. Still, despite the benefits of this approach, it is hard to deny that the work and effort involved in implementing and configuring distributed tracing outweigh its benefits.
Manual Distributed Tracing
This approach uses existing solutions and open-source technologies to build a distributed tracing solution. You can start by using existing building blocks, such as the OpenTracing and OpenCensus frameworks.
These frameworks provide support for most common high-level languages, allowing you to build your tracing and logging tools and integrate them into your existing applications and development environment. These frameworks do generate large amounts of data. So once you have used them to build a solid foundation, you can use tools such as Jaeger and Zipkin for tracing management and analytics.
Both of these tools are free and open-source and were developed by major tech companies, such as Google, Twitter, and Uber. Once you have all the pieces in place, you can integrate your code and management apps and configure them manually.
Again, the tools mentioned in this section are free and open source. This means that you can experiment with them without the threat of vendor lock-in. As a result, this approach gives you a high level of flexibility and enables you to build a generic system that you can customize to meet your needs. You can use each framework and management system on its own, combine them as needed, or integrate them with third-party solutions.
Your solution can also include multiple types of services, design patterns, and communication protocols. Best of all, these tools do not limit you to using a single high-level language. Instead, they support the most popular languages and can be implemented as part of a polyglot programming environment.
Of course, the flexibility offered by this approach also has its costs. For a start, this mix-and-match approach will involve a steep learning curve. Any developers involved with this type of distributed tracing project will have to master the low-end frameworks as well as high-end management tools. Also, the more resources and developers you have available for this type of project, the better.
But you might end up committing all of your available personnel to make it work. Due to time constraints, you may also run into the problem of not being able to give team members sufficient training. As a result, they will need to learn on the job, mainly by trial and error, thus possibly incurring high maintenance costs to fix defective code. By some estimates, the time spent on developing and maintaining manual tracing can be up to 30% of the total development time.
Automated Distributed Tracing
If manual tracing seems like a big commitment, you may want to consider automated tracing. Let’s take a look at the pros and cons.
This solution uses some elements of the previous approaches, such as using a tracing framework, in conjunction with an automated, cloud-based service that handles the setup, configuration, and management aspects. As distributed enters the mainstream, this approach is supported by large cloud providers, such as Amazon Web Services (AWS) as well as many existing monitoring and logging solution providers.
Like most task automation, the main advantage of this approach is that it handles a majority of the hard and repetitive tasks for you. This lets you focus on the business of development instead of building a distributed tracing solution. As we noted in the previous section, building your own distributed tracing system is time-consuming.
In contrast, the time involved to deploy automatic tracing is just minutes or hours versus days, weeks, or months. Moreover, instead of trying to build your own integration code, these systems provide highly abstracted code that achieves the same result via a limited number of API calls. Less code means less potential errors, and this ultimately produces better results.
When compared to the manual approach, automated tracing offers convenience over flexibility, which may be better suited to your needs. Again, before you commit to automatic tracing, you should consider its downsides. Not only is this approach less flexible, but it also requires you to choose and trust a third party to create the solution and provide the necessary support. Companies such as Google can be notorious for announcing new consumer and enterprise services only to quietly shutter them when they fail to get significant traction.
This means that if you choose a distributed tracing product and/or service from a major vendor, you should continually check that your specific use case and scenarios are supported. Even if your chosen supplier gives you the support you need for as long as you need it, you may still find that they do not support all possible scenarios, and you may need to use open APIs to provide missing functionalities.
Automated Distributed Tracing With Manual Benefits
In many situations, you are normally forced to choose between only two options. But sometimes, there is a third option that combines the best elements of the other two. In our case here, the ideal solution is one that provides the flexibility of the manual approach with the convenience of a fully automatic solution.
You need a fully automated solution that takes into account that 90% of software is built using standard components, components that are unique to your organization and can’t be handled by generic test cases and processes. This means that most of your tracing needs can be handled by predefined scenarios, letting you invest your time, resources, and energy in the 10% that matters most to you and your organization.
There will be times when you’ll need to extend your system and provide new functionality, and for this, you will need support for open frameworks and tools. This is exactly the approach taken by Epsagon. Epsagon abstracts away the hardest parts of the process, allowing you to focus on building effective business logic while it takes care of the mechanics of distributed tracing.
Epsagon also enables developers to set manual labels that are queryable in the Epsagon dashboard, complete with alerts. You can see an example of Epsagon’s custom labels in Node.js here. Plus, Epsagon also provides support for open frameworks and tools for those new scenarios that pop up and are not covered by its automatic tracing.
Whether you want the flexibility of a manual system or the convenience of an automatic solution, both options will help you get started with distributed tracing.
The manual approach will give you everything you need to build what you want. This flexibility comes at a high cost and will involve you having to use valuable resources and time to get it off the ground. A fully automated solution will let you get started more quickly, flatten the learning curve, and simplify major elements of integration and deployment. Apart from the hidden financial and other costs involved in a cloud-based solution, this approach is far less flexible and makes you vulnerable to vendor lock-in.
As we noted, Epsagon provides an alternative model that combines the best elements of both the automatic and manual approach. This will provide you with the ultimate balance between flexibility and convenience and avoid unnecessary compromise in either direction. Epsagon supports real-life applications and scenarios and saves time by automating common scenarios.
Plus, if you encounter a scenario not yet supported by Epsagon, you can build a custom solution. So, instead of trying to find a way to force Epsagon to work with your problem, you can use an open framework, such as OpenTracing, if and when necessary.