The Right Way to Distribute Messages Effectively in Serverless Applications

As a developer, I find it very convenient to develop and deploy Lambda functions. My favorite programming language is Python, so both Serverless and Zappa are great.

When scaling a serverless application into a real architecture, that is composed of many resources and elements — it becomes hard to design it properly. Questions such as “how many functions should I have?”, or “how should my functions communicate with each other?” often remain unclear to most of us.

In the following post, I will shed some light on how to distribute messages between AWS Lambda functions, taking into consideration service decoupling, end-to-end performance, troubleshooting, and more.

Inter-Function Communication

When building a distributed architecture, communication between processes (or in our case — functions) is critical. Luckily, if you are using Lambda, AWS provides several ways to do it.

Let’s explore the options using two Lambda functions (A)→(B):

AWS SDK (AKA boto3 in Python)

The AWS SDK allows us to manage and use our AWS resources. In our case, we want to invoke Lambda B, with a payload from Lambda A:

This method runs Lambda B synchronously. Before diving into details let’s present an asynchronous alternative:

The differences between the methods are:

  1. Synchronous: Lambda A waits for the response from Lambda B. You pay for both Lambdas (although Lambda A is idle).
  2. Asynchronous: Lambda A invokes Lambda B and immediately continues.

The only reason to use synchronous invocation is if you are dependent on Lambda B’s response. In this case, consider doing some changes to decouple your functions.

Message Queue #1 — SNS

SNS acts as fully managed, simple, message queue (Pub/Sub model), that integrates seamlessly with Lambda. With a simple setup, we create an SNS queue and configure that messages will trigger Lambda B (with the payload).

Message Queue #2 — Kinesis Data Streams

Kinesis Data Streams offers a real-time queue, dedicated to handling and processing mass amounts of data (such as video streams or data from users).

Setting up Kinesis is as simple as SNS, with built-in integration to trigger Lambda B for every new message.

The downsides of distributed architectures

So far, we just went through the possibilities, without understanding the implications of developing a distributed architecture.

Troubleshooting, for example, is much more difficult. If an exception is raised in Lambda B, we would like to trace back:

  1. What message triggered the Lambda (via the SNS, Kinesis or Lambda A).
  2. What happened in Lambda A?

It gets even more complex in case we have hundreds of Lambda functions (that run real code). Using the technique of distributed tracing, we can understand the behavior of such applications and troubleshoot issues.

Additionally, performance analysis of asynchronous, end-to-end events in serverless is a complex task nowadays. Most of the solutions require from the developers to manually log everything. End-to-end performance analysis is still critical in serverless to understand the impact on our end users.

These problems are not new, but they are getting amplified when using serverless resources.

End-to-end performance analysis

Let’s get back to inter-function communication and analyze our end-to-end performance results. The overall code:

At Epsagon, we are developing a fully automated solution for end-to-end tracing of serverless applications. Let’s see the visual results:
Two main results that I would like to share are:
  1. Average operation duration — the time it took Lambda A to execute the request (invoking a Lambda, publish to SNS or put a record to Kinesis).
  2. Average Lambda A to Lambda B duration — the time it took to deliver the message from Lambda A up to the time it arrived at Lambda B.

Luckily, with Epsagon these results are automatically generated:

Results analysis:
  1. Invoking synchronously is the fastest end-to-end. It makes sense because we wait for Lambda B to terminate. BUT, we wait for Lambda B in Lambda A as well. It means that the longer duration in Lambda B causes longer operation duration in Lambda A.
  2. Invoking asynchronously seams to have an excellent overall performance, since we don’t need to wait for the response.
  3. Using SNS helps us to decouple our services in a better way, but results in longer end-to-end duration.
  4. The Kinesis Data Streams result undoubtedly surprises. It takes only 10ms to execute, but a very long total duration (>0.5 seconds) to get from Lambda A to Lambda B. The reason is obvious — Kinesis is designed to ingest mass amounts of data in real-time (under 10ms) rather than output the data quickly.


Serverless brings us, the developers, attractive features, and much easier ops. However, if we design our application properly with decoupled services, we will encounter troubleshooting and performance issues.

A dedicated monitoring solution for serverless architectures allows us to gain visibility into the managed resources, and understand what works best for our applications.