As a developer, I find it very convenient to develop and deploy AWS Lambda functions. My favorite programming language is Python, so both Serverless and Zappa are great.

When scaling a serverless application into a real architecture, that is composed of many resources and elements — it becomes hard to design it properly. Questions such as “how many functions should I have?”, or “how should my functions communicate with each other?” often remain unclear to most of us.

In the following post, I will shed some light on how to distribute messages in serverless applications, specifically between AWS Lambda functions, taking into consideration service decoupling, end-to-end performance, troubleshooting, and more.

Check out the updated AWS Lambda and SQS guide for the performance results of SQS as well!

Inter-Function Communication

When building a distributed architecture, communication between processes (or in our case — functions) is critical. Luckily, if you are using Lambda, AWS provides several ways to do it.

Let’s explore the options using two Lambda functions (A)→(B):

Serverless applications: Communication between functions

Communication between functions

AWS SDK (AKA boto3 in Python)

The AWS SDK allows us to manage and use our AWS resources. In our case, we want to invoke Lambda B, with a payload from Lambda A:

Invoked AWS Lambda

Invoked AWS Lambda

This method runs Lambda B synchronously. Before diving into details let’s present an asynchronous alternative:

Asynchronous Alternative

Asynchronous Alternative

The differences between the methods are:

  1. Synchronous: Lambda A waits for the response from Lambda B. You pay for both Lambdas (although Lambda A is idle).
  2. Asynchronous: Lambda A invokes Lambda B and immediately continues.

The only reason to use synchronous invocation is if you are dependent on Lambda B’s response. In this case, consider doing some changes to decouple your functions.

Message Queue #1 — SNS

SNS acts as fully managed, simple, message queue (Pub/Sub model), that integrates seamlessly with Lambda. With a simple setup, we create an SNS queue and configure that messages will trigger Lambda B (with the payload).

SNS message queue

SNS message queue

Message Queue #2 — Kinesis Data Streams

Kinesis Data Streams offers a real-time queue, dedicated to handling and processing mass amounts of data (such as video streams or data from users).

Setting up Kinesis is as simple as SNS, with built-in integration to trigger Lambda B for every new message.

Kinesis message queue

Kinesis message queue

The downsides of distributed architectures

So far, we just went through the possibilities, without understanding the implications of developing a distributed architecture.

Troubleshooting, for example, is much more difficult. If an exception is raised in Lambda B, we would like to traceback:

  1. What message triggered the Lambda (via the SNS, Kinesis or Lambda A).
  2. What happened in Lambda A?

It gets even more complex in case we have hundreds of Lambda functions (that run real code). Using the technique of distributed tracing, we can understand the behavior of such applications and troubleshoot issues.

Additionally, performance analysis of asynchronous, end-to-end events in serverless is a complex task nowadays. Most of the solutions require developers to manually log everything. End-to-end performance analysis is still critical in serverless to understand the impact on our end users.

These problems are not new, but they are getting amplified when using serverless resources.

End-to-end performance analysis

Let’s get back to inter-function communication and analyze our end-to-end performance results. The overall code:

End-to-end performance results

End-to-end performance results

At Epsagon, we are developing a fully automated solution for end-to-end tracing of serverless applications. Let’s see the visual results:
Epsagon architecture screen

Epsagon architecture screen

Two main results that I would like to share are:
  1. Average operation duration — the time it took Lambda A to execute the request (invoking a Lambda, publish to SNS or put a record to Kinesis).
  2. Average Lambda A to Lambda B duration — the time it took to deliver the message from Lambda A up to the time it arrived at Lambda B.

Luckily, with Epsagon these results are automatically generated:

Method and operation duration for functions

Method and operation duration for functions

Results analysis:

  1. Invoking synchronously is the fastest end-to-end. It makes sense because we wait for Lambda B to terminate. BUT, we wait for Lambda B in Lambda A as well. It means that the longer duration in Lambda B causes longer operation duration in Lambda A.
  2. Invoking asynchronously seems to have an excellent overall performance since we don’t need to wait for the response.
  3. Using SNS helps us to decouple our services in a better way, but results in longer end-to-end duration.
  4. The Kinesis Data Streams result undoubtedly surprises. It takes only 10ms to execute, but a very long total duration (>0.5 seconds) to get from Lambda A to Lambda B. The reason is obvious — Kinesis is designed to ingest mass amounts of data in real-time (under 10ms) rather than output the data quickly.


Serverless brings us, the developers, attractive features, and much easier ops. However, if we design our application properly with decoupled services, we will encounter troubleshooting and performance issues. A dedicated monitoring solution for serverless architectures allows us to gain visibility into the managed resources, and understand what works best for our applications.