Serverless adoption has been steadily picking up throughout 2017 and 2018 as more and more people have come to rely on serverless technologies for critical production systems. This is clearly evident in the recent survey by Serverless, Inc.
Something else that stands out for me in the survey result is the diverse range of use cases.
This is in keeping with what I see in the industry as a practitioner, speaker, trainer, and occasional consultant. In this post, we will discuss the five best use cases for a beginner looking to adopt serverless. We will talk about why serverless is a good fit for these use cases and how you can get started.
For the purpose of this post, we will define serverless as any technology where you don’t need to manage servers. This includes storage services like S3 or databases such as DynamoDB, as well as function-as-a-service (FAAS) offerings such as AWS Lambda.
It’s easy to underestimate the inertia inter-team dependencies create, especially in an enterprise. Many people have told me how serverless empowers the development teams to move faster and deliver higher quality software. It’s no surprise that serverless adoptions are often driven by application developers as they pursue greater autonomy and ownership.
Interestingly, the opposite is also true! I have spoken with quite a few companies where serverless adoption is driven by the platform teams (often branded as devops teams even though devops shouldn’t be considered as a role, but that’s for another post!).
A great example is to move cron jobs into Lambda functions. Nobody wants to take on the extra overhead of having to manage servers and pay for them 24/7 just to run a small task from time to time. Instead, you can create a schedule in CloudWatch Events and use it to trigger Lambda functions to perform the cron jobs.
Automation through CloudWatch Events
In this post, I talked about how you can automate the process of subscribing new log groups in CloudWatch Logs to a Lambda function. This is achieved using an event pattern in CloudWatch Events to trigger a Lambda function whenever a new log group is created.
This is a repeatable design pattern. Once you have enabled AWS CloudTrail in your account, you can create event patterns against any API call that is captured by CloudTrail. It even extends to non-API events such as console logins, which allow you to send out automated alerts when there are suspicious login attempts.
Many similar automations are possible through Lambda. Here are just a few that I can think of:
- Create CloudWatch alarms for latency spikes and 5xx responses whenever an API is deployed in API Gateway.
- Create a dashboard in CloudWatch whenever an API is deployed in API Gateway.
- Send alerts when there are EC2 activities in unused regions.
- Use AWS Config to monitor and evaluate configuration changes to AWS resources and send non-compliance notifications to CloudWatch Events to trigger a Lambda function
I see many companies building web applications with single page application (SPA) on the frontend and implementing the backend APIs with API Gateway, Lambda, and DynamoDB or Aurora. Indeed, as you can see from the aforementioned survey, nearly a third of the responders are using serverless to write backend APIs! The SPA is usually hosted in S3 and served through a CloudFront distribution.
For RESTful APIs, API Gateway and Lambda are your technologies of choice. API Gateway offers many features out of the box:
- A wide array of ways to control access
- Request throttling down to each endpoint and method
- Canary deployment
- Tracing, logging, and metrics
GraphQL has been gaining traction over the last couple of years and AWS introduced the AppSync service in 2017 to make it really easy for you to implement GraphQL APIs. With AppSync, you no longer need to manage and run GraphQL servers yourself. Just define the object schemas and point AppSync to data resources and you’re done!
Besides the direct integration with DynamoDB and Elasticsearch, you can also use Lambda to integrate your AppSync API with other data sources such as Aurora.
User authentication can be implemented with Cognito and integrated directed with both API Gateway and AppSync. Auth0 is also a very popular alternative. I find Auth0 to be a lot easier to work with than Cognito, but it lacks the direct integration with other AWS services.
The services mentioned above all provide some built-in scalability and resilience. For instance, Lambda functions are deployed to multiple availability zones (AZs) by default, and are auto-scaled by traffic. Since the introduction of DynamoDB Global Tables, you can now also create multi-region, active-active APIs with ease and achieve better latency and even greater resilience.
However, being able to build scalable and resilient web applications easily and quickly is still not enough. You need to really understand user behavior and use that knowledge to improve the quality of the product itself. Fortunately, serverless technologies can help you here as well!
In AWS, S3 is the obvious choice for a data lake. You can use IAM to control access to your analytics data in S3, and you can protect the data at rest by enabling server-side encryption using the KMS service. You can also use the Macieservice, announced at re:invent 2017, to alert when you have sensitive data in the bucket such as personally identifiable information (PII).
You can use Athena to analyze these analytics events and turn data into understanding. Athena is a serverless solution for querying large amounts of data and getting results in seconds, paying only for data that you scan. You can then create dashboards to visualize query results with QuickSight. Together, S3, Athena, and QuickSight makes for a powerful combination of tools for business intelligence (BI) users.
You can use Kinesis Data Streams to collect these analytics events (user-login, user-logout, etc.) and use Lambda functions to process them in real time. To gobble up all the events and put them into the data lake, you can use Kinesis Data Firehose. Kinesis Data Firehose collects the data, batches them up, and then saves them into the designated S3 bucket without the need to write any custom code yourself. Furthermore, you can also transform the analytics data with Lambda before they are put into S3.
Whenever a batched events file is put into S3, it can trigger a Lambda function for further processing. For instance, if you want to use Google BigQuery instead of Athena, then you can use this Lambda function to stream the events to BigQuery. Or, perhaps you want to use these data to update a DynamoDB table or an Elasticsearch domain.
Just like that, you can create an entire analytics pipeline without having to run and manage any servers by yourself. You only pay for what you use, so it can be an extremely cost-effective solution—especially when you are just starting out and not operating at scale yet.
Another popular use case for beginners is to implement batch processing or fan-out. I think this is due to the fact that these systems tend to have spiky loads, and it’s difficult to strike the right balance between:
- provisioning sufficient resources to be able to react quickly to a large batch of tasks
- keeping the infrastructure cost in check and not paying for too many unused resources
SNS and Lambda are very popular choices here. SNS is an asynchronous event source for Lambda, where every published message would trigger a Lambda invocation. It allows you to achieve very high throughput quickly, limited only by the regional concurrency limit for Lambda (defaults at 1000) and the 500-per-minute limit for scaling up that concurrency.
In addition, you also get built-in retry when the Lambda invocation errs. Out of the box, SNS retries the failed invocation two more times. Should problems persist, Lambda would also send the failed invocation event to the dead letter queue(DLQ) configured for the function.
When dealing with downstream systems that aren’t as scalable as SNS and Lambda—such as a legacy database—you need to control the number of concurrent Lambda executions. If you publish many messages to SNS at once, then you can overwhelm the downstream system if it’s not ready to receive such a spike in the load. For these scenarios, you should consider a different event source. Both Kinesis and SQS allow you to process messages in batches, and both offer better control over concurrency of the subscriber function.
For SQS, the concurrency of the subscriber function is limited to an increase of 60 per minute, which is a much lower ramp-up compared to SNS.
For Kinesis, you can even control the concurrency of the Lambda function yourself with the number of shards in the stream.
AWS offers a range of IoT services, and at the heart of it all is IoT Core. IoT Core allows you to connect IoT devices with your backend services using MQTT, WebSockets, or HTTP. You can easily collect, process and analyze data generated by your connected devices with services such as S3, Kinesis, Lambda, and QuickSight.
iRobot is perhaps the best known user of these services, and the following quote by Ben Kehoe perfectly summarises why serverless is such a big hit with companies in this space.
Many of the companies in the IoT space are startups operating with a small team and constrained by tight budgets. Serverless technologies like Lambda and IoT Core are a natural fit for them because it allows their small teams to focus on feature delivery as opposed to wasting time on infrastructure. The pay-per-use pricing model also allows them to grow their operational costs linearly and not worry about big up-front costs.
So there you have it, five of the best, most popular use cases for serverless for your consideration. As you can see, some of these use cases are deliberately very generic as there are many possible variations, depending on the specific technologies you use.
You might have noticed a consistent theme in all of the use cases—that Lambda is used in conjunction with a wide array of other services, all of which can be considered as “serverless” based on our earlier definition. In all of these use cases, we are able to build entire architectures or workflows by linking various specialized services together with Lambda.
And THAT is the true power of AWS Lambda. It’s so much more than an abstraction layer over compute resources. What makes it so appealing is the fact that it brings an entire ecosystem to you and makes the services that are part of this ecosystem easy to consume.
It’s the gateway drug to the rest of the AWS ecosystem.
I have no doubt that the battle for supremacy among cloud providers will be decided by their FAAS offerings—not just by the quality of their FAAS solutions themselves, but by how well-connected the FAAS solution is to the rest of the ecosystem.
Unsurprisingly, AWS seems to be leading the way.