Skip to content

AWS Lambda: Cold Start & Provisioned Concurrency

Posted on:January 3, 2020 at 04:20 AM

We all have agreed that Serverless Architecture has changed the way we develop applications. Everyone loves AWS Lambda (Serverless computes service).

Table of contents

Open Table of contents

Some features of AWS lambda are:

AWS Lambda is asynchronous and event-driven due to that it is a good fit for the type of workload where we do not care about the response.

We are also using it for developing a web application without worrying about managing Infra and autoscaling.

But this great power comes with responsibility as well:

We will be discussing how to get around with the Cold start issue.

What is AWS lambda cold start?

Before we look into the cold start issue we have to understand AWS lambda concurrency Model.

Lambda Concurrency Model

Despite lambda’s simplicity, most of us tend to miss lambda Concurrency Model.

Concurrency is the number of requests that your function is serving at any given time. When your function is invoked, Lambda allocates an instance of it to process the event. When the function code finishes running, it can handle another request. If the function is invoked again while a request is still being processed, another instance is allocated, which increases the function’s concurrency.

Lambda only serves one request at any single point of time, meaning if two requests hit lambda at the same time, lambda only processes one request and holds another request in the queue unless you have specified concurrency.

Lambda only serves one request at any single point of time.

Lambda process one req at once and other parallel req gets queued.

Lambda scaled to handle all parallel req.

Now that we understand the concurrency model, let’s talk about Lambda Cold Start.

Lambda Cold Starts

Lambda cold start is responsible for delay in API response.

When we make an API request to particular lambda first time, AWS spin a lambda container or instance as follow:

  1. Assigning a new lambda container instance.
  2. Pull/Download your code — AWS manages that artifact, this can be in S3 or any other service.
  3. Launch execution context based on the configuration setting we provided. — Setting runtime (nodejs,python,go,c# etc), Environment variable, Memory, Timeout, Concurrency, etc.
  4. Bootstrapping/Initializing your code and resolving external dependencies.

Process of cold

This process took a while before it serves your API request that results in stiff response time and this process is known as a cold start.

And cold start is a big concern about the applicability of serverless technologies to latency-sensitive workloads.

Cases when we hit cold start

How to handle Cold start

  1. Lambda warmer (Best effort)
  2. Configuring Provisioned Concurrency (Prefered) - Provisioned Concurrency is just released in AWS reinvent-2019.

1. Lambda warmer:

This is a simple technique where we make n ping requests to any particular lambda in parallel at the same time by setting a cron job in Cloudwatch which triggers every 5 or 10 min.

This isn’t a perfect solution and will do job partially but it is still the best-effort as:

2. Provisioned Concurrency

Provisioned Concurrency is a pre-defined number of concurrent lambda or warm lambda AWS keeps for us to serve our request with low latency.

Previous to this if we have a latency-sensitive application we have to do a hack like a lambda warmer which is also the best effort. Thankfully, AWS heard this concern and release Provisioned Concurrency feature in Reinvent2019.

Although Provisioned Concurrency solves our cold start problem, it does come with extra cost.

Find Provisioned Concurrency Pricing with a good example.

Reducing Provisioned Concurrency Costing

Provision Concurrency is the right solution for latency-sensitive workload but to reduce the cost we do not want to keep provisioned concurrency all the time.

Take an example of Swiggy, Zomato or Uber Eat as food delivery services. We can be very sure that during any day at lunch or dinner time we will have a rise in requests for our ordering service. In such a case, we can schedule a cloudwatch cron event that triggers and add or remove the Provisioned Concurrency configuration for that particular service(lambda).

Note: Provision concurrency and Reserved concurrency are different things. Read more about this here.

Configuring Provision concurrency

Serverless Framework: Provisioned Concurrency.

CLI: Provisioned Concurrency

Conclusion:

Discuss on