Sunday, March 14, 2021

Wake up, sleepy lambda

AWS Lambda functions with painfully slow start up times is a problem. But there's hope.

Clojure anywhere

After going through some of the 12 Stages of learning Clojure, I have found this Lisp style language to be very nice and it has become my favorite programming language.

Some of the nice things with Clojure are that data is immutable, the REPL is like magic and the code you write looks minimalistic. Also, you can run Clojure code almost anywhere: as backend services, web frontends and even in shell scripts.

Even in Lambda functions?

I found out that it is not that difficult to make Clojure code to also run in AWS Lambda. The code can live in a Java runtime. Lambda events will be routed to a handler function written in Clojure, when the namespace implements a Java Request handler class (here's an example). Yes, there is some interop needed at the entry point of the Lambda code to make it work. But don't worry.

In addition to the Java interop, the source code should also be ahead-of-time compiled and packaged. I've used the uberdeps library that will make the process smooth when using tools.deps.

If you are not familiar with the Java lingo (like me, I have a background in Node.js, frontend and .NET), words like jar, AOT and even Java can be intimidating. I guess it is possible to sidestep all of this by writing the Lambda function in ClojureScript and run it in a Node.js runtime. But I don't want to opt out of the rich ecosystem of Clojure libraries built for the server side.

Sleepy lambda, slow cold starts

It seems that a Lambda running in a Java or .NET runtime often has painfully slow start up times. Setting a timeout of 3 seconds is probably not even enough. A simplistic solution to this problem is to use the AWS provisioned concurrency with a Lambda alias. No code changes required, only configuration and money. I wanted to find out if there are other ways to solve this problem.

What about GraalVM?

I found this great post about Clojure and Lambda written by Esko Luntola. Code that is compiled with GraalVM and running in Docker, solving the issues with slow cold starts and even makes requests in general super fast. Wow!

However, I haven't succeeded in going through the steps described in the post and am stuck in build failures that I don't know how to solve - yet. But I will try this approach some more. Even though it requires some initial setup with configs and Docker containers, this seems to be the way to go.

What about GraalVM in a custom runtime?

When digging deeper in how to run code in AWS Lambda, I found out that you can create your own runtime. To me, this approach looks simplistic and straight forward.

In this repo, I have written a Hello World example with:

  • a custom runtime (the file called bootstrap) written in bash (grabbed from the AWS Official docs with some additional error handling).
  • Clojure code, with a main function as the entry point. It is not neccessary to implement the Java Request handler class, and the main function returns data via standard output.

Input args as a JSON string

The Clojure code is compiled, and built with GraalVM by using the Native Image feature. Have a look at the Makefile for details.

compile and build with GraalVM Native Image
I've tried to keep things simple and followed the guides at clj-graal-docs and watched Michiel Borkents excellent beginners guide to GraalVM on YouTube.

The custom runtime and the function can be deployed separately. You can reuse the same runtime for several Lambda functions, by creating a Lambda Layer in AWS. The function code can be deployed directly, just upload the zipped file to AWS Lambda.

I like this approach, it works fine in my simplistic hello world example. But I haven't tried it in a real-world scenario. When going beyond an experiment, I think there might be additional resource configuration flags to Native Image required. Probably the setting ReflectionConfigurationFiles.

The tradeoffs?

From what I can see in my totally non-scientific weekend experimental testing is that the code running in my Custom Runtime has a cold start of somewhere between 100 and 300 milliseconds. Good enough. When warmed up, requests are processed between 15-30 milliseconds. Not bad.

When comparing with my other example lambda that is running in a Java runtime (with cold starts usually taking over 2000 milliseconds), the Custom Runtime with GraalVM is way faster.

But once warmed up, the Java runtime is actually super fast, with duration times between 1 - 30 milliseconds. Also, I haven't yet solved how to build all of this in a CI/CD setup. The code was built with GraalVM locally on my machine.

I would very much appreciate your input on the experiments shared in this blog post.

Photo by Abdelrahman Hassanein on Unsplash