Exploring Serverless with Python, StepFunctions, and Web Front-end
Serverless slack invite and community signup project. Serverless Python project boilerplate. Story of my serverless exploration. Tips, tricks, links. Originally posted by dzimine@medium
Slack invite has been a canonical example of a serverless application. But what if you want to do something more involved? Like our StackStorm community sign-up, where we ask Slack to invite a user, but also record it into ActiveCampaign CRM with appropriate tags, and create a user record in the database? A perfect case for a serverless starter project, more realistic than aws-python-rest-api-with-dynamodb yet less daunting than “Hello Retail” showcase. And, for a change, it’s in Python!
The code is at https://github.com/dzimine/slack-signup-serverless. There, you will find:
- Python boilerplate project for Lambda functions with lean dependency management, unit testing consistent with the lambda deployment, and a
tox
-based build for packaging and continuous integration. - AWS StepFunctions to define the logic of Lambda invocations.
- Web front-end deployed on S3 with API Gateway.
Serverless community signup application diagram.
Deployment is fully-automated: you can use it “as is”: clone the code, set up your credentials, and run (enjoy Evan Powell!). Or use the code as a starting point: modify and add steps, re-style the web form — adjust to your liking or completely repurpose it.
Read on for the story of my serverless exploration. Enjoy and use the tips, tricks, and links spread along the way. Or jump straight to the Summary and comment on my takeaways.
1. Lambda
Getting the Lambda functions is a breeze with serverless.com. To appreciate it, I first tried to implement the original serverless-slack-invite in Python with raw AWS Lambda. I spent a day on it, and API Gateway configuration was so painful that < some serious swearing removed from here>. It took me an hour to redo it with Serverless — a 10x productivity boost! and a joy of having a clear, reproducible infrastructure as code. A few notes on effective Lambda with Python:
Project structure. Should you follow “hello-retaile” and create an uber-project, where each lambda is individual serverless service? Or go for a single serverless project (== single service) with fine-grained management of the parts? Trying both, I figured that a single project wins for 3–7 services. As the number of services grows beyond a dozen, with different developers working on different services with different deployment cadence, it’s time to breake it up. Be aware that all the managing the uber-project — build, deployment, service dependencies, versioning — is not covered by the serverless framework and will be your fun to invent and maintain.
Environment variables : an elegant way for managing environment variables is a single env.yml YAML file with a section for each service. You an then use environment: ${file(env.yml):my_service
to pass the service-specific section of the env.yml to Lambda as a single set of environment variables. The filename is “.gitignored” to prevent committing your secrets to GitHub.
Python Dependencies: Making Lambda packages lean with just enough dependencies keeps you safely within Lambda package size limits, and makes deployments enjoyably fast.
The first step is to tell serverless to manage Lambda functions individually. It is achieved by setting package/individually: true
, and defining a package
section per function in the serverless.yml
.
Next, prepare Python dependencies. There are couple of serverless plugin to help with this. For simple cases I recommend serverless-python-requirements. If your Lambda is called as a web service, and you configure API Gateway as lambda-proxy, taking the request-response management onto the Lambda function code, a https://github.com/logandk/serverless-wsgi plugin is a good way to go.
I preferred to not use plugins and manage dependencies from the code, building upon this serverless forum hint. “Lean” lambda packages make deployment fast and keep you safe within AWS lambda package size limits. Most importantly, this setup makes local unit testing match the remote execution. The build.sh
script automates the dependency building, and tox
is used for a full Python build & test. Note that setup.cfg
is required per this AWS hint.
Programming models and invocation discrepancies. In my development and testing process, I tried and called my Lambda functions in different ways — as a separate service, as part of testing from the serverless cli and AWS console, and as a step in StepFunction. A gotcha to watch out for here: the type and structure of the input and return values, as well as exception handling, are all different for different kinds of invocations. When Lambda is called from the API Gateway, Serverless framework encourage thelambda-proxy
method: "We highly recommend using the lambda-proxy method if it supports your use-case, since the lambda method is highly tedious". With “lambda-proxy”, a full HTTP request is passed to the handler in the event
input parameter, leaving the the handler to parse the body and return a proper HTTP response for success and failures, e.g. 500 in case of service errors. (If you take this route, consider the serverless-wsgi plugin mentioned above).
You can stick to the AWS mainstream of doing string-to-JSON transformation on API Gateway, and signaling failures by throwing Python exceptions. This simplifies the Python lambda handler, but complicates API gateway configuration part in serverless.yaml
. Pick your poison.
When Lambda is called from a StepFunction, an event
contains an input to StepFunction step. The handler returns an object for the step output, and throws exception on failures. Special care is needed for passing the API input to Lambdas down the StepFunction (more on this gotcha in “StepFunction” section below).
Finally, we often call Lambda from serverless CLI, as part of convenient dev workflow offered by serverless:
sls invoke stepf --name signup --data '{"foo":"bar"}'
Serverless first tries to parse the input as JSON and passes an dict
, falling back to string if parsing fails. It’s a bummer if you have followed Serverless recommendation of using lambda-proxy
and expect the event as a JSON string in your handler. No worries, though: if you tried it you already know it, if you haven’t, this bug be likely fixed by the time you try.
2. Wire logic with Step Function workflow.
Workflows are instrumental in stitching the logic of individual Lambda invocations into complete serverless solutions. The serverless framework doesn’t support StepFunctions in the “core”, but there is a serverless-step-functions plugin that does. I opted for using it over the raw CloudFormation approach shown in HelloRetail.
AWS StepFunctions deserves a dedicated review, but I’ll give you a BIG BOLD WARNING right here. On any change, the execution history is gone. That is: step functions are immutable; any change creates a new instance, the previous instance is deleted (ok so far), and the execution history of the previous instance is gone with deleted instance (Really? WTF!!!)
If this doesn’t stop you from using StepFunctions, read on to pick up a few tricks.
Passing workflow input to steps. The StepFunction data flow designed on “need-to-know” basis: no extra data is passed by default. The first Lambda receives the workflow input (the body of REST API call). The second Lambda only gets an output of the first Lambda. But I want all the Lambda steps to access the workflow input! The trick here is to use StepFunction’sResultPath
: it is a way to control the key where the result of the Lambda execution is placed in the workflow state. The default ResultPath=$
sets the key to the root context so it overwrites the initial workflow input. But if we append the result of each lambda under some key, like ResultPath: $.results.InviteSlack
(like this), the workflow input stays preserved and can be used by all Lambda steps.
Invoking step-functions from serverless is a joy: the plugin starts the workflows, watches the execution and prints out the results of each step as it progresses:
sls invoke stepf --name signup --data '{"email":"your@email.com", "first_name":"Donald", "last_name":"Trump"}'
Exposing via API, calling from Web, and dealing with CORS. Exposing step function methods with the API Gateway is just as easy with a plugin as with serverless lambda. But I need to call it from the WebUI hosted on s3. This requires CORS configuration that the plugin didn’t support at a time (now it does). I wondered what to do. Have “lambda” in front of StepFunction just to invoke it? And pay for it? No-o-o way. Go to the dark side and use CloudFormation to do all API Gateway configuration? No-o-o fun. The answer? Read on to “Add Web, no CORS”.
What to choose? Knight at the Crossroads, the classic Russian epic on False Choice Fallacy.
3. Add Web, no CORS.
The third, winning way was presented by serverless-apig-s3 plugin. The plugin does two things. It puts web content on S3, as pointed in serverless.yml
(./web
in my case) and enables the web access. It also configures API Gateway to serve the static content from the same domain, sidestepping any CORS issues. There are some other plugins for handling the Web Frontend but to me sidestepping CORS wins.
Couple of gotchas to watch out for:
When you add or remove web files, client deploy is not enough, a full sls deploy
is required. It makes sense when you know how things work: the APIGateway part of the plugin changes the CloudFormation stack which needs to be updated, and the sls client deploy
part puts the files on s3 bucket.
I added an image to the form while customizing it for StackStorm community signup. The image didn’t go through. Took me a frustrating while to figure why: the binary payloads is not enabled by default and need to be configured. If you happen to need it, configure it manually, via CloudFormation, try serverless-apigw-binary plugin, or do a PR to fix https://github.com/sdd/serverless-apig-s3/issues/12.
Summary
Serverless framework works like a charm for Lambda. As you expand beyond “FaaS” to broader “Serverless” set of services, things get harders. You either rely on plugins, where milage varies, or go to the dark side of AWS CloudFormation. Yet the Serverless framework is ways ahead of the competition, recognizing and supporting the fact that Serverless > FaaS.
The serverless framework is well-designed. It makes the common tasks and configurations really easy, provides solid extensibility to modify and expand the “core” function set via plugins. For advanced configurations and things not yet supported, the user can fall back to cloud-native configurations. You will fall to writing CloudFormation templates for any but a trivial project, but the framework helps to place it in the code and weave it into the development lifecycle.
The plugin ecosystem is dynamic — many issues I hit had been fixed over the time of writing the blog. Check the curated list of serverless plugins when you miss something: it may have it already, or give you a clue. The ecosystem still feels “wild” and “early”, would benefit from more attention, curation and contributions from @goserverless core team. I am sure they’ll get there.
Hmm… What’s the difference between the two plugins, really?
The user community is large, lively and helpful. Between documentation, Gitter chat, forum, and Github, I quickly found answers to my questions.
Linguistic nit: With serverless, the framework name, and serverless, the big trend, writing about using serverless for serverless gets close to poetry: “A serverless is a serverless is a serverless”[*].
The best part? Once the code is written, it deploys like a charm with serverless
. Clone the code, configure credentials, build dependencies, deploy:
serverless deployserverless client deploy
and enjoy the benefits of #serverless computing.
Check the latests stories by Dmitri Zimine or follow @dzimine on Twitter.
https://github.com/Miserlou/Zappa if you are doing python on lambda ;). (Understanding the mechanics makes sense of course anyway!)
Thanks for the pointer; to me Zappa looks like a python version of Zeit’s
now
. Which is perfect for something small, but likely not cut the case I describe here or larger solutions. Note a similar thing exists as a plugin to serverless framework:and for completeness, https://serverless.com/blog/serverless-python-packaging/ - a good blog on how to manage dependencies with this plugin.
Yep, I wanted to do it “raw” to understand the mechanics.
Hm, OK, from their home page I thought they were for JS only. (Now, after looking at the README on github, I see that it supports quite a few languages/stacks.)