How shared Lambda functions help microservices access control
Shared Lambda functions can help combat error message issues that arise when developers create custom authentication and authorization for microservices. Here's how they work.
Like they do with any other application, developers need to think about how to control access to their microservices. But while Amazon API Gateway enables you to customize authentication and authorization operations for your microservices to meet your unique needs, there is an issue when it comes to error messages.
Amazon API Gateway users can add microservices access control to any APIs with custom authorizers. A request brings up a Lambda function to control access to the API. This approach provides authentication and authorization across multiple API endpoints. However, API Gateway won't return custom error messages from a custom authorizer function. When a failed authentication returns the same error message as a failed authorization, there's no way to decipher whether a denied user lacks the credentials for a certain function or the permissions for it.
Use a shared Lambda function
Instead of relying on API Gateway's custom authorizers for microservices access control, set up a shared Lambda function that the other Lambda functions can call directly. This approach requires the developer to add a bit of bootstrapping code onto each Lambda function, but those Lambda functions can execute directly instead of through API Gateway. More importantly, this setup enables custom error messages for each type of failure. If the user isn't logged in or has invalid credentials, you can send him a 401: Unauthorized response. If the user tries to do something that he doesn't have permissions for, like view someone else's email, you can display a 403: Forbidden response.
The single Lambda function for all microservices access control simplifies logging, too. No other Lambda functions need access to the user table in a DynamoDB database, unless they need to write changes to the user's account. Since a single Lambda function contains the authorization microservice, you can upload it directly into the CodeBuild CI service, using a custom set of build configurations. Here's an example of what that looks like:
version: 0.2
phases:
install:
commands:
# Install dependencies needed for running tests
- npm install
pre_build:
commands:
# Discover and run unit tests in the 'tests' directory
- npm test
build:
commands:
# Build the js files
- ./node_modules/.bin/tsc
# Zip everything up excluding the node_modules/@types directory
- zip -r archive.zip *.js src/*.js node_modules -x "node_modules/@types/*"
# Use the AWS CLI directly to upload our lambda function
- aws lambda update-function-code --function-name login --zip-file fileb://archive.zip --publish
This action updates the $LATEST version of the Lambda function, but this example's production environments point to a Lambda alias. This sign means that developers must test new code in the development environment first. You can tag a Lambda build for staging to make the beta API endpoints use it, or you can tag the build with prod to direct all of the production API endpoints to it.
The example below has a shared library, which uses environment variables to determine which version of the Lambda function to call, based on the API environment. The prod tag calls the production version:
const lambda_resp = await lambda.invoke({
FunctionName: 'login',
Qualifier: process.env.API_STAGE || 'prod',
Payload: JSON.stringify({
...event,
API_STAGE: process.env.API_STAGE,
}),
}).promise();
if (lambda_resp && lambda_resp.Payload) {
return JSON.parse(lambda_resp.Payload.toString());
}
Since every other function uses the login function, run manual integration tests in development and staging environments before you release the function to production. Since login is a simple Lambda function, APIs running on Docker containers and even applications running on Elastic Compute Cloud instances can also depend on it.
An extensible permission model
Now that there's a function that signals whether a user has permission to enter the system, what about permissions for each API? You can set up permissions for individual users or types to enhance microservices access control.
For example, an organization wants a simple user control model based on both fine-grained and high-level permissions. They map both the permissions and the options to create specific permissions. For example, low-level users may have permission to search only product1 and product2:
search: {
products: {
product1: Granted,
product2: Granted,
The next level up, group user, has access to the vendorlist:
search: {
products: {
product1: Granted,
product2: Granted,
vendorlist: Granted,
And the top-level users, enterprise, can search any product, so their permissions are set as such:
search: {
products: Granted
When seeking permissions, Lambda functions always use the most granular check. Therefore, if a user searches through the vendor product, the authorization query is:
user.hasPermission('search.products.vendorlist')
Since the low-level user plan doesn't include a Granted flag for search, products or vendor, it returns as false. Group users have a Granted flag on search -> products -> vendor, so that returns as true. For the enterprise user, the function checks search and then finds a Granted flag on products so it doesn't need to check vendor, since it's implied that the vendor line is Granted.
You can also add permission checks to the login function call by passing in a REQUIRE_PERMISSIONS variable:
const lambda_resp = await lambda.invoke({
FunctionName: 'login',
Qualifier: process.env.API_STAGE || 'prod',
Payload: JSON.stringify({
...event,
API_STAGE: process.env.API_STAGE,
REQUIRE_PERMISSIONS: 'search.products.vendorlist'
}),
}).promise();
You can store permission trees with a default value based on the user permission levels, as well as provide overrides on a per-user basis. For example, managers can create other users on their same account, but nonmanagers cannot.
Caching is key
Lambda functions can execute quickly, but they are often slow to pull in authentication details from other identity and access management systems, such as Auth0 or Amazon Cognito. Since all of the other microservices call the login function, speed is of the essence. To help performance, deploy a DynamoDB table that caches requests to validate a user's credentials and stores information on the user's permission tree.
Instead of taking time to compile user account information, such as permission settings and whether the account is active, the authentication request simply has to poll the DynamoDB table, which only takes a few milliseconds rather than a few seconds. Additionally, the Lambda login function typically returns in less than 20 milliseconds for repeat authentication requests. DynamoDB's Time To Live (TTL) feature can abolish those cached authorizations after 15 minutes.
When designing a login setup, take into account that DynamoDB TTLs do not reflect real time, and they may not expire exactly on the time specified. Make sure the TTL date is still valid when you receive a cached session. Try Memcached or another in-memory cache system to help keep track of data.