For me this project was a neat little training exercise, as I had to explore different service APIs of the AWS Cloud, while also freshen up my TypeScript and Serverless skills. As a bonus I also learned how easy it is to develop Slack applications with custom slash commands and where to find common pitfalls along the way.
In this article I will give a small overview on how the Bot is setup using multiple Lambda functions and how Slack apps are working behind the scenes.
I have decided to go full serverless, which is why the Bot is build on top of the Serverless Framework. To keep the architecture clean and modular right from the start, every sub-command (or action) is performed by a dedicated AWS Lambda Function. Unfortunately Slack does not take care of sub-commands or other input parameters and just produces a POST request to the same HTTP endpoint every time a slash-command is used, so one additional Lambda Function acts as the central entry point, which will then parse the sub-command and potentially trigger another Lambda through an event in a SQS queue. Between Slack and the Lambda sit a CloudFront and an API Gateway endpoint.
The central entry point Lambda immediately sends a HTTP response back to Slack, because Slack enforces a strict 3 seconds response timeout, which might be too short to perform a sub-command of the Bot. The immediate response can contain the full answer to a slash-command, but is mostly meant for showing the user (and also Slack), that her request was delivered.
In case of this Bot, the content of the immediate answer depends on the text input, which the user entered behind the slash-command itself (for example
/aws-playground help. This text is treated as a sub-command by the Bot to trigger other parts of the application, which could fulfill such request. When the user sends no valid sub-command or the command
help, the first Lambda will immediately answer with a list of available commands and a short explanation on how to use the application.
When the user specifies a valid sub-command, the Lambda instead answers with an acknowledgement message and essentially tells the user to "hold the line". Slack always provides a temporary HTTP endpoint for answering a user's request at a later time, which makes sense, as many applications (including most actions this Bot can perform) will take longer than 3 seconds to complete. As the temporary endpoint URL is different for every request, it is transmitted through the same SQS event, which will trigger the Lambda Function, that implements a sub-command. These Lambda's will send their output to the Slack API endpoint, which will render it as a Slack message to the user.
The full setup and information flow is visualised in the following diagram.
As enduser in Slack, the app can be used like any other slash-command installed in the workspace.
Step 1: Trying out the slash-command
The user does not need to know how to use the slash-command yet. All she needs to do is send a Slack message containing the command.
As this message does not contain a valid sub-command, the application will answer with a brief manual listing and explaining available actions, that can be performed by the Bot.
Step 2: Using a valid command
After reading this usage guide, the user might send another message containing a valid action like "describe-instances".
This will then produce the following output:
As you might see in this screenshot, the Bot will answer in two separate messages:
- Immediately after receiving the command through Slack, the AWS Playground Bot sends a message, confirming that the request was successfully executed and stating that a report will be generated.
Behind the scenes another Lambda Function was already triggered through a message in its SQS event queue.
- After a few seconds another message is send back to the user containing the requested report. When the report does not fit within one Slack message, it is instead available through pre-signed URLs to a S3 bucket. Here the information is made available through CSV files, which the user can then download for a limited time period.
One thing worth mentioning is, that this application is currently not available to every user in our Slack workspace, as it is still under development. Most users (my amazing colleagues) will only get a response, stating that the Bot does not talk to strangers.
I won't document the whole application / Bot here as this would make quite a long blog post. Here are just a few learnings I found most interesting:
- There is a neat editor for Slack messages, which really helps during the design process for the Bot's responses. Nice feature, Slack! 👏
- There is a maximum length for Slack messages and also for certain parts within the message. When you don't know exactly how long a Bot's response might be in the future, think about producing reports in some standard output format like CSV or HTML and providing the user with a download link in your Bot's response instead.
- Slack only allows a small subset of Markdown in their messages, which lacks certain features like tables. Some apps fallback to using pictures instead, which is quite annoying when you need to copy some information out of it. 😕
- Slack does not forward any information to you when an error occurs during the POST request of the slash-command. In my case there was a false configuration in CloudFront preventing any access from other availability zones to the application endpoint (we are using EU-West-1 exclusively, but Slack has its servers somewhere in the US). Try using a proxy like BurpSuite or https://beeceptor.com/ to get the actual error message from AWS. 🚀
- IAM rules for S3 are kind of confusing. Granting full access to the Bucket itself does not also grant access to resources within the Bucket. So make sure to grant access to "arn:aws:s3:::bucket-name" and "arn:aws:s3:::bucket-name/*". The AWS role simulator does not help you in this case. 🤦♂️