Updated: Mar 1
DISCLAIMER: The following post is outdated, please see our docs for updated information.
Over the last few months I’ve been working on Grapl, a platform for DFIR built largely around graph structures.
I wanted Grapl to be trivial to deploy, both because it would ease others’ work to get started with it, and because it’ll make my test cycle a lot faster.
Grapl consists of around 7 Lambdas, some S3 buckets, SNS topics, SQS queues, and the connections and policies between them. Configuring and changing these in the console quickly became untenable. I evaluated two projects to make my life easier here, with the goal of having a near-one-command deployment.
The first project I started with was Terraform. Terraform is developed by Hashicorp, and is basically a DSL for describing CloudFormation policies, written in HCL.
I found HCL to mostly be unbearable. It’s very ‘stringy’ and weird - it definitely feels like a DSL, and not like a typical programming language, which I don’t really see the appeal of since I spend 99% of my time using typical languages.
Here’s an example:
This defines a VPC subnet resource, referencing a VPC by id through string interpolation.
On top of that Terraform does very little to help you out. Everything has to be defined - policies, subnets, etc. Why? Why must I state that my Queue should be publishable to my SNS Topic? And then I also have to define that my SNS topic can publish to my Queue? It felt so redundant and easy to get wrong - I had to understand CloudFormation and AWS policies.
I ditched Terraform and, for a while, just did things with the console.
Later I found out about aws-cdk, a new approach to configuring AWS resources through code.
The nice thing about CDK is that it is not a DSL. It’s a library that I can use from various programming languages.
By leveraging real languages I can use the tools I’m used to - classes, functions, generics, loops, branches, etc. I didn’t have to learn the CDK way, I just approached it as I would any problem.
What this meant was I could move really fast, building my configurations in a way that was very readable and, at least to some extent, DRY.
Here’s a function I wrote to factor out some common logic I had - subscribing my lambdas to an SQS Queue.
Note that in this case I had to add the policy to my lambda. This is actually atypical - cdk will, in almost every case, generate these for you. I only had to do so here because it’s a very young library and they haven’t automated this yet.
But consider this code:
This code defines a Topic and a Queue, and subscribes the Queue to the Topic. I don’t need to define any policy - it’s obvious what it should be, allow the queue to read from the topic, so cdk just does it for me.
I felt confident that cdk would build be policies that are least privilege by default, and that I couldn’t accidentally mess them up.
I also use a database, and I wanted my db username and passwords to be stored in an environment variable. Because I was using typescript this was trivial - just npm install node-env-file and use it.
I pass my custom HistoryDb class the necessary information, pulling the credentials from my .env, and I’m done.
Similarly, I pass these credentials to my lambdas that need to access the history db (I intend to move to KMS later).
This same approach allows me to deal with the fact that S3 buckets are global. If someone else wants to set up Grapl they just provide a unique prefix in the .env file and all services will be made aware of it. Easy.
CDK is still early days but I really couldn’t recommend it more. Deploying Grapl is practically trivial and adding new CloudFormation stacks, or modifying existing ones, has been incredibly smooth.