Detecting Anomalies in AWS CloudTrail Using Unomaly
AWS CloudTrail is a powerful tool for auditing your AWS usage, giving you complete insight in what’s going on across your AWS account(s) by giving you access to the raw events formatted as JSON — however, making sense of the AWS CloudTrail logs, finding those needles in the haystack type of events, the ones you really want to detect, is not necessarily the easiest task. Usually, it involves creating advanced search patterns, and knowing beforehand which events to look for, which you typically don’t.
This is where Unomaly comes into the picture and can help you make sense of your CloudTrail data without the need for creating complex regex patterns or setting up a whole bunch of rules beforehand, it just accepts the raw data, learns what is normal behavior and highlights the anomalies for you.
The events emitted by CloudTrail will look like the one below, with slight deviations depending on for example which AWS service emitted the event.
In this post, we will walk through how to set up our CloudTrail logs for automated anomaly detection, using Unomaly.
We are going to assume that you have a Unomaly instance of version 2.27 or later up and running inside your AWS account, so that we can continuously send log events to the Unomaly ingest API via HTTP. For more information on how you can get started and get a free trial of Unomaly, visit unomaly.com.
To feed the data to Unomaly, we will be relying on two other AWS services: S3 and Lambda. This setup will give you a close to zero maintenance, automated anomaly detection on your CloudTrail logs.
When we are all done, it will all look something like this:
Using S3 & Lambda in this way is a proven and very common pattern for event based processing.
Let’s break down what will happen here:
- CloudTrail will write files into S3 at regular intervals, each containing an unknown number of CloudTrail records (like the ones described above)
- An S3 event will trigger the lambda function
- The lambda function will read the object containing the CloudTrail records from S3 and post them in a batch oriented fashion to Unomaly ingest API
To manage all of this, I’ve opted to use AWS Serverless Application Model, a way to define the lambda functions and necessary AWS infrastructure as code, in a clean and simple syntax. The SAM template contains specification for the CloudTrail trail, S3 bucket and its corresponding policies, and the lambda function. When deploying, it will convert the SAM template to a CloudFormation template, i.e. infrastructure as code, yay!
Since we’re huge fans of Go here at Unomaly, the lambda function itself is written in Go and the necessary configuration is passed to it as environment variables.
All the code and templates are available on GitHub at https://github.com/unomaly/unomaly-blog/tree/master/cloudtrail-lambda-blog. Feel free to go ahead and try it out, deployment instructions should be in the README.md file in the repository root.
The CloudTrail logs in Unomaly
Once things are up and running and the first CloudTrail logs have been processed, we should start seeing the various AWS services you are using pop up as systems in the sidebar in Unomaly. That way, you’ll be able to drill down into the anomalies per service, giving you the right granularity while doing anomaly reviews.
You can of course go ahead and group these systems together as needed, in a way that makes sense for you and your team(s).
As new, never before seen events — anomalies — start appearing across your AWS account(s), they’ll be surfaced for you to review, as seen in the picture below
In the example above, Unomaly has determined the SNSCreateTopicevent as an anomaly, since it never happened before. We can easily convert this previously unknown event into aknownby clicking the plus button next to the anomaly
By adding classifications and tags to the known we can later on go ahead and route this now known event to the appropriate destination, for example a Slack channel where you monitor changes across your AWS infrastructure.
Looking to perform anomaly detection on any or all of your log data? You can download Unomaly here!