AWS (S3+SQS) collector
Overview
Logs generated by most AWS services (Cloudtrail, VPC Flows, Elastic Load Balancer, etc.) are exportable to a blob object in S3. Many other 3rd party services have also adopted this paradigm so it has become a common pattern used by many different technologies. Devo Professional Services and Technical Acceleration teams have a base-collector code that will leverage this S3 paradigm to collect logs and can be customized for different customer's different technology logs that may be stored into S3.
This documentation will go through setting up your AWS infrastructure for our collector integration to work out of the box:
Sending data to S3 (this guide uses Cloudtrail as a data source service)
Setting up S3 event notifications to SQS
Enabling SQS and S3 access using a cross-account IAM role
Gathering information to be provided to Devo for collector setup
General architecture diagram
Requirements
Access to S3, SQS, IAM, and CloudTrail services
Permissions to send data to S3
Knowledge of log format/technology type being stored into S3
Creating an S3 bucket and setting up a data feed (CloudTrail example)
The following will be set up during this section:
S3 bucket for data storage
CloudTrail trail for data logging into an S3 bucket
Create an S3 bucket
Set up a CloudTrail trail to log events into an S3 bucket
Creating an SQS queue and enabling S3 event notifications
SQS provides the following benefits from our perspective:
Built in retrying on the failure of processing a message
Dead letter queueing (if enabled when setting up SQS queue)
Allows for downstream outage without loss of the state of processing
Allows for parallelization of workers in event of very high volume data
Guaranteed at least once delivery (S3 and SQS guarantees)
Ability to have multiple S3 buckets send events to the same SQS queue and even those in other accounts via S3 event notifications to SNS to SQS in the target account
Optional - Using event otifications with SNS
Sending S3 event notifications to SNS may be beneficial/required to some teams if they are using the bucket event notifications in multiple applications. This is fully supported as long as the original S3 event notification message gets passed through SNS transparently to SQS. You will not need to follow the steps to set up event notifications to a single SQS, but could follow the Amazon documentation here to setup the following:
A brief write-up of this architecture can be found in this AWS blog. Note this will also help if you have buckets in different regions/accounts and would like one centralized technology queue for all of your logging.
Create an SQS queue for a specific service events type (i.e. CloudTrail)
In this example, we will continue by setting up an SQS queue for our CloudTrail technology logs.
Setup S3 event notifications
Enabling SQS and S3 access using a cross-account IAM role
For allowing the Devo collector to pull in data from your AWS environment, we will need an IAM cross-account role in your account. You will have to provide this role’s ARN to Devo.
Create an IAM policy
This IAM policy will:
Allow the role to read messages off the SQS queue and acknowledge (delete) them off the queue after successfully processing the messages
Retrieve the S3 object referenced in the SQS message so that Devo can read and process the message into the system
Provide limited access only to specified resources (minimal permissions)
Follow the next steps to create the IAM policy:
Create a cross-account role
Cross-account roles let roles/users from other AWS accounts (in this case, the Devo collector server AWS Account) access to assume a role in your account. This sidesteps the need to exchange permanent credentials, as credentials are still stored separately in their respective accounts, and AWS themselves authenticates the identities. For more information, check this document.
Follow these steps to create the cross-account role:
Information to be provided to Devo
At the end of this configuration process, the following tidbits of information will have to be provided to Devo for the collector setup in order to complete the integration:
Technology type that we will be consuming, or log format (in case the collector is pulling data from an AWS service - i.e: this guide is using CloudTrail as an example-, just the service name must be provided)
SQS Queue URL
Cross-account role ARN (i.e.: arn:aws:iam::<YOUR-ACCOUNT-ID>:role/devo-xs-collector-role) and optionally, ExternalID (if used in cross account role trust policy)
Once this information is provided and Devo confirms there is already a parser available (or finishes creating it) for processing your technology logs, a new Devo collector will be deployed to the Devo’s collector server cluster and it will start consuming data off of the SQS queue and S3 bucket.