AWS collector

Service description

Amazon Web Services (AWS) provides on-demand cloud computing platforms and APIs to individuals and companies. Each available AWS service generates information related to different aspects of its functionality. The available data types include service events, audit events, metrics, and logs.

You can use the AWS collector to retrieve data from the AWS APIs and send it to your Devo domain. Once the gathered information arrives at Devo, it will be processed and included in different tables in the associated Devo domain so users can analyze it.

Data source description

From the monitoring point of view, AWS generates the following types of information:

Events

Information that happens at a specific timestamp. This information will always be linked to that timestamp, and can be categorized into two different subtypes:

Service events

The different available services usually generate information related to their internal behaviors, such as A virtual machine has been started, A new file has been created in an S3 bucket or An AWS lambda function has been invoked.

Note that this type of event can be triggered with no human interaction. These kinds of events are managed by the CloudWatch Events service (CWE). Recently, AWS has created a new service called Amazon EventBridge that will replace the CWE service.
The findings detected by AWS Security Hub are also managed by CloudWatch Events (CWE).

Audit events

These events are more specific because they need human interaction no matter the way used to retrieve them (API, web interaction, or even CLI command).

These events are managed by the CloudTrail service.

Metrics

According to the standard definition, this kind of information is usually generated at the exact moment it is requested because it is typically a query related to the status of a service (everything inside AWS is considered a service).

AWS makes something slightly different because it generates metrics information every N time slots, such as 1 min, 5 min, 30 min, 1h, etc., even if no one makes a request.

This kind of information is managed by the CloudWatch Metrics service (CWM).

Logs

Logs can be defined as information with a non-fixed structure that is sent to one of the available logging services. These services are CloudWatch Logs and S3.

There are some very customizable services, such as AWS Lambda, or even any developed application which is deployed inside an AWS virtual machine (EC2), that can generate custom log information. This kind of information is managed by the CloudWatch Logs service (CWL) and also by the S3 service.

There are also some other services that can generate logs with a fixed structure, such as VPC Flow Logs or CloudFront Logs. These kinds of services require one special way of collecting their data.

Some services generate information that can be sent to different targets at the same time, for example, the CloudTrail service generates audit-related information. This information is really an "audit event", but it can be treated as a "simple event" and being sent to the Cloudwatch Events service. Also, it can be sent as a string “logline” to the Cloudwatch Logs service or sent as a file to a bucket inside the S3 service.

CloudWatch Events is in the process of changing its name, the new one is Amazon EventBridge.
Almost all services that generate Service events usually send them to Cloudwatch Events service (CWE). It could be said that 90% of services use CloudWatch Events (the same service events are also sent to the new service called Amazon EventBridge).
Cloudwatch Events (CWE), Cloudwatch Metrics (CWM), and Cloudwatch Logs (CWL) are considered as different services.

Setup

Some manual actions are necessary in order to get all the required information/services and allow the Devo collector to gather the information from AWS.

The following sections describe how to get the required AWS credentials and how to proceed with the different required setups depending on the gathered information type.

Credentials

Because there are several options about how to create the credentials, they will be detailed only in two different approaches.

There are several available options to define credentials, but we will only cover some of them.

It’s recommended to have available or create the following IAM policies before the creation of the IAM user that will be used for the AWS collector.

Policy details

Source type	AWS Data Bus	Recommended policy name	Variant
Service events	CloudWatch Events	`devo-cloudwatch-events`	All resources	It’s not required the creation of any new policy due to there are not needed any permissions
Audit events	CloudTrail API	`devo-cloudtrail-api`	All resources	{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "cloudtrail:LookupEvents", "Resource": "*" } ] }
			Specific resource	There is no way for limiting the accessed resources
	CloudTrail SQS+S3	`devo-cloudtrail-s3`	All resources	{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "s3:GetObject", "Resource": "*" } ] }
			Specific S3 bucket Note that the value for the property called Resource should be changed with the proper value It very important the `/*` string at the end of each bucket name	{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "s3:GetObject", "Resource": [ "arn:aws:s3:::devo-cloudtrail-storage-bucket1/", "arn:aws:s3:::devo-cloudtrail-storage-bucket2/" ] } ] }
Metrics	CloudWatch Metrics	`devo-cloudwatch-metrics`	All resources	{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "cloudwatch:GetMetricData", "cloudwatch:ListMetrics" ], "Resource": "*" } ] }
Metrics	CloudWatch Metrics	`devo-cloudwatch-metrics`	Specific resource	There is no way for limiting the accessed resources
Logs	CloudWatch Logs	`devo-cloudwatch-logs`	All log groups Note that the value for property `Resource` should be adapted with the proper account id value.	{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "logs:DescribeLogGroups", "logs:DescribeLogStreams", "logs:FilterLogEvents" ], "Resource": "arn:aws:logs::936082584952:log-group:" } ] }
	CloudWatch Logs	`devo-cloudwatch-logs`	Specific log groups Note that values inside the `Resources` property are only examples and they should be changed with the proper values.	{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": [ "logs:DescribeLogGroups", "logs:DescribeLogStreams", "logs:FilterLogEvents" ], "Resource": [ "arn:aws:logs::936082584952:log-group:/aws/events/devo-cloudwatch-test-1:", "arn:aws:logs::936082584952:log-group:/aws/events/devo-cloudwatch-test-2:" ] } ] }
	Logs to S3 + SQS	`devo-vpcflow-logs`	All resources	{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "s3:GetObject", "Resource": "*" } ] }
	Logs to S3 + SQS	`devo-vpcflow-logs`	Specific resource	{ "Version": "2012-10-17", "Statement": [ { "Sid": "VisualEditor0", "Effect": "Allow", "Action": "s3:GetObject", "Resource": "arn:aws:s3:::vpc-flowlogs-test1/*" } ] }

Using a user account and local policies

Depending on which source types are collected, one or more of the policies described above will be used.

Once the required policies are created, each one must be associated with an IAM user. To create it, visit the AWS Console and log in with a user account with enough permissions to create and access AWS structures:

Go to IAM → Users.
Click the Add users button.
Enter the required value in the field User name.
Enable the checkbox Access key - Programmatic access.
Click the Next: Permissions button.
Choose the box with the text Attach existing policies directly.
Use the search box to locate all your required policies and check the boxes at the left of each policy name.
Click Next: Tags. Optionally, add any desired tags to the new user.
Click Next: Review.
Click the Create user button.
A new Access Key ID and Secret Access Key will be created. You can click the Download .csv button to download a copy to your local or you can copy the values shown on the screen. These will be used as the AWS collector credentials.
Finally, click the Close button.

Assuming a role (self-account)

It is a best practice to assume roles that are granted just the required privileges to perform an action. If the customer does not want to use their own AWS user to perform these actions required by the collector (because it has far more privileges than required), they can use this option. Note that this option requires the use of AWS account credentials. To avoid sharing those credentials, check the Cross Account option below.

The customer must attach the required policies in AWS to the role that is going to be assumed. For more information about the AssumeRole feature, check the AWS documentation.

Regarding configuration, these are the fields required to use this way of authentication:

...,
"credentials":{
  "access_key": "<CUSTOMER_AWS_ACCOUNT_ACCESS_KEY>",
  "access_secret": "<CUSTOMER_AWS_ACCOUNT_SECRET_ACCESS_KEY>",
  "base_assume_role": "arn:aws:iam::<CUSTOMER_AWS_ACCOUNT_ID>:role/<ROLE_TO_BE_ASSUMED>"
}
...,

access_key is the Access Key ID provided by AWS, during the user creation process, to those users that ask for programmatic access.
access_secret is the Secret Access Key provided by AWS, during the user creation process, to those users that ask for programmatic access. This value is shown once, so it must be saved upon creation.
base_assume_role is the ARN of the role that is going to be assumed by the user authenticated with the parameters above, access_key and access_secret. This role has to be properly granted to allow the actions the collector is going to perform.

Click here to expand...

If the customer does not want to share their credentials with Devo, there is another way to run the collector. It is called Cross Account and AssumeRole functionality should be used in this case and it is explained step by step in the next link: https://docs.devo.com/space/latest/94655615/AWS (S3%2BSQS) collector#Enabling-SQS-and-S3-access-using-a-cross-account-IAM-role

Besides, some parameters must be added to the configuration file (config.json or equivalent config.yaml for on-prem). In the credentials section, instead of sharing access_key and access_secret, these other parameters must be used:

...,
"credentials":{
  "base_assume_role": "arn:aws:iam::<BASE_SYSTEM_AWS_ACCOUNT_ID>:role/<BASE_SYSTEM_ROLE>",
  "target_assume_role": "arn:aws:iam::<CUSTOMER_AWS_ACCOUNT_ID>:role/<CUSTOMER_ROLE_TO_BE_ASSUMED>",
  "assume_role_external_id": "<OPTIONAL__ANY_STRING_YOU_WANT>"
}
...,

base_assume_role is the ARN of the role that is going to be assumed by the profile bound to the machine/instance where the collector is running. As explained in the link above, this role is going to be trusted by the customer’s AWS account, so it can assume the role in the target account. That role assumed from the customer’s account will allow the collection of data without the need of sharing the credentials. This role already exists in Devo’s AWS account and to deploy the collector on Devo’s Collector Server its value must be arn:aws:iam::837131528613:role/devo-xaccount-cs-role.
target_assume_role is the ARN of the role in the customer’s AWS account. That role will allow the collector to have access to the resources specified in that role. To keep your data secure, please, use policies that grant just the necessary permissions.
assume_role_external_id is an (optional) parameter to add more security to this Cross Account operation. This value should be a string added in the request to assume the customer’s role.

Service events

All the service events that are generated on AWS are managed by Cloudwatch. However, Devo’s AWS collector offers two different services that collect Cloudwatch events:

sqs-cloudwatch-consumer - This service is used to collect Security Hub events.
service-events-all -This service is used to collect events from the rest of the services on AWS.

The AWS services generate service events per region, so the next instructions should be applied in each region where the collecting of information is required (use the same values for all your configured regions).

In order to collect these service events, there are some structures that must be created: one FIFO queue in the SQS service and one Rule+Target in the CloudWatch service.

If the auto-setup functionality is enabled in the configuration and the related credentials have enough permissions to create all the required AWS structures, the following steps are not required.

For a manual creation of these required structures, follow the next steps (click to expand):

SQS FIFO queue creation

Go to Simple Queue Service and click Create queue.
In the Details section, choose the FIFO queue type and set the Name field value you prefer (it must finish with the .fifo suffix).
In the Configuration section, set the Message retention period field value to 5 Days. Be sure that the Content-based deduplication checkbox is marked and leave the rest of the options with their default values.
In the Access policy section, choose the method Basic and select the option Only the queue owner to receive and send permissions.
Optionally, in the Tags section, you can create a tag with Key usedBy and Value devo-collector.
Click Create queue.

CloudWatch Rule + Target creation

Go to CloudWatch → Rules and click Create rule.
In the Event Source section, select Event Pattern and Build event pattern to match events by service.
In the Service Name field, enter the service to be monitored (Check the note below for Security Hub Findings)
In the Event Type field, choose All Events (Check the note below for Security Hub Findings)
In the Targets section, click Add target and choose SQS queue as the target type.
In the Queue dropdown, choose the previously created queue.
In the Message group ID field, set the value devo-collector.
Then, click Configure details.
In the Rule definition section, set the Name you prefer. Be sure that State checkbox is marked.

To retrieve Security Hub Findings, select Security Hub in the Service Name field, and Security Hub Findings - Custom Action in the Event Type field.

Audit events

No actions are required in the Cloudtrail service to retrieve this kind of information.

Metrics

No actions are required in the CloudWatch Metrics service to retrieve this kind of information.

Logs

Logs can be collected from different services. Depending on the service type, you may need to apply some settings on AWS:

CloudWatch Logs

No actions are required in this service to retrieve this kind of information.

VPC Flow Logs

Before enabling the generation of these logs, you must create one bucket in the S3 service and one FIFO queue in the SQS service. For a manual creation of these required structures, follow these steps (click to expand):

SQS queue creation

Go to Simple Queue Service and click Create queue.
In the Details section, choose the Standard queue type and set the Name field value you prefer.
In the Configuration section, set the Message retention period field value to 5 Days and leave the rest of the options with their default values.
In the Access policy section, choose the method Advanced and replace "Principal": {"AWS":"<account_id>"} by "Principal": "*" Leave the rest of the JSON as default.
Optionally, in the Tags section, you can create a tag with Key usedBy and Value devo-collector.
Click Create queue.

S3 bucket creation/configuration

Go to S3 and click Create bucket.
Set the preferred value in the Bucket name field.
Choose the required Region value and click Next.
Optionally, in the Tags section, you can create a tag with Key usedBy and Value devo-collector. Leave the rest of the fields with their default values and click Next.
Click Create bucket.
Mark the checkbox next to the previously created S3 bucket.
In the popup box, click Copy Bucket ARN and save the content. You will need it later.
In the S3 bucket list, click the previously created bucket name link.
Click the Properties tab, then click the Events box.
Click Add notification.
Set the preferred value in the Name field.
Select the All object create events checkbox.
In the Send to field, select SQS Queue.
Select the previously created SQS queue in the SQS field.

VPC service

Once the required AWS structures are created, go to the VPC service and follow these steps:

Select any available VPC (or create a new one).
Go to the Flow Logs tab and click Create flow log.
Choose the preferred Filter value and the required Maximum aggregation interval value.
In the Destination field, select Send to an S3 bucket.
In the S3 bucket ARN field set the ARN of the previously created S3 bucket.
Make sure that the Format field has the value AWS default format.
Optionally, in the Tags section, you can create a tag with Key usedBy and Value devo-collector.
Finally, click Create.

CloudFront Logs

Before enabling the generation of these logs, you must create one bucket in the S3 service and one FIFO queue in the SQS service. For a manual creation of these required structures, follow these steps (click to expand):

SQS queue creation

Go to Simple Queue Service and click Create queue.
In the Details section, choose the Standard queue type and set the Name field value you prefer.
In the Configuration section, set the Message retention period field value to 5 Days and leave the rest of the options with their default values.
In the Access policy section, choose the method Advanced and replace "Principal": {"AWS":"<account_id>"} by "Principal": "*" Leave the rest of the JSON as default.
Optionally, in the Tags section, you can create a tag with Key usedBy and Value devo-collector.
Click Create queue.

S3 bucket creation/configuration

Go to S3 and click Create bucket.
Set the preferred value in the Bucket name field.
Choose the required Region value and click Next.
Optionally, in the Tags section, you can create a tag with Key usedBy and Value devo-collector. Leave the rest of the fields with their default values and click Next.
Click Create bucket.
Mark the checkbox next to the previously created S3 bucket.
In the popup box, click Copy Bucket ARN and save the content. You will need it later.
In the S3 bucket list, click the previously created bucket name link.
Click the Properties tab, then click the Events box.
Click Add notification.
Set the preferred value in the Name field.
Select the All object create events checkbox.
In the Send to field, select SQS Queue.
Select the previously created SQS queue in the SQS field.

CloudFront service

Once the required AWS structures are created, go to the CloudFront service and follow these steps:

Click the ID of the target Distribution item and access the Distributing Settings options. Then, click Edit.
In the Logging field, select On.
In the Bucket for Logs field, enter the ARN of the previously created S3 bucket.
Finally, click the Yes, Edit button.

Collector service details

The following tables show details about the predefined services available to be used in the collector configuration.

Devo collector service name	Complete service name	CloudWatch filter used	CloudTrail source filter used	Metrics namespace used	Description	Service events (type: `events`)	Audit events (type: `audits`)	Metrics (type: `metrics`)	Logs (type: `logs`)
`service-events-all`	All service events	`{"account":["<account_id>"]}`	N/A	N/A	This service will collect all service events information available in the CloudWatch service, no matter the source defined in the event.	✓	X	X	X
`audit-events-all`	All audit events	N/A	`all_sources`	N/A	This service will collect all audit events information available in the CloudTrail service, no matter the source defined in the event.	X	✓	X	X
`metrics-all`	All metrics	N/A	N/A	`all_metrics_namespaces`	This service will collect all metric information from CloudWatch service. Metrics from all the available metric namespaces will be retrieved.	X	X	✓	X
`<cwl_custom>`	CloudWatch Logs	N/A	N/A	N/A	This service will collect the different “Log Streams” that are part of a “Log Group” from the CloudWatch Logs service. Since it is common to have more than one “Log Group” defined, this will require creating one `<cwl_custom>` entry per “Log Group”.	X	X	X	✓
`non-cloudwatch-logs`	Non-CloudWatch Logs	N/A	N/A	N/A	This service will collect data from the following services VPC Flow Logs and CloudFront Logs.	X	X	X	✓
`sqs-cloudwatch-consumer`	Service events generated by CloudWatch Events service	Check more info here.	N/A	N/A	This service will collect all Security Hub findings that have been sent to CloudWatch, no matter the source defined in the finding.	✓	X	X	X

In the service-events-all collector service, the <account_id> string is automatically replaced with the real value.

The values entered in <cwl_custom> must be unique values.

Collector configuration details

Depending on the data type chosen for collecting, the following service definitions could be added to the configuration inside the services section. The following are common properties that all services have:

regions (mandatory) - It must be a list with valid target region names to be used when collecting data. One processing thread will be created per region. See more info about the available regions here.
request_period_in_seconds (optional) - The period in seconds to be used between pulling executions (default value: 60)
pull_retries (optional) - Number of retries that will be executed when a pulling error occurs (default value: 3)
tag (optional) - Used for sending the data to a table different from the default one (in the configuration examples, they appear as commented lines).

Global predefined services

These service definitions can be used for collecting in a global way the different data types available in AWS.

Service events

This is the configuration to be used when any service event needs to be collected from AWS, except Security Hub.

service-events-all:
  #tag: my.app.aws_service_events
  cloudwatch_sqs_queue_name: <queue_name>
  #auto_event_type: <bool>
  regions:
    - <region_a>
    - <region_b>
    - <region_c>

The default target table is cloud.aws.cloudwatch.events

This is the configuration to be used when Security Hub events need to be collected.

sqs-cloudwatch-consumer:
  #tag: <str>
  cloudwatch_sqs_queue_name: <queue_name>
  #auto_event_type: <bool>
  regions:
    - <region_a>
    - <region_b>
    - <region_c>

The SQS queue name is required

The default target table is cloud.aws.securityhub.findings

All audit events

There are two ways to get audit events. In case just a few events are going to be generated in the platform, using the API may be enough. However, when mid or high volumes are expected, saving those audit events in an S3 bucket would be the best choice. In this case, an SQS queue should be created to consume those events from the collector.

This is how the config file should be defined to retrieve audit events via API:

audit-events-all:
  #tag: <str with {placeholders}>
  #types:
    #- audits_api <str>
  #auto_event_type: <bool>
  #request_period_in_seconds: <int>
  #start_time: <datetime_iso8601_format>
  #drop_event_names: ["event1", "event2"] <list of str>
  regions:
    - <region_a>
    - <region_b>
    - <region_c>

Field	Type	Mandatory	Description
`tag`	string	no	Tag or tag format to be used. i.e.: `my.app.aws_audit_events` `cloud.aws.cloudtrail.{event_type}.{account_id}.{region_id}.{collector_version}`
`types`	list of strings (in yaml format)	no	Enable/Disable modules only when several modules per service are defined. To get audit events from API, this field should be set to `audits_api`.
`request_period_in_seconds`	integer	no	Period in seconds used between each data pulling, this value will overwrite the default value (60 seconds)
`start_time`	datetime	no	Datetime from which to start collecting data. It must match ISO-8601 format.
`auto_event_type`	boolean	no	Used to enable the auto categorization of message tagging.
`drop_event_names`	list of strings	no	If the value in `eventName` field matches any of the values in this field, the event will be discarded. i.e. if this parameter is populated with the next values `["Decrypt", "AssumeRole"]`, and the value of `eventName` field is `Decrypt` or `AssumeRole`, the event will be discarded.
`regions`	list of strings (in yaml format)	yes, if defined in the “Collector definitions”.	Property name (`regions`) should be aligned with the one defined in the submodules_property property from the “Collector definitions”

On the other hand, if S3 + SQS is the chosen option to get the audit events, the config file should match the following format:

audit-events-all:
  #tag: <str with {placeholders}>
  #types:
    #- audits_s3 <str>
  #request_period_in_seconds: <int>
  #start_time: <datetime_iso8601_format>
  #auto_event_type: <bool>
  audit_sqs_queue_name: <str>
  #s3_file_type_filter: <str (RegEx)>
  #use_region_and_account_id_from_event: <bool>
  regions:
    - region_a <str>
    - region_b <str>
    - region_c <str>

The default target table is cloud.aws.cloudwatch.events

Field	Type	Mandatory	Description
`tag`	string	no	Tag or tag format to be used. i.e.: `my.app.aws_audit_events` `cloud.aws.cloudtrail.{event_type}.{account_id}.{region_id}.{collector_version}`
`types`	list of strings (in yaml format)	no	Enable/Disable modules only when several modules per service are defined
`request_period_in_seconds`	integer	no	Period in seconds used between each data pulling, this value will overwrite the default value (60 seconds)
`start_time`	datetime	no	Datetime from which to start collecting data. It must match ISO-8601 format.
`auto_event_type`	boolean	no	Used to enable the auto categorization of message tagging.
`audit_sqs_queue_name`	string	yes	Name of the SQS queue to read from.
`s3_file_type_filter`	string	no	RegEx to retrieve proper file type from S3
`use_region_and_account_id_from_event`	bool	no	If `true` the `region` and `account_id` are taken from the event; else if `false`, they are taken from the account used to do the data pulling. Default: `true`
`regions`	list of strings (in yaml format)	yes, if defined in the “Collector definitions”.	Property name (`regions`) should be aligned with the one defined in the submodules_property property from the “Collector definitions”

All metrics

metrics-all:
  #tag: my.app.aws_metrics
  regions:
    - <region_a>
    - <region_b>
    - <region_c>

The default target table is cloud.aws.cloudwatch.metrics

CloudWatch Logs

An entry per Log Stream that wanted to be processed must be defined. In this example, two different entries have been created (cwl_1, cwl_2) for processing the Log Streams called /aws/log_stream_a and /aws/log_stream_b

cwl_1:
  #tag: my.app.aws_cwl
  types:
    -logs
  log_group: /aws/log_stream_a
  regions:
    - <region_a>
    - <region_b>
    - <region_c>
cwl_2:
  #tag: my.app.aws_cwl
  types:
    -logs
  log_group: /aws/log_stream_b
  regions:
    - <region_a>
    - <region_b>
    - <region_c>

As shown in the examples, the types list must be fixed with the log values.

The default target table is cloud.aws.cloudwatch.logs

Non-CloudWatch Logs

non-cloudwatch-logs:
  #tag: my.app.aws_cwl
  #vpcflowlogs_sqs_queue_name: <custom_queue_a>
  #cloudfront_sqs_queue_name: <custom_queue_b>
  #auto_event_type: <bool>
  regions:
    - <region_a>
    - <region_b>
    - <region_c>

The default target tables are cloud.aws.vpc.flowlogs and cloud.aws.cloudfront.

The default existing expected SQS queue names for this service are devo-ncwl-vpcfl-<short_unique_identifier> and devo-ncwl-cfl-<short_unique_identifier>

The properties vpcflowlogs_sqs_queue_name and cloudfrontlogs_sqs_queue_name can be used for using custom queue names instead of the default expected ones

Run the collector

Once the data source is configured, you can send us the required information and we will host and manage the collector for you (Cloud collector), or you can host the collector in your own machine using a Docker image (On-premise collector).