/
Amazon Security Lake collector

Amazon Security Lake collector

Overview

Amazon Security Lake is a service that automates the sourcing, aggregation, normalization, and data management of security data across your organization into a security data lake stored in your account. A security data lake helps make your organization’s security data broadly accessible to your preferred security analytics solutions to power use cases such as threat detection, investigation, and incident response.

Security Lake has adopted the Open Cybersecurity Schema Framework (OCSF), an open standard. With OCSF support, the service can normalize and combine security data from AWS and a broad range of enterprise security data sources.

Devo collector features

Feature

Details

Feature

Details

Allow parallel downloading (multipod)

allowed

Running environments

  • collector server

  • on-premise

Populated Devo events

table

Flattening preprocessing

no

Allowed source events obfuscation

yes

Data sources

Data source

Description

API endpoint

Collector service name

Devo table

Available from release

Data source

Description

API endpoint

Collector service name

Devo table

Available from release

Amazon Security Lake

Security Lake records (in OCSF format)

AWS S3 and SQS

events

cloud.aws.amazon.security_lake.events

v1.0.0

For more information on how the events are parsed, visit our page.

Minimum configuration required for basic pulling

Although this collector supports advanced configuration, the fields required to retrieve data with basic configuration are defined below.

This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check setting sections for details.

Setting

Details

Setting

Details

aws_access_key_id

Credentials aws_access_key_id

aws_secret_access_key

Credentials aws_secret_access_key

region

AWS region

queue_name

AWS SQS queue_name

Accepted authentication methods

Authentication method

AWS Access Key

AWS Secret Access Key

Authentication method

AWS Access Key

AWS Secret Access Key

AWS access key/secret

Required

Required

Vendor setup

For a detailed walk-through of enabling/configuring Security Lake and enabling/configuring Security Lake providers, see the Security Lake Getting Started Guide.

This guide assumes that you have created an account with the necessary permissions to access the SQS queue(s) and S3 bucket(s) that were created during the initial enablement of Amazon Security Lake.

Action

Steps

Action

Steps

Obtain the credentials for the AWS Security Lake user.

Create and/or obtain the AWS access key and access key ID of the account that will be used to fetch the Security Lake SQS messages and S3 log files.Please refer to the Security Lake Getting Started Guide for more information on creating an administrative user.

Add a Security Lake Subscriber

  1. Open the Security Lake console at https://console.aws.amazon.com/securitylake/.

  2. By using the AWS Region selector in the upper-right corner of the page, select the Region where you want to create the custom source.

  3. In the navigation pane, choose Subscribers.

Enter subscriber details.

-

Choose to collect either all log and event sources or only specific log and event sources.

-

Choose the S3 data access method.

-

Set subscriber credentials.

  1. Enter the AWS Account ID that will be used to fetch SQS messages and respective Security Lake log files from S3.

  2. Enter a user-specified placeholder value for external ID (for example, “devo-collector-external-id”). This value is not used by the collector but is required to configure/create a subscriber.

Select SQS Queue for Notification details.

-

Choose Create to create the subscriber.

-

Obtain the SQS queue name from the newly created subscriber details page.

-

Assigning necessary permissions

The user must have already configured an instance of Amazon Security Lake. For general information and detailed steps on how to do so, please refer to the official Getting started - Amazon Security Lake documentation.

Credentials (aws_access_key_id and aws_secret_access_key) must be provided for a user that has access to the Security Lake S3 bucket and associated SQS queue (configured in the steps above).

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Collector services detail

This section is intended to explain how to proceed with specific actions for services.

Events service

Once the collector has been launched, it is important to check if the ingestion is performed in a proper way. To do so, go to the collector’s logs console.

This service has the following components:

Component

Description

Component

Description

Setup

The setup module is in charge of authenticating the service and managing the token expiration when needed.

Puller

The setup module is in charge of pulling the data in a organized way and delivering the events via SDK.

Setup output

A successful run has the following output messages for the setup module:

INFO MainThread -> [SETUP] ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) - Starting thread INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Puller Setup Started INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> We do not have a token. Getting a new one from the server. INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Attempting to get OAuth2 token from ThreatQuotient server.... INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Attempting to get from client_id ThreatQuotient server.... INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Successfully received a client_id token from (...)/assets/js/config.js INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Successfully received JWT token from (...) which expires in 3599 seconds INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Puller Setup Terminated INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Setup for module "ThreatQuotientDataPuller" has been successfully executed

Puller output

A successful initial run has the following output messages for the puller module:

Note that the PrePull action is executed only one time before the first run of the Pull action.

INFO MainThread -> [INPUT] ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) - Starting thread WARNING ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Waiting until setup will be executed INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> PrePull Started INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> PrePull terminated INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Starting data collection every 5 seconds INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Pull Started. Retrieving timestamp: 2022-06-28 13:00:59.276966+00:00 INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Started getting events from ThreatQuotient INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Started getting events from ThreatQuotient INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Started sending events to Devo INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Statistics for this pull cycle (@devo_pulling_id=1656421259.276966): Number of requests performed: 2; Number of events received: 1; Number of duplicated events filtered out: 0; Number of events generated and sent: 2 (from 1 unflattened events); Average of events per second: 4.179186813829765. WARNING ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> last_fetched_event_id and last_update_time saved in state: {'last_polled_time': 1656342870.739774, 'reset_persistence_auth': '', 'all_events_ids': [17653]} WARNING ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Last polled time saved in state: {'last_polled_time': 1656421259.276966, 'reset_persistence_auth': '', 'all_events_ids': [17653]} INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Pull terminated INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Data collection completed. Elapsed time: 0.483 seconds. Waiting for 4.517 second(s) until the next one

After a successful collector’s execution (that is, no error logs found), you will see the following log message:

INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Statistics for this pull cycle (@devo_pulling_id=1655983326.290848): Number of requests performed: 2; Number of events received: 52; Number of duplicated events filtered out: 0; Number of events generated and sent: 52 (from 52 unflattened events); Average of events per second: 92.99414315733.

The value @devo_pulling_id is injected in each event to group all events ingested by the same pull action. You can use it to get the exact events downloaded in that Pull action in Devo’s search window.

This collector uses persistent storage to download events in an orderly fashion and avoid duplicates. In case you want to re-ingest historical data or recreate the persistence, you can restart the persistence of this collector by following these steps:

  1. Edit the configuration file.

  2. Change the value of the reset_persistence_auth parameter to a different one.

  3. Save the changes.

  4. Restart the collector.

The collector will detect this change and will restart the persistence using the parameters of the configuration file or the default configuration in case it has not been provided.

This collector has different security layers that detect both an invalid configuration and abnormal operation. This table will help you detect and resolve the most common errors.

Error type

Error ID

Error message

Cause

Solution

Error type

Error ID

Error message

Cause

Solution

SetupError

1

api_base_url must be specified

The api_base_url setting is missing.

Make sure the api_base_url setting is present under the events service in your configuration.

2

api_base_url not of expected type: str

The api_base_url setting has a type other than string.

Make sure the api_base_url setting is a string.

3

api_base_url must match one of two regexes: [...]

The api_base_url setting does not follow the expected format.

Make sure your api_base_url has this format: http[s]://{ip_address_or_domain}{:optional_port}

4

Required setting, credentials not found in user configuration

There is no credentials section in your input settings.

Make sure there is a credentials section under the threatquotient_data_puller input in your configuration.

5

Required setting, credentials not of expected type: dict

The credentials section is empty or has a simple type (is not an object).

Make sure the credentials section has a username and password fields.

6

Required setting, username not found in user configuration

The username setting is missing.

Make sure the username setting from the credentials section has a value.

7

Required setting, username not of expected type: str

The username setting has a type other than string.

Make sure the username setting from the credentials section is a string.

8

Required setting, password not found in user configuration

The password setting is missing.

Make sure the password setting from the credentials section has a value.

9

Required setting, password not of expected type: str

The password setting has a type other than string.

Make sure the password setting from the credentials section is a string.

10

Optional setting, verify_host_ssl_cert not of expected type: bool

The verify_host_ssl_cert setting has a type other than boolean.

Make sure the verify_host_ssl_cert setting is a boolean value (true/false).

11

event_fetch_limit_in_items must be greater than or equal to [...] and less than equal to [...]

The event_fetch_limit_in_items setting has a value too low or too high for the specified limits.

Make sure the event_fetch_limit_in_items setting is an integer ranged between the specified limits.

12

devo_tag_map must have an entry named "default"

This error is not expected to happen in a regular flow.

This needs to be troubleshooted by the colllector’s developers.

13

Required setting, reset_persistence_auth not of expected type: str

The reset_persistence_auth setting has a value, but its type is other than string.

Make sure the reset_persistence_auth setting is a string.

14

Required setting, historical_poll_datetime not of expected type: str

The historical_poll_datetime setting has a type other than string.

Make sure the historical_poll_datetime setting is a string.

15

historical_poll_datetime does not match expected format [...]

The historical_poll_datetime setting does not look like a valid date.

Make sure the historical_poll_datetime setting meets the mentioned format (a reference of this representation can be found here).

16

Please enter valid date for historical_poll_datetime less than or equal to one year

The historical_poll_datetime setting is a date older than one year.

Make sure the historical_poll_datetime setting does not represent a date older than one year.

17

Please enter valid date for historical_poll_datetime less than or equal to the current date

The historical_poll_datetime setting is a future date.

Make sure the historical_poll_datetime setting does not represent a future date.

InitVariablesError

100

Unexpected status code when fetching ThreatQuotient JWT: [...]

When a token was retrieved, the response had an unexpected error code.

Make sure your credentials are correct.

101

Unexpected status code when fetching ThreatQuotient client_id: [...]

The collector is having issues connecting to the ThreatQ instance.

Make sure you have properly configured the api_base_url setting and that you can access the {api_base_url}/assets/js/config.js URL.

102

Cannot parse client_id from ThreatQuotient server

The collector was expecting to find the Client’s ID, but could not find it. This is likely because the ThreatQ has been upgraded and the collector does not support it.

This needs to be troubleshooted by the colllector’s developers.

ApiError

400

Unexpected status code when fetching ThreatQuotient events: [...]

This error happens when the collector tries to fetch the ThreatQ events from its REST API.

In this error you will find the HTTP error code as long as the response’s text. This information should be enough to understand why is the error happening. Otherwise, please contact support.

Collector operations

This section is intended to explain how to proceed with specific operations of this collector.

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

Events delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method.

A successful run has the following output messages for the initializer module:

Sender services

The Integrations Factory Collector SDK has 3 different senders services depending on the event type to delivery (internal, standard, and lookup). This collector uses the following Sender Services:

Logging trace

Description

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Total number of messages sent: 44, messages sent since "2022-06-28 10:39:22.511671+00:00": 21 (elapsed 0.007 seconds)

Displayes the number of events from the last time and following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2022-06-28 10:39:22.511671+00:00.

  • 21 events where sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.007 seconds to be delivered.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Standard - Total number of messages sent: 57, messages sent since "2023-01-10 16:09:16.116750+00:00": 0 (elapsed 0.000 seconds

Displays the number of events from the last time and following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2023-01-10 16:09:16.116750+00:00.

  • 21 events where sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.007 seconds to be delivered.

To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.

  • The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.

  • The global pressure of the available memory is displayed in the global value.

  • All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory

Change log

Release

Released on

Release type

Details

Recommendations

Release

Released on

Release type

Details

Recommendations

v1.0.0

Jun 1, 2023

initial release

Initial release