Custom SQS Collector

Purpose

The SQS collector can be configured to write any log to any table. Devo recommends use of a pre-built service that fits your logs. If the pre-built services do not fit, you should engage Devo professional services to create a custom service.

If you need to modify or filter logs, Devo recommends AWS Lambda.

Authorize It

Authorize SQS Data Access.
Add data to the S3 bucket. Preferably, the data should be in a consistent format. For example:
1. If the data is JSON objects, the keys of the JSON objects should be the same. Some objects can omit some keys.
2. If the data is comma separated value format, the number of columns must always be the same.

Gather Information

Get a log sample from the S3 bucket.
Determine if the S3 contents are compressed.
Choose a destination tag. Contact us for assistance checking for an existing tag or use a my.app tag.

Run It

Simple Template

In the Cloud Collector App, create an SQS Collector instance using this parameters template, replacing the values enclosed in < >.

{
  "inputs": {
    "sqs_collector": {
      "id": "<FIVE_UNIQUE_DIGITS>",
      "services": {
        "custom_service": {<OPTIONS>,
          "routing_template": "<DESTINATION TAG>"
        }
      },
      "credentials": {
              "aws_cross_account_role": "arn:<PARTITION>:iam::<YOUR_AWS_ACCOUNT_NUMBER>:role/<YOUR_ROLE>",
              "aws_external_id": "<EXTERNAL_ID>"
      },
      "region": "<REGION>",
      "base_url": "https://sqs.<REGION>.amazonaws.com/<YOUR_AWS_ACCOUNT_NUMBER>/<QUEUE_NAME>"
    }
  }
}

Flexible Template

{
  "global_overrides": {
    "debug": false
  },
  "inputs": {
    "sqs_collector": {
      "id": "12351",
      "enabled": true,
      "credentials": {
        "aws_access_key_id": "",
        "aws_secret_access_key": "",
        "aws_base_account_role": "arn:aws:iam::837131528613:role/devo-xaccount-cs-role",
        "aws_cross_account_role": "",
        "aws_external_id": ""
      },
      "ack_messages": true,
      "direct_mode": false,
      "do_not_send": false,
      "compressed_events": false,
      "base_url": "https://us-west-1.queue.amazonaws.com/id/name-of-queue",
      "region": "us-west-1",
      "sqs_visibility_timeout": 240,
      "sqs_wait_timeout": 20,
      "sqs_max_messages": 1,
      "services": {
        "custom_service": {
          "file_field_definitions": {},
          "filename_filter_rules": [],
          "encoding": "gzip",
          "send_filtered_out_to_unknown": false,
          "file_format": {
            "type": "line_split_processor",
            "config": {
              "json": true
            }
          },
          "record_field_mapping": {
            "event_simpleName": {
              "keys": [
                "event_simpleName"
              ]
            }
          },
          "routing_template": "destination tag",
          "line_filter_rules": [
            [
              {
                "source": "record",
                "key": "event_simpleName",
                "type": "match",
                "value": "EndOfProcess"
              }
            ],
            [
              {
                "source": "record",
                "key": "event_simpleName",
                "type": "match",
                "value": "DeliverLocalFXToCloud"
              }
            ]
          ]
        }
      }
    }
  }
}

Parameters

File Format Processors

Processors are selected in the type section within file_format. The processor must match the format of the event in the queue.

Example 1

We want to discard all the events that match these conditions:

if record.eventName = "HeadObject" or record.eventName = "ListObjects" or record.eventName = "HeadBucket" or record.eventName = "GetBucketLocation" 
 do_not_send_record()

eventName is one of these values: HeadObject, ListObjects, HeadBucket, GetBucketLocation

In Devo, these criteria are specified with the next query. If everything is OK, after configuring the collector properly, there should not be any event if we run this query:

from cloud.aws.cloudtrail.s3 where eventName = "HeadObject" or eventName = "ListObjects" or eventName = "HeadBucket" or eventName = "GetBucketLocation"

In this case, the key for the filter is the eventName, so first we need to add the key to the collector in the record_field_mapping section. After the record_field_mapping, we apply the corresponding filters in the line_filter_rules section. In this case, this would be as follows:

"record_field_mapping": {
  "eventName": {
    "keys": ["eventName"]
  }
},
"line_filter_rules": [
    [{"source": "record", "key": "eventName", "type": "match", "value": "HeadObject"}],
    [{"source": "record", "key": "eventName", "type": "match", "value": "ListObjects"}],
    [{"source": "record", "key": "eventName", "type": "match", "value": "HeadBucket"}],
    [{"source": "record", "key": "eventName", "type": "match", "value": "GetBucketLocation"}]
]

Elements in different lists are OR conditions. Elements in the same list are AND conditions.

Note that the logic for these filters is if they match the query, the collector won't send the event to Devo.

Example 2

What if we want to filter out the events that match this pseudocode query that has mixed conditions?

if record.type != "ldap" OR (record.main-log_ornot == main-log AND record.type == "kube-api-server-audit"):
 do_not_send_record()

In this case, the keys for the filter are type and main-log_ornot, so first we need to add the keys to the collector in the record_field_mapping section. Once we’ve added the keys, we apply the corresponding filters. In this case, the filters would be as follows:

"record_field_mapping": {
  "type": {
    "keys": ["type"]
  },
  "main-log_ornot": {
    "keys": ["main-log_ornot"]
  }
},
"line_filter_rules": [
	[{"source": "record", "key": "type", "type": "doesnotmatch", "value": "ldap"}],
    [
        {"source": "record", "key": "main-log_ornot", "type": "match", "value": "main-log"}, 
        {"source": "record", "key": "type", "type": "match", "value": "kube-apiserver-audit"}
    ]
]

Elements in different lists are OR conditions. Elements in the same list are AND conditions.

Note that the logic for these filters is if they match the query, the collector won't send the event to Devo.

File-level filters

These are a list of rules to filter out entire files by the specified pattern applied over the file name.

Example 1

"filename_filter_rules": [
    [{"type": "match", "pattern": "CloudTrail-Digest"}],
  	[{"type": "match", "pattern": "ConfigWritabilityCheckFile"}]
]

This will filter out files that contain CloudTrail-Digest or ConfigWritabilityCheckFile.

2024/01/01/CloudTrail-Digest-2024-01-01-00-00-00-123456789012.gz will be skipped.
2024/01/01/ConfigWritabilityCheckFile-2024-01-01-00-00-00-123456789012.gz will be skipped.

Example 2

"filename_filter_rules": [
    [{"type": "doesnotmatch", "pattern": "CloudTrail"}],
  	[{"type": "match", "pattern": "CloudTrail-Digest"}]
]

This will filter out files that do not contain CloudTrail or contain CloudTrail-Digest. For instance, files with a name like this:

2024/01/01/CloudTrail-2024-01-01-00-00-00-123456789012.gz will be processed.
2024/01/01/CloudTrail-Digest-2024-01-01-00-00-00-123456789012.gz will be skipped. Config can include "debug_mode": true to print out some useful information as logs come in.
For local testing, it is useful to set ack_messages to false to try processing without eating from the queue. Be careful to remove this or set it to true when launching the collector. The default is to ack messages if it is not set.

If something seems wrong at launch, you can set the following in the collector parameters/job config:

"debug": true,
"do_not_send": true,
"ack_messages": false ← you will see duplicates if you turn this to false, just set to true when done.

This will print out data as it is being processed, stop messages from getting hacked, and at the last step, data won’t send the data. In this way, you can easily check if something is not working properly.