General setup for S3 + SQS collector
Overview
Data source | Description | Collector service name | Devo table | Available from |
---|---|---|---|---|
Any | Any source you send to an SQS can be collected. | - | - |
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| VPC Flow Logs, Cloudtrail, Cloudfront, and/or AWS config logs |
| - |
|
| - |
|
|
|
| The files can be so large and hard to pull that if the service above fails, use this one. |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| Relational Database Audit Logs |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
For each setup, you can use this general config:
{
"global_overrides": {
"debug": false
},
"inputs": {
"sqs_collector": {
"id": "34523",
"enabled": true,
"credentials": {
"aws_cross_account_role": "if provided",
"aws_external_id": "if needed/supplied"
},
"region": "us-east-2",
"base_url": "https://sqs.us-east-2.amazonaws.com/",
"sqs_visibility_timeout": 120
"sqs_wait_timeout": 20
"sqs_max_messages": 1
"ack_messages": false
"direct_mode": false
"do_not_send": false
"compressed_events": false
"debug_md5": false,
"services": {
"aws_sqs_kubernetes": {
"encoding": "gzip",
"type": "unseparated_json_processor",
"config": {
"key": "logEvents"
}
}
}
}
}
}
The services are listed above. Every part of the service is overridable, so if you need to change the encoding, you can do it freely. You can also leave the service as "service_name": {}
Custom services or overrides
For a custom service or override, the config can look like this:
"services": {
"custom_service": {
"file_field_definitions": {},
"filename_filter_rules": [],
"encoding": "parquet",
"file_format": {
"type": "line_split_processor",
"config": {"json": true}
},
"record_field_mapping": {},
"routing_template": "my.app.ablo.backend",
"line_filter_rules": []
}
}
The main things you need:
file_format
is type of processor.routing_template
is the tag you need.
Collectors that need custom tags
|
|
---|---|
|
|
|
|
Types of processors
| This is if the events come in json in one massive object use this. |
| This will determine if the log is split by |
| For AWS access logs and \n splits. |
| This is for one JSON object. |
| Similar to other separators. |
| For Bluecoat recipe. |
| Split by configured value. |
| For JSON array processors |
| Similar to other separators |
| Jamf log processing. |
| Parquet encoding. |
| For GuardDuty processors. |
| VPC service processor. |
| VPC service processor. |
| For Kolide service. |
| VPC service processor. |
| RDS processor for the RDS service |
More on processors:
|
|
config: {"key": "log"}
fileobj: {..."log": {...}} |
If there is no See | ||
| ||
config options: “json”: true or false. If setting json to true, assumes that logs are newline-separated json, and allows them to be parsed by the collector therefore enabling record-field mapping | ||
config options: specify the separator e.g. “separator”: “||”. the default is newline if left unused. | ||
| ||
| ||
| ||
| ||
| ||
| ||
| ||
| A dictionary where each key defines a variable that can be parsed out from each record (which may be referenced later in filtering). For example, we may want to parse something and call it type by getting type from a certain key in the record (which may be multiple layers deep). The keys are a list of how to find a value and handle nesting (essentially, defining a path through the data). Suppose we have logs that look like this: If we want to get the In many cases, you will probably only need one key, for example, in a flat JSON that isn’t nested: Here you would just specify This snippet would grab whatever is located at Let’s say you have a log in JSON format like this which will be set to Devo: Specifying type in Now let’s say you want to change the tag dynamically based on that value. You could change the Now let’s say you want to filter out (not send) any records which have the type
The Let’s say you have a filename You could write a This would store a variable called |
Tagging
Tagging can be done in many different ways. One way tagging works is by using the file field definitions:
These are the elements of the filename
object:
If you look at the highlighted object filename, you can see that we are splitting and looking for the 2nd value. This starts at 0 like arrays. So:
0 =
cequence-data
1 =
cequence-devo-6x-NAieMI
2 =
detector
"routing_template": "my.app.test_cequence.[file-log_type]"
Our final tag is my.app.test_cequence.detector
Here is another example:
| Defined as a dictionary mapping of variable names (you decide) that lists parsing rules. Each parsing rule has an operator with its own keys. Parsing rules are applied in the order they are listed in the configuration.
For example, if your filename is | |
| A string defining how to build the tag to send each message, for example, If the |
Options for filtering
Line-level filters
These are a list of rules for filtering out single events.
File-level filters
These are a list of rules to filter out entire files by the specified pattern applied over the file name.