...
Data source | Description | Collector service name | Devo table | Available from |
---|---|---|---|---|
Any | Any source you send to an SQS can be collected. | - | - |
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| VPC Flow Logs, Cloudtrail, Cloudfront, and/or AWS config logs |
| - |
|
| - |
|
|
|
| The files can be so large and hard to pull that if the service above fails, use this one. |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| Relational Database Audit Logs |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
| - |
|
|
|
...
Code Block |
---|
"services": {
"custom_service": {
"file_field_definitions": {},
"filename_filter_rules": [],
"encoding": "parquet",
"file_format": {
"type": "line_split_processor",
"config": {"json": true}
},
"record_field_mapping": {},
"routing_template": "my.app.ablo.backend",
"line_filter_rules": []
}
} |
The main things you need:
file_format
is type of processor.routing_template
is the tag you need.
Collectors that need custom tags
|
|
---|---|
|
|
|
|
Types of processors
...
|
...
This is if the events come in json in one massive |
...
object use this. |
|
...
This will determine if the log is split by | |
|
...
For AWS access logs and \n splits. | |
|
...
This is for one JSON object. | |
|
...
Similar to other separators. | |
|
...
For |
...
Bluecoat recipe. |
|
...
Split by configured value. | |
|
...
For JSON array processors | |
|
...
Similar to other separators | |
|
...
Jamf log processing. | |
|
...
Parquet encoding. | |
| For |
...
GuardDuty processors. |
|
...
VPC service processor. | |
|
...
VPC service processor. | |
|
...
For Kolide service. | |
|
...
VPC service processor. | |
| RDS processor for the RDS service |
Tagging
Tagging can be done in many different ways. One way tagging works is by using the file field definitions:
Code Block |
---|
"file_field_definitions": {
"log_type": [
{
"operator": "split",
"on": "/",
"element": 2
}
]
}, |
...
Our final tag is my.app.test_cequence.detector
Options for filtering
| Allowed values are | |||||||||||||
| Defined as a dictionary mapping of variable names (you decide) that lists parsing rules. Each parsing rule has an operator with its own keys. Parsing rules are applied in the order they are listed in the configuration.
For example, if your filename is
| |||||||||||||
| A list of rules to filter out entire files. | |||||||||||||
| Takes any string. List of most common to least common: | |||||||||||||
| Decide whether or not to delete messages from the queue after processing. It takes boolean values. If not specified, default is | |||||||||||||
|
|
| ||||||||||||
If there is no See
| ||||||||||||||
| ||||||||||||||
config options: “json”: true or false. If setting json to true, assumes that logs are newline-separated json, and allows them to be parsed by the collector therefore enabling record-field mapping | ||||||||||||||
config options: specify the separator e.g. “separator”: “||”. the default is newline if left unused.
| ||||||||||||||
| ||||||||||||||
| ||||||||||||||
| ||||||||||||||
| ||||||||||||||
| ||||||||||||||
| ||||||||||||||
| ||||||||||||||
| a dictionary -- each key defines a variable that can be parsed out from each record (which may be referenced later in filtering)
keys is a list of how key values in the record to look into to find the value, its to handle nesting (essentially defining a path through the data). Suppose we have logs that look like this:
so if we want to get the log_type, we should list all the keys needed to parse through the json in order:
In many cases you will probably only need one key. e.g. in flat json that isn’t nested
here you would just specify keys: [“log_type”]. A few operations are supported that can be used to further alter the parsed information (like split and replace). This snippet would grab whatever is located at log[“file”][“type”] and name it as “type”. record_field_mapping defines variables by taking them from logs, and these variables can then be used for filtering. Let’s say you have a log in json format like this which will be set to devo:
Specifying “type” in the record_field_mapping will allow the collector to extract that value, “security_log” and save it as type. Now let’s say you want to change the tag dynamically based on that value. You could change the routing_template to something like my.app.datasource.[record-type]. In the case of the log above, it would be sent to my.app.datasource.security_log. Now let’s say you want to filter out (not send) any records which have the type security_log. You could write a line_filter_rule as follows:
Let’s say you have a filename “logs/account_id/folder_name/filename” and you want to save the account_id as a variable to use for tag routing or filtering. You could write a file_field_definition like this:
This would store a variable called account_id by taking the entire filename and splitting it into pieces based on where it finds backslashes, then take the element as position one. In Python it would look like:
| |||||||||||||
| a string defining how to build the tag to send each message. e.g. | |||||||||||||
|
This set of rules could be expressed in pseudocode as follows: (Internal) Notes + Debugging If something seems wrong at launch, you can set the following in the collector parameters/ job config. "debug_mode": true, This will print out data as it is being processed, stop messages from getting hacked, and at the last step, not actually send the data (so you can see if something is breaking without the customer getting wrongly formatted repeat data without consuming from the queue and losing data) |