Files fetcher
Devo File Fetcher is an extension developed on top of osquery’s extensibility framework that allows the Devo UA Manager to carve file contents and upload them to Devo. As a result, endpoints are configured to scan an arbitrary number of files and folders, process them, and upload their contents automatically.
Three tables are added to the standard OSQuery schema:
- devo_files_config: returns general configuration information from the fetchfiles extension.
- devo_files_info: shows statistics about the files processed.
- devo_files: provides access to the content of the files processed by the extension.
By default, the DevoFetchFilesPack is included in the Universal Agent solution, containing the minimum set of queries that implement the file-processing functionality. The pack makes use of the tables listed above.
Basic configuration
In general terms, the filefetcher extension works by processing one or more paths defined in its configuration and uploading the contents of the files in those paths line-by-line. Such configuration is defined in the Ansible role duam-packs along with the rest of settings for the UA Manager. The following instructions specify the process to add a new configuration to the Files Fetcher extension:
1. Edit the Ansible playbook file located in $HOME/devo-ua-deployer/playbooks/roles/duam-packs/files/devo-packs/options.yaml
2. Locate the devo_extensions and fetchfiles sections in the playbook file. It should read as shown in the following code snippet:
devo_extensions:   fetchfiles:     watchdog:       tag: box.devo_ua.files       file_buffer_size: 131072 # 128K       max_number_of_parts_per_file: 2000       paths:        - pattern: /var/log/syslog        - pattern: /var/log/system.log        - pattern: C:\Program Files (x86)\Apache Software Foundation\Tomcat*\logs\*        - pattern: C:\Program Files\Apache Software Foundation\Tomcat*\logs\*
3. Add additional pattern sections for each individual file or set of them to be found in the path. As an example, the last pattern in the snippet shows the configuration to ingest into Devo the contents of the default website IIS logs:
devo_extensions:   fetchfiles:     watchdog:       tag: box.devo_ua.files       file_buffer_size: 131072 # 128K       max_number_of_parts_per_file: 2000       paths:        - pattern: /var/log/syslog        - pattern: /var/log/system.log        - pattern: C:\Program Files (x86)\Apache Software Foundation\Tomcat*\logs\*        - pattern: C:\Program Files\Apache Software Foundation\Tomcat*\logs\*    - pattern: C:\inetpub\logs\LogFiles\W3SVC1\*Â
4. Save all changes to the options.yaml file
5. Run the duam-packs playbook to apply the changes by executing the following commands (make sure the path to the inventory file is correct):
$ cd $HOME/devo-ua-deployer $ ansible-playbook -i inventories/<YOUR INVENTORY FILE NAME>.yaml playbooks/duam-packs.yaml
Once the playbook is run, File Fetcher will automatically start reading the data and upstreaming it to Devo.
Data access in Devo
By default, all uploaded content files will be ingested into Devo under box.devo_ua.files.Â
This destination data structure can be configured to point at any my.app.*.* tag.
Options
Filesfetcher supports two different levels of configuration:
- Global: Sets the overall behavior of the extension.
- Per pattern: Allows setting-specific configurations per specified source files path.
Global options
The following options are available as global settings of the extension:
- config_refresh: specifies the interval in which the agent will look for updates of the configuration of the filesfetcher extension in the UAM. Can be expressed in seconds (s), minutes (m), and hours (h).
- watchdog: configuration block specific to the capturing function.
- watchdog—scan each (number): specifies the interval in which all specified paths will be re-scanned for changes (e.g., new files detection).
- watchdog—file_buffer_size (number): total size in kilobytes per processed chunk.
- watchdog—max_number_of_parts_per_file (number): max number of processed events per chunk.
- watchdog—tag (Devo tag): default destination in Devo for all ingested files. Can be overridden in the patterns options.
- watchdog—allow_empty_paths (false | true): allows the usage of an empty path section (i.e., paths:[]).
The following example illustrates how these options are configured in the yaml file:
devo_extensions:    fetchfiles:     config_refresh: 30s     watchdog:      scan_each: 30s      file_buffer_size: 102400 # 100k      max_number_of_parts_per_file: 10000      tag: my.app.ua.files      allow_empty_paths: true      paths: []
Patterns options
The patterns section allows for the definition of files scanning paths along with their respective scanning options. These options are described in the following list:
- pattern (string): Specifies a the set of files to scan using a subset of glob patterns. Additionally, ‘**’ patterns are also supported to denote full folders and subfolders processing. Examples of valid pattern definitions are:
- pattern: /tmp/test/* - pattern: /tmp/test/**/*.log - pattern: /tmp/test/**/* - pattern: /tmp/test/a/** - pattern: /tmp/test/a/**/* - pattern: /tmp/test/a/*/* - pattern: /tmp/test/a1/b/{1,2}*.txt
- tag: data structure in Devo where the content of the files matching the pattern will be uploaded.
- content_separator (string): defines an event delimiter string. By default, events are processed as full line events.
- file_processor (fixed | multiline): allows setting a multiline events processing in conjunction with the content_separator string. Default value is fixed (single-line events).
The following example illustrates the usage of these options:
devo_extensions:    fetchfiles:     config_refresh: 10m     watchdog:      scan_each: 1m      tag: my.app.ua.files      paths:       - pattern: C:\flog\logs\apache\**\error*log        tag: my.app.ua.apache-error       - pattern: C:\flog\logs\apache\common*log        tag: my.app.ua.apache-common        content_separator: "a"       - pattern: C:\flog\logs\apache\combined*log        tag: my.app.ua.apache-combined       - pattern: C:\flog\logs\xml\notes_xml?.log        content_separator: <note>        file_processor: multiline
Update configuration of patterns↔tags
Updating tags on the fly to avoid incoherent data in Devo isn't permitted. The file fetcher will check if the pattern and the tags match those that were set during the initial configuration.Â
To reconfigure a pattern to a different tag:
- Remove the pattern that you want to reconfigure in the
options.yaml
file and deploy theduam-packs
playbook (ansible-playbook -i inventories/your_inventory.yaml playbooks/duam-packs.yaml
). See the highlighted pattern line in the screenshot below: - Wait until the configuration is propagated to the agents. The default value for configuration refresh is 15 minutes and is set in the
config_refresh
tag inoptions.yaml
. - Once the configuration is propagated, recreate the pattern, this time pointing to the new tag in the
options.yaml
file. - Deploy the
duam-packs
playbook (ansible-playbook -i inventories/your_inventory.yaml playbooks/duam-packs.yaml
). See the pattern pointing to the new tag in the screenshot below: - The data will start ingesting from the beginning of the file in the new tag.
Performance considerations
Depending on the configuration of the file-fetching mechanism, there might be a potential impact on the sizing of the UAM elements as well as in the data volumes ingested into Devo. A general recommendation is to introduce configurations one-by-one and with a clear, optimum specification of the files and their contents to be uploaded.
Furthermore, consider combining the file fetcher functionality with automatic labeling of endpoints and their corresponding configuration profiles (e.g., scan Apache logs in the designated paths only if the endpoint is running an Apache webserver process).