/
Files Fetcher

Files Fetcher

Devo Files Fetcher is an extension developed on top of Osquery’s extensibility framework that allows the Devo EA Manager to carve file contents and upload them to Devo. As a result, endpoints are configured to scan an abitrary number of files and folders, process them and upload their contents automatically.

Three tables are added to the standard OSQuery schema:

  • devo_files_config: Returns general configuration information from the fetchfiles extension.

  • devo_files_info: Shows statistics about the files processed.

  • devo_files: Provides access to the content of the files processed by the extension.

By default, a the DevoFetchFilesPack is included in the Endpoint Agent solution, containing the minimum set of queries that implement the files processing functionality. The pack makes use of the tables listed above.

Basic configuration

From EA 1.3, the platform is deployed using Virtual Environments to avoid issues with pre-deployed software in the server. Make sure to activate the virtual environment before running any operation that require ansible or python tools.

In general terms, the filefetcher extension works by processing one or more paths defined in its configuration and uploading the contents of the files in those paths line by line. Such configuration is defined in the Ansible role ‘deam-packs’ along with the rest of settings for the EA Manager. The following instructions specify the process to add a new configuration to the Files Fetcher extension:

  1. Edit the inventory file we used in our Devo EA Manager deployment.

  2. Under the “vars“ section, add one of the following fields (matching our endpoint Operating System).

  • Windows

    deam_fleet_config_devoext_fetchfiles_paths_win:
  • Linux

    deam_fleet_config_devoext_fetchfiles_paths_nix:
  • Darwin

    deam_fleet_config_devoext_fetchfiles_paths_darwin:

3. Add the desired pattern sections for each individual file or set of them to be found in the path. As an example, the last pattern in the snippet shows the configuration to ingest into Devo the contents of the default website IIS logs:

4. Save all changes to the inventory file.

5. Run the deam-packs playbook to apply the changes by executing the following commands (make sure the path to the inventory file is correct):

Once the playbook is run, File Fetcher will automatically start reading the data and upstreaming it to Devo.

Data access in Devo

By default, all uploaded content files will be ingested into Devo under box.devo_ea.files.

There are some tables enabled under box.devo_ea.files for specific events retrieved with filesfetcher listed below:

Event type

Devo table

Event type

Devo table

Windows dns debug logging

box.devo_ea.files.dns_windows

Windows iis logs

box.devo_ea.files.iis

This destination data structure can be configured to point at any my.app.*.* tag.

(Only available from EAM v1.1.0) If the data that is being sent to Devo has already an existing technology and parser associated in Devo, the File Fetcher can be configured to use them.

Options

filesfetcher supports two different levels of configuration:

  • Global: Sets the overall behavior of the extension.

  • Per pattern: Allows setting specific configurations per specified source files path.

Global options

The following options are available as global settings of the extension:

  • deam_fleet_config_devoext_fetchfiles_config_refresh (Duration): Specifies the interval in which the agent will look for updates for updates of the configuration of the filesfetcher extension in the EAM. Can be expressed in seconds (s), minutes (m) and hours (h).

  • deam_fleet_config_devoext_fetchfiles_watchdog_opts: configuration block specific to the capturing function.
    This option is for all Operating Systems, there are also the same option by operating system (listed below).

    • deam_fleet_config_devoext_fetchfiles_watchdog_win → Windows

    • deam_fleet_config_devoext_fetchfiles_watchdog_nix → Linux

    • deam_fleet_config_devoext_fetchfiles_watchdog_darwin → Darwin

  • deam_fleet_config_devoext_fetchfiles_buffer_size: Total side in bytes per processed chunk.

  • deam_fleet_config_devoext_fetchfiles_buffer_max_number_of_parts_per_file: Max number of processed events per chunk.

  • deam_fleet_config_devoext_fetchfiles_default_tag: Default destination in Devo for all ingested files. Can be overriden in the patterns options.

The following example illustrates how these options are configured in the inventory file:

Patterns options

The patterns section allows for the definition of files scanning paths along with their respective scanning options. These options are described in the following list:

  • pattern (string): Specifies a the set of files to scan using a subset of glob patterns. Additionally, ‘**’ patterns are also supported to denote full folders and subfolders processing. Examples of valid pattern definitions are:

  • tag: Data structure in Devo where the content of the files matching the pattern will be uploaded.

  • content_separator (string): defines an event delimiter string. By default, events are processed as full line events.

  • file_processor (fixed | multiline): Allows setting a multiline events processing in conjunction with the content_separator string. Default value is fixed (single-line events).

The following example illustrates the usage of these options:

  • (Only available from EAM v1.1.0) threshold_file_modification_time (duration) Negative number in duration format that represents the time the File Fetcher needs to consider that an event is fully written. If the scanned file has been modified within now + threshold_file_modification_time, the last event is not sent but marked as the offset to be sent in the next scan iteration. When using bigger multiline events that could take longer to write, it would be advisable to increase the threshold so the chance to truncate a log is lower. By default the value is -500ms. The value should be in duration format, some valid examples could be: -500ms, -10s, -5s. If the value is 0 or a positive value, every scan will send up to the end of the file.

  • (Only available from EAM v1.1.0) payload_format (c:event) Allows the user to remove the JSON wrapper around each event sent to Devo so the events are sent “as is”. Used to be able to use existing technologies in Devo that do not use JSON. As of EAM1.1.0, only supported technologies are the ones that do not modify the tag nor the payload. The only value valid for this parameter is c:event. In order to make use of it, follow the configuration snippet below:

Example of an event sent to Devo when not using payload_format: c:event

Example of an event sent to Devo using payload_format: c:cevent

Multiline events using regular expressions

Files Fetcher enables the user to use regular expressions as delimiters for the events. This is a powerful tool to parse and interpret log files where the delimitation between events is not clear.

The regular expression defined as delimiter should follow the syntax defined here, and it should always be placed at the beginning of the line in the log file. If the delimiter does not start at the beginning of the line it will be ignored.

Example:

We want to ingest into Devo a log file from an application with the following structure:

In the above example, there are three events that need to be parsed and sent to Devo. Looking at the log file, we can consider the event date as the delimiter between events. When there is a new date at the beginning of the line, it is considered that there is a new event.

Configuration in File Fetcher should be something like the following:

The regular expression defined in content_separator defines that a structure following : “4 digits-2 digits-2 digits 2 digits:2 digits:2 digits,3 digits” will mark the start of a new event and the end of the current one. (i.e: 2012-01-19 10:13:25,393).

Update configuration of patterns<->tags

It is not possible to update tags on the fly to avoid incoherent data in Devo. The File Fetcher will check if the pattern and the tags match those that were set at initial configuration time.

If you need to re-configure a pattern to a different tag:

  1. Remove the pattern that we want to reconfigure in the inventory file and deploy the deam-packs playbook (ansible-playbook -i inventories/your_inventory.yaml playbooks/deam-packs.yaml). As en example in the below screenshot, we removed the highlighted pattern line.

  2. Wait until the configuration is propagated to the agents. The default value for configuration refresh is 15 minutes and is set in the config_refresh tag in the inventory.

  3. Enable the virtual environment by running:

  4. Once the configuration is propagated, create the pattern again, pointing to the new tag in the inventory file and deploy the deam-packs playbook (ansible-playbook -i inventories/your_inventory.yaml playbooks/deam-packs.yaml) . As an example in the below screenshot, we have added the pattern again, but pointing to a new tag.

  5. The data will start ingesting from the beginning of the file in the new tag.

Performance considerations

Depending on the configuration of the files fetching mechanism, there might be a potential impact on the sizing of the EAM elements as well as in the data volumes ingested into Devo. A general recommendation is to introduce configurations one by one and with a clear, optimum specification of the files and their contents to be uploaded.

Besides, consider combining the File Fetcher functionality with automatic labeling of endpoints and their corresponding configuration profiles (for example, scan Apache logs in the designated paths only if the endpoint is running an Apache webserver process).