Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
maxLevel2
typeflat
Note

If you are migrating from old 1v1.x.x versionsto v2.0.0, you can find a complete guide at Azure Collector Migration Guide in this article.

Overview

Microsoft Azure is an ever-expanding set of cloud computing services to help your organization meet its business challenges. Azure gives you the freedom to build, manage, and deploy applications on a massive, global network using your preferred tools and frameworks.

...

Features

Details

Allow parallel downloading (multipod)

Partial (supported for event_hubs services using Azure Blob Storage)

Running environments

  • collector server

  • on-premise

Populated Devo events

table

Flattening pre-processing

no

Allowed source events obfuscation

yes

Data source description

Data source

Description

API endpoint

Collector service name

Devo table

VM Metrics

With the advantages of the Microsoft Azure API, one can obtain metrics about the deployed Virtual Machines, gathering them on our platform, making it easier to query and analyze in the Devo platform and Activeboards.

Azure Compute Management Client SDK and Azure Monitor Management Client SDK

vm_metrics

cloud.azure.vm.metrics_simple

Event Hubs

Several Microsoft Azure services can generate some type of execution information to be sent to an EventHub service. (see next section)

Azure Event Hubs SDK

event_hubs and event_hubs_autodiscover

<auto_tag_description>

...

Info

Valid for all cloud.azure tables by setting the output option to stream to Event Hub.

Event hubs: Auto-categorization of Microsoft Azure service messages

...

Note

In case the amount of egress data exceeds Throughput per Unit limits set by Azure (2 MB/s or 4096 events per second), it won’t be possible for Devo to continue reliable ingestion of data. You can monitor ingress/egress throughput in Azure Portal EventHub Namespace, and based on trends/alerts, you can add another EventHub to resolve this. To avoid this from happening in the first place, please follow scalability guidance provided by Microsoft in their technical documentation.

Learn more in this article.

Vendor setup

The Microsoft Azure collector centralizes the data with an Event Hub using the Azure SDK. To use it, you need to configure the resources in the Azure Portal and set the right permissions to access the information.

Anchor
virtual-machine-metrics
virtual-machine-metrics
Virtual Machine metrics

...

  1. After creating the App registration (or Service Principal), go to the desired Resource Group (or subscription if you want to retrieve metrics from all the available virtual machines).

  2. Select Access control (IAM) in the left menu and click Add.

  3. Select at least the Reader role and choose the previously created App registration.

  4. Confirm the changes.

...

Anchor

...

eventhubevents
eventhubevents
Event Hub events

Getting credentials (Storage Account) (Optional)

...

Setting up the Event Hubs

  1. Now, search the Monitor service and click on it.

  2. Click the Diagnostic Settings option in the left area.

  3. A list of the deployed resources will be shown. Search for the resources that you want to monitor, select them, and click Add diagnostic setting.

  4. Type a name for the rule and check the required category details (logs will be sent to the cloud.azure.eh.events table, and metrics will be sent to the cloud.azure.eh.metrics table).

  5. Check Stream to an Event Hub, and select the corresponding Event hub namespace, Event hub name, and Event hub policy name.

  6. Click Save to finish the process.

Event Hub Auto Discover

To configure access to event hubs for the auto-discovery feature, you need to grant the necessary permissions to the registered application to access the Event Hub without using the RootManageSharedAccessKey. Furthermore, the auto-discovery feature will enumerate a namespace and resource group for all available event hubs and optionally create consumer groups (if the configuration specifies a consumer group other than $Default and that consumer group does not exist when he collector connects to the event hub) and optionally create Azure Blob Storage containers for checkpointing purposes (if the user specifies a storage account and container in the configuration file).

...

Info

For Azure Event Hub, it is enough with the event hub name and the connection string (and optionally consumer group). No credentials are required.

Accepted authentication methods

Authentication method

Tenant ID

Client ID

Client secret

Subscription ID

OAuth2

Status
colourGreen
titleREQUIRED

Status
colourGreen
titleREQUIRED

Status
colourGreen
titleREQUIRED

Status
colourGreen
titleREQUIRED

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Parameters marked as "Mandatory" are required for the collector's configuration. Optional parameters can be omitted or removed if not
Rw ui tabs macro
Rw tab
titleOn-premise collector

This data collector can be run in any machine that has the Docker service available because it should be executed as a docker container. The following sections explain how to prepare all the required setup for having the data collector running.

Structure

The following directory structure will be required as part of the setup procedure (it can be created under any directory):

Code Block
<any_directory>
└── devo-collectors/
    └── azure/
        ├── certs/
        │   ├── chain.crt
        │   ├── <your_domain>.key
        │   └── <your_domain>.crt
        ├── state/
        └── config/ 
            └── config-azure.yaml           

Devo credentials

In Devo, go to Administration → Credentials → X.509 Certificates, download the CertificatePrivate key and Chain CA and save them in <any_directory>/devo-collectors/azure/certs. Learn more about security credentials in Devo here.

Editing the config-azure.yaml file

In the config-azure.yaml file, replace the <app_id><active_directory_id><subscription_id> and <secret> values and enter the ones that you got in the previous steps. In the <short_unique_identifier> placeholder, enter the value that you choose.

Code Block
languageyaml
globals:
  debug: false
  id: <collector_id_value>
  name: <collector_name_value>
  persistence:
    type: filesystem
    config:
      directory_name: state
outputs:
  devo_us_1:
    type: devo_platform
    config:
      address: <devo_address>
      port: 443
      type: SSL
      chain: <chain_filename>
      cert: <cert_filename>
      key: <key_filename>
inputs:
  azure:
    id: <short_unique_id>
    enabled: true
    credentials:
      subscription_id: <subscription_id_value>
      client_id: <client_id_value>
      client_secret: <client_secret_value>
      tenant_id: <tenant_id_value>
    environment: <environment_value>
    services:
      vm_metrics:
        request_period_in_seconds: <request_period_in_seconds_value>
        start_time_in_utc: <start_time_in_utc_value>
        include_resource_id_patterns: [<include_resource_id_patterns_values>]
        exclude_resource_id_patterns: [<exclude_resource_id_patterns_values>]
        override_tag: <override_tag_value>
  azure_event_hub:
    id: <short_unique_id>
    enabled: true
    credentials:
      subscription_id: <subscription_id_value>
      client_id: <client_id_value>
      client_secret: <client_secret_value>
      tenant_id: <tenant_id_value>
    environment: <environment_value>
    services:
      event_hubs:
        override_pull_report_frequency_seconds: <override_pull_report_frequency_seconds_value>
        override_consumer_client_ttl_seconds: <override_consumer_client_ttl_seconds_value>
        queues:
          <queue_name_value>:
            namespace: <namespace_value>
            event_hub_name: <event_hub_name_value>
            event_hub_connection_string: <event_hub_connection_string_value>
            consumer_group: <consumer_group_value>
            events_use_autocategoryauto_category: <events_use_autocategoryauto_category_value>
            blob_storage_connection_string: <blob_storage_connection_string_value>
            blob_storage_container_name: <blob_storage_container_name_value>
            blob_storage_account_name: <blob_storage_account_name_value>
            compatibility_version: <compatibility_version_value>
            duplicated_messages_mechanism: <duplicated_messages_mechanism_value>
            override_starting_position: <override_starting_position_value>
            override_tag: <override_tag_value>
            client_thread_limit: <client_thread_limit_value>
            uamqp_transport: <uamqp_transport_value>
            partition_ids: [<partition_id>]
      event_hubs_auto_discover:
        resource_group: <resource_group_value>
        namespace: <namespace_value>
        blob_storage_account_name: <blob_storage_account_name_value>
        blob_storage_connection_string: <blob_storage_connection_string_value>
        consumer_group: <consumer_group_value>
        events_use_auto_autocategorycategory: <events_use_autocategoryauto_category_value>
        duplicated_messages_mechanism: <duplicated_messages_mechanism_value>
        override_pull_report_frequency_seconds: <override_pull_report_frequency_seconds_value>
        override_consumer_client_ttl_seconds: <override_consumer_client_ttl_seconds_value>
        override_starting_position: <override_starting_position_value>
        override_blob_storage_container_prefix: <override_blob_storage_container_prefix_value>
        overrideclient_thread_taglimit: <override<client_thread_taglimit_value>
        clientuamqp_thread_limittransport: <client<uamqp_threadtransport_limit_value>
Info

The tag field is optional and is only available in the eh_services services type.

Note

For compatibility reasons, the default value of the events_use_autocategory property is false

Note

For new deployments, we recommend the following values:

  • events_use_autocategorytrue

  • compatibility_version → enter your current collector version

Info

If you need to use a custom tag for generated messages, it can be done using the property tag inside any queue name, next toevent_hub_nameorconnection_str. For example tag: my.app.azure.{service_name}

Parameter

Data type

Requirement

Value range / Format

Description

collector_id_value

str

Mandatory

Min length: 1, Max length: 5

Unique identifier for the collector.

collector_name_value

str

Mandatory

Min length: 1, Max length: 10

Name assigned to the collector.

devo_address

str

Mandatory

One of: collector-us.devo.io, collector-eu.devo.io

Devo Cloud destination for events.

chain_filename

str

Mandatory

Min length: 4, Max length: 20

Filename of the chain.crt file from your Devo domain.

cert_filename

str

Mandatory

Min length: 4, Max length: 20

Filename of the file.cert from your Devo domain.

key_filename

str

Mandatory

Min length: 4, Max length: 20

Filename of the file.key from your Devo domain.

short_unique_id

str

Mandatory

Min length: 1, Max length: 5

Short, unique ID for input service, used in persistence addressing. Avoid duplicates to prevent collisions.

tenant_id_value

str

Mandatory

Min length: 1

Tenant ID for Azure authentication.

client_id_value

str

Mandatory

Min length: 1

Client ID for Azure authentication.

client_secret_value

str

Mandatory

Min length: 1

Client secret for Azure authentication.

subscription_id_value

str

Mandatory

Min length: 1

Azure subscription ID.

environment_value

str

Optional

Min length: 1

Differentiates environments (e.g., dev, prod). Remove if unused.

request_period_in_seconds_value

int

Optional

Min: 60

Custom period in seconds between data pulls, overriding default (300s).

start_time_in_utc_value

str

Optional

UTC datetime format: %Y-%m-%dT%H-%M-%SZ

Custom start date for data retrieval, for historical data download. Remove if unused.

include_resource_id_patterns_values

[str]

Optional

Glob patterns e.g., ["*VM-GROUP-1*"]

Includes resources matching patterns. Remove if unused.

exclude_resource_id_patterns_values

[str]

Optional

Glob patterns e.g., ["*VM-GROUP-1*"]

Excludes resources matching patterns. Remove if unused.

queue_name_value

str

Mandatory

Min length: 1

Name for the queue, appears in related logs.

event_hub_name_value

str

Mandatory

Min length: 1

Name of the Event Hub to pull events from.

event_hub_connection_string_value

str

Mandatory

Min length: 1

Connection string for the Event Hub.

consumer_group_value

str

Optional

Min length: 1, Default: $Default

Consumer group for the Event Hub. Defaults to $Default.

events_use_autocategory_value

bool

Optional

Default: false

Enables/disables auto-tagging of events.

blob_storage_connection_string_value

str

Optional

Min length: 1

Connection string for blob storage, optional for Azure Blob Storage checkpointing.

blob_storage_container_name_value

str

Optional

Min length: 1

Blob storage container name, required if using Azure Blob Storage checkpointing.

blob_storage_account_name_value

str

Optional

Min length: 1

Blob storage account name, alternative to using connection string for checkpointing.

compatibility_version_value

str

Optional

Version strings

Compatibility version for event processing.

duplicated_messages_mechanism_value

str

Optional

One of: "local", "global", "none"

Deduplication mechanism for messages: local, global, or none (see note below).

override_starting_position_value

str

Optional

One of: "-1", "@latest", "[UTC datetime value]"

Starting position for event
fetching: from the beginning of
available data (-1), from the
latest data fetched (@fetched),
or a specific datetime .override_tag(%Y-%m-
%dT%H-%M-%SZ format).

override_tag_value

str

Optional

Tag-friendly string

Optional tag to override the default tagging mechanism. Remove if unused.

override_pull_report_frequency_seconds_value

int

Optional

Default: 60

Frequency in seconds for reporting pull statistics in logs.

override_consumer_client_ttl_seconds_value

int

Optional

Default varies by service

Time-to-live in seconds for consumer clients, after which the collector restarts the pull cycle.

resource_group_value

str

Mandatory

Min length: 1

Azure resource group for event hub discovery.

namespace_value

str

Mandatory

Min length: 1

Namespace within Azure for event hub discovery.

override_blob_storage_container_prefix_value

str

Optional

Min length: 3, Max length: 10; Default: devo-

Prefix for blob storage containers created by auto-discovery service. Remove if unused.

uamqp_transport_value

bool

Optional

Default: false

Allows users to override/force
event hub SDK to use legacy
UAMQP transport mechanism
(true)instead of the
default/current PyAMQP
mechanism (false)

<partition_ids>

str

Optional

List of
partition
number, as
["1","3","5","7"]

Allows to define which partitions are going to be connected by this instance of the collector. It overrides client_thread_limit_value

Info

Parameters marked as "Mandatory" are required for the collector's configuration. Optional parameters can be omitted or removed if not used, but they provide additional customization and control over the collector's behavior.

Note

Local deduplication means that duplicates are deleted in the data received from the current collector. Global means that duplicates are search for all the instances of the collector. None means that duplicates are not deleted.

See more details in the section Internal Process and Deduplication Method.

If you deploy one collector, use local. If you deploy several instances of the collector, use global.

Note

override_tag_value can be used to create new categories. If needed, consult this section.

Download the Docker image

The collector should be deployed as a Docker container. Download the Docker image of the collector as a .tgz file by clicking the link in the following table:

Collector Docker image

SHA-256 hash

collector-azure_collector-docker-image-2.2.0

Code Block
504ef2c7d3a857468b7bd3794f6b8ac8e9c9f7e09a4bcaad8aa96f8219592508

Use the following command to add the Docker image to the system:

Code Block
gunzip -c collector-azure-docker-image-<version>.tgz | docker load
Info

Once the Docker image is imported, it will show the real name of the Docker image (including version info).

The Docker image can be deployed on the following services:

Anchor
docker
docker
Docker

Execute the following command on the root directory <any_directory>/devo-collectors/azure/

Code Block
docker run \
--name collector-azure \
--volume $PWD/certs:/devo-collector/certs \
--volume $PWD/config:/devo-collector/config \
--volume $PWD/state:/devo-collector/state \
--env CONFIG_FILE=config-azure.yaml \
--rm -it docker.devo.internal/collector/azure:<version>
Note

Replace <version> with the corresponding value.

Anchor
dockercompose
dockercompose
Docker Compose

The following Docker Compose file can be used to execute the Docker container. It must be created in the <any_directory>/devo-collectors/azure/ directory.

Code Block
languageyaml
version: '3'
services:
  collector-azure:
    image: docker.devo.internal/collector/azure:${IMAGE_VERSION:-latest}
    container_name: collector-azure
    volumes:
      - ./certs:/devo-collector/certs
      - ./config:/devo-collector/config
      - ./state:/devo-collector/state
    environment:
      - CONFIG_FILE=${CONFIG_FILE:-config-azure.yaml}

To run the container using docker-compose, execute the following command from the <any_directory>/devo-collectors/azure/ directory:

Code Block
IMAGE_VERSION=<version> docker-compose up -d
Note

Replace <version> with the corresponding value.

Rw tab
titleCloud collector

We use a piece of software called Collector Server to host and manage all our available collectors.

To enable the collector for a customer:

  1. In the Collector Server GUI, access the domain in which you want this instance to be created

  2. Click Add Collector and find the one you wish to add.

  3. In the Version field, select the latest value.

  4. In the Collector Name field, set the value you prefer (this name must be unique inside the same Collector Server domain).

  5. In the sending method select Direct Send. Direct Send configuration is optional for collectors that create Table events, but mandatory for those that create Lookups.

  6. In the Parameters section, establish the Collector Parameters as follows below:

Editing the JSON configuration

Code Block
{
  "global_overrides": {
    "debug": false
  },
  "inputs": {
    "azure": {
      "id": "<short_unique_id>",
      "enabled": true,
      "credentials": {
        "subscription_id": "<subscription_id_value>",
        "client_id": "<client_id_value>",
        "client_secret": "<client_secret_value>",
        "tenant_id": "<tenant_id_value>"
      },
      "environment": "<environment_value>",
      "services": {
        "vm_metrics": {
          "request_period_in_seconds": "<request_period_in_seconds_value>",
          "start_time_in_utc": "<start_time_in_utc_value>",
          "include_resource_id_patterns": [
            "<include_resource_id_patterns_values>"
          ],
          "exclude_resource_id_patterns": [
            "<exclude_resource_id_patterns_values>"
          ],
        }
 "override_tag": "<override_tag_value>"    }
    },
      }
    },
    "azure_"azure_event_hub": {
      "id": "<short_unique_id>",
      "enabled": true,
      "credentials": {
        "subscription_id": "<subscription_id_value>",
        "client_id": "<client_id_value>",
        "client_secret": "<client_secret_value>",
        "tenant_id": "<tenant_id_value>"
      },
      "environment": "test-env<environment_value>",
      "services": {
        "event_hubs": {
          "override_pull_report_frequency_seconds": "<override_pull_report_frequency_seconds_value>",
          "override_consumer_client_ttl_seconds": "<override_consumer_client_ttl_seconds_value>",
          "queues": {
            "<queue_name_value>": {
              "namespace": "<namespace_value>",
              "event_hub_name": "<event_hub_name_value>",
              "event_hub_connection_string": "<event_hub_connection_string_value>",
              "consumer_group": "<consumer_group_value>",
              "events_use_autocategoryauto_category": "<events_use_auto_autocategorycategory_value>",
              "blob_storage_connection_string": "<blob_storage_connection_string_value>",
              "blob_storage_container_name": "<blob_storage_container_name_value>",
              "blob_storage_account_name": "<blob_storage_account_name_value>",
              "compatibility_version": "<compatibility_version_value>",
              "duplicated_messages_mechanism": "<duplicated_messages_mechanism_value>",
              "override_starting_position": "<override_starting_position_value>",
              "override_tag": "<override_tag_value>",
              "client_thread_limit": "<client_thread_limit_value>",
              "uamqp_transport": "<uamqp_transport_value>",
              "partition_ids": ["<partition_id>"]
            }
          }
        },
        "event_hubs_auto_discover": {
          "resource_group": "<resource_group_value>",
          "namespace": "<namespace_value>",
          "blob_storage_account_name": "<blob_storage_account_name_value>",
          "consumer_groupblob_storage_connection_string": "<consumer_group<blob_storage_connection_string_value>",
          "eventsconsumer_use_autocategorygroup": "<events<consumer_use_autocategorygroup_value>",
          "events_use_auto_category": "<events_use_auto_category_value>",
          "duplicated_messages_mechanism": "<duplicated_messages_mechanism_value>",
          "override_pull_report_frequency_seconds": "<override_pull_report_frequency_seconds_value>",
          "override_consumer_client_ttl_seconds": "<override_consumer_client_ttl_seconds_value>",
          "override_starting_position": "<override_starting_position_value>",
          "override_blob_storage_container_prefix": "<override_blob_storage_container_prefix_value>",
          "override_tagclient_thread_limit": "<client_thread_limit_value>",
          "uamqp_transport": "<override<uamqp_tagtransport_value>"
        }
      }
    }
  }
}

The following table outlines the parameters available for configuring the collector. Each parameter is categorized by its necessity (mandatory or optional), data type, acceptable values or formats, and a brief description.

Parameter

Data type

Requirement

Value range / Format

Description

short_unique_id

str

Mandatory

Min length: 1, Max length: 5

Short, unique ID for input service, used in persistence addressing. Avoid duplicates to prevent collisions.

tenant_id_value

str

Mandatory

Min length: 1

Tenant ID for Azure authentication.

client_id_value

str

Mandatory

Min length: 1

Client ID for Azure authentication.

client_secret_value

str

Mandatory

Min length: 1

Client secret for Azure authentication.

subscription_id_value

str

Mandatory

Min length: 1

Azure subscription ID.

environment_value

str

Optional

Min length: 1

Differentiates environments (e.g., dev, prod). Remove if unused.

request_period_in_seconds_value

int

Optional

Min: 60

Custom period in seconds between data pulls, overriding default (300s).

start_time_in_utc_value

str

Optional

UTC datetime format: %Y-%m-%dT%H-%M-%SZ

Custom start date for data retrieval, for historical data download. Remove if unused.

include_resource_id_patterns_values

[str]

Optional

Glob patterns e.g., ["*VM-GROUP-1*"]

Includes resources matching patterns. Remove if unused.

exclude_resource_id_patterns_values

[str]

Optional

Glob patterns e.g., ["*VM-GROUP-1*"]

Excludes resources matching patterns. Remove if unused.

queue_name_value

str

Mandatory

Min length: 1

Name for the queue, appears in related logs.

event_hub_name_value

str

Mandatory

Min length: 1

Name of the Event Hub to pull events from.

event_hub_connection_string_value

str

Mandatory

Min length: 1

Connection string for the Event Hub.

consumer_group_value

str

Optional

Min length: 1, Default: $Default

Consumer group for the Event Hub. Defaults to $Default.

events_use_autocategory_value

bool

Optional

Default: false

Enables/disables auto-tagging of events.

blob_storage_connection_string_value

str

Optional

Min length: 1

Connection string for blob storage, optional for Azure Blob Storage checkpointing.

blob_storage_container_name_value

str

Optional

Min length: 1

Blob storage container name, required if using Azure Blob Storage checkpointing.

blob_storage_account_name_value

str

Optional

Min length: 1

Blob storage account name, alternative to using connection string for checkpointing.

compatibility_version_value

str

Optional

Version strings

Compatibility version for event processing.

duplicated_messages_mechanism_value

str

Optional

One of: "local", "global", "none"

Deduplication mechanism for messages: local, global, or none.

override_starting_position_value

str

Optional

One of: "-1", "@latest", "[UTC datetime value]"

Starting position for event
fetching: from the beginning of
available data (-1), from the
latest data fetched (@fetched),
or a specific datetime (%Y-%m-
%dT%H-%M-%SZ format).

override_tag_value

str

Optional

Tag-friendly string

Optional tag to override the default tagging mechanism. Remove if unused.

override_pull_report_frequency_seconds_value

int

Optional

Default: 60

Frequency in seconds for reporting pull statistics in logs.

override_consumer_client_ttl_seconds_value

int

Optional

Default varies by service

Time-to-live in seconds for consumer clients, after which the collector restarts the pull cycle.

resource_group_value

str

Mandatory

Min length: 1

Azure resource group for event hub discovery.

namespace_value

str

Mandatory

Min length: 1

Namespace within Azure for event hub discovery.

override_blob_storage_container_prefix_value

str

Optional

Min length: 3, Max length: 10; Default: devo-

Prefix for blob storage containers created by auto-discovery service. Remove if unused.

Info

uamqp_transport_value

bool

Optional

Default: false

Allows users to override/force
event hub SDK to use legacy
UAMQP transport mechanism
(true)instead of the
default/current PyAMQP
mechanism (false)

<partition_ids>

str

Optional

List of
partition
number, as
["1","3","5","7"]

Allows to define which partitions are going to be connected by this instance of the collector. It overrides client_thread_limit_value

Info

Parameters marked as "Mandatory" are required for the collector's configuration. Optional parameters can be omitted or removed if not used, but they provide additional customization and control over the collector's behavior.

Note

Local deduplication means that duplicates are deleted in the data received from the current collector. Global means that duplicates are search for all the instances of the collector. None means that duplicates are not deleted.

See more details in the section Internal Process and Deduplication Method.

If you deploy one collector, use local. If you deploy several instances of the collector, use global.

Note

override_tag_value can be used to create new categories. If needed, consult this section.

...

Expand
titleEvent Hubs Auto Discover (event_hubs_auto_discover)

General principles

Refer to Event Hubs - General Principles for general principles.

Configuration options

Devo supports only one for this service. Connection strings are not supported.

Event Hubs Auto Discover authentication configuration

Event Hubs authentication can be via connection strings or client credentials (assigning the Azure Event Hubs Data Receiver role).

Preference is given to connection string configuration when both are available.

Required parameters

Connection string configuration

  • event_hub_connection_string

  • event_hub_name

Code Block
inputs:
  azure_event_hub:
    id: 100001
    enabled: true
    services:
      event_hubs:
        queues:
          queue_a:
            event_hub_name: event_hub_value
            event_hub_connection_string: event_hub_connection_string_value

Client credentials configuration

  • event_hub_name

  • namespace

  • Credentials.client_id

  • Credentials.client_secret

  • Credentials.tenant_id

Code Block
inputs:
  azure_event_hub:
    id: 100001
    enabled: true
    credentials:
      client_id: client_id_value
      client_secret: client_secret_value
      tenant_id: tenant_id_value
    services:
      event_hubs:
        queues:
          queue_a:
            namespace: namespace_value
            event_hub_name: event_hub_name_value
Azure Blob storage checkpoint configuration

Optional and configurable via connection strings or client credentials.

If all possible parameters are present, the collector will favor the connection string configuration.

Required parameters

Connection string configuration

  • blob_storage_connection_string

  • blob_storage_container_name

Code Block
inputs:
  azure_event_hub:
    id: 100001
    enabled: true
    services:
      event_hubs:
        queues:
          queue_a:
            event_hub_name: event_hub_value
            event_hub_connection_string: event_hub_connection_string_value
            blob_storage_connection_string: blob_storage_connection_string_value
            blob_storage_container_name: blob_storage_container_name_value

Client credentials configuration

  • blob_storage_account_name

  • blob_storage_container_name

  • Credentials.client_id

  • Credentials.client_secret

  • Credentials.tenant_id

Code Block
inputs:
  azure_event_hub:
    id: 100001
    enabled: true
    credentials:
      client_id: client_id_value
      client_secret: client_secret_value
      tenant_id: tenant_id_value
    services:
      event_hubs:
        queues:
          queue_a:
            event_hub_name: event_hub_value
            event_hub_connection_string: event_hub_connection_string_value
            blob_storage_account_name: blob_storage_account_name_value
            blob_storage_container_name: blob_storage_container_name_value
Internal process and deduplication method

The collector uses the event_hubs_auto_discover to dynamically query a given resource group and namespace for all available event hubs.

All deduplication methods and checkpointing methods listed in the event_hubs service apply; however, there are some additional considerations one should make when configuring the event_hubs_auto_discover service.

The event_hubs_auto_discover service will effectively restart all event hub consumers after one hour (this time can be overridden via the override_consumer_client_ttl_seconds_value parameter.) On restart, the collector will re-discover all available event hubs and begin pulling data again. Any event hubs that might have been created between the last run and the current run will be discovered and pulled from.

Due to the nature of this service, if a user has configure Azure Blob Storage checkpointing, the collector will attempt to create containers in the configured Azure Blob storage account. If the configured credentials do not have write access to the storage account, an error will be presented to the logs and indicate that the user must grant write access to the credentials.

Checkpointing

The collector supports two forms of checkpointing.

Local persistence checkpointing

By default, the collector will utilize local persistence checkpointing to ensure that events are not fetched multiple times from a given partition in a given event hub. The collector will store the last event offset as messages are consumed.

Azure Blob Storage checkpointing

Optionally, users can specify an Azure Blob Storage account or an Azure Blob Storage connection string to use Azure Blob Storage checkpointing. This allows the collector to run in multi-pod mode and all checkpointing data is stored within the Azure Storage account.

Unlike the event_hubs service, the event_hubs_auto_discover service will create containers for the discovered event hubs in the configured Azure Blob
Storage account. The containers are prefixed with devo- (though this value can be overridden in the configuration) and a hash calculated from the resource group, namespace, event hub name, and consumer group. This hash is used to ensure that the container name is unique and does not conflict with other container names and is within the character limit for Azure container names.

Troubleshooting

This collector has different security layers that detect both an invalid configuration and abnormal operation. This table will help you detect and resolve the most common errors.

Common logic

Error type

Error ID

Error message

Cause

Solution

InitVariablesError

1

Invalid start_time_in_utc: {ini_start_str}. Must be in parseable datetime format.

The configured start_time_in_utc parameter is a non-parseable format.

Update the start_time_in_utc value to have the recommended format as indicated in the guide.

InitVariablesError

2

Invalid start_time_in_utc: {ini_start_str}. Must be in the past.

The configured start_time_in_utc parameter is a future date.

Update the start_time_in_utc value to a past datetime.

PullError

350

Could not match tag to record and no default tag provided: {record}

Advanced tagging configured but no default tag provided and record did not match any of tag pathways

Provide default tag in advanced tag mapping object

ApiError

401

An error occurred while trying to authenticate with the Azure API. Exception: {e}

The collector is unable to authenticate with the Azure API.

Check the credentials and ensure that the collector has the necessary permissions to access the Azure API.

ApiError

410

An error occurred while trying to check if container '{container_name}' exists. Ensure that the blob storage account name or connection string is correct. Exception: {e}

The collector was unable to locate the specified blob storage container name.

Ensure the container exists and the credentials have READ access to the container

ApiError

411

An error occurred while trying to check if container '{container_name}' exists. Ensure that the application has necessary permissions to access the containers. Exception: {e}

The collector was unable to access the specified blob storage container name.

Ensure the container exists and the credentials have READ access to the container

ApiError

412

An error occurred while trying to create container '{container_name}'. Ensure that the application has necessary permissions to create containers. Exception: {e}

The collector was unable to create the container for the auto discover service and the user indicated to use Azure Blob Storage checkpointing.

Ensure the credentials have WRITE access to the container storage account.

ApiError

420

An error occurred while trying to get consumer group '{consumer_group_name}'. Exception: {e}

The collector was unable to access the specified consumer group name.

Ensure the consumer group exists and the credentials have READ access to the consumer group

ApiError

421

An error occurred while trying to create consumer group '{consumer_group_name}'. Ensure that the application has necessary permissions to create consumer groups. Exception: {e}

The collector was unable to create the consumer group for the auto discover service.

Ensure the credentials have WRITE access to the event hub namespace or use the $Default consumer group.

Typical issues
  • CBS token error - This issue happens usually when the connection string includes the event hub namespace name instead of the event hub name. Both values are usually different and it is easy to mix up both. You can find a explanation here.

  • Delayed events - You can use the @devo_event_enqueued_time value in the table cloud.azure to check the time that the events are queued in Azure. The delayed events can be caused by Event Hub itself (high enqueued time), or by lack of processing capacity of collector. In this case, it is necessary to add more collector instances, or to create a collector for each partition.

  • Duplicated events - Adjust the value of the config parameter duplicated_messages_mechanism_value according to your deployment. If you are running several instances, change the value to local. See [Internal Process and Deduplication Method](Internal Process and Deduplication Method) for more details.

Collector operations

This section is intended to explain how to proceed with specific operations of this collector.

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

Expand
titleVerify collector operations
Code Block
2023-01-10T15:22:57.146 INFO MainProcess::MainThread -> Loading configuration using the following files: {"full_config": "config-test-local.yaml", "job_config_loc": null, "collector_config_loc": null}
2023-01-10T15:22:57.146 INFO MainProcess::MainThread -> Using the default location for "job_config_loc" file: "/etc/devo/job/job_config.json"
2023-01-10T15:22:57.147 INFO MainProcess::MainThread -> "\etc\devo\job" does not exists
2023-01-10T15:22:57.147 INFO MainProcess::MainThread -> Using the default location for "collector_config_loc" file: "/etc/devo/collector/collector_config.json"
2023-01-10T15:22:57.148 INFO MainProcess::MainThread -> "\etc\devo\collector" does not exists
2023-01-10T15:22:57.148 INFO MainProcess::MainThread -> Results of validation of config files parameters: {"config": "C:\git\collectors2\devo-collector-<name>\config\config.yaml", "config_validated": True, "job_config_loc": "/etc/devo/job/job_config.json", "job_config_loc_default": True, "job_config_loc_validated": False, "collector_config_loc": "/etc/devo/collector/collector_config.json", "collector_config_loc_default": True, "collector_config_loc_validated": False}
2023-01-10T15:22:57.171 WARNING MainProcess::MainThread -> [WARNING] Illegal global setting has been ignored -> multiprocessing: FalseEvents delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method.

A successful run has the following output messages for the initializer module:

Code Block
023-01-10T15:23:00.788    INFO OutputProcess::MainThread -> DevoSender(standard_senders,devo_sender_0) -> Starting thread
2023-01-10T15:23:00.789    INFO OutputProcess::MainThread -> DevoSenderManagerMonitor(standard_senders,devo_1) -> Starting thread (every 300 seconds)
2023-01-10T15:23:00.790    INFO OutputProcess::MainThread -> DevoSenderManager(standard_senders,manager,devo_1) -> Starting thread
2023-01-10T15:23:00.842    INFO OutputProcess::MainThread -> global_status: {"output_process": {"process_id": 18804, "process_status": "running", "thread_counter": 21, "thread_names": ["MainThread", "pydevd.Writer", "pydevd.Reader", "pydevd.CommandThread", "pydevd.CheckAliveThread", "DevoSender(standard_senders,devo_sender_0)", "DevoSenderManagerMonitor(standard_senders,devo_1)", "DevoSenderManager(standard_senders,manager,devo_1)", "OutputStandardConsumer(standard_senders_consumer_0)",

Sender services

The Integrations Factory Collector SDK has 3 different senders services depending on the event type to delivery (internal, standard, and lookup). This collector uses the following Sender Services:

Sender services

Description

internal_senders

In charge of delivering internal metrics to Devo such as logging traces or metrics.

standard_senders

In charge of delivering pulled events to Devo.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Info

This value helps detect bottlenecks and needs to increase the performance of data delivery to Devo. This last can be made by increasing the concurrent senders.

Total number of messages sent: 44, messages sent since "2022-06-
28 10:39:22.511671+00:00": 21 (elapsed 0.007 seconds)

Displays the number of events from the last time and following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2022-06-28 10:39:22.511671+00:00.

  • 21 events where sent to Devo between the last UTC checkpoint and now.

  • Those 0 events required 0.007 seconds to be delivered.

Info

By default these traces will be shown every 10 minutes.

Expand
titleCheck memory usage

To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.

  • The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.

  • The global pressure of the available memory is displayed in the global value.

  • All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory

Code Block
INFO InputProcess::MainThread -> [GC] global: 20.4% -> 20.4%, process: RSS(34.50MiB -> 34.08MiB), VMS(410.52MiB ->
 410.02MiB)
 INFO OutputProcess::MainThread -> [GC] global: 20.4% -> 20.4%, process: RSS(28.41MiB -> 28.41MiB), VMS(705.28MiB ->
 705.28MiB)

Azure collector migration guide

This section will walk you through the process of updating your configuration from the old version (1.x.x) to the new version (2.0.0). The new version introduces significant improvements and changes to the configuration style to enhance performance, usability, and security.

Overview of changes

The new configuration format introduces several key changes:

  • Multiple inputs: The configuration now supports multiple inputs to better represent the different data sources and access mechanisms (azure and azure_event_hub)

  • Rename credential config parameters: The credentials configuration field names now follow names that are consistent with Microsoft Azure documentation: tenant_id, client_id, client_secret.

  • Azure Blob storage checkpoint support: The configuration now accepts Azure Blob Storage related parameters in the queue-specific configuration: blob_storage_connection_string, blob_storage_container_name, blob_storage_account_name.

  • Moved VM metrics to dedicated service: The VM metrics input has been moved to a dedicated service. The customer service configuration is no longer valid.

  • Moved Event Hub to dedicated service: The Event Hub input has been moved to a dedicated service. The customer service configuration is no longer valid.

Preparing for migration

Before starting the migration process, we recommend the following steps:

  1. Backup your current configuration: Always ensure you have a backup of your existing configuration files to prevent any data loss.

  2. Review the new configuration documentation: Familiarize yourself with the new configuration options available in version 2.0.0.

Migration steps

Step 1: Update credential configuration parameter field names

The credential configuration field names have been updated:

  1. active_directory_idtenant_id

  2. secretclient_secret

  3. app_idclient_id

An example of the old and new configuration is shown below:

Code Block
# Old Version (1.x.x)
credentials:
  app_id: <app_id_value>
  active_directory_id: <active_directory_id_value>
  subscription_id: <subscription_id_value>
  secret: <secret_value>
user_guide.md 2024-05-21
57 / 60
↓
# New Version (2.0.0)
credentials:
  client_id: <client_id_value>
  tenant_id: <tenant_id_value>
  subscription_id: <subscription_id_value>
  client_secret: <client_secret_value>

Step 2: Update VM metrics configuration

The VM Metrics service has been moved to the azure input and a dedicated vm_metrics service.

An example of the new configuration is shown below:

Code Block
azure:
  id: <short_id>
  enabled: true
  credentials:
    client_id: <client_id_value>
    client_secret: <client_secret_value>
    tenant_id: <tenant_id_value>
  environment: <environment_value>
  services:
    vm_metrics:
      start_time_in_utc: <start_time_in_utc_value>
      request_period_in_seconds: 300

If you wish to continue from the old configuration, you must input the time of the latest in Devo in the start_time_in_utc field to indicate the time from which the puller will start collecting data.

Step 3: Update Event Hub Configuration

The Event Hub service(s) have been moved to the azure_event_hub input and a dedicated event_hub service.

Code Block
azure_event_hub:
  id: <short_id>
  enabled: true
  credentials:
    client_id: <client_id_value>
    client_secret: <client_secret_value>
    tenant_id: <tenant_id_value>
  environment: <environment_value>
  services:
    event_hubs:
      queues:
        <queue_name>:
          event_hub_name: <event_hub_name_value>
          event_hub_connection_string: <event_hub_connection_string_value>
          consumer_group: <consumer_group_value>
          events_use_autocategory: <events_use_autocategory_value>
          blob_storage_connection_string: <blob_storage_connection_string_value>
          blob_storage_container_name: <blob_storage_container_name_value>
          blob_storage_account_name: <blob_storage_account_name_value>
          compatibility_version: <compatibility_version_value>
          duplicated_messages_mechanism: <duplicated_messages_mechanism>
          override_starting_position: <override_starting_position_value>

The new configuration now accepts the blob_storage_connection_string, blob_storage_container_name, and blob_storage_account_name parameters in the queue-specific configuration. These parameters are new, optional, and only required for those users who wish to leverage the Azure Blob Storage for checkpoint. This guide focuses on migrating the configuration from the old version to the new version -- for this reason, the new Azure Blob Storage checkpoint parameters are not relevant to older configurations because they use local, file-based checkpointing.

By default, the collector will begin pulling from the latest event in the queue if there is not already a pre-existing checkpoint. To ensure your migrated collectors fetch from the last event previously sent to Devo, identify the date time of the last event in Devo for the relevant queue and input it into the
override_starting_position field in the format %Y-%m-%dT%H:%M:%SZ. When the collector begins pulling from the queue, the collector will begin fetching from the indicated date time for the first checkpoint.

Step 4: Example before and after configuration

Putting it all together, see below for an example of the old and new configuration:

Code Block
# Old Version (1.x.x)
inputs:
  azure:
    id: 10001
    enabled: true
    credentials:
      app_id: app_id_acme
      active_directory_id: active_directory_id_acme
      subscription_id: subscription_id_acme
      secret: secret_acme
    environment: test_environment
    requests_limits:
      - period: 1d
        number_of_requests: -1
    services:
      my_service_1:
        request_period_in_seconds: 300
        types:
          - eh_services
        queues:
          queue_a:
            event_hub_name: the-event-hub-name
            consumer_group: the-consumer-group
            connection_str: the-connection-string
            events_use_autocategory: true
            compatibility_version: 1.2.1
            duplicated_messages_mechanism: global
            use_global_counter_per_queue: true
      my_service_2:
        request_period_in_seconds: 300
        types:
          - vm_metrics
Code Block
# New Version (2.0.0)
inputs:
  azure:
    id: 100001
    enabled: true
    credentials:
      subscription_id: subscription_id_acme
      client_id: app_id_acme
      client_secret: secret_acme
      tenant_id: active_directory_id_acme
    environment: test-env
    services:
      vm_metrics:
        request_period_in_seconds: 300
  azure_event_hub:
    id: 100001
    enabled: true
    credentials:
      subscription_id: subscription_id_acme
      client_id: app_id_acme
      client_secret: secret_acme
      tenant_id: active_directory_id_acme
    environment: test-env
    services:
      event_hubs:
        queues:
          queue_a:
            event_hub_name: the-event-hub-name
            event_hub_connection_string: the-connection-string
            consumer_group: the-consumer-group
            events_use_autocategory: true
            compatibility_version: 1.2.0
            duplicated_messages_mechanism: global
            override_starting_position: "2022-01-01T00:00:00Z" # Replace with the datetime of the last event in Devo.
Otherwise, collector pulls from latest event for the first checkpoint.

Tag mapping configuration guide

The events from Event Hubs are by default auto-categorized to Devo tags according the values explained here. But sometimes it can be needed to change this categorization or create a new categorization for a new kind of events. It is possible to change the categories without creating a new version of the collector, editing the config file.

This guide explains how to configure mapping using the tag parameter in the YAML configuration.

Overview

By default, the override_tag parameter accepts a simple string that will be applied to all records; however, the advanced override_tag parameter allows you to define a default tag and a set of tag mapping rules based on JMESPath expressions. The collector will use these rules to assign tags to records based on their content.

Info

You can find a tutorial and a complete reference for JMESPath here.

Template / Example

Code Block
override_tag:
 default_tag: <default_tag_value>
 jmespath_refs:
   <jmespath_ref_placeholder_name>: <jmespath_ref_placeholder_value>
 tag_map:
   - jmespath: <jmespath_expression_value>
     tag: <tag_value>

Configuration

Default tag

  • Use the default_tag parameter to specify the default tag that will be applied to records that do not match any JMESPath expression.

JMESPath references (Optional)

  • Define reusable JMESPath expressions in the jmespath_refs section.

  • These expressions can be referenced in the tag_map section using placeholders (e.g., {events_base}).

Tag map

  • Define a list of tag mapping rules in the tag_map section.

  • Each rule consists of a jmespath expression and a corresponding tag.

  • The jmespath expression is evaluated against each record, and if it matches, the corresponding tag is applied to the record.

  • The tag value can include placeholders (e.g., {queue_name}, {collector_version}) that will be substituted with values from the record itself or the collector variables.

Example simple configuration (used internally by the Google Workspace Logs in BigQuery collector)

Code Block
override_tag:
  default_tag: my.app.gsuite_activity.{record_type}
  tag_map:
    - jmespath: "[?record_type == 'gmail']"
      tag: cloud.gcp.bigquery.gmailExample advanced configuration (used internally by the Azure collector)

Example advanced configuration (used internally by the Azure collector)

Code Block
override_tag:
 default_tag: my.app.cloud_azure.unknown_events
 jmespath_refs:
   lower_resource_id: "lower(resourceid || resourceId || _ResourceId)"
   lower_category: "lower(category || Category)"
   events_base: "[?not_null(category, Category)]"
   metrics_base: "[?not_null(metricName)]"
   vm_base: "[?SourceSystem == 'Linux' || SourceSystem == 'OpsManager']"
 tag_map:
   - jmespath: "{events_base}"
     tag: cloud.azure.others.events.{queue_name}.{collector_version}.eh
   - jmespath: "{metrics_base}"
     tag: cloud.azure.eh.metrics.{queue_name}.{collector_version}
   - jmespath: "{vm_base} | [?Type == 'SecurityEvent' || (Type == 'Event' && EventLog == 'Security')]"
     tag: cloud.azure.vm.securityevent.{queue_name}.{collector_version}.eh
   - jmespath: "{vm_base} | [?Type == 'Syslog' && SourceSystem == 'Linux']"
     tag: cloud.azure.vm.unix.{queue_name}.{collector_version}.eh
   - jmespath: "{vm_base} | [?Type == 'Event' && EventLog == 'Application']"
     tag: cloud.azure.vm.applicationevent.{queue_name}.{collector_version}.eh
   - jmespath: "{vm_base} | [?Type == 'Event' && EventLog == 'System']"
     tag: cloud.azure.vm.systemevent.{queue_name}.{collector_version}.eh
   - jmespath: "{vm_base}"
     tag: cloud.azure.vm.unknown_events.{queue_name}.{collector_version}.eh

Evaluation process

  1. The collector evaluates each record against the JMESPath expressions in the tag_map section, in top-down order.

  2. If a record matches a JMESPath expression, the corresponding tag is applied, and the record is not evaluated against subsequent expressions.

  3. If a record does not match any JMESPath expression, the default_tag is applied.

Sending records

  • After all evaluations are made for a given recordset, the collector groups the records by their assigned tags.

  • The collector sends each group of records to Devo on a per-tag basis.

...

Troubleshooting

This collector has different security layers that detect both an invalid configuration and abnormal operation. This table will help you detect and resolve the most common errors.

Error type

Error ID

Error message

Cause

Solution

InitVariablesError

1

Invalid start_time_in_utc: {ini_start_str}. Must be in parseable datetime format.

The configured start_time_in_utc parameter is a non-parseable format.

Update the start_time_in_utc value to have the recommended format as indicated in the guide.

InitVariablesError

2

Invalid start_time_in_utc: {ini_start_str}. Must be in the past.

The configured start_time_in_utc parameter is a future date.

Update the start_time_in_utc value to a past datetime.

PullError

350

Could not match tag to record and no default tag provided: {record}

Advanced tagging configured but no default tag provided and record did not match any of tag pathways

Provide default tag in advanced tag mapping object

ApiError

401

An error occurred while trying to authenticate with the Azure API. Exception: {e}

The collector is unable to authenticate with the Azure API.

Check the credentials and ensure that the collector has the necessary permissions to access the Azure API.

ApiError

410

An error occurred while trying to check if container '{container_name}' exists. Ensure that the blob storage account name or connection string is correct. Exception: {e}

The collector was unable to locate the specified blob storage container name.

Ensure the container exists and the credentials have READ access to the container

ApiError

411

An error occurred while trying to check if container '{container_name}' exists. Ensure that the application has necessary permissions to access the containers. Exception: {e}

The collector was unable to access the specified blob storage container name.

Ensure the container exists and the credentials have READ access to the container

ApiError

412

An error occurred while trying to create container '{container_name}'. Ensure that the application has necessary permissions to create containers. Exception: {e}

The collector was unable to create the container for the auto discover service and the user indicated to use Azure Blob Storage checkpointing.

Ensure the credentials have WRITE access to the container storage account.

ApiError

420

An error occurred while trying to get consumer group '{consumer_group_name}'. Exception: {e}

The collector was unable to access the specified consumer group name.

Ensure the consumer group exists and the credentials have READ access to the consumer group

ApiError

421

An error occurred while trying to create consumer group '{consumer_group_name}'. Ensure that the application has necessary permissions to create consumer groups. Exception: {e}

The collector was unable to create the consumer group for the auto discover service.

Ensure the credentials have WRITE access to the event hub namespace or use the $Default consumer group.

Collector operations

This section is intended to explain how to proceed with specific operations of this collector.

Expand
titleVerify collector operations

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

Code Block
2023-01-10T15:22:57.146 INFO MainProcess::MainThread -> Loading configuration using the following files: {"full_config": "config-test-local.yaml", "job_config_loc": null, "collector_config_loc": null}
2023-01-10T15:22:57.146 INFO MainProcess::MainThread -> Using the default location for "job_config_loc" file: "/etc/devo/job/job_config.json"
2023-01-10T15:22:57.147 INFO MainProcess::MainThread -> "\etc\devo\job" does not exists
2023-01-10T15:22:57.147 INFO MainProcess::MainThread -> Using the default location for "collector_config_loc" file: "/etc/devo/collector/collector_config.json"
2023-01-10T15:22:57.148 INFO MainProcess::MainThread -> "\etc\devo\collector" does not exists
2023-01-10T15:22:57.148 INFO MainProcess::MainThread -> Results of validation of config files parameters: {"config": "C:\git\collectors2\devo-collector-<name>\config\config.yaml", "config_validated": True, "job_config_loc": "/etc/devo/job/job_config.json", "job_config_loc_default": True, "job_config_loc_validated": False, "collector_config_loc": "/etc/devo/collector/collector_config.json", "collector_config_loc_default": True, "collector_config_loc_validated": False}
2023-01-10T15:22:57.171 WARNING MainProcess::MainThread -> [WARNING] Illegal global setting has been ignored -> multiprocessing: FalseEvents delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method.

A successful run has the following output messages for the initializer module:

Code Block
023-01-10T15:23:00.788    INFO OutputProcess::MainThread -> DevoSender(standard_senders,devo_sender_0) -> Starting thread
2023-01-10T15:23:00.789    INFO OutputProcess::MainThread -> DevoSenderManagerMonitor(standard_senders,devo_1) -> Starting thread (every 300 seconds)
2023-01-10T15:23:00.790    INFO OutputProcess::MainThread -> DevoSenderManager(standard_senders,manager,devo_1) -> Starting thread
2023-01-10T15:23:00.842    INFO OutputProcess::MainThread -> global_status: {"output_process": {"process_id": 18804, "process_status": "running", "thread_counter": 21, "thread_names": ["MainThread", "pydevd.Writer", "pydevd.Reader", "pydevd.CommandThread", "pydevd.CheckAliveThread", "DevoSender(standard_senders,devo_sender_0)", "DevoSenderManagerMonitor(standard_senders,devo_1)", "DevoSenderManager(standard_senders,manager,devo_1)", "OutputStandardConsumer(standard_senders_consumer_0)",

Sender services

The Integrations Factory Collector SDK has 3 different senders services depending on the event type to delivery (internal, standard, and lookup). This collector uses the following Sender Services:

Sender services

Description

internal_senders

In charge of delivering internal metrics to Devo such as logging traces or metrics.

standard_senders

In charge of delivering pulled events to Devo.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Info

This value helps detect bottlenecks and needs to increase the performance of data delivery to Devo. This last can be made by increasing the concurrent senders.

Total number of messages sent: 44, messages sent since "2022-06-
28 10:39:22.511671+00:00": 21 (elapsed 0.007 seconds)

Displays the number of events from the last time and following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2022-06-28 10:39:22.511671+00:00.

  • 21 events where sent to Devo between the last UTC checkpoint and now.

  • Those 0 events required 0.007 seconds to be delivered.

Info

By default these traces will be shown every 10 minutes.

Expand
titleCheck memory usage

To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.

  • The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.

  • The global pressure of the available memory is displayed in the global value.

  • All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory

Code Block
INFO InputProcess::MainThread -> [GC] global: 20.4% -> 20.4%, process: RSS(34.50MiB -> 34.08MiB), VMS(410.52MiB ->
 410.02MiB)
 INFO OutputProcess::MainThread -> [GC] global: 20.4% -> 20.4%, process: RSS(28.41MiB -> 28.41MiB), VMS(705.28MiB ->
 705.28MiB)

Change log

Release

Released on

Release type

Details

Recommendations

v2.2.0

Status
colourGreen
titleIMPROVEMENTS

Feature

  • Added Intune Service

Improvements

  • Updated DCDSK from 1.11.1 to 1.12.2

  • Fixed high vulnerability in Docker Image

  • Upgrade DevoSDK dependency to version v5.4.0

  • Fixed error in persistence system

  • Applied changes to make DCSDK compatible with MacOS

  • Added new sender for relay in house + TLS

  • Added persistence functionality for gzip sending buffer

  • Added Automatic activation of gzip sending

  • Improved behaviour when persistence fails

  • Upgraded DevoSDK dependency

  • Fixed console log encoding

  • Restructured python classes

  • Improved behaviour with non-utf8 characters

  • Decreased defaut size value for internal queues (Redis limitation, from 1GiB to 256MiB)

  • New persistence format/structure (compression in some cases)

  • Removed dmesg execution (It was invalid for docker execution)

Recommended version

v2.0.0

Status
colourGreen
titleIMPROVEMENTS

Improvements

  • Complete reimplementation of the collector, refactoring all the services

Update

v1.9.0

Status
colourGreen
titleIMPROVEMENTS

Improvements

  • Updated DCSDK from 1.10.3 to 1.11.0

    • Resolution the UTF16 issues.

    • Fixed some bug related to the development.

Update

v1.8.0

Status
colourGreen
titleIMPROVEMENTS

Status
colourRed
titleBUG FIXING

Improvements

  • Update DCSDK from 1.9.2 to 1.10.3:

  • Updated DevoSDK to v5.1.9

  • Fixed some bug related to development on MacOS

  • Added an extra validation and fix when the DCSDK receives a wrong timestamp format

  • Added an optional config property for use the Syslog timestamp format in a strict way

Bug fixing

  • A bug related to UTF-16 causing the collector to stop sending events

Update

v1.7.1

Status
colourRed
titleBUG FIXING

Bug fixing

  • Azure metrics were using the incorrect timestamp format which caused logs to go to unknown

Update

v1.7.0

Status
colourGreen
titleIMPROVEMENTS

Status
colourRed
titleBUG FIXING

Improvements

  • Update DCSDK from 1.8.0 to 1.9.2:

    • Upgrade internal dependencies

    • Store lookup instances into DevoSender to avoid creation of new instances for the same lookup

    • Ensure service_config is a dict into templates

    • Ensure special characters are properly sent to the platform

    • Changed log level to some messages from info to debug

    • Changed some wrong log messages

    • Upgraded some internal dependencies

    • Changed queue passed to setup instance constructor

  • Update internal Azure libraries

Bug fixing

  • Enhancement for event category calculation

Update

v1.6.0

Status
colourGreen
titleIMPROVEMENTS

Status
colourRed
titleBUG FIXING

Improvements

  • Update DCSDK from 1.3.0 to 1.8.0:

    • Added log traces for knowing the execution environment status (debug mode)

    • Fixes in the current puller template version

    • The Docker container exits with the proper error code

    • New controlled stopping condition when any input thread fatally fails

    • Improved log trace details when runtime exceptions happen

    • Refactored source code structure

    • New "templates" functionality

    • Functionality for detecting some system signals for starting the controlled stopping

    • Input objects sends again the internal messages to devo.collectors.out table

    • Upgraded DevoSDK to version 3.6.4 to fix a bug related to a connection loss with Devo

    • Refactored source code structure

    • Changed way of executing the controlled stopping

    • Minimized probabilities of suffering a DevoSDK bug related to "sender" to be null

    • Ability to validate collector setup and exit without pulling any data

    • Ability to store in the persistence the messages that couldn't be sent after the collector stopped

    • Ability to send messages from the persistence when the collector starts and before the puller begins working

    • Ensure special characters are properly sent to the platform

    • Added a lock to enhance sender object

    • Added new class attrs to the setstate and getstate queue methods

    • Fix sending attribute value to the setstate and getstate queue methods

    • Added log traces when queues are full and have to wait

    • Added log traces of queues time waiting every minute in debug mode

    • Added method to calculate queue size in bytes

    • Block incoming events in queues when there are no space left

    • Send telemetry events to Devo platform

    • Changed

    • Upgraded internal Python dependency Redis to v4.5.4

    • Upgraded internal Python dependency DevoSDK to v5.1.3

    • Fixed obfuscation not working when messages are sent from templates

Bug fixing

  • Updated Azure libraries for Python are updated to share common cloud patterns.

  • Change in the authentication mechanism:

    • Previous version: Used ServicePrincipalCredentials in azure.common to authenticate to Azure.

    • New version: Uses the azure.identity library to provide unified authentication for all Azure SDKs.

Update

v1.5.0

Status
colourRed
titleBUG FIXING

Bug fixing

  • Accept a batch of events that come as an array.

  • Filter out non-VM-related events in the SourceSystem branch.

Update

v1.4.1

Status
colourGreen
titleIMPROVEMENTS

Improvements

  • Upgraded underlay IFC SDK v1.3.0 to v1.4.0.

  • Updated the underlying DevoSDK package to v3.6.4 and dependencies, this upgrade increases the resilience of the collector when the connection with Devo or the Syslog server is lost. The collector is able to reconnect in some scenarios without running the self-kill feature.

  • Support for stopping the collector when a GRACEFULL_SHUTDOWN system signal is received.

  • Re-enabled the logging to devo.collector.out for Input threads.

  • Improved self-kill functionality behavior.

  • Added more details in log traces.

  • Added log traces for knowing system memory usage.

Update

v1.4.0

Status
colourGreen
titleIMPROVEMENTS

Improvements

New events types are accepted for the service vm_events autocategorizer.

  • cloud.azure.vm.securityevent:

    • Type: Event

    • EventID: all

    • EventLog: Security

  • cloud.azure.vm.applicationevent:

    • Type: Event

    • EventID: all

    • EventLog: Application

  • cloud.azure.vm.systemevent:

    • Type: Event

    • EventID: all

    • EventLog: System

Update

v1.3.2

Status
colourRed
titleBUG FIXING

Bug fixing

A configuration bug has been fixed to enable the autocategorization of the following events

  • RiskyUsers

  • AzurePolicyEvaluationDetails

Update