Document toolboxDocument toolbox

Microsoft Azure collector

If you are migrating from v1.x.x to v2.0.0, you can find a complete guide in this article.

Overview

Microsoft Azure is an ever-expanding set of cloud computing services to help your organization meet its business challenges. Azure gives you the freedom to build, manage, and deploy applications on a massive, global network using your preferred tools and frameworks.

Devo collector features

Features

Details

Features

Details

Allow parallel downloading (multipod)

Partial (supported for event_hubs services using Azure Blob Storage)

Running environments

  • collector server

  • on-premise

Populated Devo events

table

Flattening pre-processing

no

Allowed source events obfuscation

yes

Data source description

Data source

Description

API endpoint

Collector service name

Devo table

Data source

Description

API endpoint

Collector service name

Devo table

VM Metrics

With the advantages of the Microsoft Azure API, one can obtain metrics about the deployed Virtual Machines, gathering them on our platform, making it easier to query and analyze in the Devo platform and Activeboards.

Azure Compute Management Client SDK and Azure Monitor Management Client SDK

vm_metrics

cloud.azure.vm.metrics_simple

Event Hubs

Several Microsoft Azure services can generate some type of execution information to be sent to an EventHub service. (see next section)

Azure Event Hubs SDK

event_hubs and event_hubs_autodiscover

<auto_tag_description>

Valid for all cloud.azure tables by setting the output option to stream to Event Hub.

Event hubs: Auto-categorization of Microsoft Azure service messages

Many of the available Microsoft Azure services can generate some type of execution information to be sent to an EventHub service. This type of data can be categorized as events or metrics. The events, in turn, can be from different subtypes: audits, status, logs, etc.

All such data will be gathered by Devo’s Microsoft Azure collector and sent to our platform, where message auto-categorization functionality is enabled for sending the messages to relevant Devo tables in an automatic way.

Although EventHub is the service used for centralizing Azure services' data, it also generates information that can be sent to itself.

In case the amount of egress data exceeds Throughput per Unit limits set by Azure (2 MB/s or 4096 events per second), it won’t be possible for Devo to continue reliable ingestion of data. You can monitor ingress/egress throughput in Azure Portal EventHub Namespace, and based on trends/alerts, you can add another EventHub to resolve this. To avoid this from happening in the first place, please follow scalability guidance provided by Microsoft in their technical documentation.

Learn more in this article.

Vendor setup

The Microsoft Azure collector centralizes the data with an Event Hub using the Azure SDK. To use it, you need to configure the resources in the Azure Portal and set the right permissions to access the information.

Virtual Machine metrics

Getting credentials

To log in to the Azure subscription, the collector uses a Service Principal object. You need to get the subscription ID, Active Directory ID, Application ID (service principal identification), and the client secret (service principal "password"). To get them, follow these steps:

  1. Log in to your Azure account and search for Azure Active Directory.

  2. Now, click App registrations in the left menu and click the app (or Service Principal) that you are going to use.

  3. In the Overview area, find the Application (client) ID and the Directory (tenant) ID.

  4. Now click Certificates & Secrets on the menu and create a new client secret by clicking the New client secret button.

    Don't forget to save the client secret value, it will be only shown upon creation.

  5. Get the subscription ID by searching for Subscriptions on the home page.

  6. Find the correct subscription and note down the subscription ID.

Setting up permissions

  1. After creating the App registration (or Service Principal), go to the desired Resource Group (or subscription if you want to retrieve metrics from all the available virtual machines).

  2. Select Access control (IAM) in the left menu and click Add.

  3. Select at least the Reader role and choose the previously created App registration.

  4. Confirm the changes.

Event Hub events

Getting credentials (Storage Account) (Optional)

If you want to use Azure Blob Storage for checkpointing purposes, you need to create a storage account to store the checkpoints. If you do not wish to use Azure Blob storage (i.e. you will use Devo local persistence), you can skip the Blob Storage configuration steps.

  1. From the left portal menu, select Storage accounts to display a list of your storage accounts. If the portal menu isn't visible, select the menu button to toggle it on.

image-20240523-105606.png
  1. On the Storage accounts page, select Create.

image-20240523-105753.png
  1. After the storage account is created, select it from the list of storage accounts, click on Access keys in the left menu, and copy the connection string.

Alternatively, users can grant the necessary permissions to the registered application to access the Event Hub without using the RootManageSharedAccessKey. Roles can be assigned in a variety of ways (e.g. inherited from the subscription group), but the following steps will show how to assign the necessary roles directly to the Storage Account.

Repeat steps 1-2 from the Connection String section to create the Storage Account.

  1. In the Storage Account, click Access control (IAM) in the left menu, click + Add, and click Add Access Role Assignment.

  2. Search for either the Storage Blob Data Contributor or Storage Blob Data Owner role and select it and then click Next.

  3. Click + Select members and search for the previously created App registration, select it, click Next.

  4. Click Review + Assign.

Getting credentials (Event Hubs)

Users can either obtain a connection string or use Role Assignments to allow the collector to access the Event Hub.

  1. In your Azure account, search for the Event Hubs service and click on it. 

  2. Create an Event Hub resource per region (repeat the steps below for each region):

    • Click Add.

    • Fill the mandatory fields keeping in mind that the Event Hub must be in the same region as the resources that you are going to monitor (and only need one per region). The Throughput Units option refers to the ingress/egress limit in MB/s (each unit is 1 MB/s or 1000 events/second ingress, 2 MB/s, or 4096 events/second egress). You should adjust it according to the data volume (this can be modified later).

    • The previous steps create an EventHub namespace; now go to Event Hubs, search the created one and click on it.

    • Now click on the + Event Hub button and create a new resource. You only need to fill the Name and Partition Count fields (the Partition Count field will divide the data into different partitions to make it easier to read large volumes of data). Write down the EventHub name to be used later in the configuration file.

    • Once the Event Hub is created in the namespace, click it and select Consumer Group in the left menu. Note that a dedicated Consumer Group for Devo needs to be created if the existing consumer groups are already in use.

    • Here you will see the Event Hub consumer groups. This will be used by the collector (or other applications) for reading data from the Event Hub. Write down the Consumer group name that you will use later in the configuration file.
      Now, in the Event Hub Namespace, click on Shared access policies, search the default policy named RootManageSharedAccessKey and click it.

    • Copy and write down the primary (or secondary) connection string to be used later in the configuration file.

Alternatively, users can grant the necessary permissions to the registered application to access the Event Hub without using the RootManageSharedAccessKey. Roles can be assigned in a variety of ways (e.g. inherited from the subscription group), but the following steps will show how to assign the necessary roles directly to the Event Hub Namespace.

Repeat steps 1-2.7 from the previous section to create the Event Hub.

  1. In the Event Hub Namespace, click Access control (IAM) in the left menu, click + Add, and click Add Access Role Assignment.

  1. Search for either the Azure Event Hubs Data Receiver or Azure Event Hubs Data Owner role and select it and then click Next.

  1. Click + Select members and search for the previously created App registration, select it, click Next.

  1. Click Review + Assign.

Setting up the Event Hubs

  1. Now, search the Monitor service and click on it.

  2. Click the Diagnostic Settings option in the left area.

  3. A list of the deployed resources will be shown. Search for the resources that you want to monitor, select them, and click Add diagnostic setting.

  4. Type a name for the rule and check the required category details (logs will be sent to the cloud.azure.eh.events table, and metrics will be sent to the cloud.azure.eh.metrics table).

  5. Check Stream to an Event Hub, and select the corresponding Event hub namespace, Event hub name, and Event hub policy name.

  6. Click Save to finish the process.

Event Hub Auto Discover

To configure access to event hubs for the auto-discovery feature, you need to grant the necessary permissions to the registered application to access the Event Hub without using the RootManageSharedAccessKey. Furthermore, the auto-discovery feature will enumerate a namespace and resource group for all available event hubs and optionally create consumer groups (if the configuration specifies a consumer group other than $Default and that consumer group does not exist when he collector connects to the event hub) and optionally create Azure Blob Storage containers for checkpointing purposes (if the user specifies a storage account and container in the configuration file).

Role assignment (Namespace)

Repeat the steps from the Event Hubs Role Assignment section, except that the necessary role is the Azure Event Hubs Namespace Data Owner role. This allows the collector to enumerate the event hubs in the namespace and create consumer groups if necessary.

Minimum configuration required for basic pulling

Although this collector supports advanced configuration, the fields required to retrieve data with basic configuration are defined below.

Setting

Details

Setting

Details

tenant_id

The Azure application tenant ID.

client_id

The Azure application client ID.

client_secret

The Azure application client secret.

subscription_id

The Azure application subscription ID.

Accepted authentication methods

Authentication method

Tenant ID

Client ID

Client secret

Subscription ID

Authentication method

Tenant ID

Client ID

Client secret

Subscription ID

OAuth2

REQUIRED

REQUIRED

REQUIRED

REQUIRED

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Collector services detail

This section is intended to explain how to proceed with specific actions for services.

Internal process and deduplication method

All VM metrics data are pulled with a time grain value of PT1M (1 minute). The collector polls for all available VM resource IDs and then pulls the metrics for each resource ID. Checkpoints are persisted to ensure that duplicate data is not sent to Devo.

Devo categorization and destination

All events of this service are ingested into the table cloud.azure.vm.metrics_simple

Restart the persistence

This collector uses persistent storage to download events in an orderly fashion and avoid duplicates. In case you want to re-ingest historical data or recreate the persistence, you can restart the persistence of this collector by following these steps:

  1. Edit the configuration file.

  2. Change the value of the start_time_in_utc parameter to a different one.

  3. Save the changes.

  4. Restart the collector.

The collector will detect this change and will restart the persistence using the parameters of the configuration file or the default configuration in case it has not been provided.

Collector operations

This section is intended to explain how to proceed with specific operations of this collector.

Change log

Release

Released on

Release type

Details

Recommendations

Release

Released on

Release type

Details

Recommendations

v2.2.0

Jul 10, 2024

IMPROVEMENTS

Feature

  • Added Intune Service

Improvements

  • Updated DCDSK from 1.11.1 to 1.12.2

  • Fixed high vulnerability in Docker Image

  • Upgrade DevoSDK dependency to version v5.4.0

  • Fixed error in persistence system

  • Applied changes to make DCSDK compatible with MacOS

  • Added new sender for relay in house + TLS

  • Added persistence functionality for gzip sending buffer

  • Added Automatic activation of gzip sending

  • Improved behaviour when persistence fails

  • Upgraded DevoSDK dependency

  • Fixed console log encoding

  • Restructured python classes

  • Improved behaviour with non-utf8 characters

  • Decreased defaut size value for internal queues (Redis limitation, from 1GiB to 256MiB)

  • New persistence format/structure (compression in some cases)

  • Removed dmesg execution (It was invalid for docker execution)

Recommended version

v2.0.0

May 16, 2024

IMPROVEMENTS

Improvements

  • Complete reimplementation of the collector, refactoring all the services

Update

v1.9.0

Feb 20, 2024

IMPROVEMENTS

Improvements

  • Updated DCSDK from 1.10.3 to 1.11.0

    • Resolution the UTF16 issues.

    • Fixed some bug related to the development.

Update

v1.8.0

Feb 14, 2024

IMPROVEMENTS

BUG FIXING

Improvements

  • Update DCSDK from 1.9.2 to 1.10.3:

  • Updated DevoSDK to v5.1.9

  • Fixed some bug related to development on MacOS

  • Added an extra validation and fix when the DCSDK receives a wrong timestamp format

  • Added an optional config property for use the Syslog timestamp format in a strict way

Bug fixing

  • A bug related to UTF-16 causing the collector to stop sending events

Update

v1.7.1

Oct 6, 2023

BUG FIXING

Bug fixing

  • Azure metrics were using the incorrect timestamp format which caused logs to go to unknown

Update

v1.7.0

Sep 6, 2023

IMPROVEMENTS

BUG FIXING

Improvements

  • Update DCSDK from 1.8.0 to 1.9.2:

    • Upgrade internal dependencies

    • Store lookup instances into DevoSender to avoid creation of new instances for the same lookup

    • Ensure service_config is a dict into templates

    • Ensure special characters are properly sent to the platform

    • Changed log level to some messages from info to debug

    • Changed some wrong log messages

    • Upgraded some internal dependencies

    • Changed queue passed to setup instance constructor

  • Update internal Azure libraries

Bug fixing

  • Enhancement for event category calculation

Update

v1.6.0

Jun 12, 2023

IMPROVEMENTS

BUG FIXING

Improvements

  • Update DCSDK from 1.3.0 to 1.8.0:

    • Added log traces for knowing the execution environment status (debug mode)

    • Fixes in the current puller template version

    • The Docker container exits with the proper error code

    • New controlled stopping condition when any input thread fatally fails

    • Improved log trace details when runtime exceptions happen

    • Refactored source code structure

    • New "templates" functionality

    • Functionality for detecting some system signals for starting the controlled stopping

    • Input objects sends again the internal messages to devo.collectors.out table

    • Upgraded DevoSDK to version 3.6.4 to fix a bug related to a connection loss with Devo

    • Refactored source code structure

    • Changed way of executing the controlled stopping

    • Minimized probabilities of suffering a DevoSDK bug related to "sender" to be null

    • Ability to validate collector setup and exit without pulling any data

    • Ability to store in the persistence the messages that couldn't be sent after the collector stopped

    • Ability to send messages from the persistence when the collector starts and before the puller begins working

    • Ensure special characters are properly sent to the platform

    • Added a lock to enhance sender object

    • Added new class attrs to the setstate and getstate queue methods

    • Fix sending attribute value to the setstate and getstate queue methods

    • Added log traces when queues are full and have to wait

    • Added log traces of queues time waiting every minute in debug mode

    • Added method to calculate queue size in bytes

    • Block incoming events in queues when there are no space left

    • Send telemetry events to Devo platform

    • Changed

    • Upgraded internal Python dependency Redis to v4.5.4

    • Upgraded internal Python dependency DevoSDK to v5.1.3

    • Fixed obfuscation not working when messages are sent from templates

Bug fixing

  • Updated Azure libraries for Python are updated to share common cloud patterns.

  • Change in the authentication mechanism:

    • Previous version: Used ServicePrincipalCredentials in azure.common to authenticate to Azure.

    • New version: Uses the azure.identity library to provide unified authentication for all Azure SDKs.

Update

v1.5.0

Feb 21, 2023

BUG FIXING

Bug fixing

  • Accept a batch of events that come as an array.

  • Filter out non-VM-related events in the SourceSystem branch.

Update

v1.4.1

Aug 12, 2022

IMPROVEMENTS

Improvements

  • Upgraded underlay IFC SDK v1.3.0 to v1.4.0.

  • Updated the underlying DevoSDK package to v3.6.4 and dependencies, this upgrade increases the resilience of the collector when the connection with Devo or the Syslog server is lost. The collector is able to reconnect in some scenarios without running the self-kill feature.

  • Support for stopping the collector when a GRACEFULL_SHUTDOWN system signal is received.

  • Re-enabled the logging to devo.collector.out for Input threads.

  • Improved self-kill functionality behavior.

  • Added more details in log traces.

  • Added log traces for knowing system memory usage.

Update

v1.4.0

Aug 12, 2022

IMPROVEMENTS

Improvements

New events types are accepted for the service vm_events autocategorizer.

  • cloud.azure.vm.securityevent:

    • Type: Event

    • EventID: all

    • EventLog: Security

  • cloud.azure.vm.applicationevent:

    • Type: Event

    • EventID: all

    • EventLog: Application

  • cloud.azure.vm.systemevent:

    • Type: Event

    • EventID: all

    • EventLog: System

Update

v1.3.2

Jun 14, 2022

BUG FIXING

Bug fixing

A configuration bug has been fixed to enable the autocategorization of the following events

  • RiskyUsers

  • AzurePolicyEvaluationDetails

Update