Document toolboxDocument toolbox

Microsoft Defender Cloud Apps

Configuration requirements

To run this collector, there are some configurations detailed below that you need to consider.

Configuration

Details

Configuration

Details

Microsoft Defender account

You need to have a Microsoft Defender account active.

Microsoft Azure account

You need to have a Microsoft Azure account to register your app.

Application Secret Key

You will need to create an application secret key in Microsot Azure.

More information

Refer to the Vendor setup section to know more about these configurations.

Overview

Microsoft Defender for Cloud Apps is a Cloud Access Security Broker (CASB) that operates on multiple clouds. It provides rich visibility, control over data travel, and sophisticated analytics to identify and combat cyber threats across all your cloud services.

Devo collector features

Feature

Details

Feature

Details

Allow parallel downloading (multipod)

Not allowed

Running environments

Collector server

On-premise

Populated Devo events

Table

Flattening preprocessing

Yes

Data sources

Data source

Description

API endpoint

Collector service name

Devo table

Available from release

Data source

Description

API endpoint

Collector service name

Devo table

Available from release

Activity

Return information regarding who logs in to which app and when, which files are being downloaded from suspicious locations, and so on.

/api/v1/activities/

activities

casb.microsoft_defender.cloud_apps.activities

v1.0.0

Alerts 

 

Alerts indicate suspicious behavior and known threats in your environment.

/api/v1/alerts/

alerts

casb.microsoft_defender.cloud_apps.alerts

v1.0.0

Entities

 

Entities provides you with basic information about the users and accounts using your organization's cloud apps, allowing you to understand service use patterns.

/api/v1/entities/

entities

casb.microsoft_defender.cloud_apps.entities

v1.0.0

Data Enrichment  

 

Data Enrichment  enables you to manage identifiable IP address ranges, such as your physical office IP addresses. IP address ranges allow you to tag, categorize, and customize the way logs and alerts are displayed and investigated. 

/api/v1/subnet/

data_enrichment

casb.microsoft_defender.cloud_apps.data_enrichment

v1.0.0

Files 

Files  provides you with metadata about the files and folders stored in your cloud apps, such as last modification date, ownership, and more.

/api/v1/files/

files

casb.microsoft_defender.cloud_apps.files

v1.0.0

 

Flattening preprocessing

Data source

Collector service

Optional

Flattening details

Data source

Collector service

Optional

Flattening details

Alerts

alerts

Yes

The easiest way to describe the flattening logic is via an example. If:

base_event = { "id": "baseEventId1", "type": "alert" } args[0] = { "type": "domain", "should_flatten": True, "value": [ { "id": 1, "name": "domain1" }, { "id": 2, "name": "domain2" } ] } args[1] = { "type": "user", "should_flatten": True, "value": [ { "id": 1, "name": "user1" }, { "id": 2, "name": "user2" }, { "id": 3, "name": "user3" } ] }

The assumptions are:

  1. The base_event must be a dictionary

  2. You can pass any number of arguments in "args" but they must conform to the structure shown above.

In the above example, a total of 6 events will be generated (because there are 2 domains and 3 users).
If there are multiple "args" passed, the total number of flattened events = len(arg[0]["value"]) *
len(arg[1]["value"]) * len(arg[2]["value"])...
For generating an event, the logic is:

  1. Take the base_event as-is.

  2. Add the number of domains and users to the base event. At this point the base_event becomes:

{ "id": "baseEventId1", "type": "alert", "related_domains": 2, "related_users": 3 }
  1. For each domain and each user, prepend the keys with "domain_" and "user_" respectively,then add those keys to the base event. An example event will be (using the modified base_event in the previous point):

{ "id": "baseEventId1", # This is from the base event "type": "alert", # This is from the base event "related_domains": 2, # This is the number of related domains in total "related_users": 3, # This is the number of related users in total "domain_id": 1, # This is the "id" field from a domain, with "domain_" prepended to it "domain_name": "domain1", # This is the "name" field from a domain, with "domain_" prepended to it "user_id": 1, # This is the "id" field from a user, with "user_" prepended to it "user_name": "user1" # This is the "name" field from a user, with "user_" prepended to it }

Vendor setup

There are some minimum requirements that are needed in order to set up this collector.

Accepted authentication methods

Authentication method

Client ID

Tenant ID

Client Secret

Files Access Token

Authentication method

Client ID

Tenant ID

Client Secret

Files Access Token

Azure Authentication for the following services:

  • Activity

  • Alerts

  • Entities

  • Data Enrichment

REQUIRED

REQUIRED

REQUIRED

 

Azure Authentication for Files service:

REQUIRED

REQUIRED

REQUIRED

REQUIRED

Minimum configuration required for basic pulling

Although this collector supports advanced configuration, the fields required to retrieve data basic configuration are defined below.

This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check the settings section for details.

Setting

Details

Setting

Details

api_base_url

This parameter is the URL of the CloudApps instance. It is usually https://portal.cloudappsecurity.com/.

token_url

Set up here your access token created in the CloudApps console. It is usually https://login.microsoftonline.com/.

tenant_id

Set up here your tenant id created in the CloudApps console.

client_id

Set up here your client id created in the CloudApps console.

client_secret

Set up here your client secret created in the CloudApps console.

files_access_token

Only for Files service and should be removed if it is not used.

Set up here your access token Value created in the CloudApps console.

Running the data collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Collector services detail

This section is intended to explain how to proceed with specific actions for services.

Events service

Once the collector has been launched, it is important to check if the ingestion is performed in a proper way. To do so, go to the collector’s logs console.

This service has the following components:

Component

Description

Component

Description

Setup

The setup module is in charge of authenticating the service and managing the token expiration when needed.

Puller

The setup module is in charge of pulling the data in a organized way and delivering the events via SDK.

Setup output

A successful run has the following output messages for the setup module:

Puller output

A successful initial run has the following output messages for the puller module:

After a successful collector’s execution (that is, no error logs found), you will see the following log message:

This collector uses persistent storage to download events in an orderly fashion and avoid duplicates. In case you want to re-ingest historical data or recreate the persistence, you can restart the persistence of this collector by following these steps:

  1. Edit the configuration file.

  2. Change the value of the historic_date_utc parameter to a different one.

  3. Save the changes.

  4. Restart the collector.

The collector will detect this change and will restart the persistence using the parameters of the configuration file or the default configuration in case it has not been provided.

This collector has different security layers that detect both an invalid configuration and abnormal operation. This table will help you detect and resolve the most common errors.

 

Error type

Error ID

Error message

Cause

Solution

Error type

Error ID

Error message

Cause

Solution

SetupError

100

The remote data is not pullable with the given credentials. Check the error traces for details.

This error is raised when remote data cannot be accessed with current credentials.

Check that the credentials are correct and that they have the necessary permissions.

101

The token/header/authentication was refreshed but is still expired. Check the error traces for details.

This error is raised when the token cannot be refreshed.

Check that the credentials are correct and contact the internal team if the problem persists.

101

access token has expired or invalid. status code: 401. Error message: <response.text>

This error is raised when the credentials does not have sufficient permissions to access the data.

Give the credentials the correct permissions to access the data.

102

The provided credentials are valid but they do not have the permission to generate access token. status code: 403. Error message: <response.text>

This error is raised when the access token does not have sufficient permissions to access the data.

Give the token the correct permissions to access the data.

103

Unexpected error occurred at the Microsoft Defender for Cloud Apps server. status code: <response.status_code>. Error message: <response.text>

This error is raised when an unexpected and unknown error occurs in Microsoft Defender Cloud Apps

Contact with Devo Support team.

InitVariablesError

0

The internal config did not pass the format validation. Contact Devo Support.

This error is raised when the collector_definitions.yaml does not comply with the json schema validation.

This is an internal issue. Contact with Devo Support team.

1

The user config did not pass the format validation. Check error traces for details and visit our documentation.

This error is raised when the user configuration information does not comply with the json schema validation.

Check in the collector documentation what are the allowed parameters and their formats.

2

<rate_limiter> setting has been defined with wrong type.'Expected <dict> but received <{type(rate_limiter_config)>

This error is raised when the required rate_limiter property in the collector_definitions.yaml is not a dictionary.

This is an internal issue. Contact with Devo Support team.

3

The user config did not pass the format validation. Check error traces for details and visit our documentation.

This error is raised when the user configuration information does not comply with the json schema validation.

Check in the collector documentation what are the allowed parameters and their formats.

PullError

300

Error occurred while retrieving data from Microsoft Defender for Cloud Apps server. Error details: <str(e)>

This error is raised when when an unknown error occurs in the Microsoft Defender Cloud Apps request.

This is an internal issue. Contact with Devo Support team.

301

Error in the filters : <response.json()>

This error is raised when the filters are not in the correct format.

The error description indicates if there are disallowed elements in the filter. Review the Microsoft Defender CloudApps API documentation for the correct format of filters.

302

access token has expired or invalid. status code: 401. Error message: <response.json()>

This error is raised when the credentials does not have sufficient permissions to access the data.

Give the credentials the correct permissions to access the data.

303

The access token does not have valid permissions to perform this request. Add the required permissions in the Microsoft Defender for Cloud Apps azure portal (Azure Active Directory -> App registration -> select the register app -> API Permission) status code: 403. Error message: <response.json()>

This error is raised when the access token does not have sufficient permissions to access the data.

Give the token the correct permissions to access the data.

304

The resource requested is not found. Resource Url: <base_url + endpoint_url>"
f"status code: 404. Error message: <response.json()>

This error is raised when the requested resource is not found.

This is an internal issue. Contact with Devo Support team.

305

Unexpected error occurred at the Microsoft Defender for Cloud Apps server. status code: "<response.status_code>. Error message: <response.json()>

This error is raised when when an unknown error occurs in the Microsoft Defender Cloud Apps request.

This is an internal issue. Contact with Devo Support team.

Collector operations

This section is intended to explain how to proceed with the specific operations of this collector.

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

Events delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method.

A successful run has the following output messages for the initializer module:

Sender services

The Integrations Factory Collector SDK has 3 different senders services depending on the event type to delivery (internal, standard, and lookup). This collector uses the following Sender Services:

Sender services

Description

Sender services

Description

internal_senders

In charge of delivering internal metrics to Devo such as logging traces or metrics.

standard_senders

In charge of delivering pulled events to Devo.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Total number of messages sent: 44, messages sent since "2022-06-28 10:39:22.511671+00:00": 21 (elapsed 0.007 seconds)

Displayes the number of events from the last time and following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2022-06-28 10:39:22.511671+00:00.

  • 21 events where sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.007 seconds to be delivered.

To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.

  • The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.

  • The global pressure of the available memory is displayed in the global value.

  • All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory

Change log

Release

Released on

Release type

Details

Recommendations

Release

Released on

Release type

Details

Recommendations

v1.4.0

Oct 24, 2024

BUG FIXING

IMPROVEMENTS

  • Improvements:

    • Upgraded SDK image base from 1.1.0 to 1.3.1

    • Upgraded the DCSDK from 1.12.4 to 1.13.1

      • Change internal queue management for protecting against OOMK

      • Extracted ModuleThread structure from PullerAbstract

      • Improve Controlled stop when both processes fails to instantiate

      • Improve Controlled stop when InputProcess is killed

      • Fixed error related a ValueError exception not well controlled

      • Fixed error related with loss of some values in internal messages

  • Bug fixing:

    • Fixed an issue related to `files` service not working

Recommended version

v1.3.0

Feb 5, 2024

BUG FIXING

IMPROVEMENTS

Improvements:

  • Upgraded SDK image base from 1.0.2 to 1.1.0

Bug fixing:

  • Fixed an issue during persistence reset that caused events to be duplicated

  • Fixed an issue in the pull logic that caused events to be duplicated in the alerts service

  • Modified the pull logic to receive events in time intervals with the request_period_in_seconds parameter


Update

v1.2.0

Jan 12, 2024

BUG FIXING

IMPROVEMENTS

Improvements:

  • Updated API limits

  • Upgraded DCSDK from 1.9.2 to 1.10.2:

    • Added input metrics

    • Modified ouutput metrics

    • Updated DevoSDK to version 5.1.6

    • Standardized exception messages for traceability

    • Added more detail in queue statistics

    • Updated PythonSDK to version 5.0.7

    • Introduced pyproject.toml

    • Added requirements.dev.txt

    • Fixed error in pyproject.toml related to project scripts endpoint

Bug fixing:

  • Removed old rate_limiter parameter

  • Modification to the prepull reset steps, which changed due to a previous update.

Update

v1.1.1

Sep 20, 2023

BUG FIXING

Bug fixing:

  • Set request_limits using schemas

Update

v1.1.0

Sep 9, 2023

IMPROVEMENT

Upgraded DCSDK from 1.5.1 to 1.9.2:

  • Store lookup instances into DevoSender to avoid creation of new instances for the same lookup

  • Ensure service_config is a dict into templates

  • Ensure special characters are properly sent to the platform

  • Changed log level to some messages from info to debug

  • Changed some wrong log messages

  • Upgraded some internal dependencies

  • Changed queue passed to setup instance constructor

  • Ability to validate collector setup and exit without pulling any data

  • Ability to store in the persistence the messages that couldn't be sent after the collector stopped

  • Ability to send messages from the persistence when the collector starts and before the puller begins working

  • Ensure special characters are properly sent to the platform

  • Added a lock to enhance sender object

  • Added new class attrs to the setstate and getstate queue methods

  • Fix sending attribute value to the setstate and getstate queue methods

  • Added log traces when queues are full and have to wait

  • Added log traces of queues time waiting every minute in debug mode

  • Added method to calculate queue size in bytes

  • Block incoming events in queues when there are no space left

  • Send telemetry events to Devo platform

  • Upgraded internal Python dependency Redis to v4.5.4

  • Upgraded internal Python dependency DevoSDK to v5.1.3

  • Fixed obfuscation not working when messages are sent from templates

  • New method to figure out if a puller thread is stopping

  • Upgraded internal Python dependency DevoSDK to v5.0.6

  • Improved logging on messages/bytes sent to Devo platform

  • Fixed wrong bytes size calculation for queues

  • New functionality to count bytes sent to Devo Platform (shown in console log)

  • Upgraded internal Python dependency DevoSDK to v5.0.4

  • Fixed bug in persistence management process, related to persistence reset

  • Aligned source code typing to be aligned with Python 3.9.x

  • Inject environment property from user config

  • Obfuscation service can be now configured from user config and module definiton

  • Obfuscation service can now obfuscate items inside arrays

Update to 1.1.1

v1.0.0

Oct 28, 2022

NEW FEATURE

New features:

Released first version of Microsoft Defender Cloud Apps Collector with the following services:

  • Activities: Return information regarding who logs in to which app and when, which files are being downloaded from suspicious locations, and so on.

  • Alerts: Alerts indicate suspicious behavior and known threats in your environment.

  • Entities: Entities provides you with basic information about the users and accounts using your organization's cloud apps, allowing you to understand service use patterns.

  • Data Enrichment: Data Enrichment  enables you to manage identifiable IP address ranges, such as your physical office IP addresses. IP address ranges allow you to tag, categorize, and customize the way logs and alerts are displayed and investigated.

  • Files: Files  provides you with metadata about the files and folders stored in your cloud apps, such as last modification date, ownership, and more.

-