Document toolboxDocument toolbox

Big ID collector

Overview

The BigID API allows you to perform all the actions you're used to performing via the BigID user interface programmatically. This is perfect for scenarios like the one in this exercise where you need to perform the same operation on a scheduled basis.

Devo collector features

Feature

Details

Feature

Details

Allow parallel downloading (multipod)

  • not allowed

Running environments

  • collector server

  • on-premise

Populated Devo events

  • table

Flattening preprocessing

  • no

Allowed source events obfuscation

  • yes

Data sources

Data source

API endpoint

Collector service name

Devo table

Data source

API endpoint

Collector service name

Devo table

Audit

/v1/api/audit-logs

audit

dspm.bigid.audit.events

For more information on how the events are parsed, visit our page.

Vendor setup

To log in to the Big ID environment. Using the vendor doc here:

Getting Credentials

User Token - A user token (generated from Administration -> Access Management by a System Administrator) allows you to access BigID by exchanging a user token for a session token at the /refresh endpoint. This means you don't have to store your username and password within an application, but user tokens are only valid for a maximum of 999 days.

  1. First we'll need to create a user token for us to use through the BigID UI.

  2. To do this we need to navigate to the Access Management screen under Administration -> Access Management. On the Access Management screen, select the user you want to create a token for from the System Users List. Then press the Generate button to start the token creation process.

    1.jpeg
  3. Tokens can only be valid for up to 999 days. Since we're just using this token for testing, let's set it to 30 days and then click Generate like in the screenshot below.

    2.jpeg

  4. On the next screen you'll see a name for the token as well as the token value. Copy the token value by clicking the icon to the right of it then close the dialog. You can't see the token value again so be sure you have saved it someplace safe.

    3.jpeg

  5. Finally, save the user so the token can take effect.

    4.jpeg

This token you'll have to ROTATE YOURSELF in your collector app

Minimum configuration required for basic pulling

Although this collector supports advanced configuration, the fields required to retrieve data with basic configuration are defined below.

This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check setting sections for details.

Setting

Details

Setting

Details

integration_key

The key you generate in the setup guide

environment

This is the domain name of the cloud where your BigID instance is. the code will replace the url for the calls with this value

Accepted Authentication Methods

Authentication Method

Integration Key

Environment

Authentication Method

Integration Key

Environment

Api Key

REQUIRED

REQUIRED

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Collector services detail

This section is intended to explain how to proceed with specific actions for services.

The process for deduplication of events is handled by the event id that is returned that is stored and checked against on each pull.

All services are tagged by the service they are pulled by.

2024-04-09T09:56:19.561389088Z 2024-04-09T09:56:19.561 WARNING InputProcess::BigIDPullerSetup(BigID,BigID#56752,audit#predefined) -> Testing fetch from /mcm/get-audit. 2024-04-09T09:56:19.771898077Z 2024-04-09T09:56:19.771 INFO OutputProcess::DevoSenderManagerMonitor(lookup_senders,relay_0) -> Number of available senders: 1, sender manager internal queue size: 0 2024-04-09T09:56:19.772167710Z 2024-04-09T09:56:19.771 INFO OutputProcess::DevoSenderManagerMonitor(internal_senders,relay_0) -> Number of available senders: 1, sender manager internal queue size: 0 2024-04-09T09:56:19.772327665Z 2024-04-09T09:56:19.772 INFO OutputProcess::DevoSenderManagerMonitor(lookup_senders,relay_0) -> enqueued_elapsed_times_in_seconds_stats: {} 2024-04-09T09:56:19.772395427Z 2024-04-09T09:56:19.772 INFO OutputProcess::DevoSenderManagerMonitor(internal_senders,relay_0) -> enqueued_elapsed_times_in_seconds_stats: {} 2024-04-09T09:56:19.772425735Z 2024-04-09T09:56:19.772 INFO OutputProcess::DevoSenderManagerMonitor(internal_senders,relay_0) -> Sender: DevoSender(internal_senders,devo_sender_0), status: {"internal_queue_size": 0, "is_connection_open": True} 2024-04-09T09:56:19.772519048Z 2024-04-09T09:56:19.772 INFO OutputProcess::DevoSenderManagerMonitor(lookup_senders,relay_0) -> Sender: DevoSender(lookup_senders,devo_sender_0), status: {"internal_queue_size": 0, "is_connection_open": False} 2024-04-09T09:56:19.772601144Z 2024-04-09T09:56:19.772 INFO OutputProcess::DevoSenderManagerMonitor(internal_senders,relay_0) -> Internal - Total number of messages: 6333 messages/bytes sent since/to "2024-04-09T09:51:19.771169+00:00/2024-04-09T09:56:19.772408+00:00": 25/13762, (elapsed 0.052 seconds) 2024-04-09T09:56:19.772631900Z 2024-04-09T09:56:19.772 INFO OutputProcess::DevoSenderManagerMonitor(lookup_senders,relay_0) -> Lookup - Total number of messages sent: 0, messages sent since "2024-04-09 09:51:19.770969+00:00": 0 (elapsed 0.000 seconds) 2024-04-09T09:56:20.261298432Z 2024-04-09T09:56:20.261 INFO InputProcess::BigIDPullerSetup(BigID,BigID#56752,audit#predefined) -> Successfully tested fetch from /mcm/get-audit. Source is pullable. 2024-04-09T09:56:20.262730750Z 2024-04-09T09:56:20.262 INFO InputProcess::BigIDPullerSetup(BigID,BigID#56752,audit#predefined) -> Setup for module <BigIDPuller> has been successfully executed 2024-04-09T09:56:22.704294203Z 2024-04-09T09:56:22.704 INFO InputProcess::BigIDPuller(BigID,56752,audit,predefined) -> Pull Started 2024-04-09T09:56:23.124016843Z 2024-04-09T09:56:23.123 INFO InputProcess::BigIDPuller(BigID,56752,audit,predefined) -> Updating the persistence 2024-04-09T09:56:23.124811682Z 2024-04-09T09:56:23.124 INFO InputProcess::BigIDPuller(BigID,56752,audit,predefined) -> (Partial) Statistics for this pull cycle (@devo_pulling_id=1712656582704):Number of requests made: 535; Number of events received: 2; Number of duplicated events filtered out: 0; Number of events generated and sent: 2; Average of events per second: 4.759.
  1. Update the Unique Id of the collector and restart, this will remove the Id values that have been pulled.

  2. Update the start_datetime_utc to new date to avoid as many duplicates as possible.

  3. Multiple changes to this can cause duplication.

This collector has different security layers that detect both an invalid configuration and abnormal operation. This table will help you detect and resolve the most common errors.

Error Type

Error Id

Error Message

Cause

Solution

 

Error Type

Error Id

Error Message

Cause

Solution

 

ApiError

401,403

Failed to fetch data from {endpoint}. Source is not pullable. Exception: {response.text}

Could Not connect to the Host

Ensure the endpoint is reachable with the credentials

 

SetUpError

102

Failed to fetch from {endpoint}. Error: {response.text}. API Error: {response.status_code}

The collector was unable to access the specified endpoint.

Ensure the endpoint is reachable with the credentials

Collector operations

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

2023-01-10T15:22:57.146 INFO MainProcess::MainThread -> Loading configuration using the following files: {"full_config": "config-test-local.yaml", "job_config_loc": null, "collector_config_loc": null} 2023-01-10T15:22:57.146 INFO MainProcess::MainThread -> Using the default location for "job_config_loc" file: "/etc/devo/job/job_config.json" 2023-01-10T15:22:57.147 INFO MainProcess::MainThread -> "\etc\devo\job" does not exists 2023-01-10T15:22:57.147 INFO MainProcess::MainThread -> Using the default location for "collector_config_loc" file: "/etc/devo/collector/collector_config.json" 2023-01-10T15:22:57.148 INFO MainProcess::MainThread -> "\etc\devo\collector" does not exists 2023-01-10T15:22:57.148 INFO MainProcess::MainThread -> Results of validation of config files parameters: {"config": "C:\git\collectors2\devo-collector-<name>\config\config.yaml", "config_validated": True, "job_config_loc": "/etc/devo/job/job_config.json", "job_config_loc_default": True, "job_config_loc_validated": False, "collector_config_loc": "/etc/devo/collector/collector_config.json", "collector_config_loc_default": True, "collector_config_loc_validated": False} 2023-01-10T15:22:57.171 WARNING MainProcess::MainThread -> [WARNING] Illegal global setting has been ignored -> multiprocessing: False

Events delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method. A successful run has the following output messages for the initializer module:

2023-01-10T15:23:00.788 INFO OutputProcess::MainThread -> DevoSender(standard_senders,devo_sender_0) -> Starting thread 2023-01-10T15:23:00.789 INFO OutputProcess::MainThread -> DevoSenderManagerMonitor(standard_senders,devo_1) -> Starting thread (every 300 seconds) 2023-01-10T15:23:00.790 INFO OutputProcess::MainThread -> DevoSenderManager(standard_senders,manager,devo_1) -> Starting thread 2023-01-10T15:23:00.842 INFO OutputProcess::MainThread -> global_status: {"output_process": {"process_id": 18804, "process_status": "running", "thread_counter": 21, "thread_names": ["MainThread", "pydevd.Writer", "pydevd.Reader", "pydevd.CommandThread", "pydevd.CheckAliveThread", "DevoSender(standard_senders,devo_sender_0)", "DevoSenderManagerMonitor(standard_senders,devo_1)", "DevoSenderManager(standard_senders,manager,devo_1)", "OutputStandardConsumer(standard_senders_consumer_0)",

Sender services

The Integrations Factory Collector SDK has 3 different sender services depending on the event type to deliver (internal, standard, and lookup). This collector uses the following Sender Services:

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

Sender manager internal queue size: 0

Displays the items available in the internal sender queue.

This value helps detect bottlenecks and needs to increase the performance of data delivery to Devo. This last can be made by increasing the concurrent senders.

Total number of messages sent: 44, messages sent since "2022-06-28 10:39:22.511671+00:00": 21 (elapsed 0.007 seconds)

Displays the number of events from the last time the collector executed the pull logic. Following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2022-06-28 10:39:22.511671+00:00.

  • 21 events were sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.007 seconds to be delivered.

    By default these traces will be shown every 10 minutes.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service

Sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Standard - Total number of messages sent: 57, messages sent since "2023-01-10 16:09:16.116750+00:00": 0 (elapsed 0.000 seconds

Displays the number of events from the last time the collector executed the pull logic. Following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2023-01-10 16:09:16.116750+00:00.

  • 21 events were sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.00 seconds to be delivered.

To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.

  • The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.

  • The global pressure of the available memory is displayed in the global value.

  • All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory

INFO InputProcess::MainThread -> [GC] global: 20.4% -> 20.4%, process: RSS(34.50MiB -> 34.08MiB), VMS(410.52MiB -> 410.02MiB) INFO OutputProcess::MainThread -> [GC] global: 20.4% -> 20.4%, process: RSS(28.41MiB -> 28.41MiB), VMS(705.28MiB -> 705.28MiB)

Change log

Release

Released on

Release type

Recommendations

Release

Released on

Release type

Recommendations

v1.0.0

Dec 16, 2024

NEW FEATURE

Recommended version