Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Overview

This collector pulls events from the Box API, which pulls from admin activities endpoint.

Devo collector features

Feature

Details

Allow parallel downloading (multipod)

  • not allowed

Running environments

  • collector server

  • on-premise

Populated Devo events

  • table

Flattening preprocessing

  • no

Allowed source events obfuscation

  • yes

Data sources

Data source

API endpoint

Collector service name

Devo table

Admin

https://api.box.com/2.0/events

admin

cloud.box.events.json

For more information on how the events are parsed, visit our page.

Vendor setup

To log in to the Box environment. Using the vendor doc here:

Enable the Box API Playground

  1. Navigate to the Developer Console Log into Box and go to the Developer Console. Select Create New App.

  2. Select application type Select Custom App from the list of application types. A modal will appear to prompt a selection for the next step.

  3. Provide basic application information

Purpose

Details

Automation, Custom Portal

Specify if the app is built by a customer or partner.

Integration

Specify the integration category, external system name if the app is built by a customer or partner.

Other

Specify the app purpose and if it is built by a customer or partner.

  1. Select application authentication Select Server Authentication (with JWT) if you would like to verify application identity with a key pair and confirm with Create App. Once created you will not be able to change it

  2. Public and private key pair Once a Custom App is created leveraging Server Authentication with JWT, a key pair can be generated via the configuration tab within the Developer Console. Alternatively, you can generate your own and supply Box with the public key. Regardless of the method you select, your Box account will need to have 2FA enabled for security purposes.

    1. Generate a keypair (Recommended) If you would like to use a Box generated keypair, navigate to the Developer Console where you can generate a configuration file. This file includes a public/private keypair and a number of other application details that are necessary for authentication.

    2. To generate this file, navigate to the Configuration tab of the Developer Console and scroll down to the Add and Manage Public Keys section. Once this step is completed a File will be downloaded, this is the information you will use to copy to your Collector in the Self Service Application.

  3. App Authorization Before the application can be used, a Box Admin needs to authorize the application within the Box Admin Console.

Navigate to the Authorization tab for your application within the Developer Console. Your Box Admin will need to authorize this. Once completed your app is ready to use.

Minimum configuration required for basic pulling

Although this collector supports advanced configuration, the fields required to retrieve data with basic configuration are defined below.

This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check setting sections for details.

Setting

Details

client_id

The Box client ID

client_secret

The Box client secret

public_key_id

The Box Public Key Id

private_key

The Box Private Key

passphrase

Box Passphrase

enterprise_id

The Box Account Enterprise ID

Accepted Authentication Methods

Authentication Method

Client ID

Client Secret

Public Key Id

Private Key

Passphrase

Enterprise Id

JWT

Required

Required

Required

Required

Required

Required

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Collector services detail

This section is intended to explain how to proceed with specific actions for services.

 Event duplication

The process for deduplication of events is handled by the event id that is returned that is stored and checked against on each pull.

 Devo categorization and destination

All services are tagged by the service they are pulled by.

 Setup/Puller Output
2024-04-02T12:36:10.881042712Z 2024-04-02T12:36:10.880    INFO InputProcess::MainThread -> InputThread(Box,45635) - Starting thread (execution_period=300s)
2024-04-02T12:36:10.900848871Z 2024-04-02T12:36:10.900    INFO InputProcess::MainThread -> ServiceThread(Box,45635,admin,predefined) - Starting thread (execution_period=300s)
2024-04-02T12:36:10.901635871Z 2024-04-02T12:36:10.901    INFO InputProcess::MainThread -> BoxPullerSetup(Box-collector,Box#45635,admin#predefined) -> Starting thread
2024-04-02T12:36:10.902970384Z 2024-04-02T12:36:10.902    INFO InputProcess::MainThread -> BoxPuller(Box,45635,admin,predefined) - Starting thread
2024-04-02T12:36:10.903841384Z 2024-04-02T12:36:10.903 WARNING InputProcess::BoxPuller(Box,45635,admin,predefined) -> Waiting until setup will be executed
2024-04-02T12:36:10.910390935Z 2024-04-02T12:36:10.909 WARNING InputProcess::BoxPullerSetup(Box-collector,Box#45635,admin#predefined) -> The token/header/authentication has not been created yet
2024-04-02T12:36:10.912045728Z 2024-04-02T12:36:10.911    INFO InputProcess::BoxPullerSetup(Box-collector,Box#45635,admin#predefined) -> using base url: https://manage.office.com
2024-04-02T12:36:11.221983503Z 2024-04-02T12:36:11.221    INFO InputProcess::BoxPullerSetup(Box-collector,Box#45635,admin#predefined) -> Setup for module <BoxPuller> has been successfully executed
2024-04-02T12:36:11.906707525Z 2024-04-02T12:36:11.905    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> BoxPuller(Box,45635,admin,predefined) Starting the execution of pre_pull()
2024-04-02T12:36:11.907795456Z 2024-04-02T12:36:11.906    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Reading persisted data
2024-04-02T12:36:11.910462424Z 2024-04-02T12:36:11.909    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Data retrieved from the persistence: {'@persistence_version': 1, 'start_time_in_utc': None, 'last_event_time_in_utc': '2024-04-02 12:35:07'}
2024-04-02T12:36:11.911358075Z 2024-04-02T12:36:11.910    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Start time not found in config, using 2024-04-02 12:35:11
2024-04-02T12:36:11.912847398Z 2024-04-02T12:36:11.911    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Running the persistence upgrade steps
2024-04-02T12:36:11.915154717Z 2024-04-02T12:36:11.913    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Running the persistence corrections steps
2024-04-02T12:36:11.916748235Z 2024-04-02T12:36:11.915    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Running the persistence corrections steps
2024-04-02T12:36:11.918276116Z 2024-04-02T12:36:11.917    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> No changes were detected in the persistence
2024-04-02T12:36:11.919248467Z 2024-04-02T12:36:11.918    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> BoxPuller(Box,45635,admin,predefined) Finalizing the execution of pre_pull()
2024-04-02T12:36:11.920446419Z 2024-04-02T12:36:11.919    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Starting data collection every 60 seconds
2024-04-02T12:36:11.924162570Z 2024-04-02T12:36:11.923    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Pull Started
2024-04-02T12:36:12.045307400Z 2024-04-02T12:36:12.044    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Found 1 removed 0
2024-04-02T12:36:12.221770395Z 2024-04-02T12:36:12.221    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> (Partial) Statistics for this pull cycle (@devo_pulling_id=1712061371905):Number of requests made: 1; Number of events received: 30; Number of duplicated events filtered out: 0; Number of events generated and sent: 30; Average of events per second: 101.027.
2024-04-02T12:36:12.222243522Z 2024-04-02T12:36:12.222    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Statistics for this pull cycle (@devo_pulling_id=1712061371905):Number of requests made: 1; Number of events received: 30; Number of duplicated events filtered out: 0; Number of events generated and sent: 30; Average of events per second: 100.751.
2024-04-02T12:36:12.222631040Z 2024-04-02T12:36:12.222    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> The data is up to date!
2024-04-02T12:36:12.223216005Z 2024-04-02T12:36:12.223    INFO InputProcess::BoxPuller(Box,45635,admin,predefined) -> Data collection completed. Elapsed time: 0.318 seconds. Waiting for 59.682 second(s) until the next one```
 Restart the persistence

To change the persistence you must edit the short_id of the collector in the config.

 Troubleshooting

This collector has different security layers that detect both an invalid configuration and abnormal operation. This table will help you detect and resolve the most common errors.

Error Type

Error Id

Error Message

Cause

Solution

BoxAPIException

The stack trace will be generated dynamically from the events session method of the box sdk.

JWT token set up

{"error":"invalid_grant","error_description":"Please check the 'sub' claim. The 'sub' specified is invalid."}

The remote source being pullable is due to the set up of your credentials in the config.

The "sub" claim in the JWT should always be a Box ID — depending on the value of "box_sub_type" it would be either the ID of the user you're trying to generate tokens for or the ID of the enterprise you're authenticating as the service account for. You want to verify this value is correct and that you're passing it as a string.

Collector operations

 Operations to verify collector

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

2023-01-10T15:22:57.146    INFO MainProcess::MainThread -> Loading configuration using the following files: {"full_config": "config-test-local.yaml", "job_config_loc": null, "collector_config_loc": null}
2023-01-10T15:22:57.146    INFO MainProcess::MainThread -> Using the default location for "job_config_loc" file: "/etc/devo/job/job_config.json"
2023-01-10T15:22:57.147    INFO MainProcess::MainThread -> "\etc\devo\job" does not exists
2023-01-10T15:22:57.147    INFO MainProcess::MainThread -> Using the default location for "collector_config_loc" file: "/etc/devo/collector/collector_config.json"
2023-01-10T15:22:57.148    INFO MainProcess::MainThread -> "\etc\devo\collector" does not exists
2023-01-10T15:22:57.148    INFO MainProcess::MainThread -> Results of validation of config files parameters: {"config": "C:\git\collectors2\devo-collector-<name>\config\config.yaml", "config_validated": True, "job_config_loc": "/etc/devo/job/job_config.json", "job_config_loc_default": True, "job_config_loc_validated": False, "collector_config_loc": "/etc/devo/collector/collector_config.json", "collector_config_loc_default": True, "collector_config_loc_validated": False}
2023-01-10T15:22:57.171 WARNING MainProcess::MainThread -> [WARNING] Illegal global setting has been ignored -> multiprocessing: False

Events delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method. A successful run has the following output messages for the initializer module:

2023-01-10T15:23:00.788    INFO OutputProcess::MainThread -> DevoSender(standard_senders,devo_sender_0) -> Starting thread
2023-01-10T15:23:00.789    INFO OutputProcess::MainThread -> DevoSenderManagerMonitor(standard_senders,devo_1) -> Starting thread (every 300 seconds)
2023-01-10T15:23:00.790    INFO OutputProcess::MainThread -> DevoSenderManager(standard_senders,manager,devo_1) -> Starting thread
2023-01-10T15:23:00.842    INFO OutputProcess::MainThread -> global_status: {"output_process": {"process_id": 18804, "process_status": "running", "thread_counter": 21, "thread_names": ["MainThread", "pydevd.Writer", "pydevd.Reader", "pydevd.CommandThread", "pydevd.CheckAliveThread", "DevoSender(standard_senders,devo_sender_0)", "DevoSenderManagerMonitor(standard_senders,devo_1)", "DevoSenderManager(standard_senders,manager,devo_1)", "OutputStandardConsumer(standard_senders_consumer_0)",

Sender services

The Integrations Factory Collector SDK has 3 different sender services depending on the event type to deliver (internal, standard, and lookup). This collector uses the following Sender Services:

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

Sender manager internal queue size: 0

Displays the items available in the internal sender queue.

This value helps detect bottlenecks and needs to increase the performance of data delivery to Devo. This last can be made by increasing the concurrent senders.

Total number of messages sent: 44, messages sent since "2022-06-28 10:39:22.511671+00:00": 21 (elapsed 0.007 seconds)

Displays the number of events from the last time the collector executed the pull logic. Following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2022-06-28 10:39:22.511671+00:00.

  • 21 events were sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.007 seconds to be delivered.

    By default these traces will be shown every 10 minutes.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service

Sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Standard - Total number of messages sent: 57, messages sent since "2023-01-10 16:09:16.116750+00:00": 0 (elapsed 0.000 seconds

Displays the number of events from the last time the collector executed the pull logic. Following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2023-01-10 16:09:16.116750+00:00.

  • 21 events were sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.00 seconds to be delivered.

 Check memory usage

To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.

  • The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.

  • The global pressure of the available memory is displayed in the global value.

  • All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory

  INFO InputProcess::MainThread -> [GC] global: 20.4% -> 20.4%, process: RSS(34.50MiB -> 34.08MiB), VMS(410.52MiB ->
  410.02MiB)
  INFO OutputProcess::MainThread -> [GC] global: 20.4% -> 20.4%, process: RSS(28.41MiB -> 28.41MiB), VMS(705.28MiB ->
  705.28MiB)

Change log

Release

Released on

Release type

Recommendations

v1.0.0

NEW FEATURE

Recommended version

  • No labels