Document toolboxDocument toolbox

Kiteworks API collector

Overview

Kiteworks APIs provide broad coverage of the platform. The APIs can be categorized into Content, Collaboration, Preferences, Contacts, Security, Clients, and Kiteworks Maintenance APIs.

Devo collector features

Feature

Details

Feature

Details

Allow parallel downloading (multipod)

  • not allowed

Running environments

  • collector server

  • on-premise

Populated Devo events

  • table

Flattening preprocessing

  • no

Allowed source events obfuscation

  • Yes

Data sources

Data source

API endpoint

Collector service name

Devo table

Data source

API endpoint

Collector service name

Devo table

Admin

/rest/admin/activities

admin

dmp.kiteworks.admin.event

For more information on how the events are parsed, visit our page.

Vendor setup

Getting Credentials

To log in to the kiteworks environment. Using the vendor doc here:

Enable the Kiteworks API Playground

The following steps help you get started with the Kiteworks API playground. Exploring using Kiteworks APIs requires development experience. To enable the Kiteworks API Playground UI:

  1. In the new Kiteworks Admin console, go to Applications > Client Management > Custom Applications.

  2. Turn on Enable Kiteworks API Playground UI. The Kiteworks Developer Documentation is added to the Help menu.

  3. To view the complete list of APIs, go to the Help (?) menu and click Kiteworks Developer Documentation. The Developer Documentation page displays listing the library of APIs.

image-20240924-105819.png

Setting Up Credentials

  1. Create a custom application (ensuring that Signature Authorization) is enabled.

  2. Go to the playground at https:///rest/index.html.

  3. On the Kiteworks API Documentation toolbar, click the Get a Token button.

  4. In the Request OAuth Token dialog box, select Signature-based Access Token from the grant list.

  5. Fill in the information based on the application you just created in the administrator console.

  6. Test all the API endpoints through the playground.

image-20240924-105832.png

If you have not already done so, register at https://developer.kiteworks.com .

Minimum configuration required for basic pulling

Although this collector supports advanced configuration, the fields required to retrieve data with basic configuration are defined below.

This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check setting sections for details.

Setting

Details

Setting

Details

client_id

The Kiteworks client ID

client_secret

The Kiteworks client secret

signature_secret

The Kiteworks signature secret

user_email

The Kiteworks user email secret

base_url

Add your domain to the the url

token_url

Add your domain to the url

Accepted authentication methods

Authentication Method

Client ID

Client Secret

Signature Secret

User Email

Authentication Method

Client ID

Client Secret

Signature Secret

User Email

Signature-based Access Token

Required

Required

Required

Required

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Collector services detail

This section is intended to explain how to proceed with specific actions for services.

Services (Services)

Internal Process and Deduplication Method

The collector deduplicates by the Ids pulled and stored.

Event Deduplication

Overview

The api is queried by time intervals after pulling the ids are checked against and stored and non-duplicate events are sent to Devo

Devo Categorization and Destination

All services are tagged by the service they are pulled by.

Setup/Puller Output

2024-04-02T12:36:10.881042712Z 2024-04-02T12:36:10.880 INFO InputProcess::MainThread -> InputThread(kiteworks,45635) - Starting thread (execution_period=300s) 2024-04-02T12:36:10.900848871Z 2024-04-02T12:36:10.900 INFO InputProcess::MainThread -> ServiceThread(kiteworks,45635,admin,predefined) - Starting thread (execution_period=300s) 2024-04-02T12:36:10.901635871Z 2024-04-02T12:36:10.901 INFO InputProcess::MainThread -> ManagementPullerSetup(kiteworks-collector,kiteworks#45635,admin#predefined) -> Starting thread 2024-04-02T12:36:10.902970384Z 2024-04-02T12:36:10.902 INFO InputProcess::MainThread -> ManagementPuller(kiteworks,45635,admin,predefined) - Starting thread 2024-04-02T12:36:10.903841384Z 2024-04-02T12:36:10.903 WARNING InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Waiting until setup will be executed 2024-04-02T12:36:10.910390935Z 2024-04-02T12:36:10.909 WARNING InputProcess::ManagementPullerSetup(kiteworks-collector,kiteworks#45635,admin#predefined) -> The token/header/authentication has not been created yet 2024-04-02T12:36:10.912045728Z 2024-04-02T12:36:10.911 INFO InputProcess::ManagementPullerSetup(kiteworks-collector,kiteworks#45635,admin#predefined) -> using base url: https://manage.office.com 2024-04-02T12:36:11.221983503Z 2024-04-02T12:36:11.221 INFO InputProcess::ManagementPullerSetup(kiteworks-collector,kiteworks#45635,admin#predefined) -> Setup for module <ManagementPuller> has been successfully executed 2024-04-02T12:36:11.906707525Z 2024-04-02T12:36:11.905 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> ManagementPuller(kiteworks,45635,admin,predefined) Starting the execution of pre_pull() 2024-04-02T12:36:11.907795456Z 2024-04-02T12:36:11.906 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Reading persisted data 2024-04-02T12:36:11.910462424Z 2024-04-02T12:36:11.909 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Data retrieved from the persistence: {'@persistence_version': 1, 'start_time_in_utc': None, 'last_event_time_in_utc': '2024-04-02 12:35:07'} 2024-04-02T12:36:11.911358075Z 2024-04-02T12:36:11.910 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Start time not found in config, using 2024-04-02 12:35:11 2024-04-02T12:36:11.912847398Z 2024-04-02T12:36:11.911 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Running the persistence upgrade steps 2024-04-02T12:36:11.915154717Z 2024-04-02T12:36:11.913 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Running the persistence corrections steps 2024-04-02T12:36:11.916748235Z 2024-04-02T12:36:11.915 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Running the persistence corrections steps 2024-04-02T12:36:11.918276116Z 2024-04-02T12:36:11.917 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> No changes were detected in the persistence 2024-04-02T12:36:11.919248467Z 2024-04-02T12:36:11.918 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> ManagementPuller(kiteworks,45635,admin,predefined) Finalizing the execution of pre_pull() 2024-04-02T12:36:11.920446419Z 2024-04-02T12:36:11.919 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Starting data collection every 60 seconds 2024-04-02T12:36:11.924162570Z 2024-04-02T12:36:11.923 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Pull Started 2024-04-02T12:36:12.045307400Z 2024-04-02T12:36:12.044 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Found 1 removed 0 2024-04-02T12:36:12.221770395Z 2024-04-02T12:36:12.221 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> (Partial) Statistics for this pull cycle (@devo_pulling_id=1712061371905):Number of requests made: 1; Number of events received: 30; Number of duplicated events filtered out: 0; Number of events generated and sent: 30; Average of events per second: 101.027. 2024-04-02T12:36:12.222243522Z 2024-04-02T12:36:12.222 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Statistics for this pull cycle (@devo_pulling_id=1712061371905):Number of requests made: 1; Number of events received: 30; Number of duplicated events filtered out: 0; Number of events generated and sent: 30; Average of events per second: 100.751. 2024-04-02T12:36:12.222631040Z 2024-04-02T12:36:12.222 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> The data is up to date! 2024-04-02T12:36:12.223216005Z 2024-04-02T12:36:12.223 INFO InputProcess::ManagementPuller(kiteworks,45635,admin,predefined) -> Data collection completed. Elapsed time: 0.318 seconds. Waiting for 59.682 second(s) until the next one```

Restart the persistence

This collector uses persistent storage to download events in an orderly fashion and avoid duplicates. In case you want to re-ingest historical data or recreate the persistence, you can restart the persistence of this collector by following these steps:

  1. Edit the configuration file.

  2. Change the value of the start_time_in_utc parameter to a different one.

  3. Save the changes.

  4. Restart the collector.

The collector will detect this change and will restart the persistence using the parameters of the configuration file or the default configuration in case it has not been provided.

Collector operations

Verify collector operations

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

2023-01-10T15:22:57.146 INFO MainProcess::MainThread -> Loading configuration using the following files: {"full_config": "config-test-local.yaml", "job_config_loc": null, "collector_config_loc": null} 2023-01-10T15:22:57.146 INFO MainProcess::MainThread -> Using the default location for "job_config_loc" file: "/etc/devo/job/job_config.json" 2023-01-10T15:22:57.147 INFO MainProcess::MainThread -> "\etc\devo\job" does not exists 2023-01-10T15:22:57.147 INFO MainProcess::MainThread -> Using the default location for "collector_config_loc" file: "/etc/devo/collector/collector_config.json" 2023-01-10T15:22:57.148 INFO MainProcess::MainThread -> "\etc\devo\collector" does not exists 2023-01-10T15:22:57.148 INFO MainProcess::MainThread -> Results of validation of config files parameters: {"config": "C:\git\collectors2\devo-collector-<name>\config\config.yaml", "config_validated": True, "job_config_loc": "/etc/devo/job/job_config.json", "job_config_loc_default": True, "job_config_loc_validated": False, "collector_config_loc": "/etc/devo/collector/collector_config.json", "collector_config_loc_default": True, "collector_config_loc_validated": False} 2023-01-10T15:22:57.171 WARNING MainProcess::MainThread -> [WARNING] Illegal global setting has been ignored -> multiprocessing: False

Events delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method. A successful run has the following output messages for the initializer module:

2023-01-10T15:23:00.788 INFO OutputProcess::MainThread -> DevoSender(standard_senders,devo_sender_0) -> Starting thread 2023-01-10T15:23:00.789 INFO OutputProcess::MainThread -> DevoSenderManagerMonitor(standard_senders,devo_1) -> Starting thread (every 300 seconds) 2023-01-10T15:23:00.790 INFO OutputProcess::MainThread -> DevoSenderManager(standard_senders,manager,devo_1) -> Starting thread 2023-01-10T15:23:00.842 INFO OutputProcess::MainThread -> global_status: {"output_process": {"process_id": 18804, "process_status": "running", "thread_counter": 21, "thread_names": ["MainThread", "pydevd.Writer", "pydevd.Reader", "pydevd.CommandThread", "pydevd.CheckAliveThread", "DevoSender(standard_senders,devo_sender_0)", "DevoSenderManagerMonitor(standard_senders,devo_1)", "DevoSenderManager(standard_senders,manager,devo_1)", "OutputStandardConsumer(standard_senders_consumer_0)",

Sender services

The Integrations Factory Collector SDK has 3 different sender services depending on the event type to deliver (internal, standard, and lookup). This collector uses the following Sender Services:

Logging trace

Description

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

Sender manager internal queue size: 0

Displays the items available in the internal sender queue.

This value helps detect bottlenecks and needs to increase the performance of data delivery to Devo. This last can be made by increasing the concurrent senders.

Total number of messages sent: 44, messages sent since "2022-06-28 10:39:22.511671+00:00": 21 (elapsed 0.007 seconds)

Displays the number of events from the last time the collector executed the pull logic. Following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2022-06-28 10:39:22.511671+00:00.

  • 21 events were sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.007 seconds to be delivered.

    By default these traces will be shown every 10 minutes.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service

Sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Standard - Total number of messages sent: 57, messages sent since "2023-01-10 16:09:16.116750+00:00": 0 (elapsed 0.000 seconds

Displays the number of events from the last time the collector executed the pull logic. Following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2023-01-10 16:09:16.116750+00:00.

  • 21 events were sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.00 seconds to be delivered.

Check memory usage

To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.

  • The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.

  • The global pressure of the available memory is displayed in the global value.

  • All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory

Change log

Release

Released on

Release type

Details

Recommendations

Release

Released on

Release type

Details

Recommendations

v1.0.0

Aug 29, 2024

NEW collector

New collector

Initial version