Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).
...
Rw ui tabs macro
├── chain.crt
Rw tab
title
On-premise collector
This data collector can be run in any machine that has the Docker service available because it should be executed as a docker container. The following sections explain how to prepare all the required setup for having the data collector running.
Structure
The following directory structure should be created for being used when running the collector:
We use a piece of software called Collector Server to host and manage all our available collectors.
To enable the collector for a customer:
In the Collector ServerGUI, access the domain in which you want this instance to be created
Click Add Collector and find the one you wish to add.
In the Version field, select the latest value.
In the Collector Name field, set the value you prefer (this name must be unique inside the same Collector Server domain).
In the sending method select Direct Send. Direct Send configuration is optional for collectors that create Table events, but mandatory for those that create Lookups.
In the Parameters section, establish the Collector Parameters as follows below:
In Devo, go to Administration → Credentials → X.509 Certificates, download the Certificate, Private key and Chain CA and save them in <product_name>/certs/. Learn more about security credentials in Devo here.
All defined service entities will be executed by the collector. If you do not want to run any of them, just remove the entity from the services object.
Replace the placeholders with your required values following the description table below:
Parameter
Data type
Type
Value range
Details
collector_id
str
Mandatory
Minimum length: 1 Maximum length: 5
Use this param to give an unique id to this collector.
collector_name
str
Mandatory
Minimum length: 1 Maximum length: 10
Use this param to give a valid name to this collector.
multiprocessing_mode
bool
Mandatory
false / true
If the value is true, the collector will run using a multiprocessing architecture. If the value is false, the collector will use only one CPU.
devo_address
str
Mandatory
collector-us.devo.io collector-eu.devo.io
Use this param to identify the Devo Cloud where the events will be sent.
chain_filename
str
Mandatory
Minimum length: 4 Maximum length: 20
Use this param to identify the chain.cert file downloaded from your Devo domain. Usually this file's name is: chain.crt
cert_filename
str
Mandatory
Minimum length: 4 Maximum length: 20
Use this param to identify the file.cert downloaded from your Devo domain.
key_filename
str
Mandatory
Minimum length: 4 Maximum length: 20
Use this param to identify the file.key downloaded from your Devo domain.
input_id
int
Mandatory
Minimum length: 1 Maximum length: 5
Use this param to give an unique id to this input service.
api_token
str
Mandatory
Any
API token to authenticate to the service.
period_in_seconds
int
Mandatory
Minimum length: 1
Recommended value: 60
This parameter allows you to customize this behavior for each service. As this collector uses websockets, this is the period elapsed between reconnections.
Initial time period used when fetching data from the endpoint.
override_tag_base
str
Optional
str.str.str.str
Example: mail.proofpoint.pod.maillog
This parameter allows to override the destination table in Devo.
override_url_base
str
Optional
Valid connection URL
This parameter allows to change the connection URL to the websocket
Download the Docker image
The collector should be deployed as a Docker container. Download the Docker image of the collector as a .tgz file by clicking the link in the following table:
Once the Docker image is imported, it will show the real name of the Docker image (including version info). Replace <image_file> and <version> with a proper value.
The Docker image can be deployed on the following services:
Docker
Execute the following command on the root directory <any_directory>/devo-collectors/<product_name>/
Replace <product_name>, <image_name> and <version> with the proper values.
Docker Compose
The following Docker Compose file can be used to execute the Docker container. It must be created in the <any_directory>/devo-collectors/<product_name>/ directory.
To run the container using docker-compose, execute the following command from the <any_directory>/devo-collectors/<product_name>/ directory:
Code Block
IMAGE_VERSION=<version> docker-compose up -d
Note
Replace <product_name>, <image_name> and <version> with the proper values.
Rw tab
title
Cloud collector
We use a piece of software called Collector Server to host and manage all our available collectors.
To enable the collector for a customer:
In the Collector ServerGUI, access the domain in which you want this instance to be created
Click Add Collector and find the one you wish to add.
In the Version field, select the latest value.
In the Collector Name field, set the value you prefer (this name must be unique inside the same Collector Server domain).
In the sending method select Direct Send. Direct Send configuration is optional for collectors that create Table events, but mandatory for those that create Lookups.
In the Parameters section, establish the Collector Parameters as follows below:
All defined service entities will be executed by the collector. If you do not want to run any of them, just remove the entity from the services object.
Please replace the placeholders with real world values following the description table below:
Parameter
Data type
Type
Value range / Format
Details
input_id
int
Mandatory
Minimum length: 1 Maximum length: 5
Use this param to give an unique id to this input service.
enabled
bool
Mandatory
false / true
If the value is true, the input definition will be executed. If the value is false, the service will be ignored.
cluster_id
str
Mandatory
Any
Cluster id to get data from.
api_key
str
Mandatory
Any
API key to authenticate to the service.
request_period_in_seconds
int
Mandatory
Minimum length: 1
This parameter allows you to customize this behavior for each service.
Initial time period used when fetching data from the endpoint.
Note
Due to the large amount of data produced by this service, using this parameter is discouraged except in special cases.
This parameter can be left blank, removed or commented.
override_tag_base
str
Optional
str.str.str.str
Example: mail.proofpoint.pod.maillog
This parameter allows to override the destination table in Devo.
override_url_base
str
Optional
Valid connection URL
This parameter allows to change the connection URL to the websocket
Change log
...
Release
...
Released on
...
Release type
...
Details
}
}
All defined service entities will be executed by the collector. If you do not want to run any of them, just remove the entity from the services object.
Please replace the placeholders with real world values following the description table below:
Parameter
Data type
Type
Value range / Format
Details
input_id
int
Mandatory
Minimum length: 1 Maximum length: 5
Use this param to give an unique id to this input service.
enabled
bool
Mandatory
false / true
If the value is true, the input definition will be executed. If the value is false, the service will be ignored.
cluster_id
str
Mandatory
Any
Cluster id to get data from.
api_key
str
Mandatory
Any
API key to authenticate to the service.
request_period_in_seconds
int
Mandatory
Minimum length: 1
This parameter allows you to customize this behavior for each service.
Initial time period used when fetching data from the endpoint.
Note
Due to the large amount of data produced by this service, using this parameter is discouraged except in special cases.
This parameter can be left blank, removed or commented.
override_tag_base
str
Optional
str.str.str.str
Example: mail.proofpoint.pod.maillog
This parameter allows to override the destination table in Devo.
override_url_base
str
Optional
Valid connection URL
This parameter allows to change the connection URL to the websocket
Rw tab
title
On-premise collector
This data collector can be run in any machine that has the Docker service available because it should be executed as a docker container. The following sections explain how to prepare all the required setup for having the data collector running.
Structure
The following directory structure should be created for being used when running the collector:
In Devo, go to Administration → Credentials → X.509 Certificates, download the Certificate, Private key and Chain CA and save them in <product_name>/certs/. Learn more about security credentials in Devo here.
All defined service entities will be executed by the collector. If you do not want to run any of them, just remove the entity from the services object.
Replace the placeholders with your required values following the description table below:
Parameter
Data type
Type
Value range
Details
collector_id
str
Mandatory
Minimum length: 1 Maximum length: 5
Use this param to give an unique id to this collector.
collector_name
str
Mandatory
Minimum length: 1 Maximum length: 10
Use this param to give a valid name to this collector.
multiprocessing_mode
bool
Mandatory
false / true
If the value is true, the collector will run using a multiprocessing architecture. If the value is false, the collector will use only one CPU.
devo_address
str
Mandatory
collector-us.devo.io collector-eu.devo.io
Use this param to identify the Devo Cloud where the events will be sent.
chain_filename
str
Mandatory
Minimum length: 4 Maximum length: 20
Use this param to identify the chain.cert file downloaded from your Devo domain. Usually this file's name is: chain.crt
cert_filename
str
Mandatory
Minimum length: 4 Maximum length: 20
Use this param to identify the file.cert downloaded from your Devo domain.
key_filename
str
Mandatory
Minimum length: 4 Maximum length: 20
Use this param to identify the file.key downloaded from your Devo domain.
input_id
int
Mandatory
Minimum length: 1 Maximum length: 5
Use this param to give an unique id to this input service.
api_token
str
Mandatory
Any
API token to authenticate to the service.
period_in_seconds
int
Mandatory
Minimum length: 1
Recommended value: 60
This parameter allows you to customize this behavior for each service. As this collector uses websockets, this is the period elapsed between reconnections.
Initial time period used when fetching data from the endpoint.
override_tag_base
str
Optional
str.str.str.str
Example: mail.proofpoint.pod.maillog
This parameter allows to override the destination table in Devo.
override_url_base
str
Optional
Valid connection URL
This parameter allows to change the connection URL to the websocket
Download the Docker image
The collector should be deployed as a Docker container. Download the Docker image of the collector as a .tgz file by clicking the link in the following table:
Once the Docker image is imported, it will show the real name of the Docker image (including version info). Replace <image_file> and <version> with a proper value.
The Docker image can be deployed on the following services:
Docker
Execute the following command on the root directory <any_directory>/devo-collectors/<product_name>/
Replace <product_name>, <image_name> and <version> with the proper values.
Docker Compose
The following Docker Compose file can be used to execute the Docker container. It must be created in the <any_directory>/devo-collectors/<product_name>/ directory.
To run the container using docker-compose, execute the following command from the <any_directory>/devo-collectors/<product_name>/ directory:
Code Block
IMAGE_VERSION=<version> docker-compose up -d
Note
Replace <product_name>, <image_name> and <version> with the proper values.
API limits and duplicates
Number of connections
The number of connections that can be performed with one credential set is limited. The credentials cannot pull data from more sources than the defined ones. If the access token is being used by another session, the API will return a 409 error: Exceeded maximum number of sessions per token.
The collector logs will show this error:
Code Block
2024-08-09T10:48:10.163 ERROR InputProcess::ProofpointOnDemandWSSPuller(proofpoint_on_demand,12345,message,predefined) -> Handshake status 409 Conflict -+-+- {'date': 'Fri, 09 Aug 2024 08:48:10 GMT', 'content-type': 'text/plain;charset=iso-8859-1', 'content-length': '47'} -+-+- b'Exceeded maximum number of sessions per token\r\n' - goodbye
Duplicated events
It was observed that the API sometimes sends duplicate events. The collector can filter out duplicate events within an hour.
Change log
Release
Released on
Release type
Recommendations
v1.2.2
Status
colour
Blue
title
improvement
Recommended version
Expand
title
Details
Improvements
Upgrade DCSDK from v1.12.2 to v1.12.4
Improve Controlled stop when InputProcess is killed
Change internal queue management for protecting against OOMK
Extracted ModuleThread structure from PullerAbstract
Improve Controlled stop when both processes fails to instatiate
Fixed error related a ValueError exception not well controlled.
Fixed error related with loss of some values in internal messages (collector_name, collector_id and job_id)
v1.2.1
Status
colour
Yellow
title
BUG FIXING
Update
Expand
title
Details
Bug fixing
Downgrade DCSDK from v1.12.3 to v1.12.2
v1.2.0
Status
colour
GreenBlue
title
NEW FEATUREimprovement
Status
colour
Red
colour
Yellow
title
BUG FIXING
Update
Expand
title
BUG FIXING
Details
Improvements
Upgrade DCSDK from v1.12.2 to v1.12.3
Bug fixing
Reduce in-memory cached data to avoid memory issues
Recommended version
v1.1.0
Status
colour
GreenPurple
title
NEW FEATURE
Status
colour
Red
Blue
title
improvement
Status
colour
Yellow
title
BUG FIXING
Update
Expand
title
Details
New features
New parameters override_tag_base and override_url_base for config.yaml
Parametrize timestamp_field and datetime_format to collector_definitions.yaml
Reduce memory usage by changing time_window_hours to 1
Send messages and flush the ProcessingLayer's cache immediately on connection close
Optimize ProcessingLayer's performance
Detection when start_time has been changed to use it instead of persisted data (creates persistence v2)
Persistence data structure automatic migration from v1 to v2
Adapt unit tests to new functionalities
Mock web-socket server with Proofpoint POD API specifics for integration tests without credentials:
Rounding down sinceTime param to the nearest hour
Some events coming unsorted
Bug fixing
High CPU usage caused by a wait mechanism not working correctly
Reduce persisted data size, causing memory issues (INT-2562, INT-2489)
Improved duplicate filtering (2509)
Improvements
Upgrade DCSDK to v1.12.2 from v1.12.1
Upgrade DevoSDK dependency to version v5.4.0
Update
v1.0.1
Status
colour
RedYellow
title
BUG FIXING
Upgrade
Expand
title
Details
Bug fixing:
Added reset mechanism for stats counters to avoid growing them indefinitely