Document toolboxDocument toolbox

TAXII collector

Overview

Trusted Automated Exchange of Intelligence Information (TAXII™) is an application protocol for exchanging CTI over HTTPS. ​TAXII defines a RESTful API (a set of services and message exchanges) and a set of requirements for TAXII Clients and Servers.

Devo collector features

Feature

Details

Feature

Details

Allow parallel downloading (multipod)

not allowed

Running environments

  • collector server

  • on-premise

Populated Devo events

table

Flattening preprocessing

no

Data sources

Data source

Description

API endpoint

Collector service name

Available from release

Data source

Description

API endpoint

Collector service name

Available from release

Objects and Manifests

STIX Objects data and metadata of objects 

collection.get_objects()

collection.get_manifests()

objects_and_manifests

v1.0.0

For more information on how the events are parsed, visit our page.

Flattening preprocessing

Data source

Collector service

Optional

Flattening details

Data source

Collector service

Optional

Flattening details

Objects and Manifests

objects_and_manifests

no

No flattening

Vendor setup

Several vendors offer the TAXII protocol, this guide uses Cyware as an example. You’ll need:

  • Access to Cyware TAXII server or any TAXII server with version 2.0 or 2.1

  • In order to retrieve the data, we need the discovery URL for the TAXII server and the required credentials.

Actions

Screenshots

Actions

Screenshots

Create a Cyware account.

-

Click on Cyware threat intel feed from the profile dropdown in the top right.

 

Click on Grant access. You will see the credentials on the next screen. Save the credentials, as they will be shown only once. 

Minimum configuration required for basic pulling

Important

Before using the collector, it is necessary to find out what version of the protocol the TAXII server is using. We are able to pull data from versions 2.1 and 2.0.

Some servers are able to offer data using both versions, using different URL. Check that there is a correspondence between the URL and the version of the protocol that the collector expects. Although this collector supports advanced configuration, the fields required to retrieve data with basic configuration are defined below.

This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check setting sections for details.

Setting

Details

Setting

Details

discovery_url

the Discovery URL of the TAXII server

username

credential username

password

credential password

taxii_version

2.0 or 2.1

See the Accepted authentication methods section to verify what settings are required based on the desired authentication method.

Accepted authentication methods

Depending on how you obtained your credentials, you will have to either fill or delete the following properties on the JSON/YAML credentials configuration block.

Authentication method

Username

Password

Authentication method

Username

Password

Username/Password

Required

Required

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Collector services detail

This section is intended to explain how to proceed with specific actions for services.

Events service

Once the collector has been launched, it is important to check if the ingestion is performed in a proper way. To do so, go to the collector’s logs console.

This service has the following components:

Component

Description

Component

Description

Setup

The setup module is in charge of authenticating the service and managing the token expiration when needed.

Puller

The setup module is in charge of pulling the data in a organized way and delivering the events via SDK.

Setup output

A successful run has the following output messages for the setup module:

INFO MainThread -> [SETUP] ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) - Starting thread INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Puller Setup Started INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> We do not have a token. Getting a new one from the server. INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Attempting to get OAuth2 token from ThreatQuotient server.... INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Attempting to get from client_id ThreatQuotient server.... INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Successfully received a client_id token from (...)/assets/js/config.js INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Successfully received JWT token from (...) which expires in 3599 seconds INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Puller Setup Terminated INFO ThreatQuotientDataPullerSetup(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Setup for module "ThreatQuotientDataPuller" has been successfully executed

Puller output

A successful initial run has the following output messages for the puller module:

INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Pull Started INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Total number of collections: 41 INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Polling for collection id: 1ae57e3d-810c-450c-a97f-eb60b63c896c INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Fetching objects data for collection 1ae57e3d-810c-450c-a97f-eb60b63c896c INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Fetched 0 objects for collection 1ae57e3d-810c-450c-a97f-eb60b63c896c INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Polling for collection id: 694bc05f-9568-4738-ac66-7c3fb119ff75 INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Fetching objects data for collection 694bc05f-9568-4738-ac66-7c3fb119ff75 INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Fetched 0 objects for collection 694bc05f-9568-4738-ac66-7c3fb119ff75 INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Polling for collection id: 8eafbfb4-6213-4ff8-9de4-978aa5fdc59f INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Fetching manifests data for collection 8eafbfb4-6213-4ff8-9de4-978aa5fdc59f INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Fetched 30 manifests for collection 8eafbfb4-6213-4ff8-9de4-978aa5fdc59f INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> sent 30 manifests to Devo for collection id 8eafbfb4-6213-4ff8-9de4-978aa5fdc59f INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Statistics for this pull cycle (@devo_pulling_id=1669727732252):Number of requests made: 41; Number of events received: 30; Number of duplicated events filtered out: 30; Number of events generated and sent: 0; Average of events per second: 0.000. INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> The data is up to date! INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Data collection completed. Elapsed time: 29.370 seconds. Waiting for 1170.630 second(s)

After a successful collector’s execution (that is, no error logs found), you will see the following log message:

INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> Statistics for this pull cycle (@devo_pulling_id=1669727732252):Number of requests made: 41; Number of events received: 30; Number of duplicated events filtered out: 30; Number of events generated and sent: 0; Average of events per second: 0.000. INFO InputProcess::TaxiiPuller(taxii_collector,12345,objects_and_manifests,predefined) -> The data is up to date!

This collector uses persistent storage to download events in an orderly fashion and avoid duplicates. In case you want to re-ingest historical data or recreate the persistence, you can restart the persistence of this collector by following these steps:

  1. Edit the configuration file.

  2. Change the value of the historical_date_utc parameter to a different one.

  3. Save the changes.

  4. Restart the collector.

The collector will detect this change and will restart the persistence using the parameters of the configuration file or the default configuration in case it has not been provided.

This collector has different security layers that detect both an invalid configuration and abnormal operation. This table will help you detect and resolve the most common errors.

Error type

Error ID

Error message

Cause

Solution

Error type

Error ID

Error message

Cause

Solution

InitVariableError

1

Invalid value is provided for the historic_date.

historic_date_utc is mentioned in the wrong date time format.

Write the historic_date_utc in correct format
For. ex. 2022-11-15T14:32:33Z

InitVariableError

2

Time format for historic date must be "%Y-%m-%dT%H:%M:%SZ"

historic_date_utc is mentioned in the wrong date time format.

Write the historic_date_utc in correct format
For. ex. 2022-11-15T14:32:33Z

InitVariableError

3

historic datetime cannot be greater than the present UTC time

The historic datetime mentioned is of the future.

The value of historic datetime should always be lesser than current datetime

SetupError

100

The remote data is not pullable with the given credentials. Check the error traces for details

The user credentials that are used in the config does not have permissionto fetch logs

Give the user permission to access the logs.

PullError

301

The username or the password provided in the config.yaml is incorrect

The username or the password provided in the config.yaml is incorrect

Recheck the values that are provided in the config file

PullError

302

The credentials does not have required permissions to fetch data from TAXII server

Credentials does not have required permissions to fetch the data

Check if the user credentials have enough permissions to fetch the data

 

PullError

303

The Discovery URL provided in the config.yaml is incorrect.

The discovery url provided is incorrect

recheck if correct value of discovery url is provided in the config

PullError

304

Internal server error from TAXII server.

Something went wrong in the server

Try after some time

Collector operations

This section is intended to explain how to proceed with specific operations of this collector.

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

Events delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method.

A successful run has the following output messages for the initializer module:

Sender services

The Integrations Factory Collector SDK has 3 different senders services depending on the event type to delivery (internal, standard, and lookup). This collector uses the following Sender Services:

Sender services

Description

Sender services

Description

internal_senders

In charge of delivering internal metrics to Devo such as logging traces or metrics.

standard_senders

In charge of delivering pulled events to Devo.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Total number of messages sent: 56, messages sent since "2023-02-14 12:31:36.678120+00:00": 56 (elapsed 0.389 seconds)

Displays the number of events from the last time and following the given example, the following conclusions can be obtained:

  • 56 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2023-02-14 12:31:36.678120+00:00.

  • 56 events where sent to Devo between the last UTC checkpoint and now.

  • Those 56 events required 0.389 seconds to be delivered.

To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.

  • The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.

  • The global pressure of the available memory is displayed in the global value.

  • All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory

Change log

Release

Released on

Release type

Details

Recommendations

Release

Released on

Release type

Details

Recommendations

v1.1.0

Apr 15, 2024

iMPROVEMENTS

bug fixes

Upgraded DCSDK from 1.8.0 to 1.11.1:

  • Refactored Collector Definitions to include metadata and some service specific json schemas

  • Refactored Base puller to find the right collector variables

  • Refactored the abstract methods, removing unused ones and updating old ones to match Template1 puller

  • Updated DevoSDK to v5.1.9

  • Fixed some bug related to development on MacOS

  • Added an extra validation and fix when the DCSDK receives a wrong timestamp format

  • Added an optional config property for use the Syslog timestamp format in a strict way

  • Updated DevoSDK to v5.1.10

  • Fix for SyslogSender related to UTF-8

  • Enhace of troubleshooting. Trace Standardization, Some traces has been introduced.

  • Introduced a mechanism to detect "Out of Memory killer" situation

  • Updated DevoSDK to v5.1.9

  • Fixed some bug related to development on MacOS

  • Added an extra validation and fix when the DCSDK receives a wrong timestamp format

  • Added an optional config property for use the Syslog timestamp format in a strict way

Bug Fixes

  • Fix the issue with Historic time utc which was causing error in API calls

Recommended version

v1.0.0

Jun 22, 2023

NEW FEATURE NEW FEATURE

-

Initial release