Document toolboxDocument toolbox

Recorded Future collector

Configuration requirements

To run this collector, there are some configurations detailed below that you need to take into account.

Configuration

Details

Configuration

Details

Recorded Future API key

Generate your API token.

Refer to the Vendor setup section to know more about these configurations.

Overview

Recorded Future is a threat intelligence provider that allows you to access known bad incidents of compromise and entity enrichment capabilities. It has 6 different modules and charges on a per-user basis for access to the product. The 6 different modules are:

  • Security Operations: Providing intel into SIEM / SOAR platforms.

  • Standalone Threat Intelligence: An extension of Security Operations providing context and enrichment of known and emerging threats/incidents.

  • Brand Intelligence: Monitoring an organization’s external exposure.

  • Vulnerability Management: Intelligence into the prioritization of threats.

  • Third party intelligence: Data from third-party sources.

  • Geo-political: More focussed on nation-state attacks and threat indicators.

Recorded Future also charges customers for each integration they use. For example, a mutual customer of Recorded Future and Devo using this integration will pay Recorded Future a subscription fee.

Devo collector features

Feature

Details

Allow parallel downloading (multipod)

  • Not allowed

Running environments

  • Collector server

  • On-premise

Populated Devo events

  • Lookups

Flattening Preprocessing

  • No

Data sources

Data Source

Description

API Endpoint

Collector service name

Devo Table

Available from release

IpAddressLookupPuller

This endpoint provides a list of IPs classified as Threat by Recorded Future.

https://api.recordedfuture.com/v2/ip/risklist?format=csv%2Fsplunk

lookup_puller

type: ip

Lookup

my.lookuplist.Recorded_Future_IPv4_Address_Threat_List

my.lookuplist.Recorded_Future_IPv6_Address_Threat_List

v1.0.0

DomainLookupPuller

This endpoint provides a list of domains classified as Threat by Recorded Future.

https://api.recordedfuture.com/v2/domain/risklist?format=csv%2Fsplunk

lookup_puller

type: domain

Lookup

my.lookuplist.Recorded_Future_Domain_Threat_List

v1.0.0

FileHashLookupPuller

This endpoint returns a list of file hashes classified as Threat by Recorded Future.

https://api.recordedfuture.com/v2/hash/risklist?format=csv%2Fsplunk

lookup_puller

type: hash

Lookup

my.lookuplist.Recorded_Future_File_Hash_Threat_List

v1.0.0

UrlLookupPuller

This endpoint returns a list of URL classified as Threat by Recorded Future.

https://api.recordedfuture.com/v2/url/risklist?format=csv%2Fsplunk

lookup_puller

type: url

Lookup

my.lookuplist.Recorded_Future_URL_Threat_List

v1.0.0

VulnerabilityLookupPuller

This endpoint returns a list of vulnerabilities classified as Threat by Recorded Future.

https://api.recordedfuture.com/v2/vulnerability/risklist?format=csv%2Fsplunk

lookup_puller

type: vulnerability

Lookup

my.lookuplist.Recorded_Future_Vulnerability_Threat_List

v1.0.0

PublicUkraineRussiaIpsLookupPuller

This endpoint returns a list of IPs related with Russia and Ukraine.

https://api.recordedfuture.com/v2/fusion/files/?path=/public/ukraine/ukraine_russia_ip.csv

lookup_puller

type: PublicUkraineRussiaIps

Lookup

my.lookuplist.Recorded_Future_IPv4_Public_Ukranie_Russia_List

my.lookuplist.Recorded_Future_IPv6_Public_Ukranie_Russia_List

v1.2.0

Vendor setup

There are some minimal requirements to enable this collector:

  • Login on Recorded Future

  • Create a new token

Action

Steps

Login on Recorded Future

Create a new token

  • Click on Menu and select the option User Settings.

  • Select the API Access tab.

  • To create a new API token, click on Generate New API Token.

  • Enter a name for the token.

  • Select Devo from the integration list.

  • Click on the Generate new API token button.  

  • Make a note of the token value, as this is required for the Ingest Configuration.

Minimum configuration required for basic pulling

Although this collector supports advanced configuration, the fields required to retrieve data with basic configuration are defined below.

This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check setting sections for details.

Setting

Details

Setting

Details

url_value

This param refers to the endpoint used by the collector to pull data.

api_token_value

This is the access token provided by Recorded Future.

list_of_sources

This configuration allows you to define what data sources will be pulled.

Accepted authentication methods

Authentication Method

Token

Authentication Method

Token

Token

REQUIRED

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Collector services detail

This section is intended to explain how to proceed with specific actions for services.

Lookup puller service

The only service available in this collector is lookup_puller. It contains 6 different pullers:

  • IpAddressLookupPuller

  • FileHashLookupPuller

  • DomainLookupPuller

  • UrlLookupPuller

  • VulnerabilityLookupPuller

  • PublicUkraineRussiaIpsLookupPuller

All events of this service are ingested into these tables:

  • IpAddressLookupPuller:

    • my.lookuplist.Recorded_Future_IPv4_Address_Threat_List

    • my.lookuplist.Recorded_Future_IPv6_Address_Threat_List

  • FileHashLookupPuller:

    • my.lookuplist.Recorded_Future_Domain_Threat_List

  • DomainLookupPuller:

    • my.lookuplist.Recorded_Future_File_Hash_Threat_List

  • UrlLookupPuller:

    • my.lookuplist.Recorded_Future_URL_Threat_List

  • VulnerabilityLookupPuller:

    • my.lookuplist.Recorded_Future_Vulnerability_Threat_List

  • PublicUkraineRussiaIpsLookupPuller:

    • my.lookuplist.Recorded_Future_IPv4_Public_Ukranie_Russia_List

    • my.lookuplist.Recorded_Future_IPv6_Public_Ukranie_Russia_List

Once the collector has been launched, it is important to check if the ingestion is performed in a proper way. To do so, go to the collector’s logs console.

This service has the following components:

Component

Description

Component

Description

Setup

The setup module is in charge of authenticating the service and managing the token expiration when needed.

Puller

The setup module is in charge of pulling the data in a organized way and delivering the events via SDK.

Setup output

A successful run has the following output messages for the setup module:

INFO InputProcess::DataPullerSetup(collector,data_puller#111,issues#predefined) -> Puller Setup Started INFO InputProcess::DataPullerSetup(collector,data_puller#111,issues#predefined) -> successfully generated new access token INFO InputProcess::DataPullerSetup(collector,data_puller#111,issues#predefined) -> The credentials provided in the configuration have required permissions to request issues from server INFO InputProcess::DataPullerSetup(collector,data_puller#111,issues#predefined) -> Puller Setup Terminated INFO InputProcess::DataPullerSetup(collector,data_puller#111,issues#predefined) -> Setup for module <DataPuller> has been successfully executed

Puller output

A successful initial run has the following output messages for the puller module:

INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> PrePull Started. INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> User has specified 2022-01-01 00:00:00 as the datetime. Historical polling will consider this datetime for creating the default values. INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> No saved state found, initializing with state: {'historic_date_utc': datetime.datetime(2022, 1, 1, 0, 0), 'last_polled_timestamp': datetime.datetime(2022, 1, 1, 0, 0), 'ids_with_same_timestamp': [], 'buffer_timestamp_with_duplication_risk': datetime.datetime(1970, 1, 1, 0, 0), 'buffer_ids_with_duplication_risk': []} WARNING InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Saved state loaded: {'historic_date_utc': datetime.datetime(2022, 1, 1, 0, 0), 'last_polled_timestamp': datetime.datetime(2022, 1, 1, 0, 0), 'ids_with_same_timestamp': [], 'buffer_timestamp_with_duplication_risk': datetime.datetime(1970, 1, 1, 0, 0), 'buffer_ids_with_duplication_risk': []} INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> PrePull Terminated INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Starting data collection every 60 seconds INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Pull Started INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Fetching for issues from 2022-01-01T00:00:00 INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Requesting API for issues INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> successfully retried issues from INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Total number of issues in this poll: 45 INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Removing the duplicate issues if present INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Flatten data is set to True. Flattening the data and adding 'devo_pulling_id' to events INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Delivering issues to the SDK INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> 20 issues delivered INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> State has been updated during pagination: {'historic_date_utc': datetime.datetime(2022, 1, 1, 0, 0), 'last_polled_timestamp': datetime.datetime(2022, 1, 1, 0, 0), 'ids_with_same_timestamp': [], 'buffer_timestamp_with_duplication_risk': datetime.datetime(2022, 5, 12, 19, 13, 20, 193191), 'buffer_ids_with_duplication_risk': ['09992ee4-1450-44fa-951c-d5fc4815473a']}. INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> (Partial) Statistics for this pull cycle (@devo_pulling_id=1656602793.044179) so far: Number of requests made: 1; Number of events received: 45; Number of duplicated events filtered out: 0; Number of events generated and sent: 20. INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Requesting API for issues INFO OutputProcess::SyslogSender(standard_senders,syslog_sender_0) -> syslog_sender_0 -> Created sender: {"client_name": "collector-4ac42f93cffaa59c-9dc9f67c9-cgm84", "url": "sidecar-service-default.integrations-factory-collectors:601", "object_id": "140446617222352"} INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> successfully retried issues from INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Removing the duplicate issues if present INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Flatten data is set to True. Flattening the data and adding 'devo_pulling_id' to events INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Delivering issues to the SDK INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> 20 issues delivered INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> State has been updated during pagination: {'historic_date_utc': datetime.datetime(2022, 1, 1, 0, 0), 'last_polled_timestamp': datetime.datetime(2022, 1, 1, 0, 0), 'ids_with_same_timestamp': [], 'buffer_timestamp_with_duplication_risk': datetime.datetime(2022, 6, 30, 9, 0, 1, 927011), 'buffer_ids_with_duplication_risk': ['87e301c5-d3b7-4c2b-9495-9163772b3517', '7c95e45f-694e-4843-8aa7-d697a66fb14a', '5f3daede-c375-424f-9034-d9f423310b4a', '584ac078-87f2-45a5-b2eb-6e72e0594bd7', '5057cb24-ce5b-405d-bd5d-fd7b3ba70fc0', '22933fcb-ebb0-4a03-bb00-c1cba0b5abca', '1bed50e0-7825-41c9-a9de-8d32e0a35de8', '03a303c8-000c-4544-8f2c-65486a225e15']}. INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> (Partial) Statistics for this pull cycle (@devo_pulling_id=1656602793.044179) so far: Number of requests made: 2; Number of events received: 45; Number of duplicated events filtered out: 0; Number of events generated and sent: 40. INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Requesting API for issues INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> successfully retried issues from INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Removing the duplicate issues if present INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Flatten data is set to True. Flattening the data and adding 'devo_pulling_id' to events INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Delivering issues to the SDK INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> 5 issues delivered INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> State has been updated during pagination: {'historic_date_utc': datetime.datetime(2022, 1, 1, 0, 0), 'last_polled_timestamp': datetime.datetime(2022, 1, 1, 0, 0), 'ids_with_same_timestamp': [], 'buffer_timestamp_with_duplication_risk': datetime.datetime(2022, 6, 30, 13, 14, 40, 673424), 'buffer_ids_with_duplication_risk': ['4d819843-61ef-4e70-a2b6-5834a3f96403']}. INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Updating deduplication buffers content INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Statistics for this pull cycle (@devo_pulling_id=1656602793.044179):Number of requests made: 3; Number of events received: 45; Number of duplicated events filtered out: 0; Number of events generated and sent: 45; Average of events per second: 33.797. INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Pull Terminated INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Data collection completed. Elapsed time: 1.334 seconds. Waiting for 58.666 second(s)

After a successful collector’s execution (that is, no error logs found), you will see the following log message:

INFO InputProcess::DataPuller(data_puller,00011,issues,predefined) -> Statistics for this pull cycle (@devo_pulling_id=1656602793.044179):Number of requests made: 3; Number of events received: 45; Number of duplicated events filtered out: 0; Number of events generated and sent: 45; Average of events per second: 33.797.

This collector does not persist in any data.

This collector has different security layers that detect both an invalid configuration and abnormal operation. This table will help you detect and resolve the most common errors.

ErrorType

Error Id

Error Message

Cause

Solution

ErrorType

Error Id

Error Message

Cause

Solution

ConnectionError

-

Error retrieving data from API with response code {status_code}. This pull iteration did not produce any results.

Response’s status code is different to 200. The cause depends on the status code received. These are some of the most common status codes:

  • 401: Unauthorized. Invalid credentials.

  • 403: Forbidden. Not allowed to perform this action.

  • 404: Not found. Invalid endpoint URL.

  • 429: Too may requests. API’s Rate Limiter has been exceeded.

Depending on the value of the status code, solutions for the most common error could be:

  • 401: Try valid credentials.

  • 403: Try some credentials with privileges to make these requests.

  • 404: Try a valid endpoint.

  • 429: Set requests_per_second parameter in config file to a lower value.

LookupError

-

All lookups have been rejected. ETL aborted!

The collected messages has an unexpected format.

This errors are expected as some data sources will not match the expected and accepted format.

Collector operations

This section is intended to explain how to proceed with specific operations of this collector.

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

Events delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method.

A successful run has the following output messages for the initializer module:

Sender services

The Integrations Factory Collector SDK has 3 different senders services depending on the event type to delivery (internal, standard, and lookup). This collector uses the following Sender Services:

Sender services

Description

Sender services

Description

internal_senders

In charge of delivering internal metrics to Devo such as logging traces or metrics.

standard_senders

In charge of delivering pulled events to Devo.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Total number of messages sent: 44, messages sent since "2022-06-28 10:39:22.511671+00:00": 21 (elapsed 0.007 seconds)

Displayes the number of events from the last time and following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2022-06-28 10:39:22.511671+00:00.

  • 21 events where sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.007 seconds to be delivered.

Change log

Release

Released on

Release type

Details

Recommendations

Release

Released on

Release type

Details

Recommendations

v1.5.0

Dec 12, 2023

IMPRoVEMENTs

  • Updated DCSDK from 1.10.2 to 1.12.4:

    • Fixed error related a ValueError exception not well controlled.

    • Fixed error related with loss of some values in internal messages(collector_name, collector_id and job_id)

    • Improve Controlled stop when InputProcess is killed

    • Change internal queue management for protecting against OOMK

    • Extracted ModuleThread structure from PullerAbstract

    • Improve Controlled stop when both processes fails to instatiate

    • Upgrade DevoSDK dependency to version v5.4.0

    • Fixed error in persistence system

    • Applied changes to make DCSDK compatible with MacOS

    • Added new sender for relay in house + TLS

    • Added persistence functionality for gzip sending buffer

    • Added Automatic activation of gzip sending

    • Improved behaviour when persistence fails

    • Upgraded DevoSDK dependency

    • Fixed console log encoding

    • Restructured python classes

    • Improved behaviour with non-utf8 characters

    • Decreased defaut size value for internal queues (Redis limitation, from 1GiB to 256MiB)

    • New persistence format/structure (compression in some cases)

    • Removed dmesg execution (It was invalid for docker execution)

    • Added extra check for not valid message timestamps

    • Added extra check for improve the controlled stop

    • Changed default number for connection retries (now 7)

    • Fix for Devo connection retries

    • Updated DevoSDK to v5.1.10

    • Fix for SyslogSender related to UTF-8

    • Enhace of troubleshooting. Trace Standardization, Some traces has been introduced.

    • Introduced a machanism to detect "Out of Memory killer" situation.

    • Updated DevoSDK to v5.1.9

    • Fixed some bug related to development on MacOS

    • Added an extra validation and fix when the DCSDK receives a wrong timestamp format

    • Added an optional config property for use the Syslog timestamp format in a strict way

  • Updated the docker image version to 1.3.0.

Recommended version

v1.4.0

Dec 12, 2023

IMPRoVEMENTs

BUG FIXING

Upgrade

v1.3.0

Aug 16, 2023

IMPRoVEMENTs

Updated DCSDK from 1.1.4 to 1.9.1: https://devoinc.atlassian.net/wiki/spaces/IF/pages/3901620225

Upgrade

v1.2.0-stable

Jun 1, 2022

new FEATURES

Added new custom data sources from Recorded Future Threat List service:

  • IPs related to Ukraine and Russia countries.

Upgrade

v1.1.0

May 24, 2022

new FEATURES
VULNERABILITIES

This release includes:

  • Upgrade the base docker-image from Debian to Ubuntu20 for vulnerabilities mitigation.

  • Upgrade the IFC SDK Lookup Factory Service to improve the data model validation.

Upgrade to v1.2.0

v1.0.2

May 20, 2021

new FEATURES

Initial release with the following Recorded Future Threat List default data sources:

  • IPs

  • Domains

  • URLs

  • File Hashes

  • Vulnerabilities

Upgrade to v1.2.0