Cloudflare collector

[ 1 Overview ] [ 2 Data sources ] [ 3 Vendor setup ] [ 4 Accepted authentication methods ] [ 5 Run the collector ] [ 6 Structure ]

Overview

Cloudflare is a Content Delivery Network and DDoS mitigation cloud service company. It primarily acts as a reverse proxy between a website's visitor and the Cloudflare customer's hosting provider.

Data sources

Data source	Description	Devo Table	API endpoint	Description

Data source

Description

Devo Table

API endpoint

Description

Cloudflare

Audit Logs

cdn.cloudflare.audit.events

GET https://api.cloudflare.com/client/v4/{entity_type}/{entity_id}/audit_logs?since={start_date}&before={end_date}Z&page={page_num}&per_page={page_limit}&direction={direction}, where:

{entity_type} is one of the two entity types allowed: organizations or accounts.
{entity_id} is the account or organization identifier
{start_date} to limit the returned results to logs newer than the specified date with RFC3339 format (YYYY-MM-DDTHH:mm:ssZ).
{end_date} to limit the returned results to logs older than the specified date with RFC3339 format (YYYY-MM-DDTHH:mm:ssZ).
{page_num} which page of results to return.
{page_limit} how many results to return per page.
{direction} is the direction of the chronological sorting (allowed values are asc or desc -default).

Get audit logs for an account or an organization, filter by who made the change, which zone was the change was made on, and the timeframe of the change.

GraphQL Analytics

cdn.cloudflare.firewall.samples

POST https://api.cloudflare.com/client/v4/graphql, where the body of the request use the following template:

{
  "query": "query { 
              viewer {
                zones (filter: {zoneTag: $zone_tag}) {
                  <DATASET>(
                    filter: {
                      datetime_geq: $start_date,
                      datetime_lt: $end_date
                    },
                    limit: $limit,
                    orderBy: [datetime_ASC]
                  ) {
                    datetime
                    <FIELDS>
                  }
                }
              }
            }",
  "variables": {
    "zoneTag": "<ZONE_TAG>",
    "filter": {
      "zone_tag": "<ZONE_TAG>",
      "start_date": "<START_DATE>",
      "end_date": "<END_DATE>",
      "limit": <LIMIT>
    }
  }
}

where:

<DATASET> is the dataset (product) name you want to query against a zone. Right now, the only dataset allowed by the collector is for Firewall Activity Log: firewallEventsAdaptive. Check the following URL for API available datasets: Datasets (tables) · Cloudflare Analytics docs
<FIELDS> list of fields you want to fetch. List of fields used for firewallEventsAdaptive dataset:

- action
- clientAsn
- clientASNDescription
- clientCountryName
- clientIP
- clientIPClass
- clientRefererHost
- clientRefererPath
- clientRefererQuery
- clientRefererScheme 
- clientRequestHTTPHost
- clientRequestHTTPMethodName
- clientRequestHTTPProtocol
- clientRequestPath
- clientRequestQuery
- clientRequestScheme 
- edgeColoName
- edgeResponseStatus
- kind
- matchIndex
- originResponseStatus
- originatorRayName
- rayName
- ruleId
- source
- userAgent

<ZONE_TAG> is the zone tag (or zone key/ID).
<START_DATE> is the initial date for the query (inclusive).
<END_DATE> is the final date for the query (exclusive).
<LIMIT> to limit the results.

Query for a dataset in a specific zone and timeframe. The only dataset allowed right now by the collector is Firewall Activity Log: firewallEventsAdaptive.

The collector uses limit, orderBy and datetime filters for pagination. For a timeframe request, is limit is not reached no more request are needed. But if limit is reached, collector removes all events with the last datetime value from the result and performs a new timeframe request using this last datetime as start_dateand the same end_date. As start_date is inclusive, all the request removed from the previous request should be returned again. In case all the events returned by the request have the same datetime and also the maximum limit per request is reached, the collector will add all the events and use as start_date the last datetime plus one second. Take into account that this behavior can cause losing events for the requested timeframe.

The collector also performs a request to check allowed limits for each dataset on service setup: Limits · Cloudflare Analytics docs

In a small number of cases, the analytics provided on the Cloudflare GraphQL Analytics API are based on a sample — a subset of the dataset. In these cases, Cloudflare Analytics returns an estimate derived from the sampled value. For example, suppose that during an attack the sampling rate is 10% and 5,000 events are sampled. Cloudflare will estimate 50,000 total events (5,000 × 10) and report this value in Analytics.

See Sampling · Cloudflare Analytics docs for more details.

For more information on how the events are parsed, visit our page.

Vendor setup

To configure the Cloudflare Collector Services you need to configure one of the allowed authentication methods:

API tokens
API keys

Authentication Method	Details	Configuration properties	Link

Authentication Method

Details

Configuration properties

Link

API Tokens

Cloudflare recommends API Tokens as the preferred way to interact with Cloudflare APIs. You can configure the scope of tokens to limit access to account and zone resources, and you can define the Cloudflare APIs to which the token authorizes access.

The following credentials properties are needed:

credentials:
  api_token: <API_TOKEN>

Create API token · Cloudflare API docs

API Keys

Unique to each Cloudflare user and used only for authentication. API keys do not authorize access to accounts or zones.

Use the Global API Key for authentication. Only use the Origin CA Key when you create origin certificates through the API.

The following credentials properties are needed:

Get API keys (legacy) · Cloudflare API docs

Accepted authentication methods

Depending on how did you obtain your credentials, you will have to either fill or delete the following properties on the JSON credentials configuration block.

Authentication Method	api_token	api_key	user_email

Authentication Method	api_token	api_key	user_email
API Tokens	REQUIRED
API Keys		REQUIRED	REQUIRED

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).