Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel2
maxLevel2
typeflat

Overview

IBM Cloud Flow Logs for VPC enables the collection, storage, and presentation of information about the Internet Protocol (IP) traffic going to and from network interfaces within your Virtual Private Cloud (VPC).

The IBM Cloud Flow Logs for VPC collector collects flow logs from an IBM log collector instance and sends them to Devo.

Devo collector features

Feature

Details

Allow parallel downloading (multipod)

not allowed

Running environments

  • collector server

  • on-premise

Populated Devo events

table

Flattening preprocessing

yes

Allowed source events obfuscation

yes

Data sources

Data source

Description

API endpoint

Collector service name

Devo table

Available from release

IBM Cloud

IBM Cloud flow logs for VPC

/v2/export

flow_log

cloud.ibm.vpc.flow_log

v1.0.0

For more information on how the events are parsed, visit our page.

Flattening preprocessing

Click here to see an example of a flow log object. However, the collector gets the flow log objects one by one, instead of grouped. This is due to a pre-processing performed by IBM.

...

Code Block
{
    "account": "099547bf015144628q4ef599863c5123",
    "key": "ibm_vpc_flowlogs_v1/account=099547bf015144628q4ef599863c5123/region=us-south/vpc-id=crn%3Av1%3Abluemix%3Apublic%3Ais%3Aus-south%3Aa%2F099547bf015144628q4ef599863c5123%3A%3Avpc%3Ar006-a1f9875c-874b-472b-ab4a-8a669fe57be6/subnet-id=crn%3Av1%3Abluemix%3Apublic%3Ais%3Aus-south-1%3Aa%2F099547bf015144628q4ef599863c5123%3A%3Asubnet%3A0717-d622cc30-816e-4b94-935d-70ca4b8f9b8b/endpoint-type=vnics/instance-id=crn%3Av1%3Abluemix%3Apublic%3Ais%3Aus-south-1%3Aa%2F099547bf015144628q4ef599863c5123%3A%3Ainstance%3A0717_2abb8f82-07f8-4d09-aedb-4c02e61fb20d/vnic-id=0717-b36b4d70-3c92-446d-9d5e-f28778ab033f/record-type=egress/year=2023/month=10/day=11/hour=03/stream-id=20231011T035936Z/00000000.gz",
    "version": "0.0.1",
    "collector_crn": "crn:v1:bluemix:public:is:us-south:a/099547bf015144628q4ef599863c5123::flow-log-collector:r006-1b7a1d95-8a01-40ae-b011-9c6b0575a59f",
    "attached_endpoint_type": "vnic",
    "network_interface_id": "0717-b36b4d70-3c92-446d-9d5e-f28778ab033f",
    "instance_crn": "crn:v1:bluemix:public:is:us-south-1:a/099547bf015144628q4ef599863c5123::instance:0717_2abb8f82-07f8-4d09-aedb-4c02e61fb20d",
    "vpc_crn": "crn:v1:bluemix:public:is:us-south:a/099547bf015144628q4ef599863c5123::vpc:r006-a1f9875c-874b-472b-ab4a-8a669fe57be6",
    "capture_end_time": "2023-10-11T03:59:36Z",
    "capture_start_time": "2023-10-11T03:56:06Z",
    "state": "ok",
    "flow_log_start_time": "2023-10-11T03:56:26Z",
    "flow_log_end_time": "2023-10-11T03:59:26Z",
    "flow_log_direction": "O",
    "flow_log_action": "accepted",
    "flow_log_initiator_ip": "10.240.0.4",
    "flow_log_initiator_port": 68,
    "flow_log_target_ip": "10.240.0.1",
    "flow_log_target_port": 67,
    "flow_log_transport_protocol": 17,
    "flow_log_ether_type": "IPv4",
    "flow_log_was_initiated": true,
    "flow_log_was_terminated": false,
    "flow_log_bytes_from_initiator": 2050,
    "flow_log_packets_from_initiator": 6,
    "flow_log_bytes_from_target": 1956,
    "flow_log_packets_from_target": 6,
    "flow_log_cumulative_packets_from_initiator": 6,
    "flow_log_cumulative_packets_from_target": 6,
    "flow_log_cumulative_bytes_from_target": 1956,
    "flow_log_cumulative_bytes_from_initiator": 2050,
    "@devo_environment": "develop",
    "@devo_pulling_id": "1696997002225"
}

Minimum configuration required for basic pulling

The collector retrieves IBM Cloud flow logs for VPC from a Log Analysis instance. To achieve this, users must have previously set up logging for VPC to direct log objects to a COS Bucket. Additionally, a cloud function should be in place to read and insert these logs into the Log Analysis instance.

...

Once the Log Analysis instance is created, users will be able to fetch the necessary credentials:

Setting

Details

service_key

The IBM Cloud Log Analysis instance service key.

The service key can be found in the IBM Cloud console via: Observability > Logging > Select the Flow Log log collector instance > Open Dashboard > Settings > Organization > API Keys > Service Keys

base_url

The IBM Cloud Flow Logs for VPC log collector API base URL. Select from the following API endpoints

Note

This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check the setting sections for details.

Accepted authentication methods

Authentication method

Service key

Service key

Required

Base URL

Required

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Rw ui tabs macro
Rw tab
titleOn-premise collector

This data collector can be run in any machine that has the Docker service available because it should be executed as a docker container. The following sections explain how to prepare all the required setup for having the data collector running.

Structure

The following directory structure should be created for being used when running the collector:

Code Block
<any_directory>
└── devo-collectors/
    └── <product_name>/
        ├── certs/
        │   ├── chain.crt
        │   ├── <your_domain>.key
        │   └── <your_domain>.crt
        ├── state/
        └── config/ 
            └── config.yaml 
Note

Replace <product_name> with the proper value.

Devo credentials

In Devo, go to Administration → Credentials → X.509 Certificates, download the Certificate, Private key and Chain CA and save them in <product_name>/certs/. Learn more about security credentials in Devo here.

Replace <product_name> with the proper value.

Editing the config.yaml file

Code Block
globals:
  debug: false
  id: <collector_id_value>
  name: <collector_name_value>
  persistence:
    type: filesystem
    config:
      directory_name: state
outputs:
  devo_us_1:
    type: devo_platform
    config:
      address: <devo_address>
      port: 443
      type: SSL
      chain: <chain_filename>
      cert: <cert_filename>
      key: <key_filename>
inputs:
  ibm_cloud_flow_log:
    id: <input_id_value>
    enabled: true
    environment: <environment_value>
    base_url: <base_url_value>
    credentials:
      service_key: <service_key_value>
    services:
      flow_log:
        override_tag: <override_tag_value>
        override_fetch_gap_seconds: <override_fetch_gap_seconds_value>
        start_time_in_utc: <start_time_in_utc_value>
        obfuscation_data: [<obfuscation_data_values>]
        request_period_in_seconds: <request_period_in_seconds_value>

Replace the placeholders with your required values following the description table below:

Parameter

Data type

Type

Value Range / Format

Details

collector_id_value

int

mandatory

minimum length: 1

maximum length: 5

Use this parameter to give a unique ID to this collector.

collector_name_value

str

mandatory

minimum length: 1

maximum length: 10

Use this parameter to give a valid name to this collector.

devo_address

str

mandatory

collector-us.devo.io 

collector-eu.devo.io 

Use this parameter to identify the Devo Cloud where the events will be sent.

chain_filename

str

mandatory

minimum length: 4

maximum length: 20

Use this parameter to identify the chain.cert file downloaded from your Devo domain. Usually this file's name is: chain.crt.

cert_filename

str

mandatory

minimum length: 4

maximum length: 20

Use this parameter to identify the file.cert downloaded from your Devo domain.

key_filename

str

mandatory

minimum length: 4

maximum length: 20

Use this parameter to identify the file.key downloaded from your Devo domain.

input_id_value

int

mandatory

minimum length: 5

maximum length: 15

Use this parameter to give a unique ID to this input service.

Note

This parameter is used to build the persistence address. Do not use the same value for multiple collectors. It could cause a collision.

base_url_value

str

mandatory

minimum length: 1

One of: API endpoints

The IBM Cloud Flow Logs for VPC log collector API base URL. Select from the following API endpoints

service_key_value

str

mandatory

minimum length: 1

The IBM Cloud Flow Logs for VPC service key.

The service key can be found in the IBM Cloud console via: Obversability > Logging > Select the Flow Log log collector instance > Open Dashboard > Settings > Organization > API Keys > Service Keys

override_tag_value

str

optional

Devo tag-friendly string (no special characters, spaces, etc.) For more information see Devo Tags.

An optional tag that allows users to override the service default tags.

Info

This parameter can be removed or commented.

override_fetch_gap_seconds_value

int

optional

minimum value: 0

An optional value that allows users to specify the query end time for a given poll. For example, specifying a value of 60 indicates that the collector will fetch all logs from the last event time (or start_time_in_utc_value if the first poll) up to NOW() - 60 seconds.

This threshold buffer is utilised to ensure that all log events have had time to properly ingest, index, and become searchable in the IBM Log Analysis instance.

This value will overwrite the default value (300 seconds).

Info

This parameter can be removed or commented.

start_time_in_utc_value

str

optional

UTC datetime string having datetime string format %-Y-%m-%d %H-%M-%S (e.g., “2000-01-01 00:00:01”)

This configuration allows you to set a custom date as the beginning of the period to download. This allows downloading historical data (one month back for example) before downloading new events.

Info

This parameter should be removed if it is not used.

obfuscation_data_values

array<object>

optional

The objects in the array look like this:

Code Block
obfuscation_data:
  - name:
    - credentials
    - "*"
    value: "**********"

Each object represents the necessary configuration to obfuscate messages before these are sent to Devo.

Info

This parameter can be removed or commented.

Download the Docker image

The collector should be deployed as a Docker container. Download the Docker image of the collector as a .tgz file by clicking the link in the following table:

Collector Docker image

SHA-256 hash

collector-ibm_cloud_vpc_flow_if-docker-image-1.0.0

5c552f91b8988ac3e977a5a22515b63b238832e43712cd94b9da0fea9de662d3

Use the following command to add the Docker image to the system:

Code Block
gunzip -c <image_file>-<version>.tgz | docker load
Note

Once the Docker image is imported, it will show the real name of the Docker image (including version info). Replace <image_file> and <version> with a proper value.

The Docker image can be deployed on the following services:

Docker

Execute the following command on the root directory <any_directory>/devo-collectors/<product_name>/

Code Block
docker run 
--name collector-<product_name> 
--volume $PWD/certs:/devo-collector/certs 
--volume $PWD/config:/devo-collector/config 
--volume $PWD/state:/devo-collector/state 
--env CONFIG_FILE=config.yaml 
--rm 
--interactive 
--tty 
<image_name>:<version>
Note

Replace <product_name>, <image_name> and <version> with the proper values.

Docker Compose

The following Docker Compose file can be used to execute the Docker container. It must be created in the <any_directory>/devo-collectors/<product_name>/ directory.

Code Block
version: '3'
services:
  collector-<product_name>:
    image: <image_name>:${IMAGE_VERSION:-latest}
    container_name: collector-<product_name>
    volumes:
      - ./certs:/devo-collector/certs
      - ./config:/devo-collector/config
      - ./credentials:/devo-collector/credentials
      - ./state:/devo-collector/state
    environment:
      - CONFIG_FILE=${CONFIG_FILE:-config.yaml}

To run the container using docker-compose, execute the following command from the <any_directory>/devo-collectors/<product_name>/ directory:

Code Block
IMAGE_VERSION=<version> docker-compose up -d
Note

Replace <product_name>, <image_name> and <version> with the proper values.

Rw tab
titleCloud collector

We use a piece of software called Collector Server to host and manage all our available collectors. If you want us to host this collector for you, get in touch with us and we will guide you through the configuration.

Collector services detail

This section is intended to explain how to proceed with specific actions for services.

flow_log

Expand
titleVerify data collection

Internal process and deduplication method

All flow log records are fetched via the v2 Export API and filtered/ordered by their created timestamp. The collector continually pulls new events since the last recorded timestamp. A unique hash value is computed for each event and used for deduplication purposes to ensure events are not fetched multiple times in subsequent pulls.

Please note: the collector fetches logs from a Log Analysis instance. Log Analysis can house many different log types. When fetching logs, the collector will attempt to identify a `vpc_crn` property key in the log to determine if the log is a VPC flow log. If this key does not exist, the collector will skip that log. For the purposes of statistics tracking in the collector log output, non-VPC flow logs are not counted as events received or events filtered.

If your collector logs indicate that the collector is successfully running but processing 0 valid flow logs, please ensure that the base URL and service key you provided for your Log Analysis instance contains valid flow logs for VPC; for example, if a user indicates a base URL and service key for an IBM Cloud Activity Tracker instance, then the collector will successfully run but never fetch valid flow logs for VPC.

Devo categorization and destination

All events of this service are ingested into the table cloud.ibm.vpc.flow_log

Setup output

A successful run has the following output messages for the setup module:

Code Block
2023-08-31T09:30:01.135    INFO InputProcess::MainThread -> EventPullerSetup(unknown,ibm_cloud_flow_log#10001,flow_log#predefined) -> Starting thread
2023-08-31T09:30:01.137    INFO InputProcess::EventPullerSetup(unknown,ibm_cloud_flow_log#10001,flow_log#predefined) -> Testing fetch from /v2/export.
2023-08-31T09:30:01.794    INFO InputProcess::EventPullerSetup(unknown,ibm_cloud_flow_log#10001,flow_log#predefined) -> Successfully tested fetch from /v2/export. Source is pullable.
2023-08-31T09:30:01.794    INFO InputProcess::EventPullerSetup(unknown,ibm_cloud_flow_log#10001,flow_log#predefined) -> Setup for module <EventPuller> has been successfully executed

Puller output

Code Block
2023-08-31T09:30:02.142    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Running the persistence upgrade steps
2023-08-31T09:30:02.143    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Running the persistence corrections steps
2023-08-31T09:30:02.143    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Running the persistence corrections steps
2023-08-31T09:30:02.143    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> No changes were detected in the persistence
2023-08-31T09:30:02.144    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) Finalizing the execution of pre_pull()
2023-08-31T09:30:02.144    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Starting data collection every 60 seconds
2023-08-31T09:30:02.144    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Pull Started
2023-08-31T09:30:02.145    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Fetching all event logs via params={'from': 1693485435, 'to': 1693488602, 'prefer': 'head'}
2023-08-31T09:30:02.908    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Sending 183 event(s) to my.app.ibm.cloud.flow_log
2023-08-31T09:30:02.932    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> No more pagination_id values returned. Setting pull_completed to True.
2023-08-31T09:30:02.935    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Updating the persistence
2023-08-31T09:30:02.936    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> (Partial) Statistics for this pull cycle (@devo_pulling_id=1693488602141):Number of requests made: 1; Number of events received: 185; Number of duplicated events filtered out: 2; Number of events generated and sent: 183; Average of events per second: 231.194.
2023-08-31T09:30:02.936    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Statistics for this pull cycle (@devo_pulling_id=1693488602141):Number of requests made: 1; Number of events received: 185; Number of duplicated events filtered out: 2; Number of events generated and sent: 183; Average of events per second: 231.142.

After a successful collector’s execution (that is, no error logs found), you will see the following log message:

Code Block
023-08-31T09:30:02.936    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> The data is up to date!
2023-08-31T09:30:02.936    INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Data collection completed. Elapsed time: 0.795 seconds. Waiting for 59.205 second(s) until the next one

...

Expand
titleTroubleshooting

This collector has different security layers that detect both an invalid configuration and abnormal operation. This table will help you detect and resolve the most common errors.

Error Type

Error Id

Error Message

Cause

Solution

InitVariablesError

1

Invalid start_time_in_utc: {ini_start_str}. Must be in parseable datetime format.

The configured start_time_in_utc parameter is a non-parseable format.

Update the start_time_in_utc value to have the recommended format as indicated in the guide.

InitVariablesError

2

Invalid start_time_in_utc: {ini_start_str}. Must be in the past.

The configured start_time_in_utc parameter is a future date.

Update the start_time_in_utc value to a past datetime.

SetupError

101

Failed to fetch OAuth token from {token_endpoint}. Exception: {e}.

The provided credentials, base URL, and/or token endpoint is incorrect.

Revisit the configuration steps and ensure that the correct values were specified in the config file.

SetupError

102

Failed to fetch data from {endpoint}. Source is not pullable.

The provided credentials, base URL, and/or token endpoint is incorrect.

Revisit the configuration steps and ensure that the correct values were specified in the config file.

ApiError

401

Error during API call to [API provider HTML error response here]

The server returned an HTTP 401 response.

Ensure that the provided credentials are correct and provide read access to the targeted data.

ApiError

429

Too many concurrent requests.]

IBM Cloud is reporting that too many simultaneous requests are being made against the Log Analysis instance.

This error can happen when a user attempts to manually restart the collector frequently or otherwise query the Log Analysis instance while the collector is running. In practice, this error should naturally correct itself within 15 minutes of the original report so long as simultaneous query requests cease.

If the collector continues to report this error after 15 minutes, please ensure that there is not another script or user also making API requests to the Log Analysis instance.

Log Analysis concurrency limit is determined by your instance configuration and tier.

ApiError

498

Error during API call to [API provider HTML error response here]

The server returned an HTTP 500 response.

If the API returns a 500 but successfully completes subsequent runs then you may ignore this error. If the API repeatedly returns a 500 error, ensure the server is reachable and operational.

Collector operations

This section is intended to explain how to proceed with specific operations of this collector.

Expand
titleVerify collector operations

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

Code Block
2023-01-10T15:22:57.146    INFO MainProcess::MainThread -> Loading configuration using the following files: {"full_config": "config.yaml", "job_config_loc": null, "collector_config_loc": null}
2023-01-10T15:22:57.146    INFO MainProcess::MainThread -> Using the default location for "job_config_loc" file: "/etc/devo/job/job_config.json"
2023-01-10T15:22:57.147    INFO MainProcess::MainThread -> "\etc\devo\job" does not exists
2023-01-10T15:22:57.147    INFO MainProcess::MainThread -> Using the default location for "collector_config_loc" file: "/etc/devo/collector/collector_config.json"
2023-01-10T15:22:57.148    INFO MainProcess::MainThread -> "\etc\devo\collector" does not exists
2023-01-10T15:22:57.148    INFO MainProcess::MainThread -> Results of validation of config files parameters: {"config": "config.yaml", "config_validated": True, "job_config_loc": "/etc/devo/job/job_config.json", "job_config_loc_default": True, "job_config_loc_validated": False, "collector_config_loc": "/etc/devo/collector/collector_config.json", "collector_config_loc_default": True, "collector_config_loc_validated": False}
2023-01-10T15:22:57.171 WARNING MainProcess::MainThread -> [WARNING] Illegal global setting has been ignored -> multiprocessing: False

Events delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method.

A successful run has the following output messages for the initializer module:

Code Block
2023-01-10T15:23:00.788    INFO OutputProcess::MainThread -> DevoSender(standard_senders,devo_sender_0) -> Starting thread
2023-01-10T15:23:00.789    INFO OutputProcess::MainThread -> DevoSenderManagerMonitor(standard_senders,devo_1) -> Starting thread (every 300 seconds)
2023-01-10T15:23:00.790    INFO OutputProcess::MainThread -> DevoSenderManager(standard_senders,manager,devo_1) -> Starting thread
2023-01-10T15:23:00.842    INFO OutputProcess::MainThread -> global_status: {"output_process": {"process_id": 18804, "process_status": "running", "thread_counter": 21, "thread_names": ["MainThread", "pydevd.Writer", "pydevd.Reader", "pydevd.CommandThread", "pydevd.CheckAliveThread", "DevoSender(standard_senders,devo_sender_0)", "DevoSenderManagerMonitor(standard_senders,devo_1)", "DevoSenderManager(standard_senders,manager,devo_1)", "OutputStandardConsumer(standard_senders_consumer_0)", 

Sender services

The Integrations Factory Collector SDK has 3 different senders services depending on the event type to delivery (internal, standard, and lookup). This collector uses the following Sender Services:

Sender services

Description

internal_senders

In charge of delivering internal metrics to Devo such as logging traces or metrics.

standard_senders

In charge of delivering pulled events to Devo.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Standard - Total number of messages sent: 57, messages sent since "2023-01-10 16:09:16.116750+00:00": 0 (elapsed 0.000 seconds

Displays the number of events from the last time and following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2023-01-10 16:09:16.116750+00:00.

  • 21 events where sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.007 seconds to be delivered.

Expand
titleCheck memory usage

To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.

  • The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.

  • The global pressure of the available memory is displayed in the global value.

  • All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory

Code Block
INFO InputProcess::MainThread -> [GC] global: 20.4% -> 20.4%, process: RSS(34.50MiB -> 34.08MiB), VMS(410.52MiB -> 410.02MiB)
INFO OutputProcess::MainThread -> [GC] global: 20.4% -> 20.4%, process: RSS(28.41MiB -> 28.41MiB), VMS(705.28MiB -> 705.28MiB)

Change log for v1.x.x

Release

Released on

Release type

Details

Recommendations

v1.0.0

Status
colourPurple
titleINITIAL RELEASE



Features:

  • Flow log: Enable the collection, storage, and presentation of information about the Internet Protocol (IP) traffic going to and from network interfaces within your Virtual Private Cloud (VPC).

Released with Devo Collector SDK v1.10.0

Recommended version