IBM Cloud Flow Logs for VPC enables the collection, storage, and presentation of information about the Internet Protocol (IP) traffic going to and from network interfaces within your Virtual Private Cloud (VPC).
The IBM Cloud Flow Logs for VPC collector collects flow logs from an IBM log collector instance and sends them to Devo.
Devo collector features
Feature
Details
Allow parallel downloading (multipod)
not allowed
Running environments
collector server
on-premise
Populated Devo events
table
Flattening preprocessing
yes
Allowed source events obfuscation
yes
Data sources
Data source
Description
API endpoint
Collector service name
Devo table
Available from release
IBM Cloud
IBM Cloud flow logs for VPC
/v2/export
flow_log
cloud.ibm.vpc.flow_log
v1.0.0
For more information on how the events are parsed, visit our page.
Flattening preprocessing
Click here to see an example of a flow log object. However, the collector gets the flow log objects one by one, instead of grouped. This is due to a pre-processing performed by IBM.
The collector retrieves IBM Cloud flow logs for VPC from a Log Analysis instance. To achieve this, users must have previously set up logging for VPC to direct log objects to a COS Bucket. Additionally, a cloud function should be in place to read and insert these logs into the Log Analysis instance.
...
Once the Log Analysis instance is created, users will be able to fetch the necessary credentials:
Setting
Details
service_key
The IBM Cloud Log Analysis instance service key.
The service key can be found in the IBM Cloud console via: Observability > Logging > Select the Flow Log log collector instance > Open Dashboard > Settings > Organization > API Keys > Service Keys
base_url
The IBM Cloud Flow Logs for VPC log collector API base URL. Select from the following API endpoints
Note
This minimum configuration refers exclusively to those specific parameters of this integration. There are more required parameters related to the generic behavior of the collector. Check the setting sections for details.
Accepted authentication methods
Authentication method
Service key
Service key
Required
Base URL
Required
Run the collector
Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).
Rw ui tabs macro
Rw tab
title
On-premise collector
This data collector can be run in any machine that has the Docker service available because it should be executed as a docker container. The following sections explain how to prepare all the required setup for having the data collector running.
Structure
The following directory structure should be created for being used when running the collector:
In Devo, go to Administration → Credentials → X.509 Certificates, download the Certificate, Private key and Chain CA and save them in <product_name>/certs/. Learn more about security credentials in Devo here.
The IBM Cloud Flow Logs for VPC log collector API base URL. Select from the following API endpoints
service_key_value
str
mandatory
minimum length: 1
The IBM Cloud Flow Logs for VPC service key.
The service key can be found in the IBM Cloud console via: Obversability > Logging > Select the Flow Log log collector instance > Open Dashboard > Settings > Organization > API Keys > Service Keys
override_tag_value
str
optional
Devo tag-friendly string (no special characters, spaces, etc.) For more information see Devo Tags.
An optional tag that allows users to override the service default tags.
Info
This parameter can be removed or commented.
override_fetch_gap_seconds_value
int
optional
minimum value: 0
An optional value that allows users to specify the query end time for a given poll. For example, specifying a value of 60 indicates that the collector will fetch all logs from the last event time (or start_time_in_utc_value if the first poll) up to NOW() - 60 seconds.
This threshold buffer is utilised to ensure that all log events have had time to properly ingest, index, and become searchable in the IBM Log Analysis instance.
This value will overwrite the default value (300 seconds).
Info
This parameter can be removed or commented.
start_time_in_utc_value
str
optional
UTC datetime string having datetime string format %-Y-%m-%d %H-%M-%S (e.g., “2000-01-01 00:00:01”)
This configuration allows you to set a custom date as the beginning of the period to download. This allows downloading historical data (one month back for example) before downloading new events.
Info
This parameter should be removed if it is not used.
Each object represents the necessary configuration to obfuscate messages before these are sent to Devo.
Info
This parameter can be removed or commented.
Download the Docker image
The collector should be deployed as a Docker container. Download the Docker image of the collector as a .tgz file by clicking the link in the following table:
Once the Docker image is imported, it will show the real name of the Docker image (including version info). Replace <image_file> and <version> with a proper value.
The Docker image can be deployed on the following services:
Docker
Execute the following command on the root directory <any_directory>/devo-collectors/<product_name>/
Replace <product_name>, <image_name> and <version> with the proper values.
Docker Compose
The following Docker Compose file can be used to execute the Docker container. It must be created in the <any_directory>/devo-collectors/<product_name>/ directory.
To run the container using docker-compose, execute the following command from the <any_directory>/devo-collectors/<product_name>/ directory:
Code Block
IMAGE_VERSION=<version> docker-compose up -d
Note
Replace <product_name>, <image_name> and <version> with the proper values.
Rw tab
title
Cloud collector
We use a piece of software called Collector Server to host and manage all our available collectors. If you want us to host this collector for you, get in touch with us and we will guide you through the configuration.
Collector services detail
This section is intended to explain how to proceed with specific actions for services.
flow_log
Expand
title
Verify data collection
Internal process and deduplication method
All flow log records are fetched via the v2 Export API and filtered/ordered by their created timestamp. The collector continually pulls new events since the last recorded timestamp. A unique hash value is computed for each event and used for deduplication purposes to ensure events are not fetched multiple times in subsequent pulls.
Please note: the collector fetches logs from a Log Analysis instance. Log Analysis can house many different log types. When fetching logs, the collector will attempt to identify a `vpc_crn` property key in the log to determine if the log is a VPC flow log. If this key does not exist, the collector will skip that log. For the purposes of statistics tracking in the collector log output, non-VPC flow logs are not counted as events received or events filtered.
If your collector logs indicate that the collector is successfully running but processing 0 valid flow logs, please ensure that the base URL and service key you provided for your Log Analysis instance contains valid flow logs for VPC; for example, if a user indicates a base URL and service key for an IBM Cloud Activity Tracker instance, then the collector will successfully run but never fetch valid flow logs for VPC.
Devo categorization and destination
All events of this service are ingested into the table cloud.ibm.vpc.flow_log
Setup output
A successful run has the following output messages for the setup module:
Code Block
2023-08-31T09:30:01.135 INFO InputProcess::MainThread -> EventPullerSetup(unknown,ibm_cloud_flow_log#10001,flow_log#predefined) -> Starting thread
2023-08-31T09:30:01.137 INFO InputProcess::EventPullerSetup(unknown,ibm_cloud_flow_log#10001,flow_log#predefined) -> Testing fetch from /v2/export.
2023-08-31T09:30:01.794 INFO InputProcess::EventPullerSetup(unknown,ibm_cloud_flow_log#10001,flow_log#predefined) -> Successfully tested fetch from /v2/export. Source is pullable.
2023-08-31T09:30:01.794 INFO InputProcess::EventPullerSetup(unknown,ibm_cloud_flow_log#10001,flow_log#predefined) -> Setup for module <EventPuller> has been successfully executed
Puller output
Code Block
2023-08-31T09:30:02.142 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Running the persistence upgrade steps
2023-08-31T09:30:02.143 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Running the persistence corrections steps
2023-08-31T09:30:02.143 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Running the persistence corrections steps
2023-08-31T09:30:02.143 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> No changes were detected in the persistence
2023-08-31T09:30:02.144 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) Finalizing the execution of pre_pull()
2023-08-31T09:30:02.144 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Starting data collection every 60 seconds
2023-08-31T09:30:02.144 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Pull Started
2023-08-31T09:30:02.145 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Fetching all event logs via params={'from': 1693485435, 'to': 1693488602, 'prefer': 'head'}
2023-08-31T09:30:02.908 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Sending 183 event(s) to my.app.ibm.cloud.flow_log
2023-08-31T09:30:02.932 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> No more pagination_id values returned. Setting pull_completed to True.
2023-08-31T09:30:02.935 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Updating the persistence
2023-08-31T09:30:02.936 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> (Partial) Statistics for this pull cycle (@devo_pulling_id=1693488602141):Number of requests made: 1; Number of events received: 185; Number of duplicated events filtered out: 2; Number of events generated and sent: 183; Average of events per second: 231.194.
2023-08-31T09:30:02.936 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Statistics for this pull cycle (@devo_pulling_id=1693488602141):Number of requests made: 1; Number of events received: 185; Number of duplicated events filtered out: 2; Number of events generated and sent: 183; Average of events per second: 231.142.
After a successful collector’s execution (that is, no error logs found), you will see the following log message:
Code Block
023-08-31T09:30:02.936 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> The data is up to date!
2023-08-31T09:30:02.936 INFO InputProcess::EventPuller(ibm_cloud_flow_log,10001,flow_log,predefined) -> Data collection completed. Elapsed time: 0.795 seconds. Waiting for 59.205 second(s) until the next one
...
Expand
title
Troubleshooting
This collector has different security layers that detect both an invalid configuration and abnormal operation. This table will help you detect and resolve the most common errors.
Error Type
Error Id
Error Message
Cause
Solution
InitVariablesError
1
Invalid start_time_in_utc: {ini_start_str}. Must be in parseable datetime format.
The configured start_time_in_utc parameter is a non-parseable format.
Update the start_time_in_utc value to have the recommended format as indicated in the guide.
InitVariablesError
2
Invalid start_time_in_utc: {ini_start_str}. Must be in the past.
The configured start_time_in_utc parameter is a future date.
Update the start_time_in_utc value to a past datetime.
SetupError
101
Failed to fetch OAuth token from {token_endpoint}. Exception: {e}.
The provided credentials, base URL, and/or token endpoint is incorrect.
Revisit the configuration steps and ensure that the correct values were specified in the config file.
SetupError
102
Failed to fetch data from {endpoint}. Source is not pullable.
The provided credentials, base URL, and/or token endpoint is incorrect.
Revisit the configuration steps and ensure that the correct values were specified in the config file.
ApiError
401
Error during API call to [API provider HTML error response here]
The server returned an HTTP 401 response.
Ensure that the provided credentials are correct and provide read access to the targeted data.
ApiError
429
Too many concurrent requests.]
IBM Cloud is reporting that too many simultaneous requests are being made against the Log Analysis instance.
This error can happen when a user attempts to manually restart the collector frequently or otherwise query the Log Analysis instance while the collector is running. In practice, this error should naturally correct itself within 15 minutes of the original report so long as simultaneous query requests cease.
If the collector continues to report this error after 15 minutes, please ensure that there is not another script or user also making API requests to the Log Analysis instance.
Log Analysis concurrency limit is determined by your instance configuration and tier.
ApiError
498
Error during API call to [API provider HTML error response here]
The server returned an HTTP 500 response.
If the API returns a 500 but successfully completes subsequent runs then you may ignore this error. If the API repeatedly returns a 500 error, ensure the server is reachable and operational.
Collector operations
This section is intended to explain how to proceed with specific operations of this collector.
Expand
title
Verify collector operations
Initialization
The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.
A successful run has the following output messages for the initializer module:
Code Block
2023-01-10T15:22:57.146 INFO MainProcess::MainThread -> Loading configuration using the following files: {"full_config": "config.yaml", "job_config_loc": null, "collector_config_loc": null}
2023-01-10T15:22:57.146 INFO MainProcess::MainThread -> Using the default location for "job_config_loc" file: "/etc/devo/job/job_config.json"
2023-01-10T15:22:57.147 INFO MainProcess::MainThread -> "\etc\devo\job" does not exists
2023-01-10T15:22:57.147 INFO MainProcess::MainThread -> Using the default location for "collector_config_loc" file: "/etc/devo/collector/collector_config.json"
2023-01-10T15:22:57.148 INFO MainProcess::MainThread -> "\etc\devo\collector" does not exists
2023-01-10T15:22:57.148 INFO MainProcess::MainThread -> Results of validation of config files parameters: {"config": "config.yaml", "config_validated": True, "job_config_loc": "/etc/devo/job/job_config.json", "job_config_loc_default": True, "job_config_loc_validated": False, "collector_config_loc": "/etc/devo/collector/collector_config.json", "collector_config_loc_default": True, "collector_config_loc_validated": False}
2023-01-10T15:22:57.171 WARNING MainProcess::MainThread -> [WARNING] Illegal global setting has been ignored -> multiprocessing: False
Events delivery and Devo ingestion
The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method.
A successful run has the following output messages for the initializer module:
The Integrations Factory Collector SDK has 3 different senders services depending on the event type to delivery (internal, standard, and lookup). This collector uses the following Sender Services:
Sender services
Description
internal_senders
In charge of delivering internal metrics to Devo such as logging traces or metrics.
standard_senders
In charge of delivering pulled events to Devo.
Sender statistics
Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:
Logging trace
Description
Number of available senders: 1
Displays the number of concurrent senders available for the given Sender Service.
sender manager internal queue size: 0
Displays the items available in the internal sender queue.
Standard - Total number of messages sent: 57, messages sent since "2023-01-10 16:09:16.116750+00:00": 0 (elapsed 0.000 seconds
Displays the number of events from the last time and following the given example, the following conclusions can be obtained:
44 events were sent to Devo since the collector started.
The last checkpoint timestamp was 2023-01-10 16:09:16.116750+00:00.
21 events where sent to Devo between the last UTC checkpoint and now.
Those 21 events required 0.007 seconds to be delivered.
Expand
title
Check memory usage
To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.
The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.
The global pressure of the available memory is displayed in the global value.
All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory
Flow log: Enable the collection, storage, and presentation of information about the Internet Protocol (IP) traffic going to and from network interfaces within your Virtual Private Cloud (VPC).