Document toolboxDocument toolbox

Alibaba Cloud collector

Configuration requirements

To run this collector, there are some configurations detailed below that you need to consider.

Configuration

Details

Configuration

Details

Access key ID and Secret ID

You will need to obtain the Access Key ID and Secret ID to configure this collector.

Create a Trail

You will need to create a single account trail.

Log store

You will need to create a log store in the ActionTrail console.

More information

Refer to the Vendor setup section to know more about these configurations.

Overview

A service that monitors and records the actions of your Alibaba Cloud account, including the access to and use of Alibaba Cloud services using the Alibaba Cloud Management Console, calling API operations, or SDKs.

Devo collector features

Feature

Details

Feature

Details

Allow parallel downloading (multipod)

Not allowed

Running environments

  • Collector server

  • On-premise

Populated Devo events

Table

Flattening preprocessing

No

Data sources

Data Source

Description

API Endpoint

Collector service name

Devo Table

Available from release

Data Source

Description

API Endpoint

Collector service name

Devo Table

Available from release

ActionTrail

ActionTrail events, via API.

Data of the last 90 days available.

ActionTrail core SDK

actiontrail

cloud.alibaba.actiontrail.events

v1.0.0

ActionTrail (Log Service)

ActionTrail events, using LogService.

Ideal for a larger data volumes, and to store (and query) data for more than 90 days.

Log Service SDK

actiontrail_log_service

cloud.alibaba.log_service.events

v1.0.0

For more information on how the events are parsed, visit our page.

Vendor setup

There are some minimal requirements to setup this collector:

  1. A configured ActionTrail trail to query

  2. A Log Store that contains Action Trail events - this is optional-.

Accepted authentication methods

The user must specify an Access Key ID and Secret ID for the account/RAM to authenticate with the ActionTrail API or Log Service API.

Authentication Method

Access Key ID

Access Key Secret

Authentication Method

Access Key ID

Access Key Secret

Access Key ID / Access Key Secret

REQUIRED

REQUIRED

Run the collector

Once the data source is configured, you can either send us the required information if you want us to host and manage the collector for you (Cloud collector), or deploy and host the collector in your own machine using a Docker image (On-premise collector).

Collector services detail

This section is intended to explain how to proceed with specific actions for services.

Events service

Once the collector has been launched, it is important to check if the ingestion is performed in a proper way. To do so, go to the collector’s logs console.

This service has the following components:

Component

Description

Component

Description

Setup

The setup module is in charge of authenticating the service and managing the token expiration when needed.

Puller

The setup module is in charge of pulling the data in a organized way and delivering the events via SDK.

Setup output

A successful run has the following output messages for the setup module:

INFO InputProcess::MainThread -> ActionTrailBasePullerSetup(example_collector,alibaba#alibaba_1,actiontrail#predefined) -> Starting thread INFO InputProcess::ActionTrailBasePullerSetup(example_collector,alibaba#alibaba_1,actiontrail#predefined) -> Setup for module <ActionTrailStandardPuller> has been successfully executed

Puller output

A successful initial run has the following output messages for the puller module:

Note that the PrePull action is executed only one time before the first run of the Pull action.

INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Detected initial start time in UTC change: {'values_changed': {'root': {'new_value': DateTime(2023, 1, 5, 0, 0, 0, tzinfo=Timezone('UTC')), 'old_value': DateTime(2023, 1, 1, 0, 0, 0, tzinfo=Timezone('UTC'))}}}. Setting last run time to 2023-01-05T00:00:00+00:00 INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Updating persisted data with {'initial_start_time_in_utc': '2023-01-05T00:00:00Z'} INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Updating persisted data with {'last_run_datetime': '2023-01-05T00:00:00Z'} INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Updating persisted data with {'last_ids': []} INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Starting data collection every 60 seconds INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Retrieving timestamp (2023-01-10T10:46:05.063698+00:00) is greater than 2023-01-10T10:36:05.078809+00:00 (datetime.now() - 600 seconds). Setting end datetime to 2023-01-10T10:36:05.078809+00:00. INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Setting end_datetime to 2023-01-10T10:36:04.999999+00:00 (end of previous second to account for Alibaba time filtration granularity) INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Retrieving actiontrail logs since 2023-01-05T00:00:00+00:00 to 2023-01-10T10:36:04.999999+00:00 and sending to cloud.alibaba.actiontrail.events INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Sending 33 events to cloud.alibaba.actiontrail.events INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Updating persisted data with {'last_run_datetime': '2023-01-10T09:55:09Z'} INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Updating persisted data with {'last_ids': ['16BE7780-2F03-5C25-BC0F-ASDF1234ASDF']} INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Updating persisted data with {'last_run_datetime': '2023-01-10T10:36:04.999999Z'} INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Retrieved and sent 33 actiontrail event(s) to cloud.alibaba.actiontrail.events since 2023-01-05T00:00:00+00:00 to 2023-01-10T10:36:04.999999+00:00 INFO InputProcess::ActionTrailStandardPuller(alibaba,alibaba_1,actiontrail,predefined) -> Data collection completed. Elapsed time: 0.412 seconds. Waiting for 59.588 second(s) until the next one

After a successful collector’s execution (that is, no error logs found), you will see the following log message:

INFO ThreatQuotientDataPuller(threatquotient_collector,threatquotient_data_puller#111,events#predefined) -> Statistics for this pull cycle (@devo_pulling_id=1655983326.290848): Number of requests performed: 2; Number of events received: 52; Number of duplicated events filtered out: 0; Number of events generated and sent: 52 (from 52 unflattened events); Average of events per second: 92.99414315733.

The value @devo_pulling_id is injected in each event to group all events ingested by the same pull action. You can use it to get the exact events downloaded in that Pull action in Devo’s search window.

This collector uses persistent storage to download events in an orderly fashion and avoid duplicates. In case you want to re-ingest historical data or recreate the persistence, you can restart the persistence of this collector by following these steps:

  1. Edit the configuration file.

  2. Change the value of the initial_start_time_in_utc parameter to a different one.

  3. Save the changes.

  4. Restart the collector.

The collector will detect this change and will restart the persistence using the parameters of the configuration file or the default configuration in case it has not been provided.

The actiontrail_logservice service utilizes the standard Log Service API to fetch all ActionTrail events stored in a given Log Store. Please note that all logs will be fetched from a Log Store. Log Store events include two additional fields in the event output that describe where the event came from (origin and module).

Log Service is a data logging service that supports the collection, consumption, shipping, search, and analysis of logs. After setting it up, it is able to store data longer than 90 days.

Devo categorization and destination

All events of this service are ingested into the table cloud.alibaba.log_service.events

Deduplication and Persistence Strategy

The state is composed of these parameters:

  • initial_start_time_in_utc: The time specified in the config file, from which we want to start fetching events.

  • last_run_datetime: The date of the last event(s).

  • last_ids: An array of the identifiers of the last event(s). The event must be sent at exactly the same date.

To prevent sending duplicate events to Devo, while iterating over pages, the identifiers of those events with exactly the same date as the newest event are saved in the state. The date of the newest event is also saved in the state. On the next page, if any of the events have the same date, we check if the event is present in the identifier’s array. If the identifier is already in the array, the event is filtered out. After processing a page, the state is updated with the new values.

Collector operations

This section is intended to explain how to proceed with specific operations of this collector.

Initialization

The initialization module is in charge of setup and running the input (pulling logic) and output (delivering logic) services and validating the given configuration.

A successful run has the following output messages for the initializer module:

Events delivery and Devo ingestion

The event delivery module is in charge of receiving the events from the internal queues where all events are injected by the pullers and delivering them using the selected compatible delivery method.

A successful run has the following output messages for the initializer module:

Sender services

The Integrations Factory Collector SDK has 3 different senders services depending on the event type to delivery (internal, standard, and lookup). This collector uses the following Sender Services:

Sender services

Description

Sender services

Description

internal_senders

In charge of delivering internal metrics to Devo such as logging traces or metrics.

standard_senders

In charge of delivering pulled events to Devo.

Sender statistics

Each service displays its own performance statistics that allow checking how many events have been delivered to Devo by type:

Logging trace

Description

Logging trace

Description

Number of available senders: 1

Displays the number of concurrent senders available for the given Sender Service.

sender manager internal queue size: 0

Displays the items available in the internal sender queue.

Total number of messages sent: 44, messages sent since "2022-06-28 10:39:22.511671+00:00": 21 (elapsed 0.007 seconds)

Displayes the number of events from the last time and following the given example, the following conclusions can be obtained:

  • 44 events were sent to Devo since the collector started.

  • The last checkpoint timestamp was 2022-06-28 10:39:22.511671+00:00.

  • 21 events where sent to Devo between the last UTC checkpoint and now.

  • Those 21 events required 0.007 seconds to be delivered.

To check the memory usage of this collector, look for the following log records in the collector which are displayed every 5 minutes by default, always after running the memory-free process.

  • The used memory is displayed by running processes and the sum of both values will give the total used memory for the collector.

  • The global pressure of the available memory is displayed in the global value.

  • All metrics (Global, RSS, VMS) include the value before freeing and after previous -> after freeing memory

Change log

Release

Released on

Release type

Details

Recommendations

Release

Released on

Release type

Details

Recommendations

v1.1.0

Jun 1, 2023

IMPROVEMENTS

Improvements:

  • Improved log retrieval speed by moving from time-based pagination to cursor/shard-based pagination.

  • Updated DCSDK from 1.5.1 to 1.7.2

    • Added a lock to enhance sender object

    • Added new class attrs to the setstate and getstate queue methods

    • Fix sending attribute value to the setstate and getstate queue methods

    • Added log traces when queues are full and have to wait

    • Added log traces of queues time waiting every minute in debug mode

    • Added method to calculate queue size in bytes

    • Block incoming events in queues when there are no space left

    • Send telemetry events to Devo platform

    • Upgraded internal Python dependency Redis to v4.5.4

    • Upgraded internal Python dependency DevoSDK to v5.1.3

    • Fixed obfuscation not working when messages are sent from templates

    • New method to figure out if a puller thread is stopping

    • Upgraded internal Python dependency DevoSDK to v5.0.6

    • Improved logging on messages/bytes sent to Devo platform

    • Fixed wrong bytes size calculation for queues

    • New functionality to count bytes sent to Devo Platform (shown in console log)

    • Upgraded internal Python dependency DevoSDK to v5.0.4

    • Fixed bug in persistence management process, related to persistence reset

    • Aligned source code typing to be aligned with Python 3.9.x

    • Inject environment property from user config

    • Obfuscation service can be now configured from user config and module definiton

    • Obfuscation service can now obfuscate items inside arrays

Recommended version

v1.0.0

Nov 30, 2022

NEW FEATURE

New features:

  • Ingestion of Actiontrail events. This is a service that monitors and records the actions of your Alibaba Cloud account, including the access to and use of Alibaba Cloud services using the Alibaba Cloud Management console, calling API operations, or SDKs. These are the services provided by this collector to read Actiontrail events:

  •  actiontrail is a service to fetch all the Actiontrail events from a single account, using the standard API. Events up to 90 days can be pulled using this method. If greater retention is required, go to the next service using LogService.

  • actiontrail_log_service is a service to fetch all the Actiontrail events stored in a Log Store. It is recommended to use this service for higher volumes.

 

-