Approximated estimation (estimation)
Description
Computes the approximated estimation of a set of distinct counts in dc data type.
Use the HyperLogLog++ (hllpp) aggregation operation to transform a field into dc data type, needed as input data for this operation. Keep in mind that you must group your data before applying an aggregation operation.
How does it work in the search window?
Select Create column in the search window toolbar, then select the Approximated estimation operation. You need to specify one argument:
Argument | Data type |
---|---|
Estimate mandatory | dc |
The data type of the values in the new column is float.
Example
In the demo.ecommerce.data
table, we want to get the approximated estimation of the distinct count values generated from the clientIpAddress column. To do it, we will apply a Filter using the Approximated estimation operation but first, we need to create the required dc column.
Step 1: Create the distinct count column
The first step is creating a column showing the distinct count of the values in the clientIpAddress column, in dc data type. To do it, we group data every 5 minutes and then use the HyperLogLog++ (hllpp) aggregation operation. Select the clientIpAddress column in the Source argument and enter a name for the new column (clientIpAddress_dc).
Step 2: Create a new column using the Approximated estimation operation
Select Create column on the query toolbar, then select Approximated estimation as the operation. Select the clientIpAddress_dc column as argument. Let's call the new column clientIpAddress_estimation
Click Create column and you will see the following result:
How does it work in LINQ?
Use the operator select
... as
... and add the operation syntax to create the new column. This is the syntax for the Approximated estimation operation:
estimation(dc)
Example
You can copy the following LINQ script and try the above example on the demo.ecommerce.data
table:
from demo.ecommerce.data
group every 5m
every 5m
select hllpp(clientIpAddress) as clientIpAddress_dc,
estimation(clientIpAddress_dc) as clientIpAddress_estimation