Document toolboxDocument toolbox

baselineScorerV2

Scores the events on the specified groupByField and metricField in the given table, same as the original BaselineScorer, along with some new mutable parameters. The higher score the more abnormal event is.

Operator usage in easy mode

  1. Click + on the parent node.

  2. Enter the Baseline Scorer operator in the search field and select the operator from the Results to open the operator form.

  3. In the Input Table drop-down, enter or select the table containing the data to run this operator on.

  4. In the Group By Field, enter the column name by which to group the rows by.

  5. In the Metric Field, enter the column name that contains the metric to be used for scoring.

  6. In the Baseline Table drop-down, enter or select the name of the baseline table.

  7. In the Not In Baseline Score field, enter the score for items not in the baseline. Default is 8.

  8. In the Not Enough Examples Score field, enter the score for items less in number than the notEnoughExamplesThreshold. Default is 6.

  9. In the Not Enough Examples Threshold field, enter the threshold for similar items. Default is 1.

  10. In the Max Std Dev field, enter the maximum value for standard deviation. Default is 10.0.

  11. In the Std Dev Multiplier field, enter the multiplier for standard deviation. Default is 1.0.

  12. In the Min Ratio Between Std Dev And Avg field, enter the minimum Ratio between mean and standard deviation. Default is 0.3.

  13. Click Run to view the result.

  14. Click Save to add the operator to the playbook.

  15. Click Cancel to discard the operator form.

Usage details

LQL Command

baselineScorerV2(inputTable, groupByField, metricField, baselineTable, notInBaselineScore, notEnoughExamplesScore, notEnoughExamplesThreshold, maxStdDev, stdDevMultiplier, minRatioBetweenStdDevAndAvg)

Input
inputTable (TableReference) - The table containing the data to run this operator on.
groupByField (ColumnReference) - The column by which to group the rows by
metricField (ColumnReference) - The column that contains the metric to be used for scoring
baselineTable (TableReference) - The name of the baseline table
notInBaselineScore (Long): Score for items not in baseline. Default is 8
notEnoughExamplesScore (Long) - Score for items less in number than the notEnoughExamplesThreshold. Default is 6
notEnoughExamplesThreshold (Long) - Threshold for similar items. Default is 1
maxStdDev (Double) - Maximum value for standard deviation. Default is 10.0
stdDevMultiplier (Double) - Multiplier for standard deviation. Default is 1.0
minRatioBetweenStdDevAndAvg (Double) - Minimum Ratio between mean and standard deviation. Default is 0.3

Output
The input table with an additional lhub_score column containing the score.

Example

Input
tableA:

id

user

download_count

id

user

download_count

x1

emil

12

x2

emil

22

x3

monica

32

x4

monica

35

tableB:

id

user

download_count

id

user

download_count

v1

monica

25

v2

emil

15

v3

emil

50

LQL Command

baselineScorerV2(tableB, "user", "download_count", tableA , 8 , 6 , 1 , 10.0 , 1.0 , 0.3)

Output

id

user

download_count

lhub_score

lh_baseline

lhub_confidence_score

id

user

download_count

lhub_score

lh_baseline

lhub_confidence_score

v3

emil

50

4

12,22

4

v1

monica

25

0

32,35

4

v2

emill

15

0

12,22

4