Document toolboxDocument toolbox

approximateLabelLookup

Score an event table by looking at similar events from another event table.

Example:
tableA is a table that has already been processed and scored.
tableB is an event table to be scored.

The two tables have some overlapping content (some similar events) but are not the same

Instead of processing all of tableB, use the scores (or labels) that are already determined for the parts that are similar to tableA. With this method, only the portions of tableB that are not similar to tableA must be processed and scored. It is not necessary to repeat the processing for the similar portions of the tables.

Operator Usage in Easy Mode

  1. Click + on the parent node.
  2. Enter the Approximate Label Lookup operator in the search field and select the operator from the Results to open the operator form.
  3. In the Reference table drop-down, enter or select a reference table.
  4. Click Add More to add the list of field names in the reference table to measure similarity.
  5. In the Ref Label field, enter the label field name in the reference table.
  6. In the Score Table drop-down, enter or select a table to be scored.
  7. Click Add More to add the list of columns in the to-be-labeled table, same order as in the reference field.
  8. Click Run to view the result.
  9. Click Save to add the operator to the playbook.
  10. Click Cancel to discard the operator form.

Usage Details

``` {text}approximateLabelLookup(tableA,listOfColumnsFromTableA, scoreColumn, tableB, listOfColumnsFromTableB)

**Input**  
`[tableA](http://google.com)`: reference (lookup) table  
`listOfColumnsFromTableA`: list of column names from `tableA` that will be used as a feature to measure similarity.e.g. ["bytes_in","bytes_out"]. Column values should be numeric.  
`scoreColumn`: lookup score or label from `tableA`  
`tableB`: event table to be scored (approximate label)  
`listOfColumnsFromTableB`: list of column names from `tableB` that will be used as a feature to measure similarity between `tableA` and `tableB`. Ordering of columns names are important, should same order as in listOfColumnsFromTableA

**Output**  
tableB + addition "lhub_lookup_label"

## Example

We want to find a score for `tableB` by looking similar events from `tableA`, where by "similar" we mean: 

`tableA.col1` is similar to `tableB.col1`, and `tableA.col2` is similar to `tableB.col2` (but not same).

**TableA**

<style>
  th {
    border: 1px solid #cccccc;
    background-color: #eeeeee;
    padding: 8px 5px 8px 5px;
    text-align: left
  }
</style>

<div><table>
<thead>
<tr>
<th>id</th>
<th>col1</th>
  <th>col2</th>
  <th>score</th>
</tr>
</thead>
<tbody>
<tr>
<td>u1</td><td>11</td><td>12</td><td>1.0</td></tr>
<tr>
<td>u2</td><td>21</td><td>22</td><td>5.0</td></tr><tr>
<td>u3</td><td>31</td><td>32</td><td>10.0</td></tr>
</tbody>
</tr>
</table></div>

**tableB:**

<style>
  th {
    border: 1px solid #cccccc;
    background-color: #eeeeee;
    padding: 8px 5px 8px 5px;
    text-align: left
  }
</style>

<div><table>
<thead>
<tr>
<th>id</th>
<th>col1</th>
  <th>col2</th>
</tr>
</thead>
<tbody>
  <tr><td>x1</td><td>11</td><td>11</td></tr>
    <tr><td>x2</td><td>20</td><td>20</td></tr>
  <tr><td>x3</td><td>50</td><td>50</td></tr>
</tbody>
</tr>
</table></div>

<style>table.blueTable {
  border: 1px solid #1C6EA4;
  background-color: #FFFFFF;
  width: 100%;
  text-align: left;
  border-collapse: collapse;
}
table.blueTable td, table.blueTable th {
  border: 1px solid #AAAAAA;
  padding: 3px 2px;
}
table.blueTable tbody td {
  font-size: 13px;
}
table.blueTable thead {
  background: #E0E0E0;
  background: -moz-linear-gradient(top, #e8e8e8 0%, #e3e3e3 66%, #E0E0E0 100%);
  background: -webkit-linear-gradient(top, #e8e8e8 0%, #e3e3e3 66%, #E0E0E0 100%);
  background: linear-gradient(to bottom, #e8e8e8 0%, #e3e3e3 66%, #E0E0E0 100%);
  border-bottom: 1px solid #444444;
}
table.blueTable thead th {
  font-size: 15px;
  font-weight: bold;
  color: #424242;
  border-left: 1px solid #D0E4F5;
}
table.blueTable thead th:first-child {
  border-left: none;
}

table.blueTable tfoot td {
  font-size: 14px;
}
table.blueTable tfoot .links {
  text-align: right;
}
table.blueTable tfoot .links a{
  display: inline-block;
  background: #1C6EA4;
  color: #FFFFFF;
  padding: 2px 8px;
  border-radius: 5px;
}</style>

LQL command
``` {sql}
approximateLabelLookup(tableA, ["col1","col2"], "score", tableB, ["col1", "col2"])

Output

id col1 col2 lhub_lookup_label
x111111.0
x220205.0
x35050null

"x3" is not scored, since we didn't find similar event from tableA. "u3" in tableA is closer than others, but it is not within 10% difference range.