...
There are many cases where we have a handful (10 - 2000) past examples of text data and we want to see if new text is close to these saved examples. Machine learning techniques like classification are not appropriate because we don’t have enough data to train an accurate model.
This operator builds a model for each group in the dataset so during matching we find best matches from the same group.
Operator Usage in Easy Mode
- Click + on the parent node.
- Enter the Build Term Corpus Per Group operator in the search field and select the operator from the Results to open the operator form.
- In the Table drop-down, enter or select a table to create the model.
- In the Model Name field, enter the name of a model.
- In the Grouping Column drop-down, enter or select a column.
- In the Column drop-down, enter or select the column name that contains the text to extract TF-IDF features.
- In the Columns to Keep drop-down, select a column or a list of columns to keep. The drop-down shows the data based on the table selected in the TABLE field.
- Optional. Click Add More to add values for minimum and maximum TF parameters.
- Click Run to view the result.
- Click Save to add the operator to the playbook.
- Click Cancel to discard the operator form.
Usage Details
We then want to match them against an incoming stream of events to determine how close they are to what we've already observed and retrieve some enrichment about those past examples.
...