Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Table of Contents
minLevel1
maxLevel6
outlinefalse
typeflat
printablefalse

There are many cases where we have a handful (10 - 2000) past examples of text data and we want to see if new text is close to these saved examples. Machine learning techniques like classification are not appropriate because we don’t have enough data to train an accurate model.
This operator builds a model for each group in the dataset so during matching we find best matches from the same group.

...

Example

Input
table = github_logs

server

corpus

label

domain

machine1

a b c, d e f, g, h, i, j

x

google

machine1

aa b, c, d ee, ff, gg, hh, i, jj

y

facebook

machine2

k, l, m, n, o, p, q

z

apple

LQL command

Code Block
buildModelFromCorpus(inputTable, "corpusModel", "server", "corpus", ["label", "domain"])
// table = inputTable
// text to train model = corpus
// columns to keep so they will be added after match is found = label and domain
// minDF and minTF are default

Output

RESULT

'Successfully created model and stored into <> file'