Document toolboxDocument toolbox

Subqueries

Overview

Note that this feature is still not available in the Data Search area of the application. You can use it in your Activeboards and through the Devo API.

Devo provides a functionality that suppresses the boundaries between searches and removes certain limitations of a single search: subqueries. It consists of using the result of a query to act as a filter in another query, thus making it possible to use information from different sources in a single consultation.

For example, let's assume we have 2 different queries (query and subquery in the picture below) and we want to use the values in column 2 of the subquery to filter the data in the query. Matching the columns 2 in both queries, whose data belongs to the same type, we can filter the data in our query and get only those events that contain the values in the column 2 of the subquery.

How does it work in LINQ?

You have to insert the full subquery enclosed in brackets () and preceded by the clause where <field(s)> in, functioning as the second argument of a filtering operation in the main query. You have to get rid of any unrelated field in the subquery by grouping your data or creating the required columns. The remaining fields must be only those whose data type will match the fields indicated in the main query.

from <table1>  [clauses] where <field(s)> in (   from <table2>   [clauses]    where time expression < eventdate < time expression    group every - by / select <field(s) matching outer field(s)>) select <field(s)>

Be careful with the allocation of the brackets and always ensure to finish the clause, especially when the subquery is very long and contains a variety of operations.

Guidelines

There are some considerations we need to take into account when working with subqueries:

  • The data types of the resulting fields in the subquery and the fields indicated in the main query must match. You can do this by grouping your data or creating the required columns. For example, if we want to filter a string column in the main query, we need to use a subquery that results in a string column. However, the names of the columns in the subquery and the main query can be different.
    from <table1>
    where <column> in (
    from <table2>
    where now() - 5m < eventdate < now()
    group every - by <column>)
    select *

  • You can match the data of the main query and subquery through more than one field, as long as their data types match and are indicated in the proper order (the first one matches the first one, and so on).
    from <table1>
    where (<column1, column2>) in (
    from <table2>
    where now() - 5m < eventdate < now()
    group every - by <column1, column2>)
    select *

  • The subquery may contain additional fields, as long as they are added after the ones that match the main query.
    from <table1>
    where <column1> in (
    from <table2>
    where now() - 5m < eventdate < now()
    group every - by <column1, column2, column3>)
    select *

  • You must add the fields you want to see in the final query results using the operator select. Include select * to add all the columns in the table, or specify the required columns after the select operator.

  • The subquery may come from the same table as the main query, but also from a different one.

Time range selection

In case you want to explicitly specify a time range for your subquery data, you must include it using the following syntax:

     where time expression < eventdate < time expression.

Following this format, you can specify either times relative to now or absolute dates.

This is an example with relative dates:

from demo.ecommerce.data where statusCode in ( from demo.ecommerce.data where statusCode = "404" where now()- 5m < eventdate < now() group every - by statusCode) select method, statusCode, eventdate

And the following is an example with absolute dates. As you can see, it is necessary to use the Parse date (parsedate) operation to transform a given string representing a date into a timestamp. Check the operation article for more details and valid formats.

from siem.logtrust.web.activity where userid in( from siem.logtrust.web.error where not userid = "" where parsedate("21/04/2020 21:00","DD/MM/YYYY HH:mm") < eventdate < parsedate("21/04/2020 21:40","DD/MM/YYYY HH:mm") group every - by userid) select *

It is possible to specify a time range for the subquery different than the one specified for the main query. Also, note that If you do NOT include a time range in your subquery. it will default to the same time range as the main query.

Including a temporal grouping in your subquery

Note that if you include a temporal grouping in your subquery (for example, group every 30m by <column1>) you may not get any data. This is because a temporal grouping will return an eventdate prior to the time specified (in the example before, 30m). That will be compared to the main query time range and will likely return no results.

For this reason, it is recommended to add no-time based groupings (group every -) to your subqueries.

Query examples

Check below examples of subqueries that cover the different cases explained in the guidelines above. After each query, there is a capture that shows the result of adding them as source queries in an Activeboard widget. Go here to learn more about Activeboards and how to use them.

Subquery from the same table as the main query

Subquery from a table different than the one in the main query

Subquery using two columns to filter

Subquery that includes more fields than the main query

This example works because the first column in the subquery list (userid) matches the type of the column indicated in the main query.

Subquery with the column name different than the one of the main query

Checking that the subquery results are contained in one of the main query fields

All the subquery examples above are used to filter the main query results in order to get only the events that match the subquery data. However, you can also check that your query results contain the resulting values of your subquery. In the example below, we uploaded this CSV file to our Devo domain, which contains several URI parts, in order to get those main query events that contain them in the uri column. To do it, you must use the following syntax:

Subquery using a select clause to expose a field

Expose data from the subquery to the main query

You can correlate specific column values of your subquery with the ones in your main query and show them as a list in a new column.

To do it, you must encapsulate the subquery between another set of parentheses and add a name to it (inner in the example below). Then, you must reference the required subquery fields that we are going to use to perform the match between brackets [ ], preceded by the name we gave to the subquery.

Check below a query example that references the field userid of the subquery (named inner), which is the field used to perform the match (in red). The final query will show the userid column of the main query plus the inner column, which will show the corresponding values of the columns in our subquery separated by commas (in this case, a count and the average of the values in the srcPort column, marked in green in the example).

Dealing with case sensitivity

Subqueries are case sensitive, that is to say, the fields you use to match the main query and the subquery must have the same case.

In case you are working with strings and you're not sure if the fields you use for the match have the same case, you can use the Upper case (upper) or Lower case (lower) operations to transform them to the same type (uppercase or lowercase). In the example below, we transformed the uri field in both queries to lowercase using the Lower case (lower) operation: