Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 8 Next »

Overview

Note that this feature is still not available in the Data Search area of the application. You can use it in your Activeboards and through the Devo Query API.

Devo provides a functionality that suppresses the boundaries between searches and removes certain limitations of a single search: subqueries. It consists of using the result of a query to act as a filter in another query, thus making it possible to use information from different sources in a single consultation.

For example, let's assume we have 2 different queries (query and subquery in the picture below) and we want to use the values in field 2 of the subquery to filter the data in the query. Matching the fields 2 in both queries, whose data belongs to the same type, we can filter the data in our query and get only those events that contain the values in the field 2 of the subquery.

How does it work in LINQ?

You have to insert the full subquery enclosed in brackets () and preceded by the clause where <field(s)> in, functioning as the second argument of a filtering operation in the main query. You have to get rid of any unrelated field in the subquery by grouping your data or creating the required fields. The remaining fields must be only those whose data type will match the fields indicated in the main query.

from <table1> 
[clauses]
 where <field(s)> in (
  from <table2> 
  [clauses] 
  where time expression < eventdate < time expression 
  group every - by / select <field(s) matching outer field(s)>)
select <field(s)>

Be careful with the allocation of the brackets and always ensure to finish the clause, especially when the subquery is very long and contains a variety of operations.

Guidelines

There are some considerations we need to take into account when working with subqueries:

  • The data types of the resulting fields in the subquery and the fields indicated in the main query must match. You can do this by grouping your data or creating the required fields. For example, if we want to filter a string field in the main query, we need to use a subquery that results in a string field. However, the names of the fields in the subquery and the main query can be different.
    from <table1>
    where <field> in (
    from <table2>
    where now() - 5m < eventdate < now()
    group every - by <field>)
    select *

  • You can match the data of the main query and subquery through more than one field, as long as their data types match and are indicated in the proper order (the first one matches the first one, and so on).
    from <table1>
    where (<field1, field2>) in (
    from <table2>
    where now() - 5m < eventdate < now()
    group every - by <field1, field2>)
    select *

  • The subquery may contain additional fields, as long as they are added after the ones that match the main query.
    from <table1>
    where <field1> in (
    from <table2>
    where now() - 5m < eventdate < now()
    group every - by <field1, field2, field3>)
    select *

  • You must add the fields you want to see in the final query results using the operator select. Include select * to add all the fields in the table, or specify the required fields after the select operator.

  • The subquery may come from the same table as the main query, but also from a different one.

Time range selection

In case you want to explicitly specify a time range for your subquery data, you must include it using the following syntax:

     where time expression < eventdate < time expression.

Following this format, you can specify either times relative to now or absolute dates.

This is an example with relative dates:

from demo.ecommerce.data
where statusCode in (
from demo.ecommerce.data 
where statusCode = "404" 
where now()- 5m < eventdate < now()
group every - by statusCode) 
select method, statusCode, eventdate

And the following is an example with absolute dates. As you can see, it is necessary to use the Parse date (parsedate) operation to transform a given string representing a date into a timestamp. Check the operation article for more details and valid formats.

from siem.logtrust.web.activity
where userid in(
from siem.logtrust.web.error
where not userid = ""
where parsedate("21/04/2020 21:00","DD/MM/YYYY HH:mm") < eventdate < parsedate("21/04/2020 21:40","DD/MM/YYYY HH:mm")
group every - by userid)
select *

It is possible to specify a time range for the subquery different than the one specified for the main query. Also, note that If you do NOT include a time range in your subquery, it will default to the same time range as the main query.

Including a temporal grouping in your subquery

Note that if you include a temporal grouping in your subquery (for example, group every 30m by <field1>) you may not get any data. This is because a temporal grouping will return an eventdate prior to the time specified (in the example before, 30m). That will be compared to the main query time range and will likely return no results.

For this reason, it is recommended to add no-time based groupings (group every -) to your subqueries.

Query examples

Check below examples of subqueries that cover the different cases explained in the guidelines above. After each query, there is a capture that shows the result of adding them as source queries in an Activeboard widget. Go here to learn more about Activeboards and how to use them.

 Subquery from the same table as the main query
from demo.ecommerce.data
where statusCode in (
from demo.ecommerce.data 
where statusCode = "404" 
where now()- 5m < eventdate < now()
group every - by statusCode) 
select method, statusCode, eventdate
 Subquery from a table different than the one in the main query
from demo.ecommerce.data
select mmcountry(clientIpAddress) as Country
where not Country in (
from siem.logtrust.web.activity
where now() - 12h < eventdate < now()
group every - by country)
where isnotnull(Country)
select bytesTransferred, timeTaken
 Subquery using two columns to filter
from siem.logtrust.web.activity 
where (username, domain) in (
from siem.logtrust.web.navigation 
where origin = "redada" 
where now()-2h <eventdate<now() 
group every - by userEmail, domain) 
group every 5m by username, domain 
select count()
 Subquery that includes more fields than the main query

This example works because the first field in the subquery list (userid) matches the type of the field indicated in the main query.

from siem.logtrust.web.activity
where userid in (
from siem.logtrust.web.navigation
where not userid = ""
where now() -1h < eventdate <now()
group every - by userid, domain)
select *
 Subquery with the column name different than the one of the main query
from siem.logtrust.web.activity
where username in (
from siem.logtrust.web.navigation 
where now()-1d <eventdate<now() 
select userEmail) 
select eventdate, username
 Checking that the subquery results are contained in one of the main query fields

All the subquery examples above are used to filter the main query results in order to get only the events that match the subquery data. However, you can also check that your query results contain the resulting values of your subquery. In the example below, we uploaded the CSV below to our Devo domain, which contains several URI parts, in order to get those main query events that contain them in the uri field.

To do it, you must use the following syntax:

from demo.ecommerce.data
where set ((
from my.upload.uri.parts
select message)) in uri
select *
 Subquery using a select clause to expose a field
from siem.logtrust.web.activity
select ((
from siem.logtrust.web.navigation
group every - by userEmail
select count()) as inner)
select inner[username] as nav
group by username, nav

Expose data from the subquery to the main query

You can correlate specific field values of your subquery with the ones in your main query and show them as a list in a new field.

To do it, you must encapsulate the subquery between another set of parentheses and add a name to it (inner in the example below). Then, you must reference the required subquery fields that we are going to use to perform the match between brackets [ ], preceded by the name we gave to the subquery.

Check below a query example that references the field userid of the subquery (named inner), which is the field used to perform the match (in red). The final query will show the userid field of the main query plus the inner field, which will show the corresponding values of the fields in our subquery separated by commas (in this case, a count and the average of the values in the srcPort field, marked in green in the example).

from siem.logtrust.web.activity
where userid in((
from siem.logtrust.web.navigation
where not userid = ""
where now()-1h <eventdate<now()
group every - by userid
select count()
select avg(srcPort)) as inner)
select userid, inner[userid]

Dealing with case sensitivity

Subqueries are case-sensitive, that is to say, the fields you use to match the main query and the subquery must have the same case.

In case you are working with strings and you're not sure if the fields you use for the match have the same case, you can use the Upper case (upper) or Lower case (lower) operations to transform them to the same type (uppercase or lowercase). In the example below, we transformed the uri field in both queries to lowercase using the Lower case (lower) operation:

from demo.ecommerce.data
where lower(uri) in (
from demo.ecommerce.data
where now()- 5m < eventdate < now()
group every - by lower(uri))
select *
  • No labels