Document toolboxDocument toolbox

Regular expression, regexp (re)

Description

Builds a regular expression from the given string. These regular expressions are based on a specific language that establishes patterns to help you match, locate and manage text. They can be used for several purposes, such as dividing a string into capturing groups.

You can use the regular expressions generated using this operation in the Substitute (subs) and Substitute all (subsall) operations.

The syntax used by Devo for this operation is Java syntax. Check the following link to know more about Java language and syntax to construct your own regular expressions. If you want a broader overview of the concept and uses of regular expressions, you can click the following link.

Take care when using strings containing the \ escape character. For every \ in the string you must add \\\\ (4), resulting in a total of \\\\\ (5). This is because the Java compiler needs \\ and the regex engine also needs \\.

Given messages like these already ingested in Devo:

{\"request\":{\"Id\":23456,\"Email\":\"marketing@devo.com\",\"Company\":\"Devo\",\"Team\":\"Marketing\"}}

{\"request\":{\"Id\":34567,\"Email\":\"sales@devo.com\",\"Company\":\"Devo\",\"Team\":\"Sales\"}}

{\"request\":{\"Id\":12345,\"Email\":\"support@devo.com\",\"Company\":\"Devo\",\"Team\":\"Customer Support\"}}

To retrieve the email address value, you can use this code:

select peek(message, re("\\\\\"Email\\\\\":\\\\\"(.*?)\\\\\""),1) as email

How does it work in the search window?

Select Create column in the search window toolbar, then select the Regular expression, regexp operation. You need to specify one argument:

Argument

Data type

Argument

Data type

Pattern mandatory

string

The data type of the values in the new column is regexp.

Example

In the upload.sample.data table, we want to transform the strings representing regular expressions in the regexStr column into regexp data type so we can use it later as the argument of another operation. To do it, we will create a new column using the Regular expression, regexp operation.

The arguments needed to create the new column are:

  • Pattern - regexStr column

 

Click Create column and you will see the following result:

  • A column in regexp data type that contains a pattern typically used to isolate the different parts of an email address.

How does it work in LINQ?

Use the operator select... as...  and add the operation syntax to create the new column. This is the syntax for the Regular expression, regexp operation:

  • re(string)

Example

You can copy the following LINQ script and try the previous example on the my.upload.sample.data table.

from my.upload.sample.data select split(message, ";", 9) as regexStr, re(regexStr) as regex