Regular expression, regexp (re)
Description
Builds a regular expression from the given string. These regular expressions are based on a specific language that establishes patterns to help you match, locate and manage text. They can be used for several purposes, such as dividing a string into capturing groups.
You can use the regular expressions generated using this operation in the Substitute (subs) and Substitute all (subsall) operations.
The syntax used by Devo for this operation is Java syntax. Check the following link to know more about Java language and syntax to construct your own regular expressions. If you want a broader overview of the concept and uses of regular expressions, you can click the following link.
Take care when using strings containing the \
escape character. For every \
in the string you must add \\\\
(4), resulting in a total of \\\\\
(5). This is because the Java compiler needs \\
and the regex engine also needs \\
.
Given messages like these already ingested in Devo:
{\"request\":{\"Id\":23456,\"Email\":\"marketing@devo.com\",\"Company\":\"Devo\",\"Team\":\"Marketing\"}}
{\"request\":{\"Id\":34567,\"Email\":\"sales@devo.com\",\"Company\":\"Devo\",\"Team\":\"Sales\"}}
{\"request\":{\"Id\":12345,\"Email\":\"support@devo.com\",\"Company\":\"Devo\",\"Team\":\"Customer Support\"}}
To retrieve the email address value, you can use this code:
select peek(message, re("\\\\\"Email\\\\\":\\\\\"(.*?)\\\\\""),1) as email
How does it work in the search window?
Select Create column in the search window toolbar, then select the Regular expression, regexp operation. You need to specify one argument:
Argument | Data type |
---|---|
Pattern mandatory | string |
The data type of the values in the new column is regexp.
Example
The first example in this article use values in a data table generated from the following CSV file.
If you want to try the example for yourself, download the file and upload it to your domain clicking Data upload in the navigation pane. Name the new table my.upload.sample.data and select Current date as Date parsing type. Learn more about uploading data in Uploading log files.
After receiving the confirmation message, you can access the table from the Finder, selecting my → upload → sample → data. When you upload data from a file, all the information is included in a single column called message. To split the values into different columns, you can use the Split operation. Click Toggle Query Editor in the search window toolbar and paste the following LINQ query to save time:
from my.upload.sample.data
select split(message, ";", 9) as regexStr
In the upload.sample.dataÂ
table, we want to transform the strings representing regular expressions in the regexStr column into regexp data type so we can use it later as the argument of another operation. To do it, we will create a new column using the Regular expression, regexp operation.
The arguments needed to create the new column are:
Pattern - regexStr column
Click Create column and you will see the following result:
A column in regexp data type that contains a pattern typically used to isolate the different parts of an email address.
We can also create a column in the demo.ecommerce.data
table that shows the regular expression ([0-9]+)\.* in regexp data type so we can use it later as an argument of another operation. To do it, we will create a new column using the Regular expression operation. Let's call it regex.
The arguments needed to create the new column are:
Pattern - Click the pencil icon and enter ([0-9]+)\.*
Click Create column and you will see the following result:
A column in regexp data type that contains a pattern typically used to isolate the different parts of an IP address.
How does it work in LINQ?
Use the operator select
... as
... and add the operation syntax to create the new column. This is the syntax for the Regular expression, regexp operation:
re(string)
Example
You can copy the following LINQ script and try the previous examples on the my.upload.sample.data
 and demo.ecommerce.data
 tables
from my.upload.sample.data
select split(message, ";", 9) as regexStr,
re(regexStr) as regex
from demo.ecommerce.data
select re("([0-9]+)\\.*") as regex