Document toolboxDocument toolbox

Substitute all (subsall)

Description

In a given string, all sets of characters that match an established regular expression will be replaced by an indicated template. If no occurrence is found, it returns the original expression or an optional specified fail value. Use this operation to replace all the occurrences in the given string. Use the Substitute (subs) operation to replace only the first occurrence. 

You can use the Template (template) and Regular expression, regexp (re) operations to transform the values in a string column into the required template and regexp data types.

How does it work in the search window?

Select Create column in the search window toolbar, then select the Substitute all operation. You need to specify three arguments:

Argument

Data type

More information

Argument

Data type

More information

String to scan

string

You can select a column in the table or enter a value manually.

Regular expression 

regexp

You can select a column in the table or enter a value manually. If you introduce it yourself, you can use the regexp syntax to establish grouping patterns.

Template 

template

You can select a column in the table or enter a value manually. If you introduce it yourself, you can use the capturing group syntax to make reference to specific groups established by the regular expression.

The data type of the values in the new column is string.

Note that Devo automatically changes the strings manually entered in the Regular expression and Template arguments to the required regexp and template data types. If you want to use a column on these arguments, it must be a regexp/template type column. You can use the Regular expression, regexp (re) and Template (template) operations to transform a string column to the required data type.

Example

In the demo.ecommerce.data table, we want to replace all the (:) occurrences in every string of our timestamp column by a (-). To do it, we will create a new column using the Substitute all operation.

  • String to scan- timestamp column

  • Regular expression - Click the pencil icon and enter → :

  • Template - Click the pencil icon and enter → -

Click Create column and you will see the following result:

We can also create a column in the demo.ecommerce.data table that substitutes the dots in IP addresses by spaces. To do it, we will create a new column using the Substitute all operation and we will call it Substituteall. Before that, we need to transform the clientIpAddress column into string type using the to string (str) operation and we will call it IPstring.

Once we have the IPstring column, the arguments needed to create the new Substitute column are:

  • String to scan- IPstring column

  • Regular expression - Click the pencil icon and enter the following syntax to group up to the first dot→ ([0-9]+)\.*

  • Template - Click the pencil icon and make reference to the capturing group specified by the regular expression syntax, followed by a space → \1space

If you are going to use the same regular expression and template several times, it is advisable to create column using the Regular expression, regexp (re) and Template (template) operations and use them as arguments in the substitute operations.

Click Create column and you will see the following result:

  • The dots of the IP addresses have been substituted by spaces.

If you don't want the grouping pattern established by the regular expression to be used throughout the whole string, you can use the Substitute (subs) operation.

How does it work in LINQ?

Use the operator select... as...  and add the operation syntax to create the new column. These are the valid formats for the Substitute all operation:

  • subsall(string, re(string), template(string))

  • subsall(string, regexp, template)

Note that when you enter a string value as a regular expression and template using LINQ, you have to transform them to regexp and template format using the Regular expression, regexp (re) and Template (template) operations, as you can see in the examples. This is not needed if you perform this operation directly from the search window interface, as said above.

Example

You can copy the following LINQ script and try the previous examples on the demo.ecommerce.data table. 

from demo.ecommerce.data select subsall(timestamp, re(":"), template("-")) as substitute_timestamp_all
from demo.ecommerce.data select str(clientIpAddress) as IPstring, subsall(IPstring, re("([0-9]+)\\.*"), template("\\1 ")) as Substitute