Regular Expressions (RegEx)

Basic introduction to Regular Expressions


Regular expressions are a way to identify patterns in data. A Regular Expression (RegEx) uses a sequence of characters to specify a search pattern, and it can represent something as simple as "every value containing a zero" or something complicated like "every value that's between 10-12 characters in length and doesn't contain a capital A, a lowercase g, or a zero"

The backslash \ character is an "escape" character, which signals that whatever comes after it should be treated specially. (this means . returns different results than \. , as shown in the structures guide below)

Regular Expressions can be quite overwhelming at first, but most standard needs in Data-flo will be accomplished using a small number of structures. There are numerous resources online for learning RegEx, including a straightforward tutorial at RegexOne.

Basic RegEx structures guide

The following table shows the structure of a piece of RegEx, what that structure represents, and Examples of what it might return. See below for specific examples showing real-world use of these structures in Data-flo adaptor arguments.

RegEx structureRegEx meaningExamples





digits (numbers)



any digit (number)



any non-digit character

B; or _


any character

4; or B; or _


full stop (period)



only a, b, or c

[gb]et matches get and bet, but doesn't match let or net


Not a, b, or c

[^ln]et matches get and bet, but doesn't match let or net


characters a to z

[a-z]101 matches m101 but not 2101


numbers 0 to 9

[0-9]101 matches 2101 but not a101


any alphanumeric character

a; or T; or 7


any non-alphanumeric character

_; or @


m repetitions

a{3} matches aaa; [wxy]{3} can match www, xxx, wyy, etc.; [0-9]{2} matches any two-digit number


m to n repetitions

a{2,4} matches aa or aaa or aaaa; .{2,3} matches any two- or three-character string


zero or more repetitions


one or more repetitions

AB+ matches AB or ABAB or ABCAB, but not BA or BACB


optional character

ba?123 matches ba123 or b123 but not a123


any whitespace (space, tab, new-line, carriage return)

a\sb matches a b


any non-whitespace character (anything but space, tab, new-line, carriage return)

a\Sb matches aab but not a b


starts and ends (anchors to the beginning and end of a field) (Note: $ in a reference is different than in a pattern; in a reference, $ references a specific capture group)

^123$ matches 123 but not 1123 or 1233


capture group


capture sub-group


capture all


matches abc or def


/^.+-CGPS+-[0-9]{6}/ matches someamountoftext-CGPS-123454 and matches someamountoftext-CGPS-1234540 and matches x-CGPS-0000000000 and matches x-CGPS-CGPS-000000 because in all cases, the start of the line ^ is followed by any character . any number of times + , followed by the very specific text -CGPS any number of times +, followed by any six {6} digits [0-9] (and doesn't specify what happens after the six digits).

If a dollar-sign is added at the end, it signifies that there are six digits and that's the end of the line, so /^.+-CGPS+-[0-9]{6}$/ matches someamountoftext-CGPS-123454 but not someamountoftext-CGPS-1234540 .

Converting lat/long to negative numbers

/(.+)[W|w|West|WEST|west|S|s|South|south]/ matches 100.67W and matches 90.7 S and can be used as the pattern when converting latitude and longitude to negative numbers, with the replacement value -$1 turning those values into -100.67 and -90.7 respectively.

Select and reference everything in a field

  • pattern is everything in the field: /^(.+)$/

    • ^ and $anchor the start & end of the field, () designate a capture group to reference in the replacement, and .+ means any characters any number of times (at least once).

  • replacement is #$1 (where $1 means everything in the first capture group, which here is everything you've selected)

  • This example shows the two different uses of the dollar sign $ character. In the pattern, it means the end of the field. In the reference, it signifies a capture group.

External Resources (unrelated to CGPS)


To get a step-by-step walk-through of how Regular Expressions work, and more information about the structures involved, visit


This resource ( allows you to write plain English and return RegEx, which can be a good way to familiarize yourself with the concepts and get started creating a complicated pattern.


Regex101 is a good place to test and debug RegEx functionality, although some users find the interface unintuitive.


Another place to build, test, and debug RegEx is

Last updated