C# Professional - Processing Text
Open Source Your Knowledge, Become a Contributor
Technology knowledge has to be shared and made accessible for free. Join the movement.
Regular Expressions - Groups
When building a regular expression pattern, you can specify groups that will match one value between multiple possible values. This is done using the parenthesis : (banana|ananas|apple)
.
This example will match any text containing banana
, ananas
or apple
.
Groups are very handy when working with a regular expression where you need to specify multiple options for a specific work.
Quantifiers
As with individual characters, groups can use quantifiers to specify the number of occurence of the group.
Capture & Backreference
When using groups in the pattern, by default, the regular expression will capture the value corresponding to that group. This is often used when using regular expressions to extract a specific substring from a larger text.
In .Net, the value captured can be retrieved using the Groups
property of a Match
from a regular expression.
Note: the first element in the Groups
enumeration is the whole match, captured groups start at the 1 index
Values captured from a group can also be used as backreference in the pattern, allowing to ensure that the first captured value is the same in another part of the regular expression.
The backreference is done with the \N
syntax, where N
is the number of the referenced group in the pattern.
Example: user_id: (\d+) - validating email for user \1
This will match text when the first user_id
is the same than the one at the end of the text.
Naming groups
Groups can be given a name with a specific syntax in the pattern.
user_name: (?<username>\w+)
Here, the capturing group is named username
. This name can be used for backreferences using the \k<username>
syntax, and can be used when retrieving groups on a Match
object in .Net.