C# Professional - Processing Text
Open Source Your Knowledge, Become a Contributor
Technology knowledge has to be shared and made accessible for free. Join the movement.
Regular Expressions
Regular expressions are a power tool to work with text. They use patterns to apply different operations on text.
With regular expressions, you can:
- Parse text to find specific character patterns
- Edit, replace or delete substrings of a text
- Extract text matching specific character patterns
Pattern definition
The basic pattern syntax will match any character.
You can define a class of multiple characters using [
and ]
. [aeiou]
will match one character that can be any vowel.
You can use the -
to include a range of consecutive characters in
Note that by default, regular expressions are case-sensitive.
The .Net Regex
class can accept an option when creating a new Regex
to specify that the case should be ignored, however, it is better to specify in the pattern that all cases can be accepted.
Here are the most common characters attribute for simple regular expression patterns.
Pattern | Matching characters |
---|---|
t | Single character t |
[aei] | A single character of: a , e or i |
[a-z] | A single character in the range from a to z |
[^a-z] | A single character not in the range from a to z |
\d | A decimal character (digit), equivalent to [0-9] |
\w | A word character, equivalent to [a-ZA-Z_0-9] |
For any character, you can use quantifiers to specify how many repetitions of the character should be matched.
Quantifier | Definition |
---|---|
* | Will match zero or more repetitions |
? | Will match zero or one repetition |
+ | Will match one or more repetition |
{N} | Will match exactly N repetitions |
{N,} | Will match at least N repetitions |
{M,N} | Will match between M and N repetitions |
You can define anchors to match the beginning or the end of the text or a word.
Anchor | Definition |
---|---|
^ | Will match the beginning of the text |
$ | Will match the end of the text |
\b | Will match the boundary of a word (beginning or end) |