Open Source Your Knowledge, Become a Contributor
Technology knowledge has to be shared and made accessible for free. Join the movement.
Character Sets
Simple character set
In previous lessons, we learned that a regex made from literal characters, like ain
, will search exactly those 3 letters in that exact order.
It's essentially a search for a && i && n. But what if I need an || (OR) instead of an && (AND)?
You can accomplish this by using brackets [
]
. When you create a pattern like [ain]
, you'll search for a single character that must be either a
OR i
OR n
.
In this first exercise, you'll need to create a pattern to match vowels.
Range character set
A simple character set can be bothersome to declare when you need to match the whole alphabet or all digits.
For that reason, you can use -
in Regular Expressions to declare ranges of consecutive characters.
Using the pattern [a-z]
, you'll match any character from a to z (a,b,c,d,e....x,y or z). Likewise, [2-5]
will match any number from 2 to 5.
You can also combine several ranges inside the character set. [B-Ga-v]
is a valid regex pattern.
As stated before, regex patterns are case sensitive so [a-z]
and [A-Z]
match differently.
The ^
metacharacter is a special case. When used inside of [
]
, the character creates a negative match. [^2-5]
will match with any character except 2,3,4 and 5. Be catious as that doesn't mean it matches with the following numbers: 0,1,6,7,8 or 9. It matches with any other character, even letters and symbols.
Some regex engines (check your language first) support character set subtractions and intersections.
- Subtractions are usually defined as
[range-[subrange_to_remove]]
, like[0-9-[2-7]]
indicating a set that matches only 0,1,8 or 9. - Intersections are defined as
[range1&&range2]
. The character must belong to both ranges to be matched in the search.
Note: Remember
\w
from the previous lesson? It's shorthand for[a-zA-Z0-9_]
For the next exercise, you need to create a complex pattern set with the following constraints:
- Search for any consonant, search for a lowercase vowel, and search for either the letter 'n' or 's'.
Continue to the next lesson to learn about Repetitions in Regular Expressions.