Regular Expressions Basics

Marchete
145.6K views

Open Source Your Knowledge, Become a Contributor

Technology knowledge has to be shared and made accessible for free. Join the movement.

Create Content

Final words

In this course, you learned many concepts of Regular Expressions: Metacharacters, repetitions, alternations and groups. I tried to keep the lessons simple but complete, and as language neutral as possible.

Pros and Cons

Regular Expressions have good characteristics for text searching, but they also have their flaws:

Pros

  1. Mature, well tested technology. If you think your problem can be solved with a regex pattern, please use it. Don't try to reinvent the wheel and create your own text parser.
  2. Powerful tool. With one line of code you can create amazing searches.
  3. Available in most programming languages.
  4. There are many online regex tools available, where you can quickly test and fix your patterns. These online tools simplify a lot the debugging and testing of regex expressions.

Cons

  1. Chaotic, evil syntaxes. Whoever created the regex metacharacter set was high on something. Depending on the situation, the same metacharacter has many different meanings, which makes reading a regex a complicated task. For example, the ?? First it's a metacharacter for 0 or 1 repetitions, but suddenly it's also used as a lazy quantifier. But wait! As if two different meanings aren't enough, inside a parenthesis (? has more than 10 different meanings!: Non-capturing groups, named groups, lookahead and lookbehind, conditionals, recursion.... And this same thing happens with many other metacharacters.
  2. Regex expressions could have bad performance in some instances. Unbounded repetitions can match a string in many different ways, and regex engines usually need to do many steps and backtracking to find all of these matches.
  3. Regular Expressions are not suited for very complex, recursive data formats, like XML or HTML. In these cases, it's better to use an XML parser.
  4. There are many different regex engines, and each one has different syntaxes. Therefore depending on the language, you need to learn some particular flags and metacharacters.

In my opinion, Regular Expressions are a must-have for anybody that works on IT-related stuff (programming, databases, OS, etc.). One day or another, you'll face a problem where you need to process text streams and search for data based on patterns. Regular expressions excel at these type of tasks.

Codingame Puzzles

There are many Codingame puzzles where you can use Regular Expressions:

The list goes on and on.

Other interesting links

Open Source Your Knowledge: become a Contributor and help others learn. Create New Content