Home > Articles > Open Source > Python

  • Print
  • + Share This
From the author of


The final piece in the jigsaw is to make the regular expression search for optional elements or to select one of several patterns. We'll look at each of these options separately.

Optional Elements

You can specify that a character is optional using zero or more repetition metacharacters:

>>> re.match('computer?d?', 'computer')
<re.MatchObject instance at 864890>

This will match 'computer' or 'computed'. However, it will also match 'computerd', which we don't want.

By using a range within the expression, we can be more specific. Thus,

>>> re.match('compute[rd]','computer')
<re.MatchObject instance at 874390>

will select only 'computer' and 'computed' but will reject the unwanted 'computerd'.

Optional Expressions

In addition to matching options from a list of characters, we can also match based on a choice of subexpressions. We mentioned earlier that we could group sequences of characters in parentheses, but in fact, we can group any arbitrary regular expression in parentheses and treat it as a unit. In describing the syntax, I will use the notation (RE) to indicate any such regular expression grouping.

The situation that we want to examine here is the case whereby we want to match a regular expression containing (RE)xxxx or (RE)yyyy, where xxxx and yyyy are different patterns.

Thus, for example, we want to match both 'premature' and 'preventive'. We can do this by using a selection metacharacter:

>>> regexp = 'pre(mature|ventative)'
>>> re.match(regexp,'premature')
<re.MatchObject instance at 864890>
>>> re.match(regexp,'preventative')
<re.MatchObject instance at 864890>
>>> re.match(regexp,'prelude')

Notice that when defining the regular expression, we had to include both the options inside the parentheses; otherwise, the option would have been restricted to 'prematureentative' or 'prematurventative'. In other words, only the letters e and v would have formed the options, not the groups.

Even in this article we have not plumbed the depths of what regular expressions can achieve. It is possible to use special markers to act like variables or registers. These, combined with the branching and repetition constructs we've seen here, make regular expressions even more powerful. If you are interested in these aspects of regular expression syntax, I recommend reading Jeffrey E. F. Friedl's excellent book on the subject, Mastering Regular Expressions, published by O'Reilly.

  • + Share This
  • 🔖 Save To Your Account