Home > Articles > Open Source > Python

Regular Expressions in Python

📄 Contents

  1. Python's Regular Expression Language
  2. The Regular Expression Module
Regular expressions are defined using a mini-language that's completely different from Python, but Python includes the re module, through which we can seamlessly create and use regexes. Mark Summerfield explains.
This chapter is from the book

This chapter is from the book

A regular expression is a compact notation for representing a collection of strings. What makes regular expressions so powerful is that a single regular expression can represent an unlimited number of strings—provided that they meet the regular expression's requirements. Regular expressions (we'll refer to them mostly as "regexes" from now on) are defined using a mini-language that's completely different from Python—but Python includes the re module, through which we can seamlessly create and use regexes.

Regexes are used for four main purposes:

  • Validation. Checking whether a piece of text meets some criteria; for example, a currency symbol followed by digits.
  • Searching. Locating substrings that can have more than one form; for example, finding any of pet.png, pet.jpg, pet.jpeg, or pet.svg while avoiding carpet.png and similar.
  • Searching and replacing. Replacing everywhere the regex matches with a string; for example, finding bicycle or human powered vehicle and replacing either with bike.
  • Splitting strings. Splitting a string at each place the regex matches; for example, splitting everywhere a colon (:) or equal sign (=) is encountered.

At its simplest, a regular expression is an expression (for instance, a literal character), optionally followed by a quantifier. More complex regexes consist of any number of quantified expressions, may include assertions, and may be influenced by flags.

The first section of this article introduces and explains all the key regular expression concepts and shows pure regular expression syntax—it makes minimal reference to Python itself. The second section shows how to use regular expressions in the context of Python programming, drawing on all the material covered earlier. Readers familiar with regular expressions who just want to learn how they work in Python could skip to the second section. The article covers the complete regex language offered by the re module, including all the assertions and flags. Regular expressions are indicated in the text using bold, where they match in green, and captures are shown using highlighting.

Python's Regular Expression Language

This section looks at the regular expression language in four subsections. The first subsection shows how to match individual characters or groups of characters; for example, match a, or match b, or match either a or b. The second subsection shows how to quantify matches; for example, match once, or match at least once, or match as many times as possible. The third subsection shows how to group subexpressions and how to capture matching text, and the final subsection shows how to use the language's assertions and flags to affect how regular expressions work.

Characters and Character Classes

The simplest expressions are just literal characters, such as a or 5, and if no quantifier is explicitly given the expression is taken to be "match one occurrence." For example, the regex tune consists of four expressions, each implicitly quantified to match once, so it matches one t followed by one u followed by one n followed by one e, and hence matches the strings tune and attuned.

Although most characters can be used as literals, some are special characters—symbols in the regex language that must be escaped by preceding them with a backslash (\) to use them as literals. The special characters are \.^$?+*{}[]()|. Most of Python's standard string escapes can also be used within regexes; for example, \n for newline and \t for tab, as well as hexadecimal escapes for characters using the \xHH, \uHHHH, and \UHHHHHHHH syntaxes.

In many cases, rather than matching one particular character we want to match any one of a set of characters. This can be achieved by using a character class—one or more characters enclosed in square brackets. (This has nothing to do with a Python class, and is simply the regex term for "set of characters.") A character class is an expression. Like any other expression, if not explicitly quantified it matches exactly one character (which can be any of the characters in the character class). For example, the regex r[ea]d matches both red and radar, but not read. Similarly, to match a single digit we can use the regex [0123456789]. For convenience we can specify a range of characters using a hyphen, so the regex [0-9] also matches a digit. It's possible to negate the meaning of a character class by following the opening bracket with a caret, so [^0-9] matches any character that is not a digit.

Note that inside a character class, apart from the backslash (\), the special characters lose their special meaning, although the caret (^) acquires a new meaning (negation) if it's the first character in the character class, and otherwise is simply a literal caret. Also, the hyphen (-) signifies a character range unless it's the first character, in which case it's a literal hyphen. Since some sets of characters are required so frequently, several have shorthand forms, which are shown in Table 1. With one exception, the shorthands can be used inside character sets; for example, the regex [\dA-Fa-f] matches any hexadecimal digit. The exception is the period (.) which is a shorthand outside a character class but matches a literal period inside a character class.

Table 1 Character Class Shorthands

Symbol

Meaning

.

Matches any character except newline, any character at all with the re.DOTALL flag, or inside a character class matches a literal period.

\d

Matches a Unicode digit, or [0-9] with the re.ASCII flag.

\D

Matches a Unicode nondigit, or [^0-9] with the re.ASCII flag.

\s

Matches a Unicode whitespace, or [ \t\n\r\f\v] with the re.ASCII flag.

\S

Matches a Unicode non-whitespace, or [^ \t\n\r\f\v] with the re.ASCII flag.

\w

Matches a Unicode "word" character, or [a-zA-Z0-9_] with the re.ASCII flag.

\W

Matches a Unicode non-"word" character, or [^a-zA-Z0-9_] with the re.ASCII flag.

Quantifiers

A quantifier has the form {m,n} where m and n are the minimum and maximum times the expression to which the quantifier applies must match. For example, both e{1,1}e{1,1} and e{2,2} match feel, but neither matches felt.

Writing a quantifier after every expression would soon become tedious, and is certainly difficult to read. Fortunately, the regex language supports several convenient shorthands. If only one number is given in the quantifier, it's taken to be both the minimum and the maximum, so e{2} is the same as e{2,2}. As noted in the preceding section, if no quantifier is explicitly given, it's assumed to be 1 (that is, {1,1} or {1}); therefore, ee is the same as e{1,1}e{1,1} and e{1}e{1}, so both e{2} and ee match feel but not felt.

Having a different minimum and maximum is often convenient. For example, to match travelled and traveled (both legitimate spellings),we could use either travel{1,2}ed or travell{0,1}ed. The {0,1} quantification is used so often that it has its own shorthand form, ?, so another way of writing the regex (and the one most likely to be used in practice) is travell?ed.

Two other quantification shorthands are provided: A plus sign (+) stands for {1,n} ("at least one") and an asterisk (*) stands for {0,n} ("any number of"). In both cases, n is the maximum possible number allowed for a quantifier, usually at least 32767. Table 2 shows all the quantifiers.

The + quantifier is very useful. For example, to match integers, we could use \d+ to match one or more digits. This regex could match in two places in the string 4588.91, for example: 4588.91 and 4588.91. Sometimes typos are the result of pressing a key too long. We could use the regex bevel+ed to match the legitimate beveled and bevelled, and the incorrect bevellled. If we wanted to standardize on the single-l spelling, and match only occurrences that had two or more l's, we could use bevell+ed to find them.

The * quantifier is less useful, simply because it can lead so often to unexpected results. For example, supposing that we want to find lines that contain comments in Python files, we might try searching for #*. But this regex will match any line whatsoever, including blank lines, because the meaning is "match any number of pound signs"—and that includes none. As a rule for those new to regexes, avoid using * at all, and if you do use it (or if you use ?), make sure that at least one other expression in the regex has a nonzero quantifier. Use at least one quantifier other than * or ?, that is, since both of these can match their expression zero times.

Often it's possible to convert * uses to + uses and vice versa. For example, we could match "tasselled" with at least one l using tassell*ed or tassel+ed, and match those with two or more l's using tasselll*ed or tassell+ed.

If we use the regex \d+ it will match 136. But why does it match all the digits, rather than just the first one? By default, all quantifiers are greedy—they match as many characters as they can. We can make any quantifier nongreedy (also called minimal) by following it with a question mark (?) symbol. (The question mark has two different meanings—on its own it's a shorthand for the {0,1} quantifier, and when it follows a quantifier it tells the quantifier to be nongreedy.) For example, \d+? can match the string 136 in three different places: 136, 136, and 136. Here's another example: \d?? matches zero or one digits, but prefers to match none since it's nongreedy; on its own it suffers the same problem as * in that it will match nothing—that is, any text at all.

Table 2 Regular Expression Quantifiers

Syntax

Meaning

e? or e{0,1}

Greedily match zero occurrences or one occurrence of expression e.

e?? or e{0,1}?

Nongreedily match zero occurrences or one occurrence of expression e.

e+ or e{1,}

Greedily match one or more occurrences of expression e.

e+? or e{1,}?

Nongreedily match one or more occurrences of expression e.

e* or e{0,}

Greedily match zero or more occurrences of expression e.

e*? or e{0,}?

Nongreedily match zero or more occurrences of expression e.

e{m}

Match exactly m occurrences of expression e.

e{m,}

Greedily match at least m occurrences of expression e.

e{m,}?

Nongreedily match at least m occurrences of expression e.

e{,n}

Greedily match at most n occurrences of expression e.

e{,n}?

Nongreedily match at most n occurrences of expression e.

e{m,n}

Greedily match at least m and at most n occurrences of expression e.

e{m,n}?

Nongreedily match at least m and at most n occurrences of expression e.

Nongreedy quantifiers can be useful for quick-and-dirty XML and HTML parsing. For example, to match all the image tags, writing <img.*> (match one each in order of <, i, m, and g, and then zero or more of any character apart from newline, and then one >) will not work because the .* part is greedy and will match everything including the tag's closing >, and will keep going until it reaches the last > in the entire text.

Three solutions present themselves (apart from using a proper parser):

  • <img[^>]*> matches <img, any number of non-> characters, and then the tag's closing > character.
  • <img.*?> matches <img, any number of characters (but nongreedily, so it will stop immediately before the tag's closing >), and then the >.
  • <img[^>]*?> combines both of the preceding options.

None of these is correct, though, since they can all match <img>, which is not valid. Since we know that an image tag must have a src attribute, a more accurate regex is as follows:

<img\s+[^>]*?src=\w+[^>]*?>

This regex matches the literal characters <img, one or more whitespace characters, nongreedily zero or more of anything except > (to skip any other attributes such as alt), the src attribute (the literal characters src= and then at least one "word" character), and then any other non-> characters (including none) to account for any other attributes, and finally the closing >.

Grouping and Capturing

In practical applications, we often need regexes that can match any one of two or more alternatives, and we often need to capture the match or some part of the match for further processing. Also, we sometimes want a quantifier to apply to several expressions. All of these goals can be achieved by grouping with parentheses; and, in the case of alternatives, using alternation with the vertical bar (|).

Alternation is especially useful when we want to match any one of several quite different alternatives. For example, the regex aircraft|airplane|jet will match any text that contains aircraft or airplane or jet. The same objective can be achieved using the regex air(craft|plane)|jet. Here, the parentheses are used to group expressions, so we have two outer expressions, air(craft|plane) and jet. The first of these has an inner expression, craft|plane, and because this is preceded by air, the first outer expression can match only aircraft or airplane.

Parentheses serve two different purposes: grouping expressions and capturing the text that matches an expression. We'll use the term group to refer to a grouped expression whether it captures or not, and capture and capture group to refer to a captured group. If we used the regex (aircraft|airplane|jet), it not only would match any of the three expressions, but would capture whichever one was matched for later reference. Compare this with the regex (air(craft|plane)|jet), which has two captures if the first expression matches (aircraft or airplane as the first capture and craft or plane as the second capture), and one capture if the second expression matches (jet). We can switch off the capturing effect by following an opening parenthesis with ?: like this:

(air(?:craft|plane)|jet)

This will have only one capture if it matches (aircraft or airplane or jet).

A grouped expression is an expression, and therefore can be quantified. As with any other expression, the quantity is assumed to be 1 unless explicitly given. For example, if we've read a text file with lines of the form key=value, where each key is alphanumeric, the regex (\w+)=(.+) will match every line that has a nonempty key and a nonempty value. (Recall that . matches anything except newlines.) And for every line that matches, two captures are made, the first being the key and the second being the value.

For example, the key=value regular expression will match the entire line topic= physical geography with the two captures shown shaded. Notice that the second capture includes some whitespace, and that whitespace before the equal sign is not accepted. We could refine the regex to be more flexible in accepting whitespace, and to strip off unwanted whitespace, by using a somewhat longer version:

[ \t]*(\w+)[ \t]*=[ \t]*(.+)

This example matches the same line as before, as well as lines that have whitespace around the equal sign, but with the first capture having no leading or trailing whitespace, and the second capture having no leading whitespace (for example, topic = physical geography).

We've been careful to keep the whitespace-matching parts outside the capturing parentheses, and to allow for lines that have no whitespace at all. We didn't use \s to match whitespace because that matches newlines (\n), which could lead to incorrect matches that span lines (for instance, if the re.MULTILINE flag is used). And for the value we didn't use \S to match non-whitespace because we want to allow for values that contain whitespace (English sentences, for example). To avoid the second capture having trailing whitespace, we would need a more sophisticated regex; we'll see this in the next subsection.

We can refer to captures by using backreferences; that is, by referring back to an earlier capture group. Note that backreferences cannot be used inside character classes; that is, inside brackets ([]).

One syntax for backreferences inside regexes themselves is \i, where i is the capture number. Captures are numbered starting from one and increasing by one going from left to right as each new (capturing) left parenthesis is encountered. For example, to match duplicated words simplistically, we can use the regex (\w+)\s+\1 to match a "word," followed by at least one whitespace, and then the same word as was captured. (Capture number 0 is created automatically without the need for parentheses; it holds the entire match—that is, what we show underlined.) We'll see a more sophisticated way to match duplicate words later.

In long or complicated regexes, it's often more convenient to use names rather than numbers for captures. This approach can also make maintenance easier, because adding or removing capturing parentheses may change the numbers but won't affect names. To name a capture, follow the opening parenthesis with ?P<name>.For example,

(?P<key>\w+)=(?P<value>.+)

has two captures called key and value. The syntax for backreferences to named captures inside a regex is (?P=name). For example,

(?P<word>\w+)\s+(?P=word)

matches duplicate words using a capture called word.

Assertions and Flags

One problem that affects many of the regexes we've examined so far is that they can match more or different text than we intended. For example, the regex aircraft|airplane|jet will match waterjet and jetski as well as jet. This kind of problem can be solved by using assertions. An assertion doesn't match any text, but instead says something about the text at the point where the assertion occurs.

One assertion is \b (word boundary), which asserts that the character that precedes it must be a "word" (\w) and the character that follows it must be a non-"word" (\W), or vice versa. For example, although the regex jet can match twice in the following text:

the jet and jetski are noisy

that is, the jet and jetski are noisy, the regex \bjet\b will match only once, the jet and jetski are noisy. In the context of the original regex, we could write it this way:

\baircraft\b|\bairplane\b|\bjet\b

or more clearly this way:

\b(?:aircraft|airplane|jet)\b

That is, word boundary, noncapturing expression, word boundary.

Many other assertions are supported, as shown in Table 3. We could use assertions to improve the clarity of a key=value regex, for example, by changing it to ^(\w+)=([^\n]+) and setting the re.MULTILINE flag to ensure that each key=value is taken from a single line with no possibility of spanning lines. (The flags are shown in Table 4; we'll get to flag syntax shortly.) If we also want to strip leading and trailing whitespace and use named captures, the full regex is as follows:

^[ \t]*(?P<key>\w+)[ \t]*=[ \t]*(?P<value>[^\n]+)(?<![ \t])

Even though this regex is designed for a fairly simple task, it looks quite complicated. One way to make it more maintainable is to include comments in it. This can be done by adding inline comments using the syntax (?#the comment), but in practice comments like this can easily make the regex even more difficult to read. A much nicer solution is to use the re.VERBOSE flag—this allows us to use whitespace and normal Python comments freely in regexes, with the one constraint that if we need to match whitespace we must either use \s or a character class such as [ ]. Here's the key=value regex with comments:

^[ \t]*                  # start of line and optional leading whitespace
(?P<key>\w+)             # the key text
[ \t]*=[ \t]*            # the equals with optional surrounding whitespace
(?P<value>[^\n]+)        # the value text
(?<![ \t])               # negative lookbehind to avoid trailing whitespace

In the context of a Python program, we would normally write a regex like this inside a raw triple-quoted string—raw so that we don't have to double up the backslashes, and triple-quoted so that we can spread it over multiple lines.

In addition to the assertions we've discussed so far, there are additional assertions that look at the text in front of (or behind) the assertion to see whether it matches (or doesn't match) a specified expression. The expressions that can be used in lookbehind assertions must be of fixed length, so the quantifiers ?, +, and * cannot be used, and numeric quantifiers must be of a fixed size; for example, {3}.

Table 3 Regular Expression Assertions

Symbol

Meaning

^

Matches at the start; also matches after each newline with the re.MULTILINE flag.

$

Matches at the end; also matches before each newline with the re.MULTILINE flag.

\A

Matches at the start.

\b

Matches at a "word" boundary, influenced by the re.ASCII flag. Inside a character class, this is the escape for the backspace character.

\B

Matches at a non-"word" boundary, influenced by the re.ASCII flag.

\Z

Matches at the end.

(?=e)

Matches if the expression e matches at this assertion but doesn't advance over it—called lookahead or positive lookahead.

(?!e)

Matches if the expression e doesn't match at this assertion and doesn't advance over it—called negative lookahead.

(?<=e)

Matches if the expression e matches immediately before this assertion—called positive lookbehind.

(?<!e)

Matches if the expression e doesn't match immediately before this assertion—called negative lookbehind.

Table 4 The Regular Expression Module's Flags

Flag

Meaning

re.A or re.ASCII

Makes \b, \B, \s, \S, \w, and \W assume that strings are ASCII; the default is for these character class shorthands to depend on the Unicode specification.

re.I or re.IGNORECASE

Makes the regex match without regard to case.

re.M or re.MULTILINE

Makes ^ match at the start and after each newline and $ match before each newline and at the end.

re.S or re.DOTALL

Makes . match every character, including newlines.

re.X or re.VERBOSE

Allows whitespace and comments to be included.

In the case of the key=value regex, the negative lookbehind assertion means that at the point it occurs the preceding character must not be a space or a tab. This has the effect of ensuring that the last character captured into the value capture group is not a space or tab (yet without preventing spaces or tabs from appearing inside the captured text).

Let's consider another example. Suppose we're reading a multiline text that contains the names Helen Patricia Sharman, Jim Sharman, Sharman Joshi, Helen Kelly, and so on, and we want to match Helen Patricia, but only when referring to Helen Patricia Sharman. The easiest way is to use this regex:

\b(Helen\s+Patricia)\s+Sharman\b

But we could achieve the same thing using a lookahead assertion:

\b(Helen\s+Patricia)(?=\s+Sharman\b)

This will match Helen Patricia only if the text is preceded by a word boundary and followed by whitespace and Sharman ending at a word boundary.

To capture variations of the forenames (Helen, Helen P., or Helen Patricia), we could make the regex slightly more sophisticated:

\b(Helen(?:\s+(?:P\.|Patricia))?)\s+(?=Sharman\b)

This matches a word boundary followed by one of the forename forms—but only if it's followed by some whitespace, Sharman, and a word boundary.

Note that only two syntaxes perform capturing—(e) and (?P<name>e). None of the other parenthesized forms capture. This makes perfect sense for the lookahead and lookbehind assertions, since they only make a statement about what follows or precedes them—they're not part of the match, but rather affect whether a match is made. It also makes sense for the last two parenthesized forms that we'll consider next.

Earlier we saw how we can backreference a capture inside a regex either by number, such as \1, or by name, such as (?P=name). It's also possible to match conditionally depending on whether an earlier match occurred. The syntaxes are (?(id)yes_exp) and (?(id)yes_exp|no_exp). The id is the name or number of an earlier capture o which we're referring. If the capture succeeded, the yes_exp will be matched here. If the capture failed, the no_exp will be matched, if given.

Let's consider an example. Suppose we want to extract the filenames referenced by the src attribute in HTML img tags. We'll begin by trying to match just the src attribute. Unlike our earlier attempt, however, we'll account for the three forms that the attribute's value can take: single-quoted, double-quoted, and unquoted. Here's an initial attempt:

src=(["'])([^"'>]+)\1

The ([^"'>]+) part captures a greedy match of at least one character that isn't a quote or >. This regex works fine for quoted filenames, and thanks to the \1 matches only when the opening and closing quotes are the same. But it doesn't allow for unquoted filenames. To fix this problem, we must make the opening quote optional and therefore match only it if it's present. Here's the revised regex:

src=(["'])?([^"'>]+)(?(1)\1)

We didn't provide a no_exp since there's nothing to match if no quote is given. Now we're ready to put the regex in context. Here's the complete img tag regex, using named groups and comments:

<img\s+                         # start of the tag
[^>]*?                          # any attributes that precede the src
src=                            # start of the src attribute
(?P<quote>["'])?                # optional opening quote
(?P<image>[^"'>]+)              # image filename
(?(quote)(?P=quote))            # closing quote (matches opening quote if given)
[^>]*?                          # any attributes that follow the src
>                               # end of the tag

The filename capture is called image (which happens to be capture number 2).

Of course, there's a simpler but subtler alternative:

src=(["']?)([^"'>]+)\1

Here, if there's a starting quote character, it's captured into capture group 1 and matched after the nonquote characters. And if there's no starting quote character, group 1 will still match—an empty string, since it's completely optional (its quantifier is 0 or 1), in which case the backreference will also match an empty string.

The final piece of regex syntax that Python's regular expression engine offers is a means of setting the flags. Usually the flags are set by passing them as additional parameters when calling the re.compile() function, but sometimes it's more convenient to set them as part of the regex itself. The syntax is simply (?flags) where flags is one or more of the following:

  • a (the same as passing re.ASCII)
  • i (re.IGNORECASE)
  • m (re.MULTILINE)
  • s (re.DOTALL)
  • x (re.VERBOSE)

If the flags are set this way, they should be put at the start of the regex; they match nothing, so their effect on the regex is only to set the flags.

InformIT Promotional Mailings & Special Offers

I would like to receive exclusive offers and hear about products from InformIT and its family of brands. I can unsubscribe at any time.

Overview


Pearson Education, Inc., 221 River Street, Hoboken, New Jersey 07030, (Pearson) presents this site to provide information about products and services that can be purchased through this site.

This privacy notice provides an overview of our commitment to privacy and describes how we collect, protect, use and share personal information collected through this site. Please note that other Pearson websites and online products and services have their own separate privacy policies.

Collection and Use of Information


To conduct business and deliver products and services, Pearson collects and uses personal information in several ways in connection with this site, including:

Questions and Inquiries

For inquiries and questions, we collect the inquiry or question, together with name, contact details (email address, phone number and mailing address) and any other additional information voluntarily submitted to us through a Contact Us form or an email. We use this information to address the inquiry and respond to the question.

Online Store

For orders and purchases placed through our online store on this site, we collect order details, name, institution name and address (if applicable), email address, phone number, shipping and billing addresses, credit/debit card information, shipping options and any instructions. We use this information to complete transactions, fulfill orders, communicate with individuals placing orders or visiting the online store, and for related purposes.

Surveys

Pearson may offer opportunities to provide feedback or participate in surveys, including surveys evaluating Pearson products, services or sites. Participation is voluntary. Pearson collects information requested in the survey questions and uses the information to evaluate, support, maintain and improve products, services or sites, develop new products and services, conduct educational research and for other purposes specified in the survey.

Contests and Drawings

Occasionally, we may sponsor a contest or drawing. Participation is optional. Pearson collects name, contact information and other information specified on the entry form for the contest or drawing to conduct the contest or drawing. Pearson may collect additional personal information from the winners of a contest or drawing in order to award the prize and for tax reporting purposes, as required by law.

Newsletters

If you have elected to receive email newsletters or promotional mailings and special offers but want to unsubscribe, simply email information@informit.com.

Service Announcements

On rare occasions it is necessary to send out a strictly service related announcement. For instance, if our service is temporarily suspended for maintenance we might send users an email. Generally, users may not opt-out of these communications, though they can deactivate their account information. However, these communications are not promotional in nature.

Customer Service

We communicate with users on a regular basis to provide requested services and in regard to issues relating to their account we reply via email or phone in accordance with the users' wishes when a user submits their information through our Contact Us form.

Other Collection and Use of Information


Application and System Logs

Pearson automatically collects log data to help ensure the delivery, availability and security of this site. Log data may include technical information about how a user or visitor connected to this site, such as browser type, type of computer/device, operating system, internet service provider and IP address. We use this information for support purposes and to monitor the health of the site, identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents and appropriately scale computing resources.

Web Analytics

Pearson may use third party web trend analytical services, including Google Analytics, to collect visitor information, such as IP addresses, browser types, referring pages, pages visited and time spent on a particular site. While these analytical services collect and report information on an anonymous basis, they may use cookies to gather web trend information. The information gathered may enable Pearson (but not the third party web trend services) to link information with application and system log data. Pearson uses this information for system administration and to identify problems, improve service, detect unauthorized access and fraudulent activity, prevent and respond to security incidents, appropriately scale computing resources and otherwise support and deliver this site and its services.

Cookies and Related Technologies

This site uses cookies and similar technologies to personalize content, measure traffic patterns, control security, track use and access of information on this site, and provide interest-based messages and advertising. Users can manage and block the use of cookies through their browser. Disabling or blocking certain cookies may limit the functionality of this site.

Do Not Track

This site currently does not respond to Do Not Track signals.

Security


Pearson uses appropriate physical, administrative and technical security measures to protect personal information from unauthorized access, use and disclosure.

Children


This site is not directed to children under the age of 13.

Marketing


Pearson may send or direct marketing communications to users, provided that

  • Pearson will not use personal information collected or processed as a K-12 school service provider for the purpose of directed or targeted advertising.
  • Such marketing is consistent with applicable law and Pearson's legal obligations.
  • Pearson will not knowingly direct or send marketing communications to an individual who has expressed a preference not to receive marketing.
  • Where required by applicable law, express or implied consent to marketing exists and has not been withdrawn.

Pearson may provide personal information to a third party service provider on a restricted basis to provide marketing solely on behalf of Pearson or an affiliate or customer for whom Pearson is a service provider. Marketing preferences may be changed at any time.

Correcting/Updating Personal Information


If a user's personally identifiable information changes (such as your postal address or email address), we provide a way to correct or update that user's personal data provided to us. This can be done on the Account page. If a user no longer desires our service and desires to delete his or her account, please contact us at customer-service@informit.com and we will process the deletion of a user's account.

Choice/Opt-out


Users can always make an informed choice as to whether they should proceed with certain services offered by InformIT. If you choose to remove yourself from our mailing list(s) simply visit the following page and uncheck any communication you no longer want to receive: www.informit.com/u.aspx.

Sale of Personal Information


Pearson does not rent or sell personal information in exchange for any payment of money.

While Pearson does not sell personal information, as defined in Nevada law, Nevada residents may email a request for no sale of their personal information to NevadaDesignatedRequest@pearson.com.

Supplemental Privacy Statement for California Residents


California residents should read our Supplemental privacy statement for California residents in conjunction with this Privacy Notice. The Supplemental privacy statement for California residents explains Pearson's commitment to comply with California law and applies to personal information of California residents collected in connection with this site and the Services.

Sharing and Disclosure


Pearson may disclose personal information, as follows:

  • As required by law.
  • With the consent of the individual (or their parent, if the individual is a minor)
  • In response to a subpoena, court order or legal process, to the extent permitted or required by law
  • To protect the security and safety of individuals, data, assets and systems, consistent with applicable law
  • In connection the sale, joint venture or other transfer of some or all of its company or assets, subject to the provisions of this Privacy Notice
  • To investigate or address actual or suspected fraud or other illegal activities
  • To exercise its legal rights, including enforcement of the Terms of Use for this site or another contract
  • To affiliated Pearson companies and other companies and organizations who perform work for Pearson and are obligated to protect the privacy of personal information consistent with this Privacy Notice
  • To a school, organization, company or government agency, where Pearson collects or processes the personal information in a school setting or on behalf of such organization, company or government agency.

Links


This web site contains links to other sites. Please be aware that we are not responsible for the privacy practices of such other sites. We encourage our users to be aware when they leave our site and to read the privacy statements of each and every web site that collects Personal Information. This privacy statement applies solely to information collected by this web site.

Requests and Contact


Please contact us about this Privacy Notice or if you have any requests or questions relating to the privacy of your personal information.

Changes to this Privacy Notice


We may revise this Privacy Notice through an updated posting. We will identify the effective date of the revision in the posting. Often, updates are made to provide greater clarity or to comply with changes in regulatory requirements. If the updates involve material changes to the collection, protection, use or disclosure of Personal Information, Pearson will provide notice of the change through a conspicuous notice on this site or other appropriate way. Continued use of the site after the effective date of a posted revision evidences acceptance. Please contact us if you have questions or concerns about the Privacy Notice or any objection to any revisions.

Last Update: November 17, 2020