ZetCode

Python re.compile() Function

last modified April 20, 2025

Introduction to re.compile

The re.compile function is a fundamental part of Python's re module. It transforms a regular expression pattern into a regex object that can be reused multiple times.

Compiling patterns offers performance benefits when the same regex is used repeatedly. It also makes code more readable by separating pattern definition from usage.

The function takes a pattern string and optional flags, returning a Pattern object with methods for various matching operations.

Basic Syntax

The syntax for re.compile is straightforward:

re.compile(pattern, flags=0)

The pattern is the regular expression string. The optional flags modify how the pattern matches text.

Basic Pattern Compilation

Let's start with a simple example of compiling and using a basic pattern.

basic_compile.py
#!/usr/bin/python

import re

text = "The quick brown fox jumps over the lazy dog"
pattern = re.compile(r'fox')

match = pattern.search(text)
if match:
    print(f"Found '{match.group()}' at position {match.start()}")

This example demonstrates the basic workflow: compile a pattern, then use it to search text. The raw string (r'') prevents Python from interpreting backslashes.

pattern = re.compile(r'fox')

Here we compile a simple pattern that matches the literal string 'fox'. The compiled pattern is stored for reuse.

match = pattern.search(text)

We use the compiled pattern's search method to look for matches in the text. This returns a match object if found.

Using Flags

Flags modify how the pattern matches text. Let's see how to use them.

flags_example.py
#!/usr/bin/python

import re

text = "Python is FUN!"
pattern = re.compile(r'fun', re.IGNORECASE)

match = pattern.search(text)
if match:
    print(f"Found '{match.group()}' ignoring case")

The re.IGNORECASE flag makes the match case-insensitive. Other useful flags include re.MULTILINE and re.DOTALL.

Precompiled Patterns for Performance

Compiling patterns improves performance when reused. Here's a benchmark.

performance.py
#!/usr/bin/python

import re
import timeit

text = "a" * 1000 + "b"
setup = "import re; text = '" + text + "'; pattern = re.compile(r'a+b')"
stmt_compiled = "pattern.search(text)"
stmt_uncompiled = "re.search(r'a+b', text)"

compiled_time = timeit.timeit(stmt_compiled, setup, number=10000)
uncompiled_time = timeit.timeit(stmt_uncompiled, setup, number=10000)

print(f"Compiled: {compiled_time:.4f}s")
print(f"Uncompiled: {uncompiled_time:.4f}s")

This shows the performance difference between compiled and uncompiled patterns when used repeatedly.

Using Compiled Patterns with findall

Compiled patterns work with all regex functions. Here's findall.

findall_example.py
#!/usr/bin/python

import re

text = "10 apples, 20 oranges, 15 bananas, 5 pineapples"
pattern = re.compile(r'\d+')

numbers = pattern.findall(text)
print("Numbers found:", numbers)

The compiled pattern extracts all sequences of digits from the text. The findall method returns all matches as a list.

Splitting Text with Compiled Patterns

Compiled patterns can also split text efficiently.

split_example.py
#!/usr/bin/python

import re

text = "first,second;;third|fourth\nfifth"
pattern = re.compile(r'[,;|\n]+')

parts = pattern.split(text)
print("Split result:", parts)

This splits text on any combination of commas, semicolons, pipes, or newlines. The + quantifier handles consecutive delimiters.

Pattern Substitution

Compiled patterns enable efficient text substitution.

sub_example.py
#!/usr/bin/python

import re

text = "Contact us at support@example.com or sales@example.org"
pattern = re.compile(r'([a-z]+)@([a-z]+)\.([a-z]{2,})')

# Replace emails with [REDACTED]
redacted = pattern.sub('[REDACTED]', text)
print(redacted)

This finds email patterns and replaces them with placeholder text. The substitution is performed efficiently using the compiled pattern.

Advanced Pattern with Groups

Compiled patterns can capture groups for complex matching.

groups_example.py
#!/usr/bin/python

import re

text = "Date: 2023-12-25, Time: 23:59"
date_pattern = re.compile(r'Date: (\d{4})-(\d{2})-(\d{2})')
time_pattern = re.compile(r'Time: (\d{2}):(\d{2})')

date_match = date_pattern.search(text)
time_match = time_pattern.search(text)

if date_match:
    print(f"Year: {date_match.group(1)}, Month: {date_match.group(2)}")
if time_match:
    print(f"Hour: {time_match.group(1)}, Minute: {time_match.group(2)}")

This demonstrates extracting structured data using named groups. The compiled patterns make the extraction efficient and readable.

Best Practices

When using re.compile, follow these best practices:

Performance Considerations

While re.compile offers performance benefits, they're most noticeable in tight loops. For one-off matches, the difference is negligible.

Python internally caches recently used patterns, so even uncompiled regexes have some optimization. However, explicit compilation gives more control and clarity.

Source

Python re.compile() documentation

This tutorial covered the essential aspects of Python's re.compile function. Mastering pattern compilation will make your regex code more efficient and maintainable.

Author

My name is Jan Bodnar, and I am a passionate programmer with extensive programming experience. I have been writing programming articles since 2007. To date, I have authored over 1,400 articles and 8 e-books. I possess more than ten years of experience in teaching programming.

List all Python tutorials.