Python re.purge() Function
last modified April 20, 2025
Introduction to re.purge
The re.purge function is part of Python's re module.
It clears the regular expression cache used by the module to store compiled
patterns.
Python caches recently used regular expressions to improve performance.
The re.purge function allows manual clearing of this cache.
This function is particularly useful in memory-sensitive applications or when working with many unique patterns that shouldn't be cached long-term.
Basic Syntax
The syntax for re.purge is simple as it takes no arguments:
re.purge()
Calling this function clears the internal cache of compiled regex patterns. Subsequent regex operations will need to recompile patterns if used again.
Basic Cache Purge Example
Here's a simple demonstration of clearing the regex cache.
#!/usr/bin/python import re # Compile some patterns to fill the cache re.compile(r'pattern1') re.compile(r'pattern2') # Clear the cache re.purge() # These will need to be compiled fresh re.compile(r'pattern1') re.compile(r'pattern2')
This example shows the basic usage of re.purge to clear the
cache between compilations.
re.compile(r'pattern1')
This first compilation adds the pattern to Python's internal regex cache.
re.purge()
This call clears all cached compiled patterns, freeing memory immediately.
Measuring Cache Impact
Let's measure how the cache affects compilation performance.
#!/usr/bin/python
import re
import time
def time_compilation(pattern):
start = time.perf_counter()
re.compile(pattern)
return time.perf_counter() - start
# First compilation (uncached)
t1 = time_compilation(r'\d+')
# Second compilation (cached)
t2 = time_compilation(r'\d+')
# After purge
re.purge()
t3 = time_compilation(r'\d+')
print(f"First: {t1:.6f}s, Second: {t2:.6f}s, After purge: {t3:.6f}s")
This demonstrates the performance difference between cached and uncached compilations.
Memory Management with Purge
Here's how re.purge can help manage memory usage.
#!/usr/bin/python
import re
import sys
def show_cache_size():
return sys.getsizeof(re._cache)
# Fill cache with many patterns
for i in range(1000):
re.compile(f'pattern_{i}')
print(f"Cache size before purge: {show_cache_size()} bytes")
# Clear the cache
re.purge()
print(f"Cache size after purge: {show_cache_size()} bytes")
This example shows how re.purge can reduce memory usage by
clearing the regex cache.
Cache Behavior with Identical Patterns
The cache stores identical patterns efficiently.
#!/usr/bin/python
import re
# These compile to the same pattern object
pattern1 = re.compile(r'\d+')
pattern2 = re.compile(r'\d+')
print(f"Same object: {pattern1 is pattern2}")
# After purge, new objects are created
re.purge()
pattern3 = re.compile(r'\d+')
pattern4 = re.compile(r'\d+')
print(f"After purge: {pattern3 is pattern4}")
This demonstrates how identical patterns share cached objects until the cache is cleared.
Cache Limits and Purge Timing
Python's regex cache has a default limit we can observe.
#!/usr/bin/python
import re
import sys
# Check default cache size
print(f"Default cache limit: {re._MAXCACHE}")
# Fill cache beyond limit
for i in range(re._MAXCACHE + 10):
re.compile(f'pattern_{i}')
# Verify cache size
print(f"Actual cache size: {len(re._cache)}")
# Purge resets everything
re.purge()
print(f"After purge: {len(re._cache)}")
This shows how the cache automatically manages its size and how
re.purge provides manual control.
Testing Cache-Dependent Code
re.purge is useful for testing cache-dependent behavior.
#!/usr/bin/python
import re
def test_pattern_compilation():
# Ensure fresh start
re.purge()
# Test first compilation
start = time.perf_counter()
re.compile(r'complex_pattern')
first_time = time.perf_counter() - start
# Test cached compilation
start = time.perf_counter()
re.compile(r'complex_pattern')
cached_time = time.perf_counter() - start
assert cached_time < first_time, "Cache not working"
print("Test passed")
test_pattern_compilation()
This demonstrates using re.purge to ensure consistent
testing conditions.
Best Practices
When using re.purge, consider these best practices:
- Use sparingly - the cache exists to improve performance
- Call before memory-intensive operations if using many unique patterns
- Use in testing to ensure consistent behavior
- Combine with custom caching for specialized applications
- Monitor actual memory impact before optimizing
Performance Considerations
re.purge should be used judiciously as clearing the cache
forces recompilation of subsequent patterns.
The performance impact depends on your pattern complexity and usage frequency. Measure before optimizing cache behavior.
Source
Python re.purge() documentation
This tutorial covered Python's re.purge function and its
role in managing the regex cache. Use it wisely for memory management.
Author
List all Python tutorials.