ZetCode

Python Match.groupdict() Method

last modified April 20, 2025

Introduction to Match.groupdict

The Match.groupdict method is part of Python's re module. It returns a dictionary containing all named subgroups of a match.

This method is available on match objects returned by regex operations. It only works with patterns that contain named capturing groups.

The dictionary keys are the group names, and values are the matched strings. Unmatched named groups return None as their value.

Basic Syntax

The syntax for Match.groupdict is simple:

match.groupdict(default=None)

The optional default parameter specifies a value for unmatched groups. If not provided, unmatched groups will be None in the dictionary.

Basic Named Group Matching

Let's start with a simple example of using named groups and groupdict.

basic_groupdict.py
#!/usr/bin/python

import re

text = "Date: 2023-12-25"
pattern = re.compile(r'Date: (?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})')

match = pattern.search(text)
if match:
    print(match.groupdict())

This example demonstrates extracting date components using named groups. The groupdict method returns them as a dictionary.

pattern = re.compile(r'Date: (?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})')

Here we define named groups for year, month, and day using the (?P<name>...) syntax. Each group captures a specific date component.

print(match.groupdict())

The groupdict call returns a dictionary with our named groups as keys and their matched values. The output would be {'year': '2023', 'month': '12', 'day': '25'}.

Handling Optional Groups

When some named groups don't match, groupdict handles them gracefully.

optional_groups.py
#!/usr/bin/python

import re

text = "Name: John"
pattern = re.compile(r'Name: (?P<first>\w+)(?: (?P<middle>\w+))?(?: (?P<last>\w+))?')

match = pattern.search(text)
if match:
    print(match.groupdict())

This pattern has optional middle and last name groups. When they don't match, they appear in the dictionary with None values.

Using Default Values

We can specify default values for unmatched groups using the default parameter.

default_values.py
#!/usr/bin/python

import re

text = "Temperature: 25C"
pattern = re.compile(r'Temperature: (?P<value>\d+)(?P<unit>[CF])?')

match = pattern.search(text)
if match:
    print("Without default:", match.groupdict())
    print("With default:", match.groupdict(default='F'))

Here the unit group is optional. Without a default, it's None. With default='F', unmatched units default to Fahrenheit.

Combining Named and Unnamed Groups

groupdict only includes named groups, ignoring unnamed capturing groups.

mixed_groups.py
#!/usr/bin/python

import re

text = "Version 3.9.1 released"
pattern = re.compile(r'Version (\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)')

match = pattern.search(text)
if match:
    print("All groups:", match.groups())
    print("Named groups:", match.groupdict())

The first group is unnamed, so it appears in groups but not in groupdict. Only minor and patch appear in the dictionary.

Nested Group Dictionaries

For complex patterns, groupdict can create nested data structures.

nested_groups.py
#!/usr/bin/python

import re

text = "Coordinates: (40.7128, -74.0060)"
pattern = re.compile(
    r'Coordinates: \((?P<lat>-?\d+\.\d+), (?P<lon>-?\d+\.\d+)\)'
)

match = pattern.search(text)
if match:
    coords = match.groupdict()
    print(f"Latitude: {coords['lat']}, Longitude: {coords['lon']}")
    print("Full dictionary:", coords)

This extracts latitude and longitude into a dictionary, making the values easily accessible by name. The output shows both individual access and the full dictionary.

Advanced Pattern with Multiple Named Groups

Here's a more complex example with multiple named groups in a log line.

log_parser.py
#!/usr/bin/python

import re

log_line = "2023-04-20 14:35:22 ERROR [main] App crashed: NullPointerException"
pattern = re.compile(
    r'(?P<date>\d{4}-\d{2}-\d{2}) '
    r'(?P<time>\d{2}:\d{2}:\d{2}) '
    r'(?P<level>\w+) '
    r'\[(?P<thread>\w+)\] '
    r'(?P<message>.*)'
)

match = pattern.search(log_line)
if match:
    log_data = match.groupdict()
    print("Log entry components:")
    for key, value in log_data.items():
        print(f"{key:>8}: {value}")

This pattern breaks down a log line into its components using named groups. The groupdict method creates a structured representation of the log entry.

Best Practices

When using Match.groupdict, follow these best practices:

Performance Considerations

The groupdict method has minimal overhead since the groups are already captured during matching. However, creating the dictionary does require some memory allocation.

For performance-critical code that only needs specific groups, accessing them directly with group(name) might be slightly faster than using groupdict.

Source

Python Match.groupdict() documentation

This tutorial covered the essential aspects of Python's Match.groupdict method. Using named groups and their dictionary representation can make your regex code more readable and maintainable.

Author

My name is Jan Bodnar, and I am a passionate programmer with extensive programming experience. I have been writing programming articles since 2007. To date, I have authored over 1,400 articles and 8 e-books. I possess more than ten years of experience in teaching programming.

List all Python tutorials.