Golang Regexp.FindAllSubmatchIndex
last modified April 20, 2025
This tutorial explains how to use the
Regexp.FindAllSubmatchIndex
method in Go. We'll cover its
functionality and provide practical examples.
A regular expression is a sequence of characters that defines a search pattern. It's used for pattern matching within strings.
The Regexp.FindAllSubmatchIndex method returns a slice of slices of integers. Each integer slice represents the positions of matches and submatches.
Basic FindAllSubmatchIndex Example
The simplest use of FindAllSubmatchIndex
finds all matches and their
positions in a string. Here we locate simple word matches.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`go`) text := "go golang go go" indices := re.FindAllSubmatchIndex([]byte(text), -1) for _, match := range indices { fmt.Printf("Found at %d-%d: %s\n", match[0], match[1], text[match[0]:match[1]]) } }
We compile the pattern "go" and find all occurrences in the text. Each match returns start and end positions in the input byte slice.
Finding Submatch Positions
This example demonstrates capturing group positions. We'll extract date components with their exact locations.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`(\d{4})-(\d{2})-(\d{2})`) text := "Date: 2025-04-20, Another: 2026-05-21" matches := re.FindAllSubmatchIndex([]byte(text), -1) for _, match := range matches { fmt.Println("Full match:", text[match[0]:match[1]]) fmt.Println("Year:", text[match[2]:match[3]]) fmt.Println("Month:", text[match[4]:match[5]]) fmt.Println("Day:", text[match[6]:match[7]]) } }
The pattern has three capture groups. Each match returns eight integers: pairs for full match and each group's start/end positions.
Finding Overlapping Matches
FindAllSubmatchIndex
can find overlapping matches when using
lookahead assertions. This shows all possible 3-digit sequences.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`(?=(\d{3}))`) text := "12345" matches := re.FindAllSubmatchIndex([]byte(text), -1) for _, match := range matches { start, end := match[2], match[3] fmt.Printf("Found at %d-%d: %s\n", start, end, text[start:end]) } }
The lookahead assertion finds all positions where three digits follow. Each match returns the position of the lookahead and the captured group.
Extracting Key-Value Pairs
This example shows how to parse key-value pairs and get their exact positions. We'll process a simple configuration string.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`(\w+)=("[^"]*"|\S+)`) text := `name="John Doe" age=30 city="New York"` matches := re.FindAllSubmatchIndex([]byte(text), -1) for _, match := range matches { key := text[match[2]:match[3]] value := text[match[4]:match[5]] fmt.Printf("Key: %s (at %d-%d), Value: %s (at %d-%d)\n", key, match[2], match[3], value, match[4], match[5]) } }
The pattern matches keys and values, with quoted values supported. We extract both the content and positions of each component.
Finding Nested Structures
This advanced example demonstrates parsing nested structures. We'll extract HTML tags with their attributes and positions.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`<(\w+)(\s+([^>]*))?>`) text := `<div class="header"><p align="center">Hello</p></div>` matches := re.FindAllSubmatchIndex([]byte(text), -1) for _, match := range matches { fmt.Println("Full tag:", text[match[0]:match[1]]) fmt.Println("Tag name:", text[match[2]:match[3]]) if match[4] != -1 { fmt.Println("Attributes:", text[match[6]:match[7]]) } } }
The pattern matches HTML tags with optional attributes. We check for -1 to handle optional groups that didn't participate in the match.
Handling Empty Matches
This example shows how FindAllSubmatchIndex
handles empty matches.
We'll process a string with potential zero-length matches.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`a*`) text := "baaab" matches := re.FindAllSubmatchIndex([]byte(text), -1) for i, match := range matches { if match[0] == match[1] { fmt.Printf("Match %d: empty at position %d\n", i, match[0]) } else { fmt.Printf("Match %d: %q at %d-%d\n", i, text[match[0]:match[1]], match[0], match[1]) } } }
The pattern matches zero or more 'a's. Empty matches are returned with equal start and end positions. This behavior is important for some algorithms.
Performance Considerations
When processing large texts, consider limiting the number of matches. This example shows how to control result size.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`\d+`) text := "1 22 333 4444 55555 666666" // Find first 3 matches matches := re.FindAllSubmatchIndex([]byte(text), 3) for _, match := range matches { fmt.Println(text[match[0]:match[1]]) } }
The second parameter limits the number of matches returned. Use this to prevent excessive memory usage with patterns that might match many times.
Source
Go regexp package documentation
This tutorial covered the Regexp.FindAllSubmatchIndex
method in Go
with practical examples of finding matches and their positions.
Author
List all Go tutorials.