Golang Regexp.Split
last modified April 20, 2025
This tutorial explains how to use the Regexp.Split
method in Go.
We'll cover regex splitting basics and provide practical examples.
A regular expression is a sequence of characters that defines a search pattern. It's used for pattern matching within strings.
The Regexp.Split method splits a string into substrings separated by matches of the regular expression. It's more powerful than strings.Split.
Basic Regexp.Split Example
The simplest use of Regexp.Split
splits on a fixed delimiter.
Here we split a comma-separated string.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`,`) text := "apple,banana,cherry,date" parts := re.Split(text, -1) for _, part := range parts { fmt.Println(part) } }
We compile a simple comma pattern and split the input string. The -1 means return all substrings, including empty ones.
Splitting on Multiple Delimiters
Regex splitting shines when you need to split on multiple possible delimiters. This example splits on commas, semicolons, or whitespace.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`[,;\s]+`) text := "apple,banana; cherry date;elderberry" parts := re.Split(text, -1) for _, part := range parts { fmt.Println(part) } }
The pattern matches one or more commas, semicolons, or whitespace characters. This handles irregular spacing between items elegantly.
Limiting the Number of Splits
You can control how many splits occur by specifying a positive integer as the second parameter. This limits the number of returned substrings.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`\s+`) text := "one two three four five" // Split into max 3 parts parts := re.Split(text, 3) for i, part := range parts { fmt.Printf("Part %d: %s\n", i+1, part) } }
The output will have at most 3 substrings. The remaining delimiters are left in the last substring.
Splitting with Capture Groups
When the regex contains capture groups, the matched groups are included in the result. This example demonstrates capturing delimiters.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`(\s*)([;,]\s*)`) text := "apple; banana,cherry;date" parts := re.Split(text, -1) for i, part := range parts { fmt.Printf("Part %d: %q\n", i, part) } }
The output includes empty strings where capture groups matched. This behavior is important to understand when processing results.
Splitting Lines with Different Endings
Different operating systems use different line endings. Regex can handle all variations when splitting text into lines.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`\r?\n|\r`) text := "line1\nline2\r\nline3\rline4" lines := re.Split(text, -1) for _, line := range lines { fmt.Println(line) } }
The pattern matches \n, \r\n, or \r. This handles Unix, Windows, and old Mac line endings consistently.
Splitting Words with Complex Boundaries
For natural language processing, you might need to split on word boundaries considering punctuation. This example shows how.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`[\s,.!?;:]+`) text := "Hello, there! How are you today?" words := re.Split(text, -1) for _, word := range words { fmt.Println(word) } }
The pattern matches one or more whitespace or punctuation characters. This produces clean word tokens from text.
Splitting While Keeping Delimiters
Sometimes you need to split a string but preserve the delimiters as part of the
output. This requires a creative approach since the Split
method of
Go's regexp
package does not retain delimiters in its results.
Instead, we use FindAllStringIndex
combined with string slicing to
achieve this functionality.
package main import ( "fmt" "regexp" ) func main() { re := regexp.MustCompile(`[,;]`) text := "apple,banana;cherry" // Split the string while keeping the delimiters parts := re.FindAllStringIndex(text, -1) result := []string{} lastIndex := 0 for _, indices := range parts { start, end := indices[0], indices[1] result = append(result, text[lastIndex:start]) // Add part before the delimiter result = append(result, text[start:end]) // Add the delimiter lastIndex = end } result = append(result, text[lastIndex:]) // Add the final part after the last delimiter for _, part := range result { fmt.Println(part) } }
This example demonstrates how to split a string into substrings while retaining
the delimiters in the output. The regular expression [,;]
matches
the delimiters, and the FindAllStringIndex
method retrieves their
positions in the string. Using these positions, the program slices the original
string and constructs an array that includes both substrings and delimiters.
The output shows the delimiters embedded within the split parts, making this approach useful for cases where the delimiters are essential for further processing or formatting. This method can be adapted for various patterns and delimiters, offering flexibility when working with structured text.
Source
Go regexp package documentation
This tutorial covered the Regexp.Split
method in Go with
practical examples of string splitting using regular expressions.
Author
List all Go tutorials.