Python split string
last modified January 29, 2024
Python split string tutorial shows how to split strings in Python.
We can split strings in Python with the following methods:
- str.split, str.rsplit
- re.split
- str.partition, str.rpartition
Python split/rsplit methods
The split methods cut a string into parts based on the given separator parameter. With the optional second parameter we can control how many times the string is cut.
str.split([sep[, maxsplit]])
The str.split
method returns a list of the words in the string,
separated by the delimiter string.
The parameters are:
- sep − Optional. Character dividing the string into split groups; default is space.
- maxsplit − Optional. Number of splits to do; default is -1 which splits all the items.
str.rsplit([sep[, maxsplit]])
The str.rsplit
returns a list of the words in the string, separated
by the delimiter string (starting from right).
Python split examples
In the following examples, we cut strings into parts with the previously mentioned methods.
#!/usr/bin/python line = "sky, club, cpu, cloud, war, pot, rock, water" words = line.split(',') print(words) words2 = line.split(', ') print(words2) words3 = line.split(',') words4 = [e.strip() for e in words3] print(words4)
In the example, we cut the line of words delimited with a comma into a list of words.
words = line.split(',')
The string is cut by the comma character; however, the words have spaces.
words2 = line.split(', ')
One way to get rid of the spaces is to include a space character in the separator parameter.
words3 = line.split(',') words4 = [e.strip() for e in words3]
Another solution is to use the strip
method.
$ ./splitting.py ['sky', ' club', ' cpu', ' cloud', ' war', ' pot', ' rock', ' water'] ['sky', 'club', 'cpu', 'cloud', 'war', 'pot', 'rock', 'water'] ['sky', 'club', 'cpu', 'cloud', 'war', 'pot', 'rock', 'water']
With the maxsplit
parameter we can set how many splits will be
done.
#!/usr/bin/python line = "sky, club, cpu, cloud, war, pot, rock, water" words = line.split(', ', 3) for word in words: print(word) print('-------------------------') words2 = line.split(', ', 4) for word in words2: print(word)
The rest of the words forms one string.
$ ./maxsplit.py sky club cpu cloud, war, pot, rock, water ------------------------- sky club cpu cloud war, pot, rock, water
In the next example, we get words from the end of the string.
#!/usr/bin/python line = "sky, club, cpu, cloud, war, pot, rock, water" words = line.rsplit(', ', 3)[-3:] print(words)
Using the rsplit
method, we get the last three words.
$ ./split_right.py ['pot', 'rock', 'water']
Python splitlines
The str.splitlines
method returns a list of the lines in the string,
breaking at line boundaries. Line breaks are not included in the resulting list
unless keepends
is set to True
.
The line boundaries are characters including line feed \n
, carriage
return \r
, and carriage return/line feed \r\n
.
str.splitlines([keepends])
It is a convenience method to quickly split lines into lists from files.
#!/usr/bin/python line = "sky\nclub\ncpu\rcloud\r\nwar\npot\nrock\nwater" words = line.splitlines() print(words)
The example turns the string into a list of words.
$ ./split_lines.py ['sky', 'club', 'cpu', 'cloud', 'war', 'pot', 'rock', 'water']
In the next example, we read words from a file.
sky cup blue bear rock pen chair lamp bowl rock falcon
We have a file of words.
#!/usr/bin/python filename = 'words.txt' with open(filename, 'r') as f: data = f.read() words = data.splitlines() print(words)
The read
method reads the whole file into a string. The string
is then split into lines with split_lines
.
$ ./split_lines2.py ['sky', 'cup', 'blue', 'bear', 'rock', 'pen', 'chair', 'lamp', 'bowl', 'rock', 'falcon']
Python re.split
With re.split
, we can split strings by using regular expressions.
re.split(pattern, string, maxsplit=0, flags=0)
The method gives us more powerful options to cut strings.
#!/usr/bin/python import re line = "sky, \nclub, \tcpu; cloud, \n\n\nwar; pot, rock, water" words = re.split("[;,]\s+", line) print(words)
In the example, we spit the string into a list of words with
re.spit
. The words can be separated a comma or a semicolon and
multiple white spaces.
$ ./reg_split.py ['sky', 'club', 'cpu', 'cloud', 'war', 'pot', 'rock', 'water']
Python word frequency
In the following example, we count the word frequency.
$ wget https://raw.githubusercontent.com/janbodnar/data/main/the-king-james-bible.txt
We use the King James Bible.
#!/usr/bin/python import collections import re filename = 'the-king-james-bible.txt' def get_words(): words = [] with open(filename) as f: for line in f: fields = re.split("\W+", line) for w in fields: if w and not w.isdigit(): words.append(w) return words words = get_words() c = collections.Counter(words) common = c.most_common(10) for e, i in common: print(f'{e}: {i}')
The example prints the ten most common words from the
the-king-james-bible.txt
file.
fields = re.split("\W+", line)
We split the line into words. The \W
character class matches
any character which is not a word character.
for w in fields: if w and not w.isdigit(): words.append(w)
We skip empty fields and verse notations (they contain digits).
c = collections.Counter(words) common = c.most_common(10)
We count the occurences and print the top ten frequent words.
$ ./word_freq.py the: 62103 and: 38848 of: 34478 to: 13400 And: 12846 that: 12576 in: 12331 shall: 9760 he: 9665 unto: 8942
Python string partition
The partition
method splits the sequence at the first occurrence of
the given separator and returns a 3-tuple containing the part before the
separator, the separator itself, and the part after the separator.
The rpartition
method splits the sequence at the last occurrence of
the given separator and returns a 3-tuple containing the part before the
separator, the separator itself, and the part after the separator.
#!/usr/bin/python import os files = os.listdir('.') for file in files: data = file.partition('.') print(f'{data[0]} has extension {data[2]}')
The example lists the current working directory and cuts each file into its name
and extension. It uses partition
.
$ ./partition.py words has extension txt split_lines2 has extension py splitting has extension py split_lines has extension py word_freq2 has extension py split_right has extension py the-king-james-bible has extension txt reg_split has extension py word_freq has extension py partition has extension py maxsplit has extension py
Source
In this article we have showed how to split strings in Python.
Author
List all Python tutorials.