Python readlines Function

Last modified March 26, 2025

This comprehensive guide explores Python's readlines function, a powerful method for reading files line by line. We'll cover basic usage, memory considerations, context managers, encoding handling, and best practices. Through practical examples, you'll master line-based file reading in Python.

Basic Definitions

The readlines function reads all lines from a file object and returns them as a list of strings. Each string represents one line from the file, including the newline character at the end. This function is particularly useful when you need to process each line of a file individually.

Unlike read which returns the entire content as a single string, readlines splits the content at line boundaries. The function automatically handles different platform-specific line endings (Windows, Unix, Mac) and converts them to \n.

Basic readlines Usage

The simplest use of readlines reads all lines from a file into a list. Each element in the list corresponds to one line from the file.

basic_readlines.py

# Open a file and read all lines
with open('example.txt', 'r') as file:
    lines = file.readlines()
    for line in lines:
        print(line.strip())  # Remove newline character

This example opens 'example.txt' in read mode, reads all lines into a list using readlines, then processes each line. The strip method removes whitespace and newline characters from each line.

The with statement ensures proper file closure. Each line in the resulting list ends with a newline character, which is why we use strip when printing. This behavior matches how the lines appear in the actual file.

Reading Specific Number of Lines

The readlines function can accept a size hint parameter to limit the amount of data read. This helps when working with very large files.

readlines_with_hint.py

# Read approximately 1000 bytes worth of lines
with open('large_file.txt', 'r') as file:
    lines = file.readlines(1000)
    print(f"Read {len(lines)} lines")
    for line in lines:
        print(line.strip())

This code attempts to read about 1000 bytes worth of lines from the file. The actual number of lines returned may vary as Python reads complete lines up to the size limit. This is useful for processing large files in chunks.

The size hint doesn't guarantee exactly that many bytes will be read - it reads complete lines until the total size approaches the hint. This prevents partial line reads. The function always returns complete lines, never breaking a line in the middle.

Processing Lines with readlines

The list returned by readlines can be processed like any Python list. This example demonstrates filtering and transforming lines.

process_lines.py

# Process lines from a file
with open('data.txt', 'r') as file:
    lines = file.readlines()
    
    # Filter empty lines and comments
    cleaned_lines = [line.strip() for line in lines 
                    if line.strip() and not line.startswith('#')]
    
    # Convert valid lines to uppercase
    upper_lines = [line.upper() for line in cleaned_lines]
    
    print("Processed lines:")
    for line in upper_lines:
        print(line)

This code reads all lines, then filters out empty lines and comments (lines starting with #). The remaining lines are converted to uppercase. List comprehensions provide a concise way to process the lines.

The example shows how readlines integrates with Python's list processing capabilities. You can chain multiple transformations and filters to create powerful data processing pipelines directly from file contents.

Comparing readlines with Iteration

While readlines reads all lines into memory at once, directly iterating over the file object is more memory efficient for large files.

readlines_vs_iteration.py

# Memory-efficient line reading
print("Using readlines (all lines in memory):")
with open('large_file.txt', 'r') as file:
    lines = file.readlines()
    for line in lines[:5]:  # Only show first 5 lines
        print(line.strip())

print("\nUsing iteration (memory efficient):")
with open('large_file.txt', 'r') as file:
    line_count = 0
    for line in file:  # Reads one line at a time
        print(line.strip())
        line_count += 1
        if line_count >= 5:
            break

The first approach loads all lines into memory, which can be problematic for very large files. The second approach reads one line at a time, using minimal memory. The file object itself is iterable in Python.

For small files, either approach works well. For large files (several GB), iteration is preferred. readlines is convenient when you need all lines in memory for random access or multiple passes through the data.

Handling Different Encodings

The readlines function works with different file encodings when specified during file opening. This is crucial for international text files.

readlines_encoding.py

# Reading a UTF-8 encoded file
try:
    with open('multilingual.txt', 'r', encoding='utf-8') as file:
        lines = file.readlines()
        for line in lines:
            print(line.strip())
except UnicodeDecodeError:
    print("Error: Could not decode the file with UTF-8 encoding")

# Reading a file with fallback encoding
with open('legacy.txt', 'r', encoding='latin-1', errors='replace') as file:
    lines = file.readlines()
    for line in lines:
        print(line.strip())

The first example attempts to read a UTF-8 file, which supports most languages. If the file contains invalid UTF-8 sequences, it raises an exception. The second example uses Latin-1 encoding with error replacement for legacy files.

When working with text files, always consider the encoding. Python 3 defaults to UTF-8, but many legacy systems use different encodings. The errors parameter controls how decoding errors are handled (ignore, replace, etc.).

Best Practices

Use with statements: Always use context managers for file handling
Consider memory usage: For large files, iterate directly over the file object
Handle encodings: Always specify the correct file encoding
Clean line endings: Remember to strip newline characters when needed
Handle exceptions: Catch IOError for file operations and UnicodeDecodeError for encoding issues

Source References

Author

My name is Jan Bodnar, and I am a passionate programmer with extensive programming experience. I have been writing programming articles since 2007. To date, I have authored over 1,400 articles and 8 e-books. I possess more than ten years of experience in teaching programming.

List all Python tutorials.

Ebooks