Working with Files in Python
Every program eventually needs to save data or load information from disk. Whether you’re building a log system, processing user uploads, or saving configuration settings, file handling is essential. Python makes this straightforward with its built-in file handling capabilities. This guide covers everything you need to read and write files confidently.
Opening Files with open()
The open() function is the foundation of file handling in Python. It takes at least one argument: the file path. You can also specify the mode to determine what operations are permitted.
# Open a file for reading
file = open("example.txt", "r")
The second argument is the mode, which determines what you can do with the file:
| Mode | Meaning |
|---|---|
r | Read (default) |
w | Write (overwrites existing content) |
a | Append (adds to end of file) |
rb | Read binary |
wb | Write binary |
r+ | Read and write |
w+ | Read and write (overwrites) |
a+ | Read and append |
When opening files, always consider the encoding. Python 3 uses UTF-8 by default, which handles most characters correctly.
# Explicitly specify encoding
with open("example.txt", "r", encoding="utf-8") as file:
content = file.read()
Reading File Contents
Python provides several ways to read file contents, each suited for different situations.
Reading the Entire File
# Read entire file into a string
with open("example.txt", "r") as file:
content = file.read()
print(content)
The with statement automatically closes the file when you’re done. This is the recommended approach for most use cases.
Reading Line by Line
# Read one line at a time
with open("example.txt", "r") as file:
for line in file:
print(line.strip())
This is memory-efficient for large files since it doesn’t load everything at once. The file object is iterable, which makes processing large logs or data files straightforward.
Reading All Lines into a List
# Get all lines as a list
with open("example.txt", "r") as file:
lines = file.readlines()
# Or more simply
with open("example.txt", "r") as file:
lines = list(file)
The readlines() method includes the newline characters, so use strip() when you don’t need them.
Reading a Specific Number of Characters
# Read first 100 characters
with open("example.txt", "r") as file:
first_100 = file.read(100)
This is useful when processing files in chunks or when you only need partial content.
Writing to Files
Writing a Single String
# Write text to a file (overwrites existing content)
with open("output.txt", "w") as file:
file.write("Hello, World!")
The write() method returns the number of characters written.
Writing Multiple Lines
# Write multiple lines
lines = ["First line\n", "Second line\n", "Third line\n"]
with open("output.txt", "w") as file:
file.writelines(lines)
Appending to Files
# Add to the end of existing file
with open("output.txt", "a") as file:
file.write("This gets appended\n")
Append mode is perfect for logging, where you want to add timestamped entries without losing previous data.
Why Use Context Managers?
The with statement handles cleanup automatically:
# Without context manager (not recommended)
file = open("example.txt", "r")
content = file.read()
file.close() # Must remember to close!
# With context manager (recommended)
with open("example.txt", "r") as file:
content = file.read()
# File automatically closed when exiting the with block
If you forget to close a file, you might lose data or run into permission issues. The context manager guarantees the file closes even if an exception occurs. This is crucial for reliable programs.
Handling Binary Files
Images, audio files, and other binary data require different modes:
# Reading binary data
with open("image.png", "rb") as file:
data = file.read()
# Writing binary data
with open("copy.png", "wb") as file:
file.write(data)
Binary modes are essential when working with non-text files. Reading a PNG in text mode corrupts the data because it tries to decode bytes as text.
Working with File Paths
Python’s os module and the newer pathlib provide tools for path manipulation:
import os
from pathlib import Path
# Using pathlib (recommended)
file_path = Path("data") / "output.txt"
with open(file_path, "w") as file:
file.write("Hello!")
# Check if file exists
if file_path.exists():
print("File exists")
# Get file size
print(f"Size: {file_path.stat().st_size} bytes")
Pathlib makes code more readable and works across operating systems.
Common Patterns
Copying a File
with open("source.txt", "r") as source:
with open("destination.txt", "w") as dest:
dest.write(source.read())
Counting Lines in a File
with open("example.txt", "r") as file:
line_count = sum(1 for line in file)
print(f"Lines: {line_count}")
Reading CSV Files
For CSV files, use the csv module:
import csv
with open("data.csv", "r") as file:
reader = csv.reader(file)
for row in reader:
print(row)
Processing Large Files Line by Line
# Process a large log file without loading into memory
total_errors = 0
with open("server.log", "r") as file:
for line in file:
if "ERROR" in line:
total_errors += 1
print(f"Found {total_errors} errors")
This pattern can handle files of any size without running out of memory.
Error Handling
File operations can fail for many reasons. Always handle potential errors:
try:
with open("example.txt", "r") as file:
content = file.read()
except FileNotFoundError:
print("File does not exist")
except PermissionError:
print("Permission denied")
except IOError as e:
print(f"IO error: {e}")
This makes your programs more robust and provides helpful feedback instead of crashes.
When to Use These Patterns
Use read() when you need the entire file contents in memory for processing.
Use line-by-line iteration when working with large files to avoid memory issues.
Use w mode when creating new files or replacing existing content.
Use a mode when logging or appending data to an existing file.
Use binary modes (rb, wb) for non-text files like images, PDFs, or when working with encoded data.
When Not to Use These Patterns
Don’t use file handling for data that needs to be queried or updated frequently. Databases are better suited for that use case.
Don’t read entire large files into memory when you can process them line by line.
Don’t forget to handle exceptions—file operations can fail due to missing files, permissions issues, or disk problems. Use try/except blocks for robust code.