gzip

import gzip
Updated March 13, 2026 · Modules
compression stdlib file-io

The gzip module provides a simple interface for compressing and decompressing files, modeled after the GNU programs gzip and gunzip. It uses the zlib module internally to handle the actual data compression.

The module provides the GzipFile class for reading and writing gzip-format files, along with convenience functions open(), compress(), and decompress() for simpler use cases. Gzip-compressed files are widely used across Unix systems, web servers (for HTTP compression), and data pipelines for reducing storage and transfer costs.

Syntax

import gzip

Functions

open()

Opens a gzip-compressed file in binary or text mode, returning a file object.

Signature: gzip.open(filename, mode='rb', compresslevel=9, encoding=None, errors=None, newline=None)

Parameters:

ParameterTypeDefaultDescription
filenamestr or bytesFilename or existing file object
modestr'rb'Mode: 'r', 'rb', 'a', 'ab', 'w', 'wb', 'x', 'xb' (binary) or 'rt', 'wt', 'xt' (text)
compresslevelint9Compression level from 0-9 (0=no compression, 9=maximum)
encodingstrNoneText encoding (only for text mode)
errorsstrNoneError handling (only for text mode)
newlinestrNoneLine ending handling (only for text mode)

Returns: A file object (GzipFile or TextIOWrapper).

Example:

import gzip

# Writing to a compressed file
with gzip.open('data.txt.gz', 'wb') as f:
    f.write(b'Hello, World!')

# Reading from a compressed file
with gzip.open('data.txt.gz', 'rb') as f:
    content = f.read()
    print(content)
# b'Hello, World!'

compress()

Compresses data in memory and returns a bytes object containing the compressed data.

Signature: gzip.compress(data, compresslevel=9, *, mtime=0)

Parameters:

ParameterTypeDefaultDescription
databytes-likeThe data to compress
compresslevelint9Compression level 0-9
mtimeint0Modification time for the gzip header. Use 0 for reproducible output, None for current time

Returns: bytes — The compressed data.

Example:

import gzip

data = b'This is a much longer piece of text that we want to compress for storage efficiency.'
compressed = gzip.compress(data)

print(f'Original size: {len(data)} bytes')
print(f'Compressed size: {len(compressed)} bytes')
print(f'Decompressed: {gzip.decompress(compressed)}')
# Original size: 91 bytes
# Compressed size: 65 bytes
# Decompressed: b'This is a much longer piece of text...'

decompress()

Decompresses gzip-compressed data and returns the original uncompressed bytes.

Signature: gzip.decompress(data)

Parameters:

ParameterTypeDefaultDescription
databytes-likeThe compressed data to decompress

Returns: bytes — The uncompressed data.

Example:

import gzip

# Compress then decompress
original = b'Binary data here'
compressed = gzip.compress(original)
decompressed = gzip.decompress(compressed)

print(decompressed == original)
# True

# Can also decompress data from file
with gzip.open('data.txt.gz', 'rb') as f:
    raw_data = f.read()
    decompressed = gzip.decompress(raw_data)

Classes

GzipFile

The GzipFile class provides a file-like interface for reading and writing gzip-compressed files. It simulates most file object methods.

Signature: gzip.GzipFile(filename=None, mode=None, compresslevel=9, fileobj=None, mtime=None)

Parameters:

ParameterTypeDefaultDescription
filenamestr or bytesNoneFilename for the gzip header (or file object to wrap)
modestrNoneMode: 'rb', 'wb', 'ab', etc.
compresslevelint9Compression level 0-9
fileobjfile-likeNoneExisting file object to wrap
mtimeintNoneTimestamp for the gzip header (Unix epoch seconds)

Attributes:

AttributeTypeDescription
mtimeint or NoneTimestamp from the gzip header when decompressing
namestr or bytesPath to the gzip file on disk
modestr'rb' for reading, 'wb' for writing

Example:

import gzip
from io import BytesIO

# Using GzipFile with BytesIO for in-memory compression
buffer = BytesIO()
with gzip.GzipFile(fileobj=buffer, mode='wb') as f:
    f.write(b'In-memory compressed data')

# Read back
buffer.seek(0)
with gzip.GzipFile(fileobj=buffer, mode='rb') as f:
    print(f.read())
# b'In-memory compressed data'

GzipFile.peek()

Reads uncompressed bytes without advancing the file position.

Signature: GzipFile.peek(n=-1)

Parameters:

ParameterTypeDefaultDescription
nint-1Number of bytes to peek at

Returns: bytes — The peeked data.

Example:

import gzip

with gzip.open('data.txt.gz', 'rb') as f:
    # Peek at the beginning of the file
    header = f.peek(10)
    print(f'First 10 bytes: {header[:10]}')
    
    # Read normally after peeking
    content = f.read()

Common Patterns

Compressing an existing file

import gzip
import shutil

# Compress a file using copyfileobj
with open('large_file.txt', 'rb') as f_in:
    with gzip.open('large_file.txt.gz', 'wb') as f_out:
        shutil.copyfileobj(f_in, f_out)

Reading a large compressed file line by line

import gzip

# Process a large gzipped log file line by line
with gzip.open('access.log.gz', 'rt') as f:
    for line in f:
        if 'ERROR' in line:
            print(line.strip())

Creating a reproducible compressed snapshot

import gzip

data = b'Same data produces same output'

# Using mtime=0 ensures reproducible output (no timestamp in header)
compressed1 = gzip.compress(data, mtime=0)
compressed2 = gzip.compress(data, mtime=0)

print(compressed1 == compressed2)
# True

# Using current time produces different output each time
compressed_now = gzip.compress(data, mtime=None)

Working with web API responses

import gzip

# Many APIs return gzip-compressed responses
response = requests.get('https://api.example.com/data')
if response.headers.get('Content-Encoding') == 'gzip':
    compressed_data = response.content
    # decompress handles the gzip wrapper
    decompressed = gzip.decompress(compressed_data)

Errors

  • gzip.BadGzipFile — Raised for invalid gzip files (inherits from OSError). Added in Python 3.8.
  • EOFError — Raised when the file ends unexpectedly.
  • zlib.error — Raised for compression/decompression errors.
  • TypeError — Raised when the input is not bytes-like.
  • FileNotFoundError — Raised when the specified file doesn’t exist.

See Also