Working with JSON in Python
JSON (JavaScript Object Notation) is the backbone of data exchange in modern applications. Whether you are reading configuration files, consuming REST APIs, or storing structured data, JSON is everywhere. Python’s standard library makes working with JSON straightforward.
What is JSON
JSON represents data in a format that humans can read and machines can parse. It supports several data types:
- Objects: Curly braces containing key-value pairs
{"key": "value"} - Arrays: Ordered lists
[1, 2, 3] - Strings: Double-quoted text
"hello" - Numbers: Integers and floats
42,3.14 - Booleans:
trueorfalse - Null:
null
Python maps JSON types to native types:
| JSON Type | Python Type |
|---|---|
| object | dict |
| array | list |
| string | str |
| number (int) | int |
| number (float) | float |
| true/false | True/False |
| null | None |
Reading JSON from a File
The most common task is reading JSON data from a file:
import json
# Read JSON from a file
with open("config.json", "r") as f:
data = json.load(f)
print(data)
# {'debug': True, 'database': {'host': 'localhost', 'port': 5432}}
The json.load() function reads directly from a file object. It parses the JSON content and returns the corresponding Python object.
What if your JSON file does not exist or contains invalid syntax? You will get an error:
import json
try:
with open("missing.json", "r") as f:
data = json.load(f)
except FileNotFoundError:
print("File does not exist")
except json.JSONDecodeError as e:
print(f"Invalid JSON: {e}")
Reading JSON from a String
Sometimes you have JSON data as a string rather than a file. This is common when working with API responses:
import json
json_string = '{"name": "Alice", "age": 30, "active": true}'
# Parse JSON string to Python object
data = json.loads(json_string)
print(data)
# {'name': 'Alice', 'age': 30, 'active': True}
# Access the values like a normal dictionary
print(data["name"]) # Alice
print(data["age"]) # 30
The function name loads (with an “s”) stands for “load string”. It takes a string and returns a Python object.
What if the JSON is nested or contains lists?
import json
api_response = '''
{
"user": {
"id": 123,
"name": "Alice",
"roles": ["admin", "editor"]
},
"status": "success"
}
'''
data = json.loads(api_response)
# Navigate nested structures
user_name = data["user"]["name"] # Alice
first_role = data["user"]["roles"][0] # admin
print(f"User: {user_name}, Role: {first_role}")
Writing JSON to a File
Saving data as JSON is just as easy. Use json.dump() to write directly to a file:
import json
data = {
"name": "Bob",
"age": 25,
"skills": ["Python", "JavaScript"],
"active": True
}
with open("output.json", "w") as f:
json.dump(data, f, indent=2)
This creates a file called output.json with formatted content:
{
"name": "Bob",
"age": 25,
"skills": [
"Python",
"JavaScript"
],
"active": true
}
The indent parameter makes the output readable. Without it, the JSON would be compressed to a single line.
Writing JSON to a String
To convert a Python object to a JSON string, use json.dumps():
import json
data = {"name": "Charlie", "score": 95.5}
# Convert to JSON string
json_string = json.dumps(data)
print(json_string)
# {"name": "Charlie", "score": 95.5}
# Pretty-printed version
pretty = json.dumps(data, indent=4)
print(pretty)
# {
# "name": "Charlie",
# "score": 95.5
# }
This is useful when sending JSON data over HTTP or including it in a message queue.
Pretty Printing and Compact Output
For debugging, pretty printing helps visualize nested structures:
import json
data = {
"users": [
{"id": 1, "name": "Alice"},
{"id": 2, "name": "Bob"}
],
"count": 2
}
# Human-readable format
print(json.dumps(data, indent=2))
Output:
{
"users": [
{
"id": 1,
"name": "Alice"
},
{
"id": 2,
"name": "Bob"
}
],
"count": 2
}
For production environments where file size matters, use compact formatting:
import json
data = {"a": 1, "b": 2}
# Compact output (no whitespace)
compact = json.dumps(data, separators=(",", ":"))
print(compact)
# {"a":1,"b":2}
Working with Complex Objects
JSON has limited types. Not every Python object can be serialized directly:
import json
from datetime import datetime
# This will fail
data = {"created": datetime.now()}
try:
json.dumps(data)
except TypeError as e:
print(f"Error: {e}")
# Object of type datetime is not JSON serializable
You have several options to handle this:
Option 1: Convert manually
import json
from datetime import datetime
data = {"created": datetime.now()}
# Convert datetime to string before serializing
data["created"] = data["created"].isoformat()
json_string = json.dumps(data)
print(json_string)
# {"created": "2024-01-15T10:30:45.123456"}
Option 2: Use the default parameter
import json
from datetime import datetime
def serialize_datetime(obj):
if isinstance(obj, datetime):
return obj.isoformat()
raise TypeError(f"Type {type(obj)} not serializable")
data = {"created": datetime.now()}
json_string = json.dumps(data, default=serialize_datetime)
Option 3: Use a custom encoder class
import json
from datetime import datetime
class CustomEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return obj.isoformat()
return super().default(obj)
data = {"created": datetime.now()}
json_string = json.dumps(data, cls=CustomEncoder)
Reading JSON from APIs
When working with HTTP APIs, you often receive JSON responses. The requests library makes this seamless:
import requests
import json
# Fetch data from an API
response = requests.get("https://api.github.com/users/octocat")
# Method 1: Use response.json() (most common)
data = response.json()
# Method 2: Parse manually
data = json.loads(response.text)
print(f"Name: {data['name']}")
print(f"Repos: {data['public_repos']}")
The response.json() method is equivalent to json.loads(response.text).
For POST requests with JSON payloads:
import requests
import json
payload = {"username": "alice", "password": "secret123"}
# Send JSON data
response = requests.post(
"https://api.example.com/login",
data=json.dumps(payload), # Convert dict to JSON string
headers={"Content-Type": "application/json"}
)
# Or use the json parameter (requests does the conversion)
response = requests.post(
"https://api.example.com/login",
json=payload
)
Common Pitfalls
Forgetting to parse
import requests
response = requests.get("https://api.example.com/data")
# Wrong: response is a Response object, not the data
print(response["name"]) # TypeError
# Correct: parse the JSON first
data = response.json()
print(data["name"]) # Works
Unicode and encoding
By default, json.dumps() escapes non-ASCII characters:
import json
data = {"name": "日本語"}
print(json.dumps(data))
# {"name": "\u65e5\u672c\u8a9e"}
print(json.dumps(data, ensure_ascii=False))
# {"name": "日本語"}
Float precision
JSON represents all numbers as floats or integers. Large integers may lose precision:
import json
# Python int can be arbitrarily large
data = {"big": 9007199254740993} # Larger than JavaScript's MAX_SAFE_INTEGER
json_string = json.dumps(data)
parsed = json.loads(json_string)
print(data["big"] == parsed["big"]) # False - precision lost!
Quick Reference
| Task | Code |
|---|---|
| Read JSON file | json.load(f) |
| Read JSON string | json.loads(string) |
| Write JSON file | json.dump(data, f) |
| Write JSON string | json.dumps(data) |
| Pretty print | json.dumps(data, indent=2) |
| Compact output | json.dumps(data, separators=(",", ":")) |
| Custom serialization | json.dumps(data, default=func) |
See Also
- json-module — Complete reference for the json module
- csv-module — Working with CSV files
- urllib-parse-module — URL parsing utilities