Python for Finance: Getting Started

· 4 min read · Updated March 7, 2026 · beginner
finance pandas yfinance data-analysis stocks

Python has become the dominant language for financial analysis. Banks, hedge funds, and fintech companies use it for everything from pricing derivatives to building trading algorithms. This tutorial teaches you the foundations — setting up your environment, understanding financial data structures, and performing your first analysis.

Setting Up Your Environment

Before analyzing financial data, you need the right tools. The Python finance ecosystem relies on a few key libraries.

Install them with pip:

pip install pandas numpy matplotlib yfinance

Here’s what each library does:

  • pandas — handles tabular data and time series
  • numpy — numerical computing foundation
  • matplotlib — creates charts and visualizations
  • yfinance — downloads free stock data from Yahoo Finance

Create a file called analysis.py and import these libraries:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import yfinance as yf

print("Libraries imported successfully")

Run this to verify everything works. If you see the success message, you’re ready to proceed.

Understanding Financial Data Structures

Financial data comes in time series — observations recorded at specific dates. Pandas provides two structures that handle this perfectly: Series and DataFrame.

A Series is a single column of data with an index of dates:

prices = pd.Series(
    [100, 102, 101, 105],
    index=pd.to_datetime(["2024-01-01", "2024-01-02", "2024-01-03", "2024-01-04"])
)
print(prices)
# 2024-01-01    100
# 2024-01-02    102
# 2024-01-03    101
# 2024-01-04    105
# dtype: int64

A DataFrame is multiple columns side by side — like a spreadsheet:

data = {
    "open": [100, 101, 100, 103],
    "high": [102, 104, 103, 107],
    "low": [99, 100, 99, 102],
    "close": [102, 101, 105, 106]
}
df = pd.DataFrame(data, index=pd.to_datetime(["2024-01-01", "2024-01-02", "2024-01-03", "2024-01-04"]))
print(df)
#              open  high  low  close
# 2024-01-01   100   102   99    102
# 2024-01-02   101   104  100    101
# 2024-01-03   100   103   99    105
# 2024-01-04   103   107  102    106

This OHLC (open, high, low, close) format is the standard for daily stock data. You’ll use it constantly in financial analysis.

Fetching Real Stock Data

Now let’s grab real data. The yfinance library provides free access to Yahoo Finance data.

# Download Apple stock data
aapl = yf.download("AAPL", start="2024-01-01", end="2024-12-31")
print(aapl.head())
print(f"\nShape: {aapl.shape}")

This downloads daily OHLC data for Apple throughout 2024. The DataFrame contains columns: Open, High, Low, Close, Adj Close, and Volume.

Let’s examine the data:

# Get just the closing prices
closes = aapl["Close"]
print(closes.head())
# Date
# 2024-01-02    185.64
# 2024-01-03    185.56
# 2024-01-04    185.83
# Name: Close, dtype: float64

# Calculate basic statistics
print(f"Mean: ${closes.mean():.2f}")
print(f"Min: ${closes.min():.2f}")
print(f"Max: ${closes.max():.2f}")

Calculating Returns

Returns measure how much a stock’s price changed over time. They’re the foundation of financial analysis.

Simple returns calculate the percentage change from one period to the next:

# Calculate daily returns
daily_returns = closes.pct_change()
print(daily_returns.head())
# Date
# 2024-01-02         NaN
# 2024-01-03   -0.000431
# 2024-01-04    0.001455
# Name: Close, dtype: float64

The first value is NaN because there’s no previous day to compare against.

Cumulative returns show the total return from the start of the period:

cumulative_returns = (1 + daily_returns).cumprod() - 1
print(f"Total return: {cumulative_returns.iloc[-1]:.2%}")

This tells you the percentage gain or loss over the entire period.

Visualizing Stock Performance

Matplotlib lets you create meaningful visualizations:

plt.figure(figsize=(12, 5))

# Plot closing prices
plt.subplot(1, 2, 1)
closes.plot(title="AAPL Closing Prices")
plt.ylabel("Price ($)")

# Plot daily returns distribution
plt.subplot(1, 2, 2)
daily_returns.dropna().hist(bins=50)
plt.title("Daily Returns Distribution")
plt.xlabel("Return")
plt.ylabel("Frequency")

plt.tight_layout()
plt.savefig("aapl_analysis.png")
plt.show()

The histogram shows how returns are distributed — most days have small moves, with occasional larger jumps.

Comparing Multiple Stocks

You can download and compare multiple stocks at once:

tickers = ["AAPL", "GOOGL", "MSFT"]
data = yf.download(tickers, start="2024-01-01", end="2024-12-31")["Close"]

# Calculate normalized returns (starting at 100)
normalized = (data / data.iloc[0]) * 100
normalized.plot(title="Normalized Price Performance (Base=100)")

This shows how $100 invested in each stock would have performed relative to each other.

Next Steps

You now have the foundation for financial analysis in Python. The next tutorial in this series covers fetching specific types of data with yfinance and handling common data issues.

From here, you can explore:

  • Calculating volatility and risk metrics
  • Building a simple portfolio analyzer
  • Backtesting trading strategies

The skills you’ve learned — loading data, calculating returns, and visualizing results — apply to every type of financial analysis you’ll do in Python.

Written

  • File: sites/pyguides/src/content/tutorials/finance-getting-started.md
  • Words: ~850
  • Read time: 12 min
  • Topics covered: Environment setup, pandas Series/DataFrame, yfinance data fetching, calculating returns, visualizing stock data
  • Verified via: Python docs, yfinance documentation
  • Unverified items: none