Getting Started with NumPy

· 4 min read · Updated March 12, 2026 · beginner
numpy scientific-computing arrays beginner

NumPy is the foundation of numerical computing in Python. If you’re doing anything with data, science, or machine learning, you’ll encounter it early and often. It provides support for large, multi-dimensional arrays and matrices, along with a collection of mathematical functions to operate on them.

What is NumPy?

NumPy stands for “Numerical Python.” At its core is the ndarray, an n-dimensional array object that outperforms standard Python lists significantly when dealing with numerical data. When you process large datasets, the difference is striking—NumPy operations can be orders of magnitude faster than equivalent Python loops.

The secret lies in how NumPy works. It stores data in contiguous blocks of memory, and most operations are implemented in C. This means you get the ease of writing Python code while NumPy handles the heavy lifting efficiently. Many popular libraries like Pandas, SciPy, and scikit-learn build on top of NumPy, making it essential knowledge for anyone pursuing data science or scientific computing.

Installing NumPy

Getting NumPy set up is straightforward. The most common way is through pip:

pip install numpy

If you’re using conda, you can install it through Anaconda’s package manager:

conda install numpy

Once installed, you import it using the conventional alias:

import numpy as np

The np alias is so widespread in the Python data science community that you’ll see it in virtually every tutorial, documentation, and codebase. Stick with it—your future self will thank you when reading other people’s code.

Creating Arrays

There are several ways to create NumPy arrays, and knowing the right method for your situation saves time.

From Python Lists

The simplest way to create an array is from an existing Python list:

import numpy as np

# One-dimensional array
arr = np.array([1, 2, 3, 4, 5])

# Two-dimensional array
matrix = np.array([[1, 2, 3], [4, 5, 6]])

print(arr)
print(matrix)

Using Built-in Functions

NumPy provides convenient functions for common array patterns:

# Create an array with a range of values
range_arr = np.arange(0, 10, 2)  # [0, 2, 4, 6, 8]

# Arrays filled with zeros
zeros = np.zeros(5)        # 1D array of zeros
zeros_2d = np.zeros((3, 4))  # 3x4 matrix of zeros

# Arrays filled with ones
ones = np.ones((2, 3))

# Create evenly spaced numbers
np.linspace(0, 1, 5)  # [0., 0.25, 0.5, 0.75, 1.]

The arange function works like Python’s built-in range, but returns an array. The linspace function is useful when you need a specific number of evenly spaced values between two endpoints.

Array Attributes

Once you have an array, you’ll want to inspect its properties. NumPy arrays have several useful attributes:

arr = np.array([[1, 2, 3], [4, 5, 6]])

print(arr.shape)    # (2, 3) - dimensions
print(arr.dtype)    # int64 - data type
print(arr.ndim)     # 2 - number of dimensions
print(arr.size)     # 6 - total elements
print(arr.itemsize) # 8 - bytes per element

Understanding these attributes helps you debug shape mismatches and optimize memory usage. The dtype is particularly important because it determines what operations you can perform and how much memory the array consumes.

Basic Operations

Indexing

Accessing individual elements works similarly to Python lists, with extended syntax for multi-dimensional arrays:

arr = np.array([1, 2, 3, 4, 5])

print(arr[0])   # 1 - first element
print(arr[-1])  # 5 - last element

matrix = np.array([[1, 2, 3], [4, 5, 6]])
print(matrix[0, 0])  # 1 - first row, first column
print(matrix[1, 2])  # 6 - second row, third column

Slicing

Slicing lets you extract portions of an array:

arr = np.arange(10)  # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

print(arr[2:7])    # [2, 3, 4, 5, 6]
print(arr[:5])      # [0, 1, 2, 3, 4]
print(arr[5:])      # [5, 6, 7, 8, 9]
print(arr[::2])     # [0, 2, 4, 6, 8] - every other element
print(arr[::-1])    # [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] - reversed

For 2D arrays, you can slice both dimensions:

matrix = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

print(matrix[:2, :2])  # [[1, 2], [4, 5]] - top-left 2x2
print(matrix[1:, :])   # [[4, 5, 6], [7, 8, 9]] - last two rows

One thing to note: array slicing returns a view, not a copy. Modifying a slice modifies the original array. If you need a copy, use arr.copy().

Conclusion

NumPy is an essential tool in the Python data science community. Its efficient array structures and mathematical functions form the backbone of most scientific computing workflows. Start with the basics—creating arrays, understanding their attributes, and performing simple indexing and slicing—and you’ll build a foundation that serves you well as you tackle more advanced topics.

See Also