Array Broadcasting in NumPy

What happens when you add a vector of shape (3,) to a matrix of shape (4, 3)? In regular math, this is undefined. In NumPy, broadcasting automatically stretches the smaller array along the missing dimensions so the operation makes sense. This eliminates entire categories of loops in AI code.


The Broadcasting Rules

NumPy compares the shapes of two arrays from right to left. Two dimensions are compatible when they are equal, or when one of them is 1. When a dimension is 1, it gets logically repeated to match the other.

Shapes are also prepended with 1s on the left to make them the same length before comparison.

Shape Compatibility Examples

<pre><code class="language-python"># Shape (4, 3) + Shape (3,) → (4, 3) ✅ # The (3,) row is broadcast across all 4 rows. # Shape (4, 1) + Shape (1, 3) → (4, 3) ✅ # Both dimensions stretch to 4 and 3 respectively. # Shape (4, 3) + Shape (4,) → ERROR ❌ # Rightmost dims: 3 vs 4, neither is 1. </pre>

Real-World Example: Adding a Bias Vector

One of the most common uses of broadcasting is adding a bias vector to an output matrix in a neural network layer. The weight output is a matrix, the bias is a 1D vector — broadcasting handles this cleanly.

Bias Addition in a Layer

<pre><code class="language-python">import numpy as np # Batch of 5 samples, each with 4 features outputs = np.random.rand(5, 4) # shape (5, 4) bias = np.array([1, 2, 3, 4]) # shape (4,) # Broadcasting adds the bias to every row result = outputs + bias # shape (5, 4) print(result.shape) # (5, 4) </pre>

Without broadcasting you'd need a for loop over all 5 rows. NumPy avoids that entirely.

Normalizing Data Across a Batch

Another common use is mean subtraction — centering a dataset by subtracting the column mean. The mean has shape (features,) and the dataset has shape (samples, features); broadcasting subtracts the right mean from every sample automatically.

Zero-centering a Dataset

<pre><code class="language-python">data = np.random.rand(100, 10) # 100 samples, 10 features col_mean = data.mean(axis=0) # shape (10,) col_std = data.std(axis=0) # shape (10,) # Normalize: subtract mean, divide by std normalized = (data - col_mean) / col_std # shape (100, 10) </pre>