75.6 Linear Algebra: dot, matmul, linalg
Right, so you’ve got your arrays all lined up and you’re feeling good. Now you want to do some real math with them. Welcome to the main event: linear algebra. This is where NumPy stops being a fancy list organizer and starts being the engine for pretty much every scientific computing or data science task you can think of.
Let’s get one thing straight from the start: np.dot, np.matmul, and the @ operator are the holy trinity of array multiplication, and they will absolutely trip you up if you don’t know their weird little family drama. And np.linalg is the toolbox that contains everything else you’d need to, you know, do linear algebra.
The dot product vs. matrix multiplication (it’s complicated)
This is the source of 90% of the confusion, so let’s burn through it. In pure Python, or for 1D arrays, np.dot(a, b) is the dot product. It’s the sum of the element-wise products. It returns a scalar.
import numpy as np
# For 1D arrays, it's the classic dot product.
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
result = np.dot(a, b) # (1*4 + 2*5 + 3*6) = 32
print(result) # Output: 32
Now, for 2D arrays, np.dot(a, b) is actually matrix multiplication. This is a historical artifact that causes no end of grief. It’s not wrong, but it’s not exactly obvious either.
# For 2D arrays, it's matrix multiplication.
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
result_dot = np.dot(A, B)
print("np.dot result:\n", result_dot)
# Output:
# [[19 22]
# [43 50]]
# Which is correct matrix multiplication.
To clear up the confusion, Python 3.5+ gave us a dedicated operator: @. This only does matrix multiplication and is way more readable. This is what you should use 99% of the time.
# The modern, clear way to do matrix multiplication
result_matmul = A @ B
print("A @ B result:\n", result_matmul) # Identical to np.dot above
So why does np.matmul exist? It’s essentially the function version of the @ operator. It behaves almost identically to dot for 2D arrays, but its true power (and the reason it was created) is in how it handles higher-dimensional arrays, which we’ll get to in a second.
The key difference to remember for 2D arrays is almost non-existent. For 1D arrays, dot treats them as vectors and gives a scalar, while matmul and @ treat them as matrices and will promote them to 2D to do proper matrix multiplication, which can lead to surprises.
# The 1D quirk:
print(np.dot(a, b)) # 32, a scalar
print(np.matmul(a, b)) # 32, also a scalar... wait, I thought you said...
# Okay, fine, it's a special case. They couldn't make it *too* simple.
# The real difference is here:
print((a @ b)) # 32, same as above.
# But look at this promotion trickery for matmul/@ with 1D:
print(a.shape, b.shape) # (3,), (3,)
result = a @ b # This is fine, gives scalar
# Now try to get a matrix out of it:
print((a[:, np.newaxis] @ b[np.newaxis, :])) # This creates a 3x3 matrix outer product!
Broadcasting in matrix operations (where it gets weird)
This is where matmul and @ truly earn their keep. When you have stacks of matrices (3D or higher arrays), matmul and @ treat the arrays as batches of matrices held in the last two dimensions. This is broadcasting, but for linear algebra. It’s incredibly powerful for vectorizing operations.
# Let's say you have a batch of 10, 2x2 matrices.
batch_A = np.random.rand(10, 2, 2)
batch_B = np.random.rand(10, 2, 2)
# You want to matrix-multiply each corresponding pair.
# Using a for loop would be sad and slow.
# Using @ is glorious and vectorized.
batch_result = batch_A @ batch_B
print(batch_result.shape) # (10, 2, 2)
# What if the batches are broadcastable?
# Let's say you have one single magic matrix that you want to multiply with every matrix in the batch.
single_B = np.random.rand(2, 2)
batch_result_2 = batch_A @ single_B
print(batch_result_2.shape) # (10, 2, 2) - it broadcasts the (2,2) across the batch of 10.
np.dot, on the other hand, has a much weirder and more archaic approach to higher dimensions. It does a sum product over the last axis of a and the second-to-last axis of b. Just typing that sentence made me sigh. Avoid it for n-dimensional arrays. Use @.
The linalg module: your Swiss Army knife
The np.linalg module is a beautifully organized toolbox. Need to solve a system of equations? np.linalg.solve. Need eigenvalues? np.linalg.eig. Norm? np.linalg.norm. It’s all there and it’s all battle-tested.
A critical best practice: NEVER use np.linalg.inv(A) @ b to solve the linear system A x = b. It’s less precise, slower, and less numerically stable than using np.linalg.solve(A, b). solve uses better algorithms (like LU decomposition with pivoting) that avoid explicitly calculating the inverse matrix, which is often a computational nightmare.
# Solving A x = b
A = np.array([[3, 1], [1, 2]])
b = np.array([9, 8])
# The bad, slow, imprecise way:
A_inv = np.linalg.inv(A)
x_bad = A_inv @ b
print("Using inv:", x_bad)
# The correct, good, professional way:
x_good = np.linalg.solve(A, b)
print("Using solve:", x_good) # Output: [2. 3.]
# They might look the same, but trust me, for large or ill-conditioned matrices, 'solve' is the only right answer.
Another pro tip: always check the condition number (np.linalg.cond) if you’re working with real-world data. A high condition number means your matrix is close to singular, and your results are likely to be numerical garbage. np.linalg will often throw a LinAlgError in these cases, but it’s good to know why it’s happening.
# A famously nasty matrix, the Hilbert matrix, is nearly singular for even small n.
from scipy.linalg import hilbert
H = hilbert(5)
print("Condition number:", np.linalg.cond(H)) # This will be huge, warning you of impending doom.
# Trying to solve with this matrix will give you meaningless results.
So there you have it. Use @ for multiplication. Use linalg.solve instead of inverses. And always be aware of the shape of your arrays. Do that, and you’ll be wielding one of the most powerful computational tools ever built.