Matrix Primer
What is a Matrix?
A matrix is a grid of numbers in rows and columns. It’s used in AI to represent data, weights, and transformations.
A = [[1, 2], [3, 4]]
Matrix in NumPy
import numpy as np A = np.array([[1, 2], [3, 4]]) print(A)
Matrix Operations (with AI relevance)
1. Matrix Addition
C = A + B
Used to update weights during training.
2. Matrix Subtraction
C = A - B</code> <div class="ai-note">Used to calculate prediction errors.</div> <h3>3. Scalar Multiplication</h3> [python]C = 2 * A
Used for scaling weights or learning rates.
4. Matrix Multiplication
C = np.dot(A, B) # or A @ B
Core operation in neural network layers.
5. Transpose
A_T = A.T
Aligns matrices for multiplication.
6. Identity Matrix
I = np.eye(3)[python] <div class="ai-note">Neutral element in matrix multiplication.</div> <h3>7. Matrix Inversion</h3> [python]A_inv = np.linalg.inv(A)
Used in solving systems of equations (rare in deep learning).
8. Determinant
det = np.linalg.det(A)
Used in checking if a matrix is invertible.
9. Rank
rank = np.linalg.matrix_rank(A)
Helps understand data dimensionality.
10. Eigenvalues & Eigenvectors
eig_vals, eig_vecs = np.linalg.eig(A)
Used in PCA and data transformation.
11. Singular Value Decomposition (SVD)
U, S, Vt = np.linalg.svd(A)
Used in compression, recommendation systems.
12. Norm
norm = np.linalg.norm(A)
Measures vector length, used in regularization.
13. Reshape
B = A.reshape(4, 1)
Reshapes data for model input.
14. Broadcasting
C = A + b # b = [1, 2]
Used in batch data operations.
Real-World AI Example: Neural Net Forward Pass
# x: input, W: weights, b: bias x = np.array([[1, 2, 3]]) W = np.array([[0.2, 0.4], [0.3, 0.1], [0.5, 0.6]]) b = np.array([[0.1, 0.2]]) y = np.dot(x, W) + b print("Output:", y)
This is a real forward-pass operation in a neural network layer.
What is a Tensor?
A tensor is a generalization of vectors and matrices to potentially higher dimensions.
- Scalar: 0D tensor (e.g., 5)
- Vector: 1D tensor (e.g., [1, 2, 3])
- Matrix: 2D tensor (e.g., [[1, 2], [3, 4]])
- 3D Tensor: e.g., a stack of matrices
import numpy as np tensor_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]]) print(tensor_3d.shape) # (2, 2, 2)
Tensors are the core data structure in deep learning (used in PyTorch, TensorFlow).
Tensor Operations
1. Tensor Addition
A + B
Element-wise addition for broadcasting or stacking layers.
2. Tensor Reshaping
tensor.reshape(new_shape)
Used to flatten images, or adjust shapes for model input.
3. Axis-Based Operations
np.sum(tensor, axis=0)
Useful to reduce or aggregate information along an axis (like pooling).
4. Tensor Dot Product
np.tensordot(A, B, axes=1)
Generalized inner product used in attention mechanisms and matrix projections.
5. Tensor Broadcasting
A + np.array([1])
Allows computation across different shapes without looping.
Real AI Example: Tensor Shape in Deep Learning
# Example: Image tensor [batch, height, width, channels] image_batch = np.random.rand(32, 64, 64, 3) print(image_batch.shape)
Each input image has shape (64,64,3), and batch of 32 makes it a 4D tensor.
Recommended Books
- Linear Algebra and Its Applications by Gilbert Strang
- Matrix Algebra Useful for Statistics by Shayle Searle
- Mathematics for Machine Learning by Deisenroth, Faisal, Ong
- Introduction to Linear Algebra by Gilbert Strang
- Deep Learning by Ian Goodfellow (Appendix)
- Deep Learning by Ian Goodfellow — Covers tensor operations in detail
- Mathematics for Machine Learning — Hands-on linear algebra & tensor insights
- Linear Algebra and Learning from Data by Gilbert Strang
Go to Core Learning