Matrix Primer - Little Bits of Artificial Intelligence

Matrix Primer

What is a Matrix?

A matrix is a grid of numbers in rows and columns. It’s used in AI to represent data, weights, and transformations.

A = [[1, 2],
     [3, 4]]

Matrix in NumPy

import numpy as np
A = np.array([[1, 2], [3, 4]])
print(A)

Matrix Operations (with AI relevance)

1. Matrix Addition

C = A + B

Used to update weights during training.

2. Matrix Subtraction

C = A - B

Used to calculate prediction errors.

3. Scalar Multiplication

C = 2 * A

Used for scaling weights or learning rates.

4. Matrix Multiplication

C = np.dot(A, B)  # or A @ B

Core operation in neural network layers.

5. Transpose

A_T = A.T

Aligns matrices for multiplication.

6. Identity Matrix

I = np.eye(3)

Neutral element in matrix multiplication.

7. Matrix Inversion

A_inv = np.linalg.inv(A)

Used in solving systems of equations (rare in deep learning).

8. Determinant

det = np.linalg.det(A)

Used in checking if a matrix is invertible.

9. Rank

rank = np.linalg.matrix_rank(A)

Helps understand data dimensionality.

10. Eigenvalues & Eigenvectors

eig_vals, eig_vecs = np.linalg.eig(A)

Used in PCA and data transformation.

11. Singular Value Decomposition (SVD)

U, S, Vt = np.linalg.svd(A)

Used in compression, recommendation systems.

12. Norm

norm = np.linalg.norm(A)

Measures vector length, used in regularization.

13. Reshape

B = A.reshape(4, 1)

Reshapes data for model input.

14. Broadcasting

C = A + b  # b = [1, 2]

Used in batch data operations.

Real-World AI Example: Neural Net Forward Pass

    # x: input, W: weights, b: bias
x = np.array([[1, 2, 3]])
W = np.array([[0.2, 0.4], [0.3, 0.1], [0.5, 0.6]])
b = np.array([[0.1, 0.2]])
y = np.dot(x, W) + b
print("Output:", y)

This is a real forward-pass operation in a neural network layer.

What is a Tensor?

A tensor is a generalization of vectors and matrices to potentially higher dimensions.

Scalar: 0D tensor (e.g., 5)
Vector: 1D tensor (e.g., [1, 2, 3])
Matrix: 2D tensor (e.g., [[1, 2], [3, 4]])
3D Tensor: e.g., a stack of matrices

import numpy as np
tensor_3d = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(tensor_3d.shape)  # (2, 2, 2)

Tensors are the core data structure in deep learning (used in PyTorch, TensorFlow).

Tensor Operations

1. Tensor Addition

A + B

Element-wise addition for broadcasting or stacking layers.

2. Tensor Reshaping

tensor.reshape(new_shape)

Used to flatten images, or adjust shapes for model input.

3. Axis-Based Operations

np.sum(tensor, axis=0)

Useful to reduce or aggregate information along an axis (like pooling).

4. Tensor Dot Product

np.tensordot(A, B, axes=1)

Generalized inner product used in attention mechanisms and matrix projections.

5. Tensor Broadcasting

A + np.array([1])

Allows computation across different shapes without looping.

Real AI Example: Tensor Shape in Deep Learning

# Example: Image tensor [batch, height, width, channels]
image_batch = np.random.rand(32, 64, 64, 3)
print(image_batch.shape)

Each input image has shape (64,64,3), and batch of 32 makes it a 4D tensor.

Recommended Books

Linear Algebra and Its Applications by Gilbert Strang
Matrix Algebra Useful for Statistics by Shayle Searle
Mathematics for Machine Learning by Deisenroth, Faisal, Ong
Introduction to Linear Algebra by Gilbert Strang
Deep Learning by Ian Goodfellow (Appendix)
Deep Learning by Ian Goodfellow — Covers tensor operations in detail
Mathematics for Machine Learning — Hands-on linear algebra & tensor insights
Linear Algebra and Learning from Data by Gilbert Strang

Go to Core Learning