Linear Algebra Primer

Objective

To make you AI-ready with all essential linear algebra concepts, hands-on NumPy implementations, and intuitive connections to machine learning and deep learning.

Section 1: Introduction to Linear Algebra

Linear Algebra is the branch of mathematics concerning linear equations, linear functions, and their representations through vectors and matrices.

In AI, linear algebra is the foundation for data representation, model transformations, and optimizations.

Section 2: Key Building Blocks

import numpy as np

scalar = 5
vector = np.array([1, 2, 3])
matrix = np.array([[1, 2], [3, 4]])
tensor = np.array([[[1], [2]], [[3], [4]]])

Section 3: Vector Operations (with AI perspective)

Operation NumPy Code AI Use
Addition v1 + v2 Combine gradients/updates
Scalar Multiplication 3 * v1 Scaling features/weights
Dot Product np.dot(v1, v2) Similarity, projection
Norm np.linalg.norm(v1) Gradient magnitude
Unit Vector v1 / np.linalg.norm(v1) Direction only

Example: Cosine Similarity

cos_sim = np.dot(v1, v2) / (np.linalg.norm(v1) * np.linalg.norm(v2))

Section 4: Matrix Operations (with AI perspective)

  • A.T – Transpose: flip for dot product compatibility
  • np.dot(A, B) – Matrix Multiplication: combine weights with inputs
  • np.eye(n) – Identity Matrix
  • np.linalg.inv(A) – Inverse Matrix: solve Ax = b
  • np.linalg.det(A) – Determinant: check invertibility
  • np.linalg.eig(A) – Eigenvalues & Eigenvectors

Section 5: Systems of Linear Equations

A = np.array([[2, 1], [1, 3]])
b = np.array([8, 13])
x = np.linalg.solve(A, b)

Section 6: Decompositions

from scipy.linalg import lu
P, L, U = lu(A)
U, S, Vt = np.linalg.svd(A)

Section 7: Application in AI/ML

AI Concept Linear Algebra Usage
Neural Networks Matrix multiplication in forward/backward pass
PCA Eigenvalues/Eigenvectors
Embeddings Vectors, dot product, cosine similarity
Optimization Gradient descent on vector space
Transformers High-dimensional tensor operations
CNNs Kernel (filter) matrices slide over tensors

Section 8: Tensor Operations in AI

A tensor is a generalized form of a vector (1D) or matrix (2D) to any n-dimensional space. Tensors are the core data structure in deep learning frameworks like TensorFlow and PyTorch.

Creating Tensors


   tensor_3d = np.array([
    [[1, 2], [3, 4]],
    [[5, 6], [7, 8]]
])

Indexing and Slicing

print(tensor_3d[0, 1, 1])  # Accesses element 4

Broadcasting

Allows operations on tensors of different shapes by “stretching” smaller ones:

tensor = np.ones((2, 3, 4))
bias = np.array([1, 2, 3, 4])
result = tensor + bias  # Automatically broadcasts to match shape

Reshaping

   reshaped = tensor.reshape((3, 2, 4))

Transposing Tensors

transposed = np.transpose(tensor, (0, 2, 1))

Tensor Dot Product

Used in deep learning layer transformations:

A = np.random.rand(2, 3, 4)
B = np.random.rand(2, 4, 5)
result = np.matmul(A, B)

Practical Example in Deep Learning

Batch of images (shape: batch, height, width, channels):

images = np.random.rand(32, 64, 64, 3)  # 32 RGB images of 64x64
kernel = np.random.rand(3, 3, 3, 16)    # 16 convolutional filters

Each kernel slides over the image tensor to extract patterns.

In AI:

  • Tensors represent input/output layers, weights, activations.
  • Used in RNNs (3D tensors), CNNs (4D), Transformers (even 5D+).
  • Efficient GPU computation through batch tensor operations.

Recommended Books

  • Linear Algebra and Its Applications – Gilbert Strang
  • Introduction to Linear Algebra – Gilbert Strang
  • Matrix Analysis and Applied Linear Algebra – Carl D. Meyer
  • Mathematics for Machine Learning – Deisenroth, Faisal, Ong
  • Deep Learning (MIT Press) – Goodfellow, Bengio, Courville

Go to Core Learning