Minimize objective in Neural Network

1. What is a Minimization Objective in a Neural Network?

From a neural network perspective, minimization objective means:

“We are trying to reduce the error between the model’s predictions and the actual correct answers — as small as possible.”

This “error” is captured using a loss function (like Mean Squared Error or Cross Entropy).The training process revolves around minimizing this loss.

Simple Story Example: “The Arrow and the Bullseye”

Characters:

  • Arjun, a young archer (represents the neural network).
  • Target, a bullseye (represents the correct output).
  • Coach, his trainer (represents the optimizer like gradient descent).

The Situation:

Arjun is practicing archery. He stands at a fixed spot and keeps shooting arrows toward the bullseye.

  • Each time he shoots, he sees how far off his arrow landed from the center.
  • The distance from the bullseye is like the loss.
  • His goal? Minimize this distance with every next arrow.

So:

  • The bullseye = actual label.
  • The arrow’s position = predicted output.
  • The gap between arrow and bullseye = loss.
  • The minimization = trying to shoot better by learning from previous errors.

Iterative Learning:

After each round, the coach gives feedback: “Arjun, your shot was 5cm too high and 3cm too right.”

This feedback helps Arjun adjust his aim slightly. Repeat this process 1000 times… Arjun becomes a great archer.
In neural networks, that’s what happens during backpropagation + gradient descent.

Numerical Example: (Mini)

Let’s say we have this small neural net predicting house prices.

Input (size) Target Price Predicted Price Loss = (Target – Predicted)²
1000 sqft 200,000 180,000 (200K – 180K)² = 400M

Now the optimizer tries to change the weights in the network to make the prediction closer to 200,000 next time.

  • Objective: Minimize the total loss (e.g., total squared errors across all examples).
  • Once the loss is very small → model has learned well.

2. What is a Convex Function? (Plain English)

A convex function is a special type of curve that always looks like a smile. Imagine a bowl — no matter where we drop a ball inside it, it will always roll down to the lowest point in the middle.

Simple Definition (No Math):

A function is convex if any straight line between two points on the curve lies above the curve. That means the curve never dips below the straight line between any two of its points.

Real-World Analogy:

  • Imagine we’re hiking on a mountain trail shaped like a big U.
  • No matter where we start, as long as we keep going downhill, we’ll always reach the valley (the lowest point).
  • That valley is the minimum — and for convex functions, there is only one such lowest point (called the global minimum).

Simple Example of a Convex Function:

f(x)=x²

This is a U-shaped curve:

  • f(0)=0 → minimum
  • f(1)=1 , f(2)=4 , f(–1)=1
  • The function always increases as we go away from zero.

Here’s how it looks:

Any line between two points on this curve (say at x=−2 and x=2) will always lie above the curve itself.

Why Convexity Matters in Neural Networks?

  • When training a model, we try to minimize a loss function.
  • If the loss function is convex (like a simple MSE for linear models), then gradient descent will always find the best solution — because there’s only one lowest point.
  • In contrast, non-convex functions (like those in deep neural networks) have many hills and valleys → multiple local minima → optimization is harder.

Summary:

Concept In Plain Words
Convex function A function shaped like a bowl — always curves up
Why it matters Easy to minimize; always one best (lowest) point
Classic example f(x)=x^2
Gradient descent Works reliably when loss is convex

Simple plot image of a convex vs non-convex function

In Simple Terms:

  • Convex Function
    • Always has one unique global minimum
    • Looks like a bowl or U-shape
    • Gradient descent always finds the best solution
  • Non-Convex Function
    • Has multiple minima (local and global)
    • Looks wavy or hilly
    • Gradient descent might get stuck in a local minimum, not the best one

Minimize objective in Neural Network – Minimize Objective example with Simple Python