Activation Function example with Simple Python

Basic Python Code (No Libraries)

# Define activation functions
def relu(x):
    return x if x > 0 else 0

def sigmoid(x):
    return 1 / (1 + pow(2.71828, -x))  # approximate e

def tanh(x):
    e_pos = pow(2.71828, x)
    e_neg = pow(2.71828, -x)
    return (e_pos - e_neg) / (e_pos + e_neg)

# Sample inputs
inputs = [-2, -1, 0, 1, 2]

# Show how each activation behaves
print("Input\tReLU\tSigmoid\t\tTanh")
print("---------------------------------------")
for x in inputs:
    r = relu(x)
    s = sigmoid(x)
    t = tanh(x)
    print(f"{x}\t{r:.2f}\t{s:.4f}\t\t{t:.4f}")

Expected Output

Input ReLU Sigmoid Tanh
—————————————
-2 0.00 0.1192 -0.9640
-1 0.00 0.2689 -0.7616
0 0.00 0.5000 0.0000
1 1.00 0.7311 0.7616
2 2.00 0.8808 0.9640

Explanation:

  • ReLU gives zero for all negatives, passes positives as-is.
  • Sigmoid smoothly squashes values between 0 and 1.
  • Tanh squashes values between -1 and 1, symmetric around zero.

when and why to use each activation function — with real-life-inspired use cases and their strengths & weaknesses.

2. Sigmoid

Use When:

  • We need probability-like output (e.g., binary classification).
  • We’re working on final layer of a network to decide “yes/no”, “true/false”.

Real Use Case Example:

Email Spam Detection:
If our model should say “Spam” or “Not Spam” — Sigmoid is perfect. A result close to 1 = Spam, close to 0 = Not Spam.

Caution:

  • In deep networks, it can cause the vanishing gradient problem (slows learning).
  • Use only in output layer if classification is binary.

3. Tanh (Hyperbolic Tangent)

Use When:

  • Our data is centered around zero (positive & negative values).
  • We need smoother gradients than ReLU.
  • We’re dealing with hidden layers in a relatively shallow network.

Real Use Case Example:

Sentiment Analysis:

If input text has both positive and negative sentiments (e.g., movie reviews), tanh is useful to express that full range — from strongly negative (-1) to strongly positive (+1).

Why?

  • Tanh gives balanced output: negative to positive.
  • Helps capture subtle opposites, unlike sigmoid which is always positive.

Quick Comparison Table

Activation Output Range Best For Commonly Used In Weakness
ReLU 0 to ∞ Hidden layers Image recognition, Deep Nets Can “die” for negatives (output = 0)
Sigmoid 0 to 1 Output layer (binary classification) Logistic Regression, Spam detection Vanishing gradients
Tanh -1 to 1 Hidden layers (balanced input) Sentiment analysis, Text signals Still can suffer vanishing gradients

Rule of Thumb

  • Use ReLU for most hidden layers in deep networks.
  • Use Sigmoid only in the output layer for binary classification.
  • Use Tanh when your inputs are centered or involve positive and negative signals.

Activation Function relevancy in Neural Network – Visual Roadmap