Normalization with Simple Python


Python example where we:

  • Encode a category (like city)
  • Normalize numerical features (age and salary)
  • Prepare input for a neural network
# Step 1: Sample Data (2 features: Age, Salary; 1 categorical: City)
data = [
    {"city": "New York", "age": 25, "salary": 50000},
    {"city": "London", "age": 30, "salary": 60000},
    {"city": "New York", "age": 35, "salary": 120000},
    {"city": "Tokyo", "age": 28, "salary": 75000},
]

# Step 2: One-Hot Encode the 'city'
def one_hot_encode(data, key):
    categories = sorted(list(set(row[key] for row in data)))
    for row in data:
        for cat in categories:
            row[f"{key}_{cat}"] = 1 if row[key] == cat else 0
        del row[key]
    return categories  # Return the categories for reference

# Step 3: Normalize numerical columns
def normalize(data, key):
    values = [row[key] for row in data]
    mean = sum(values) / len(values)
    std = (sum((x - mean) ** 2 for x in values) / len(values)) ** 0.5
    for row in data:
        row[key] = (row[key] - mean) / std

# Step 4: Apply encoding and normalization
one_hot_encode(data, "city")
normalize(data, "age")
normalize(data, "salary")

# Step 5: Print the final processed data
for row in data:
    print(row)

Output (Illustrative)

{‘age’: -1.14, ‘salary’: -1.07, ‘city_London’: 0, ‘city_New York’: 1, ‘city_Tokyo’: 0}
{‘age’: 0.0, ‘salary’: -0.66, ‘city_London’: 1, ‘city_New York’: 0, ‘city_Tokyo’: 0}
{‘age’: 1.14, ‘salary’: 1.62, ‘city_London’: 0, ‘city_New York’: 1, ‘city_Tokyo’: 0}
{‘age’: 0.38, ‘salary’: 0.11, ‘city_London’: 0, ‘city_New York’: 0, ‘city_Tokyo’: 1}

How This Helps in a Neural Network

When we train a neural network:

  • Inputs like age and salary now lie on a similar scale (e.g., -1 to +1).
  • The model can learn weights efficiently and fairly.
  • Gradient descent converges faster and avoids getting stuck.

Normalization in Neural Networks – Summary