Logistic Regression - From Basic to Advanced

🎯 What is Logistic Regression? BASIC

Logistic Regression is a fundamental classification algorithm that helps us predict categorical outcomes. Think of it as a smart decision-maker that answers "yes" or "no" questions with probabilities.

📊 Binary Classification

Classifies data into two distinct categories (0 or 1, Yes or No, True or False). Perfect for spam detection, medical diagnosis, and credit scoring.

📈 Probability Output

Instead of just giving a classification, it provides a probability score between 0 and 1, indicating confidence in the prediction.

🎯 Decision Boundary

Creates a boundary that separates different classes in the feature space, typically at probability = 0.5.

The Sigmoid Function

σ(z) = 1 / (1 + e^-z)

This magical function converts any number into a probability between 0 and 1!

🔍 Real-World Examples

📧
Email Spam Detection: Classifies emails as spam or not spam based on content, sender, and other features.
🏥
Medical Diagnosis: Predicts disease presence based on symptoms and test results.
🏦
Credit Scoring: Determines loan approval likelihood based on financial history.
🎮
Customer Churn: Predicts if customers will leave a service based on usage patterns.

⚙️ How It Works Internally INTERMEDIATE

🧮 Linear Equation

First, we calculate a linear combination of inputs: z = w₁x₁ + w₂x₂ + ... + b where w are weights, x are features, and b is bias.

🔄 Sigmoid Transformation

The linear result is passed through the sigmoid function to squash it between 0 and 1.

📉 Cost Function

We use Log Loss (Binary Cross-Entropy) to measure how wrong our predictions are.

Cost Function

J(θ) = -1/m Σ[y·log(ŷ) + (1-y)·log(1-ŷ)]

Measures the error between predicted and actual values.

Python Implementation:

import numpy as np

def sigmoid(z):
    return 1 / (1 + np.exp(-z))

def predict(features, weights, bias):
    z = np.dot(features, weights) + bias
    return sigmoid(z)

🎯 Interactive Visualization

Explore how different parameters affect the decision boundary:

🚀 Advanced Concepts ADVANCED

🎚️ Regularization

L1 (Lasso) and L2 (Ridge) regularization prevent overfitting by adding penalty terms to the cost function, controlling model complexity.

📊 Multiclass Classification

Extended to handle multiple classes using One-vs-Rest (OvR) or Softmax regression for mutually exclusive classes.

⚡ Optimization Algorithms

Beyond Gradient Descent: Adam, RMSprop, and LBFGS optimizers for faster convergence and better performance.

Regularized Cost Function

J(θ) = -1/m Σ[y·log(ŷ) + (1-y)·log(1-ŷ)] + λ/(2m) Σθⱼ²

λ controls the strength of regularization.

Advanced Implementation with Regularization:

class LogisticRegression:
    def __init__(self, lr=0.01, epochs=1000, reg_lambda=0.1):
        self.lr = lr
        self.epochs = epochs
        self.reg_lambda = reg_lambda
    
    def fit(self, X, y):
        # Gradient descent with L2 regularization
        for epoch in range(self.epochs):
            # Forward pass
            z = np.dot(X, self.weights) + self.bias
            predictions = sigmoid(z)
            
            # Backward pass with regularization
            dw = (1/m) * np.dot(X.T, (predictions - y)) + \
                 (self.reg_lambda/m) * self.weights
            db = (1/m) * np.sum(predictions - y)
            
            # Update parameters
            self.weights -= self.lr * dw
            self.bias -= self.lr * db

🔬 Performance Metrics

📈 ROC Curve

Receiver Operating Characteristic curve plots true positive rate vs false positive rate at various threshold settings.

🎯 AUC Score

Area Under the Curve measures model's ability to distinguish between classes. AUC = 0.5 is random, AUC = 1 is perfect.

⚖️ Precision-Recall

Useful for imbalanced datasets where accuracy can be misleading.

🎮 Interactive Playground

Adjust the parameters below to see how they affect the classification probability:

Weight 1 (Feature Importance) 0.50

Weight 2 (Feature Importance) 0.50

Bias (Decision Threshold) 0.00

Classification Result

Probability: 50.0%

Class: Undecided (Threshold = 0.5)

🏆 Challenge Exercises

🎯 Challenge 1: Perfect Separation

Adjust parameters to achieve 95%+ probability for Class 1. Hint: Increase weights and adjust bias positively.

⚖️ Challenge 2: Balanced Decision

Find parameter settings that keep probability exactly at 50%. This represents maximum uncertainty.

🔄 Challenge 3: Feature Importance

Set Weight 1 to maximum positive and Weight 2 to maximum negative. Observe how conflicting features affect the decision.