Sigmoid Function Derivation
The sigmoid function is a common activation function used in machine learning and statistics, especially in binary classification and logistic regression.
✅ 1. Definition of Sigmoid Function
The sigmoid function is defined as:
It maps any real-valued number into the range \( (0, 1) \), making it useful for probability-based tasks.
✏️ 2. Why This Form?
The sigmoid function arises naturally in logistic regression, where we model the log-odds of a binary outcome as a linear function of the input.
We want a function that: - Outputs values between 0 and 1 (interpreted as probabilities), - Is smooth and differentiable, - Is monotonic (increasing), - Approaches 1 as \( x \to \infty \), and 0 as \( x \to -\infty \).
💡 3. Derivation from Logistic Model
Start with the log-odds (logit) expression:
Solve for \( p \):
Rewriting:
📐 Derivative of the Sigmoid Function
Let:
We want to find:
✏️ Use the Chain Rule
It's easier to rewrite the expression as:
Differentiate using the chain rule:
💡 Express in Terms of \( \sigma(x) \)
Recall:
So:
✅ Final Result
🔁 Summary
Property | Formula |
---|---|
Sigmoid Function | \( \sigma(x) = \frac{1}{1 + e^{-x}} \) |
Derivative | \( \sigma'(x) = \sigma(x)(1 - \sigma(x)) \) |
Range | \( (0, 1) \) |
Applications | Logistic Regression, Neural Networks |
💡 This function is widely used for binary classification tasks and as an activation function in shallow neural networks.