Logistic Regression
Supervised learning · Classification · Sigmoid · Metrics

A supervised ML algorithm for classification — output is categorical (usually binary). Unlike Linear Regression, it predicts the probability of a class label and maps it to a discrete output via the Sigmoid function. Coefficients are estimated using Maximum Likelihood Estimation (MLE). Input = Independent variable (X) · Output = Categorical label Y ∈ {0, 1}.

Types of logistic regression
Binary
Binary Logistic Regression
OutputTwo possible outcomes (0 or 1)
e.g.Pass/Fail, Spam/Not Spam, Yes/No
Y ∈ {0, 1}
Most common form; single decision boundary
Multinomial
Multinomial Logistic Regression
Output3+ classes, no ordering
e.g.Red / Blue / Green category labels
Y ∈ {C₁, C₂, …, Cₙ}
Uses softmax; one model per class (OvR)
Ordinal
Ordinal Logistic Regression
Output3+ ordered/ranked classes
e.g.Low / Medium / High severity
Y : Low < Med < High
Order matters; uses cumulative logits
Mathematics — sigmoid & logit
0.5 σ(z) z 0 1 S-Curve
Sigmoid (Logistic) Function
σ(z) = 1 / (1 + e⁻ᶻ)
z= β₀ + β₁X₁ + β₂X₂ + … + βₚXₚ (linear combination)
σ(z)Output ∈ (0, 1) — interpreted as probability
β₀Intercept (bias term)
β₁…βₚFeature coefficients
Full Model
P(Y=1) = 1 / (1 + e⁻⁽β⁰⁺β¹X¹⁺…⁾)
P(Y=1)Probability output belongs to class 1
Threshold≥ 0.5 → predict class 1; < 0.5 → class 0
ShapeS-curve (not straight line like Linear Regression)

Log-Odds (Logit Function) — log[ P(Y=1) / P(Y=0) ] = β₀ + β₁X₁ + … + βₚXₚ  ·  This transformation maps probabilities (0–1) onto the full real line (−∞, +∞), creating a linear relationship between input features and the log-odds of the outcome. Logistic Regression is linear in log-odds space.

Evaluation metrics & confusion matrix
1
Accuracy
% of all correctly predicted instances.
(TP + TN) / (TP+TN+FP+FN)
Misleading on imbalanced data.
2
Precision
Of all predicted positives, how many were correct?
TP / (TP + FP)
Use when false positives are costly.
3
Recall
Of all actual positives, how many were caught?
TP / (TP + FN)
Use when false negatives are costly.
4
F1 Score
Harmonic mean of Precision & Recall.
2 × P×R / (P + R)
Best for imbalanced datasets.
Confusion Matrix
Predicted PositivePredicted Negative
Actual Positive TP
True Positive
FN
False Negative (Type II)
Actual Negative FP
False Positive (Type I)
TN
True Negative
■ Correct predictions
■ Incorrect predictions
Linear regression vs logistic regression
Aspect Linear Regression Logistic Regression Key Difference
OutputContinuous (e.g., price)Probability → class labelLR predicts a number; LogR predicts a category
FunctionY = β₀ + β₁X (line)σ(z) = 1/(1+e⁻ᶻ) (curve)Straight line vs S-shaped sigmoid curve
Range−∞ to +∞0 to 1LogR output is bounded — interpretable as probability
Loss FunctionMSE / Least SquaresLog Loss / Cross-EntropyOptimization method differs fundamentally
Use CaseRegression tasksClassification tasksChoose based on whether target is continuous or categorical

with by sv