Logistic Regression Calculator | Log-Odds, Odds Ratios & ROC AUC
Fit a binary logistic regression model using iterative gradient descent. Computes log-likelihood, coefficients, standard errors, odds ratios, predicted probabilities, and ROC AUC. Includes confusion matrix with accuracy, precision, recall, and F1 score.
What Is the Logistic Regression Calculator | Log-Odds, Odds Ratios & ROC AUC?
Logistic regression models the probability of a binary outcome using the sigmoid function to constrain predictions to [0,1]. Instead of minimising squared errors like OLS, it maximises the log-likelihood of the observed binary labels via gradient descent.
Coefficients are expressed as log-odds; exp(β) gives the odds ratio, representing how the odds of outcome=1 multiply for a one-unit increase in the predictor. The ROC AUC measures the model's ability to discriminate between classes across all thresholds (0.5 = random, 1.0 = perfect).
Formula
Sigmoid: σ(z) = 1 / (1 + e⁻ᶻ) where z = β₀ + β₁x₁ + β₂x₂
Log-likelihood: ℓ(β) = Σ[yᵢ·log(σ(zᵢ)) + (1−yᵢ)·log(1−σ(zᵢ))]
Odds ratio = exp(β) | SE via Fisher information: I = XᵀWX, W = diag(p̂(1−p̂))
AUC (trapezoidal rule over sorted probabilities)
How to Use
- 1
Choose the number of predictors (1 or 2) from the dropdown.
- 2
Enter data in the textarea — one row per observation with y (0 or 1) first.
- 3
Use the preset (study hours vs pass/fail) as a reference format.
- 4
Click Fit Model to run gradient descent optimisation.
- 5
Read the classification metrics: accuracy, precision, recall, F1, and AUC.
- 6
Check the Coefficient Table for β (log-odds), exp(β) (odds ratio), and p-values.
- 7
Review the Confusion Matrix to see true/false positive and negative counts at threshold 0.5.
Example Calculation
Example 1 — Pass/fail prediction: With y=pass(1)/fail(0), x₁=hours studied, x₂=prior GPA: β₁=0.8 means each extra study hour multiplies the odds of passing by e^0.8 ≈ 2.23. AUC=0.91 indicates excellent discrimination.
Example 2 — Medical diagnosis: y=disease(1)/healthy(0), x₁=biomarker level. β₁=1.2, exp(β₁)=3.32: each unit increase in the biomarker triples the odds of disease. Precision=0.80 means 80% of predicted-positive patients truly have the disease.
Understanding Logistic Regression | Log-Odds, Odds Ratios & ROC AUC
Classification Metrics Reference
| Metric | Formula | Interpretation |
|---|---|---|
| Accuracy | (TP + TN) / N | Overall fraction correct; misleading for imbalanced classes |
| Precision | TP / (TP + FP) | Of predicted positives, what fraction are truly positive |
| Recall (Sensitivity) | TP / (TP + FN) | Of actual positives, what fraction were correctly identified |
| F1 Score | 2·P·R / (P + R) | Harmonic mean of precision and recall; best for imbalanced data |
| Specificity | TN / (TN + FP) | Of actual negatives, what fraction were correctly identified |
AUC Benchmark Table
| AUC Range | Discrimination | Typical Use |
|---|---|---|
| 0.50 | No discrimination (random) | Baseline |
| 0.50 – 0.70 | Poor | Model needs improvement |
| 0.70 – 0.80 | Acceptable | Useful in many applications |
| 0.80 – 0.90 | Excellent | Good for clinical / business use |
| 0.90 – 1.00 | Outstanding | Exceptional, check for data leakage |
Key Logistic Regression Concepts
- ▸Log-odds (logit): log(p/(1-p)) = β₀ + β₁x₁ + … — the linear predictor on the logit scale.
- ▸Separation problem: if classes are perfectly separable, MLE coefficients diverge to ±∞; gradient descent will converge very slowly or not at all.
- ▸Class imbalance: with highly unequal class proportions, accuracy is misleading — prefer F1, AUC, or precision-recall curves.
- ▸Regularisation (L1/L2) is not included in this calculator but is crucial for high-dimensional data to prevent overfitting.
- ▸McFadden pseudo-R² = 1 − ℓ(full)/ℓ(null) is a common goodness-of-fit measure for logistic models.
- ▸The decision threshold of 0.5 is not always optimal — lower it to increase recall; raise it to increase precision.
Frequently Asked Questions
What is an odds ratio and how do I interpret it?
What is ROC AUC and what values are good?
Why use gradient descent instead of the closed-form solution?
What does the confusion matrix show?
When should I use logistic regression vs linear regression?
You Might Also Like
Explore 360+ Free Calculators
From math and science to finance and everyday life — all free, no account needed.