DigitHelm
Statistics & Probability

Logistic Regression Calculator | Log-Odds, Odds Ratios & ROC AUC

Fit a binary logistic regression model using iterative gradient descent. Computes log-likelihood, coefficients, standard errors, odds ratios, predicted probabilities, and ROC AUC. Includes confusion matrix with accuracy, precision, recall, and F1 score.

Instant Results100% FreeAny DeviceNo Sign-up
Each row: y(0/1) x₁ x₂

What Is the Logistic Regression Calculator | Log-Odds, Odds Ratios & ROC AUC?

Logistic regression models the probability of a binary outcome using the sigmoid function to constrain predictions to [0,1]. Instead of minimising squared errors like OLS, it maximises the log-likelihood of the observed binary labels via gradient descent.

Coefficients are expressed as log-odds; exp(β) gives the odds ratio, representing how the odds of outcome=1 multiply for a one-unit increase in the predictor. The ROC AUC measures the model's ability to discriminate between classes across all thresholds (0.5 = random, 1.0 = perfect).

Formula

Sigmoid: σ(z) = 1 / (1 + e⁻ᶻ) where z = β₀ + β₁x₁ + β₂x₂

Log-likelihood: ℓ(β) = Σ[yᵢ·log(σ(zᵢ)) + (1−yᵢ)·log(1−σ(zᵢ))]

Odds ratio = exp(β)  |  SE via Fisher information: I = XᵀWX, W = diag(p̂(1−p̂))

AUC (trapezoidal rule over sorted probabilities)

How to Use

  1. 1

    Choose the number of predictors (1 or 2) from the dropdown.

  2. 2

    Enter data in the textarea — one row per observation with y (0 or 1) first.

  3. 3

    Use the preset (study hours vs pass/fail) as a reference format.

  4. 4

    Click Fit Model to run gradient descent optimisation.

  5. 5

    Read the classification metrics: accuracy, precision, recall, F1, and AUC.

  6. 6

    Check the Coefficient Table for β (log-odds), exp(β) (odds ratio), and p-values.

  7. 7

    Review the Confusion Matrix to see true/false positive and negative counts at threshold 0.5.

Enter one observation per row: first column is y (0 or 1), followed by 1–2 predictor values. Select predictor count, then click Fit Model.

Example Calculation

Example 1 — Pass/fail prediction: With y=pass(1)/fail(0), x₁=hours studied, x₂=prior GPA: β₁=0.8 means each extra study hour multiplies the odds of passing by e^0.8 ≈ 2.23. AUC=0.91 indicates excellent discrimination.

Example 2 — Medical diagnosis: y=disease(1)/healthy(0), x₁=biomarker level. β₁=1.2, exp(β₁)=3.32: each unit increase in the biomarker triples the odds of disease. Precision=0.80 means 80% of predicted-positive patients truly have the disease.

Understanding Logistic Regression | Log-Odds, Odds Ratios & ROC AUC

Classification Metrics Reference

MetricFormulaInterpretation
Accuracy(TP + TN) / NOverall fraction correct; misleading for imbalanced classes
PrecisionTP / (TP + FP)Of predicted positives, what fraction are truly positive
Recall (Sensitivity)TP / (TP + FN)Of actual positives, what fraction were correctly identified
F1 Score2·P·R / (P + R)Harmonic mean of precision and recall; best for imbalanced data
SpecificityTN / (TN + FP)Of actual negatives, what fraction were correctly identified

AUC Benchmark Table

AUC RangeDiscriminationTypical Use
0.50No discrimination (random)Baseline
0.50 – 0.70PoorModel needs improvement
0.70 – 0.80AcceptableUseful in many applications
0.80 – 0.90ExcellentGood for clinical / business use
0.90 – 1.00OutstandingExceptional, check for data leakage

Key Logistic Regression Concepts

  • Log-odds (logit): log(p/(1-p)) = β₀ + β₁x₁ + … — the linear predictor on the logit scale.
  • Separation problem: if classes are perfectly separable, MLE coefficients diverge to ±∞; gradient descent will converge very slowly or not at all.
  • Class imbalance: with highly unequal class proportions, accuracy is misleading — prefer F1, AUC, or precision-recall curves.
  • Regularisation (L1/L2) is not included in this calculator but is crucial for high-dimensional data to prevent overfitting.
  • McFadden pseudo-R² = 1 − ℓ(full)/ℓ(null) is a common goodness-of-fit measure for logistic models.
  • The decision threshold of 0.5 is not always optimal — lower it to increase recall; raise it to increase precision.

Frequently Asked Questions

What is an odds ratio and how do I interpret it?

The odds ratio exp(β) describes how the odds of the outcome being 1 change for a one-unit increase in a predictor. OR > 1 means increased odds; OR < 1 means decreased odds. For example, OR=2.5 means the outcome is 2.5 times as likely (in odds terms) per unit increase in that predictor, holding others constant.

What is ROC AUC and what values are good?

AUC (Area Under the ROC Curve) measures classification ability across all thresholds. AUC=0.5 is random chance; 0.7–0.8 is acceptable; 0.8–0.9 is excellent; above 0.9 is outstanding. High AUC does not mean the model is well-calibrated — it only measures discrimination.

Why use gradient descent instead of the closed-form solution?

Logistic regression has no closed-form solution for maximum likelihood estimation. Gradient descent iteratively updates the coefficient vector in the direction that increases the log-likelihood. With lr=0.1 and up to 1000 iterations, convergence is typically achieved within 50–200 steps for well-scaled data.

What does the confusion matrix show?

The 2×2 confusion matrix shows counts of: True Positives (TP, correctly predicted 1), True Negatives (TN), False Positives (FP, predicted 1 but actually 0), and False Negatives (FN, predicted 0 but actually 1). Precision = TP/(TP+FP) and Recall = TP/(TP+FN) reveal trade-offs important for imbalanced classes.

When should I use logistic regression vs linear regression?

Use logistic regression when your outcome is binary (0/1, yes/no, pass/fail). Linear regression on a binary outcome can produce probabilities outside [0,1] and violates homoscedasticity. Logistic regression is the standard baseline for binary classification before exploring more complex models like decision trees or neural networks.

You Might Also Like

Explore 360+ Free Calculators

From math and science to finance and everyday life — all free, no account needed.