DigitHelm
Statistics & Probability

Mann-Whitney U Test Calculator | Non-Parametric Two-Sample Test

Perform the Mann-Whitney U test (Wilcoxon rank-sum test) on two independent samples. Computes U statistic, z-score approximation with continuity correction, exact p-value for small samples, and effect size r = z/√n.

Instant Results100% FreeAny DeviceNo Sign-up

What Is the Mann-Whitney U Test Calculator | Non-Parametric Two-Sample Test?

The Mann-Whitney U test (also called the Wilcoxon rank-sum test) is a non-parametric alternative to the independent-samples t-test. It tests whether one group tends to have larger values than the other without assuming normality. It is based entirely on ranks, making it robust to outliers and skewed distributions.

The U statistic counts the number of times an observation from group 1 exceeds an observation from group 2 across all n₁×n₂ pairs. For large samples (n > 8), a z-approximation with continuity correction and tie adjustment is used. Effect size r = |z|/√N mirrors Cohen's d on the rank scale.

Formula

Rank all N = n₁+n₂ values together (average ranks for ties)

U₁ = R₁ − n₁(n₁+1)/2  |  U₂ = n₁n₂ − U₁  |  U = min(U₁, U₂)

z = (U − n₁n₂/2 + 0.5) / √Var(U)  |  Var with tie correction: n₁n₂/12 · [(N+1) − Σ(t³−t)/(N(N−1))]

Effect size: r = |z| / √N

How to Use

  1. 1

    Enter values for Group 1 in the left textarea (comma or newline separated).

  2. 2

    Enter values for Group 2 in the right textarea.

  3. 3

    Each group needs at least 2 values; more than 8 values per group activates the z-approximation.

  4. 4

    Click Run U Test to compute all statistics.

  5. 5

    Read U, U₁, U₂, z-score, and p-value from the result cards.

  6. 6

    Check the effect size r: values above 0.3 indicate a medium effect.

  7. 7

    Read the interpretation summary for a plain-language conclusion.

Enter values for each group in the text areas — one value per line or comma-separated. Click Run U Test to compute the statistic, p-value, and effect size.

Example Calculation

Example 1 — Treatment vs control (n=8 each): Group 1: 12, 18, 23, 15, 27, 19, 14, 22. Group 2: 25, 31, 28, 35, 22, 29, 33, 30. After ranking all 16 values, R₁ = 42, U₁ = 42 − 36 = 6. U₂ = 64 − 6 = 58. U = 6. z ≈ −3.05, p ≈ 0.002. Group 2 has significantly higher values.

Example 2 — Tied values: Group 1: 5, 7, 7, 9. Group 2: 6, 7, 8, 10. Three values of 7 create ties. Average rank = (3+4+5)/3 = 4. Tie correction reduces the variance slightly, giving a more conservative (larger) p-value than the non-corrected formula.

Understanding Mann-Whitney U Test | Non-Parametric Two-Sample Test

Mann-Whitney vs T-Test Comparison

PropertyMann-Whitney UIndependent t-test
Distribution assumptionNone (non-parametric)Normal (parametric)
Scale of measurementOrdinal or continuousContinuous (interval/ratio)
Sensitive to outliersResistantSensitive
What it testsStochastic dominance / median shiftDifference in means
Relative efficiency~95% of t-test power (large n)100% under normality
Handles tiesYes (with correction)N/A

Effect Size r Reference

r valueCohen classificationPractical meaning
< 0.10NegligibleTrivial, likely no practical importance
0.10 – 0.30SmallDetectable but modest group difference
0.30 – 0.50MediumNoticeable and practically meaningful
> 0.50LargeSubstantial, easily observed difference

Reporting Guidelines

  • Report both U statistic and sample sizes: "U(n₁=10, n₂=12) = 34, p = 0.023."
  • Always report effect size r alongside p-value for meaningful interpretation.
  • Note whether exact or approximate p-values were used, especially for small samples.
  • A significant p-value does not confirm a meaningful difference — small p with negligible r is common in large samples.
  • For reporting confidence intervals, use the Hodges-Lehmann estimator (median of all pairwise differences).
  • When both groups are approximately normal with similar variance, prefer the t-test for higher statistical power.

Frequently Asked Questions

When should I use Mann-Whitney instead of a t-test?

Use Mann-Whitney when your data violates the normality assumption required by the t-test, or when you have ordinal data, heavy outliers, or small samples where normality cannot be verified. With large samples from normal distributions, both tests give virtually identical results. Mann-Whitney tests whether values in one group tend to be larger; it does not test equality of means.

What does the U statistic actually mean?

U₁ counts the number of times a Group 1 observation exceeds a Group 2 observation across all n₁×n₂ pairs (counting ties as 0.5). U ranges from 0 (one group dominates entirely) to n₁n₂/2 (complete overlap). U = 0 indicates perfect separation; U = n₁n₂/2 is the null-hypothesis expectation.

Is the Mann-Whitney test the same as testing medians?

Not exactly. Mann-Whitney tests whether P(X₁ > X₂) = 0.5, i.e., stochastic equality. It is a test of medians only if both distributions have the same shape and spread. If the groups differ in shape or variance, a significant U test may reflect a difference in spread rather than centre. The test is more accurately described as testing stochastic dominance.

What is effect size r in this context?

r = |z|/√N is an effect size measure bounded roughly between 0 and 1. Using Cohen's guidelines adapted for rank tests: r < 0.1 = negligible; 0.1–0.3 = small; 0.3–0.5 = medium; > 0.5 = large. It is directly comparable to the Pearson correlation coefficient in magnitude, making cross-study comparison straightforward.

How does the tie correction work?

Ties reduce the variance of U because tied observations occupy the same rank position, making the distribution more concentrated. The correction subtracts Σ(t³−t)/12 from the variance, where t is the size of each tie group, and divides by N(N−1)/12. This produces a larger (more conservative) z-statistic denominator and a slightly larger p-value than if ties were ignored.

You Might Also Like

Explore 360+ Free Calculators

From math and science to finance and everyday life — all free, no account needed.