The Kruskal-Wallis test extends the Wilcoxon rank-sum test to more than two groups. When data are not normally distributed or sample sizes are small, traditional ANOVA assumptions break down. This nonparametric test ranks the pooled observations and evaluates whether the group rank sums differ more than would be expected by chance. It is particularly helpful when comparing medians across three or more independent samples.
Suppose you have groups with observations each. Combine all observations and assign ranks from to , where is the total sample size. If there are ties, assign average ranks to the tied values. Next, compute , the sum of ranks within each group. The test statistic is then
Under the null hypothesis that all groups share the same median, approximately follows a chi-squared distribution with degrees of freedom when sample sizes are moderate.
The Kruskal-Wallis test assumes independent samples and that the observations are at least ordinal. It does not require normality or equal variances. Researchers often turn to it when dealing with small sample sizes or skewed distributions, such as comparing reaction times under different conditions or measuring the effectiveness of multiple treatments. Although it tests for differences in the central tendency, it does not specify which groups differ; post hoc comparisons are needed for that.
Our calculator first converts each line of input into numeric arrays. It then concatenates all groups, sorts them, and assigns ranks while averaging ties. After summing ranks within each group, it computes using the formula above. The p-value is obtained from the chi-squared distribution using the degrees of freedom . A small p-value indicates at least one group median differs significantly from the others.
If the p-value falls below your chosen significance level, you reject the null hypothesis that all group medians are equal. However, the test does not reveal which particular groups are different. For that, you might perform pairwise Mann-Whitney tests with a correction for multiple comparisons. Remember that nonparametric tests often have less power than their parametric counterparts, so consider the context and your data characteristics carefully.
The presence of ties slightly alters the distribution of . Many software packages apply a tie correction factor. Our simple implementation ignores this adjustment for clarity, but you should be aware that large numbers of ties may affect the p-value. In practice, the impact is usually minor unless ties are pervasive.
Imagine testing the effectiveness of three diets on weight loss. You record the weight change for participants in each group. After ranking all observations and summing ranks per diet, you compute and a corresponding p-value. If the p-value is below 0.05, you conclude that at least one diet leads to a different median weight change. This approach is robust even if weight changes are not normally distributed or if sample sizes differ slightly among groups.
William Kruskal and W. Allen Wallis introduced this test in 1952 as a nonparametric alternative to one-way ANOVA. It quickly became a staple in statistics due to its simplicity and minimal assumptions. Understanding its derivation helps illustrate the power of ranking methods when traditional parametric techniques are unsuitable.
Try varying the number of groups or sample sizes to see how the test statistic changes. Investigate the effect of outliers by adding extreme values. Because the procedure is based on ranks, a single outlier does not heavily influence the result, which can be advantageous in messy real-world data.
Compute relativistic kinetic energy for particles moving near light speed using mass and velocity.
Compute the Hamming distance between two equal-length strings or binary sequences.
Approximate a root of a function using the secant iteration method.