Kolmogorov-Smirnov Test Examples

Explore practical examples of the Kolmogorov-Smirnov Test to understand how it's applied in various contexts.
By Jamie

Understanding the Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov Test (K-S Test) is a non-parametric statistical test that compares the distributions of two datasets. It assesses whether two samples come from the same distribution or if a sample follows a specified distribution. This test is particularly useful when the data does not meet the assumptions required for parametric tests, such as normality. Below are three diverse, practical examples of the Kolmogorov-Smirnov Test in action.

Example 1: Comparing Two Sample Distributions in Clinical Trials

Context

In clinical trials, researchers often need to compare the efficacy of two different treatments. The Kolmogorov-Smirnov Test can be used to determine if the outcomes from two treatment groups differ significantly.

Two groups of patients were treated with Drug A and Drug B, and their recovery times (in days) were recorded. The objective is to check if the recovery times for both treatments come from the same distribution.

Actual Example

  • Group A Recovery Times: [5, 6, 7, 8, 9, 10]
  • Group B Recovery Times: [4, 5, 6, 7, 8, 10]

Using the K-S Test, we calculate the empirical cumulative distribution functions (ECDF) for both groups. The maximum difference between these ECDFs is found to be 0.2. The critical value for a significance level of 0.05 is determined from K-S distribution tables. If the maximum difference exceeds this critical value, we reject the null hypothesis.

Notes

  • This test is particularly beneficial when the sample sizes are small or when the underlying distribution is unknown.
  • Variations of the K-S Test can include one-sample tests against a theoretical distribution.

Example 2: Testing Normality of Data in Quality Control

Context

In a manufacturing setting, quality control often requires the assessment of whether a set of measurements (e.g., lengths of components) adhere to a specified normal distribution. The Kolmogorov-Smirnov Test can help determine if the observed data fits a normal distribution.

A company measures the length of 30 widgets produced in a day and wishes to determine if the lengths follow a normal distribution with a mean of 100 mm and a standard deviation of 5 mm.

Actual Example

  • Measured Lengths: [99, 101, 98, 100, 102, 97, 99, 103, 100, 98, 101, 102, 99, 100, 101, 102, 98, 97, 100, 99, 100, 101, 100, 99, 102, 98, 97, 100, 101, 99]

After calculating the empirical distribution of the measured lengths and the normal distribution, the Kolmogorov-Smirnov statistic is computed. If this statistic is less than the critical value for the normal distribution at the chosen significance level, we conclude that the lengths do not significantly deviate from normality.

Notes

  • This application of the K-S Test is crucial for ensuring that manufacturing processes remain within control limits.
  • Adjustments may be needed if the data contains outliers, as they can affect the results.

Example 3: Evaluating Survey Responses Across Regions

Context

Market researchers often conduct surveys to understand consumer preferences across different geographical regions. The Kolmogorov-Smirnov Test can be used to compare survey responses from two regions to see if their preferences are statistically different.

Consider a survey conducted in Region X and Region Y, where respondents rated their satisfaction with a product on a scale from 1 to 10.

Actual Example

  • Region X Responses: [7, 8, 9, 6, 5, 7, 8]
  • Region Y Responses: [5, 6, 7, 5, 6, 8, 9]

The K-S Test is applied to compare the satisfaction ratings. The calculated maximum difference between the ECDFs of the two regions reveals whether the distributions are significantly different. If the result shows a significant difference, the marketing team may decide to tailor their strategies based on regional preferences.

Notes

  • This test is highly effective for non-parametric data, such as ordinal scales commonly used in surveys.
  • Researchers should be mindful of sample size, as smaller samples may lead to less reliable results.