UNIT S2 STUDY GUIDE
Randomization Tests
Hypothesis testing with classical distributions such as the t and z distributions require knowledge about the population distribution. However, in many situations it is generally difficult to obtain the sampling distribution of sample statistics. For data sampled from unknown distributions, randomization tests are more useful especially when the sample sizes are small. Randomization tests are performed by resampling from the original sample without replacement. The resampling procedure relies on the assumption that the null hypothesis is true. Note that for Bootstrap, another resampling method we looked at in Unit S1, the resamples were taken with replacement from the sample without any assumptions about the population parameter.
Practice
Since simulations can be rather taxing if done manually, we utilize computation prowess of modern computing devices such as our own computers or mobile devices to run simulations. StatKey has a dedicated column for different Randomization Hypothesis Tests right on their main page. Using StatKey, we will create a randomization distribution for the statistic we’re using to test our hypothesis. The process is fairly straight forward as follows:
Determine the type of test (Single Mean, Single Proportion, Difference in Means, or Difference in Proportions). For all of the randomization tests, leave the randomization method as the default Reallocation. Since we’re going to use resampling, make sure that you have raw data from the sample, and not sample statistics when testing for the means. For proportions, you’ll need to enter your number of success(es) and sample size(s).
- STEP 1: Enter your Null hypothesis value. For two sample cases, the Null will be already listed as [latex]\mu_1=\mu_2[/latex] or [latex]p_1=p_2[/latex].
Check to make sure that your null hypothesis is reflected on the page for the Randomization test you choose. - STEP 2: Generate several thousand samples (say, 10,000 samples) by clicking on the Generate 1000 Samples button several times.
- STEP 3: Select the tail type of your test.
- STEP 4: Change the value in the box below the horizontal axis to match your sample mean [latex]\bar x[/latex] or sample proportion [latex]\hat p[/latex] as appropriate. For two sample tests, match the value below the axis with the difference in the means, [latex]\bar x_1-\bar x_2[/latex], or difference in proportions, [latex]\hat p_1=\hat p_2[/latex], displayed in the Original Sample section (top-right).
NOTE: For two-tailed tests, there will be two values under the horizontal axis. Change the value of the box that is of the same sign as the original sample statistic. - STEP 5: Area(s) representing p-value(s) will be shaded in red. For one-tailed test, the tail area is the p-value. For two-tailed tests, the p-value equals the total of the two tail areas.
Please watch the videos below showcasing StatKey usage for randomization tests in different scenarios: