To compare two samples, you need two columns. In the first column you need both datasets. In the second column you need figures that act as a label to allow the samples to be split, for example, say we have running times for two groups with different levels of performance enhancing drugs in their blood, we could use these as the labels...
Data | Label |
---|---|
23 | 0.5 |
25 | 0.5 |
18 | 0.5 |
36 | 0.5 |
27 | 0.5 |
26 | 0.5 |
23 | 0.5 |
33 | 0.5 |
25 | 0.5 |
24 | 0.5 |
36 | 1.7 |
40 | 1.7 |
42 | 1.7 |
44 | 1.7 |
33 | 1.7 |
47 | 1.7 |
50 | 1.7 |
49 | 1.7 |
43 | 1.7 |
44 | 1.7 |
Once the data is in, you need to pick the following menu item...
This will bring up the following dialog box...
This allows you to pick the column you want to test and the grouping variable/label column. To pick columns, select them in the left side of the dialog box (example hightlighted red above), and click the arrow button in the middle of the dialog box to shift them into the relevant boxes. Once you've defined your columns, click "Define Groups..." to bring up the following dialog box. If the button in greyed out, you may need to highlight the grouping variable to ungrey it.
In the boxes for each group, type the relevant label value. As you can see, you can also define a threshold value ("Cut point") at which to split a continuous range of labels instead. Once you've grouped your data, press "Continue". Once back at the main dialog, the "Options..." button lets you define the Confidence Interval to test the statistic against and what to do with missing values. Once you're happy with the test, press "OK" on the main dialog box. This will bring up the viewer with the results, for example...
The first table is plainly basic statistics about the two samples. The second gives the results of the analysis. In the results table, the first two boxes are a test of whether the variance of the two samples are equal or not. Levene's statistic has an "F" distribution. If the "Sig" value for Levene's test is above 0.05 we can accept the null hypothesis (no difference) with a 95 percent chance of correctness. The rest of the boxes give the results depending on the state of the variances relative to each other, and should be picked appropriately.
The most important of the remaining statistics are the "t" statistic, the degrees of freedom (df) and the test's significance (Sig). In this case the signigicance is below 0.000, i.e. there is only a very low chance the two samples are drawn from the same population, and the null hypothesis (that they're from the same population) should be rejected. If the value was 0.002, for example, we could reject the null hypothesis at the 95 percent level (below 0.005), but not the 99 percent level (below 0.001).