# 2.10 Small sample sizes

There is one necessary restriction on the use of PERMDISP, which is that the number of replicate samples per group must exceed *n* = 2. The reason is that, if there are only two replicates, then, by definition, the distance to the centroid for those two samples must be equal to one another. Consider a single variable and a group with two samples having values of 4 and 6. The centroid (average) in Euclidean space for this group is therefore 5. The distance from sample 1 to the centroid is 1 and the distance from sample 2 to the centroid is also 1. These two values of *z* are necessarily equal to one another. This will also be the case for other groups having only 2 replicate samples, so the within-group variance of the *z*’s when *n* = 2 for all groups will be equal to zero. If the within-group variance is equal to zero, then the *F* statistic will be infinite, so the test loses all meaning. Clearly, the test is also meaningless for a group with *n* = 1, which will have only a single *z* value of zero. Thus, if the sample size for any of the groups is *n* ≤ 2, then the PERMDISP routine will issue a warning accordingly. Although test results are meaningless in such cases, the individual deviations (the *z*’s) can nevertheless still be examined and compared in their value across the different groups, if desired. More generally, the issue here is the degree of correlation among values of *z*, which increases the smaller the sample size.
Levene (1960)
showed the degree of correlation is of order *n*-2 which, he suggested, will probably not have a serious effect on the distribution of the *F* statistic. We suggest that formal tests using PERMDISP having within-group sample sizes less than *n* = 10 should be viewed with some caution and those having sample sizes less than *n* = 5 should probably be avoided, though (as elsewhere) further simulation studies for realistic multivariate cases would be helpful in refining such rules-of-thumb.