2.10 Small sample sizes

There is one necessary restriction on the use of PERMDISP, which is that the number of replicate samples per group must exceed n = 2. The reason is that, if there are only two replicates, then, by definition, the distance to the centroid for those two samples must be equal to one another. Consider a single variable and a group with two samples having values of 4 and 6. The centroid (average) in Euclidean space for this group is therefore 5. The distance from sample 1 to the centroid is 1 and the distance from sample 2 to the centroid is also 1. These two values of z are necessarily equal to one another. This will also be the case for other groups having only 2 replicate samples, so the within-group variance of the z’s when n = 2 for all groups will be equal to zero. If the within-group variance is equal to zero, then the F statistic will be infinite, so the test loses all meaning. Clearly, the test is also meaningless for a group with n = 1, which will have only a single z value of zero. Thus, if the sample size for any of the groups is n ≤ 2, then the PERMDISP routine will issue a warning accordingly. Although test results are meaningless in such cases, the individual deviations (the z’s) can nevertheless still be examined and compared in their value across the different groups, if desired. More generally, the issue here is the degree of correlation among values of z, which increases the smaller the sample size. Levene (1960) showed the degree of correlation is of order n^-2 which, he suggested, will probably not have a serious effect on the distribution of the F statistic. We suggest that formal tests using PERMDISP having within-group sample sizes less than n = 10 should be viewed with some caution and those having sample sizes less than n = 5 should probably be avoided, though (as elsewhere) further simulation studies for realistic multivariate cases would be helpful in refining such rules-of-thumb.

0.1 Title page

0.2 Contact details and installation of the PERMANOVA+ software

0.3 Introduction to the methods of PERMANOVA+

0.4 Changes from DOS to PERMANOVA+ for PRIMER

0.5 Using this manual

1.1 General description

1.2 Partitioning

1.3 Huygens’ theorem

1.4 Sums of squares from a distance matrix

1.5 The pseudo-F statistic

1.6 Test by permutation

1.7 Assumptions

1.8 One-way example (Ekofisk oil-field macrofauna)

1.9 Creating a design file

1.10 Running PERMANOVA

1.11 Pair-wise comparisons

1.12 Monte Carlo P-values (Victorian avifauna)

1.13 PERMANOVA versus ANOSIM

1.14 Two-way crossed design (Subtidal epibiota)

1.15 Interpreting interactions

1.16 Additivity

1.17 Methods of permutations

1.18 Additional assumptions

1.19 Contrasts

1.20 Fixed vs random factors (Tasmanian meiofauna)

1.21 Components of variation

1.22 Expected mean squares (EMS)

1.23 Constructing $F$ from EMS

1.24 Exchangeable units

1.25 Inference space and power

1.26 Testing the design

1.27 Nested design (Holdfast invertebrates)

1.28 Estimating components of variation

1.29 Pooling or excluding terms

1.30 Designs that lack replication (Plankton net study)

1.31 Split-plot designs (Woodstock plants)

1.32 Repeated measures (Victorian avifauna, revisited)

1.33 Unbalanced designs

1.34 Types of sums of squares (Birds from Borneo)

1.35 Designs with covariates (Holdfast invertebrates, revisited)

1.36 Linear combinations of mean squares (NZ fish assemblages)

1.37 Asymmetrical designs (Mediterranean molluscs)

1.38 Environmental impacts

2.1 General description

2.2 Rationale

2.3 Multivariate Levene’s test (Bumpus’ sparrows)

2.4 Generalisation to dissimilarities

2.5 $P$-values by permutation

2.6 Test based on medians

2.7 Ecological example (Tikus Island corals)

2.8 Choice of measure

2.9 Dispersion as beta diversity (Norwegian macrofauna)

2.10 Small sample sizes

2.11 Dispersion in nested designs (Okura macrofauna)

2.12 Dispersion in crossed designs (Cryptic fish)

2.13 Concluding remarks

3.1 General description

3.2 Rationale

3.3 Mechanics of PCO

3.4 Example: Victorian avifauna

3.5 Negative eigenvalues

3.6 Vector overlays

3.7 PCO versus PCA (Clyde environmental data)

3.8 Distances among centroids (Okura macrofauna)

3.9 PCO versus MDS

4.1 General description

4.2 Rationale

4.3 Partitioning

4.4 Simple linear regression (Clyde macrofauna)

4.5 Conditional tests

4.6 (Holdfast invertebrates)

4.7 Assumptions & diagnostics

4.8 Building models

4.9 Cautionary notes

4.10 (Ekofisk macrofauna)

4.11 Visualising models: dbRDA

4.12 Vector overlays in dbRDA

4.13 dbRDA plot for Ekofisk

4.14 Analysing variables in sets (Thau lagoon bacteria)

4.15 Categorical predictor variables (Oribatid mites)