4.9 Cautionary notes

Before proceeding, a few cautionary notes are appropriate with respect to building models. First, the procedures of forward selection, backward elimination and step-wise selection are in no way guaranteed to find the best overall model. Second, even if the search for the “best” overall model is done, the result will depend on which selection criterion is used (adjusted R$^2$, AIC, AIC$_c$ or BIC). Third, DISTLM fits a linear combination of the X variables, which may or may not be appropriate in a given situation (e.g., see the section on Linkage trees in chapter 11 of Clarke & Gorley (2006) ). As a consequence, it is certainly always appropriate to spend some time with the X variables doing some diagnostic plots and checking out their distributions and relationships with one another as a preliminary step. Fourth, the particular predictor variables that are chosen in a model should not be interpreted as being necessarily causative⁸⁵. The variables chosen may be acting as proxies for some other important variables that either were not measured or were omitted from the model for reasons of parsimony. Finally, it is not appropriate to use a model selection procedure and then to take the resulting model and test for its significance in explaining variability. This approach uses circular and therefore invalid logic, because it is already the purposeful job of the model selection procedure to select useful explanatory variables. To create a valid test, the inherent bias associated with model selection would need to be taken into account by performing the selection procedure anew with each permutation. Such a test would require a great deal of computational time and is not currently available in DISTLM⁸⁶. In sum, model-building using the DISTLM tool should generally be viewed as an exploratory hypothesis-generating activity, rather than a definitive method for finding the one “true” model.

⁸⁵ Unless a predictor variable has been expressly manipulated experimentally in a structured and controlled way to allow causative inferences, that is.

⁸⁶ The approach of including the selection procedure as part of the test is, however, available for examining non-parametric relationships between resemblance matrices as part of PRIMER’s BEST routine. See pp. 124-125 in chapter 11 of Clarke & Gorley (2006) for details.

0.1 Title page

0.2 Contact details and installation of the PERMANOVA+ software

0.3 Introduction to the methods of PERMANOVA+

0.4 Changes from DOS to PERMANOVA+ for PRIMER

0.5 Using this manual

1.1 General description

1.2 Partitioning

1.3 Huygens’ theorem

1.4 Sums of squares from a distance matrix

1.5 The pseudo-F statistic

1.6 Test by permutation

1.7 Assumptions

1.8 One-way example (Ekofisk oil-field macrofauna)

1.9 Creating a design file

1.10 Running PERMANOVA

1.11 Pair-wise comparisons

1.12 Monte Carlo P-values (Victorian avifauna)

1.13 PERMANOVA versus ANOSIM

1.14 Two-way crossed design (Subtidal epibiota)

1.15 Interpreting interactions

1.16 Additivity

1.17 Methods of permutations

1.18 Additional assumptions

1.19 Contrasts

1.20 Fixed vs random factors (Tasmanian meiofauna)

1.21 Components of variation

1.22 Expected mean squares (EMS)

1.23 Constructing $F$ from EMS

1.24 Exchangeable units

1.25 Inference space and power

1.26 Testing the design

1.27 Nested design (Holdfast invertebrates)

1.28 Estimating components of variation

1.29 Pooling or excluding terms

1.30 Designs that lack replication (Plankton net study)

1.31 Split-plot designs (Woodstock plants)

1.32 Repeated measures (Victorian avifauna, revisited)

1.33 Unbalanced designs

1.34 Types of sums of squares (Birds from Borneo)

1.35 Designs with covariates (Holdfast invertebrates, revisited)

1.36 Linear combinations of mean squares (NZ fish assemblages)

1.37 Asymmetrical designs (Mediterranean molluscs)

1.38 Environmental impacts

2.1 General description

2.2 Rationale

2.3 Multivariate Levene’s test (Bumpus’ sparrows)

2.4 Generalisation to dissimilarities

2.5 $P$-values by permutation

2.6 Test based on medians

2.7 Ecological example (Tikus Island corals)

2.8 Choice of measure

2.9 Dispersion as beta diversity (Norwegian macrofauna)

2.10 Small sample sizes

2.11 Dispersion in nested designs (Okura macrofauna)

2.12 Dispersion in crossed designs (Cryptic fish)

2.13 Concluding remarks

3.1 General description

3.2 Rationale

3.3 Mechanics of PCO

3.4 Example: Victorian avifauna

3.5 Negative eigenvalues

3.6 Vector overlays

3.7 PCO versus PCA (Clyde environmental data)

3.8 Distances among centroids (Okura macrofauna)

3.9 PCO versus MDS

4.1 General description

4.2 Rationale

4.3 Partitioning

4.4 Simple linear regression (Clyde macrofauna)

4.5 Conditional tests

4.6 (Holdfast invertebrates)

4.7 Assumptions & diagnostics

4.8 Building models

4.9 Cautionary notes

4.10 (Ekofisk macrofauna)

4.11 Visualising models: dbRDA

4.12 Vector overlays in dbRDA

4.13 dbRDA plot for Ekofisk

4.14 Analysing variables in sets (Thau lagoon bacteria)

4.15 Categorical predictor variables (Oribatid mites)