# PERMANOVA+ for PRIMER: Guide to Software and Statistical Methods

M J Anderson, R N Gorley & K R Clarke (2008)

#### Introduction and overview

#### 0.1 Title page

#### 0.2 Contact details and installation of the PERMANOVA+ software

Getting in touch with us PERMANOVA+ for PRIMER was produced as a collaborative effort between Pro...

#### 0.3 Introduction to the methods of PERMANOVA+

Rationale PERMANOVA+ is an add-on package which extends the resemblance-based methods of PRIMER t...

#### 0.4 Changes from DOS to PERMANOVA+ for PRIMER

The new Windows interface All of the original DOS routines have been fully re-written, translat...

#### 0.5 Using this manual

Typographic conventions The typographic conventions for this manual follow those used by for PRI...

#### Chapter 1: Permutational ANOVA and MANOVA (PERMANOVA)

Key references: Method: Anderson (2001a), McArdle & Anderson (2001) Permutation techniques:...

#### 1.1 General description

Key references Method: , Permutation techniques: , PERMANOVA is a routine for testing the...

#### 1.2 Partitioning

We shall begin by considering the balanced one-way (single factor) ANOVA experimental design. A f...

#### 1.3 Huygens’ theorem

This partitioning is fine and perfectly valid for Euclidean distances. But what happens if we wis...

#### 1.4 Sums of squares from a distance matrix

We can now consider the structure of a distance/dissimilarity matrix and how sums of squares for ...

#### 1.5 The pseudo-F statistic

Once the partitioning has been done we are ready to calculate a test statistic associated with th...

#### 1.6 Test by permutation

An appropriate distribution for the pseudo-F statistic under a true null hypothesis is obtained b...

#### 1.7 Assumptions

Recall that for traditional one-way ANOVA, the assumptions are that the errors are independent, t...

#### 1.8 One-way example (Ekofisk oil-field macrofauna)

Our first real example comes from a study by , who studied changes in community structure of soft...

#### 1.9 Creating a design file

We shall formally test the hypothesis of no differences in community structure among the four gro...

#### 1.10 Running PERMANOVA

To run PERMANOVA on the Ekofisk data, click on the resemblance matrix and select PERMANOVA+ > PER...

#### 1.11 Pair-wise comparisons

Pair-wise comparisons among all pairs of levels of a given factor of interest are obtained by doi...

#### 1.12 Monte Carlo P-values (Victorian avifauna)

In some situations, there are not enough possible permutations to get a reasonable test. Consider...

#### 1.13 PERMANOVA versus ANOSIM

The analysis of similarities (ANOSIM), described by is also available within PRIMER and can be u...

#### 1.14 Two-way crossed design (Subtidal epibiota)

The primary advantage of PERMANOVA is its ability to analyse complex experimental designs. The pa...

#### 1.15 Interpreting interactions

What do we mean by an “interaction” between two factors in multivariate space? Recall that for a ...

#### 1.16 Additivity

Central to an understanding of what an interaction means for linear models25 is the idea of addit...

#### 1.17 Methods of permutations

As for the one-way case, the distribution of each of the pseudo-F ratios in a multi-way design is...

#### 1.18 Additional assumptions

Recall that we assume for the analysis of a one-way design by PERMANOVA that the multivariate obs...

#### 1.19 Contrasts

In some cases, what is of interest in a particular experimental design is not necessarily the com...

#### 1.20 Fixed vs random factors (Tasmanian meiofauna)

All of the factors considered so far have been fixed, but factors can be either fixed or random. ...

#### 1.21 Components of variation

For any given ANOVA design, PERMANOVA identifies a component of variation for each term in the mo...

#### 1.22 Expected mean squares (EMS)

An important consequence of the choice made for each factor as to whether it be fixed or random i...

#### 1.23 Constructing $F$ from EMS

The determination of the EMS’s gives a direct indication of how the pseudo-F ratio should be cons...

#### 1.24 Exchangeable units

The denominator mean square of the pseudo-F ratio for any particular term in the analysis is impo...

#### 1.25 Inference space and power

It is worthwhile pausing to consider how the above tests correspond to meaningful hypotheses for ...

#### 1.26 Testing the design

Given the fact that so many important aspects of the results (pseudo-F ratios, P-values, power, t...

#### 1.27 Nested design (Holdfast invertebrates)

We have seen how a crossed design is identifiable by virtue of every level of one factor being pr...

#### 1.28 Estimating components of variation

The EMS’s also yield another important insight: they provide a direct method to get unbiased esti...

#### 1.29 Pooling or excluding terms

For a given design file, PERMANOVA, by default, will do a partitioning according to all terms tha...

#### 1.30 Designs that lack replication (Plankton net study)

A topic related to the issue of pooling is the issue of designs that lack replication. Familiar e...

#### 1.31 Split-plot designs (Woodstock plants)

Another special case of a design lacking appropriate replication is known as a split-plot design....

#### 1.32 Repeated measures (Victorian avifauna, revisited)

While randomised blocks, latin squares and split-plot designs lack spatial replication, a special...

#### 1.33 Unbalanced designs

Virtually all of the examples thus far have involved the analysis of what are known as $balanced$...

#### 1.34 Types of sums of squares (Birds from Borneo)

When the design is unbalanced, there will be a number of different ways to do the partitioning, w...

#### 1.35 Designs with covariates (Holdfast invertebrates, revisited)

A topic that is related (perhaps surprisingly) to the topic of unbalanced designs is the analysis...

#### 1.36 Linear combinations of mean squares (NZ fish assemblages)

Several aspects of the above analysis demonstrate its affinity with unbalanced designs. Note that...

#### 1.37 Asymmetrical designs (Mediterranean molluscs)

Although a previous section has been devoted to the analysis of unbalanced designs, there are som...

#### 1.38 Environmental impacts

Some further comments are appropriate here regarding experimental designs to detect environmental...

#### Chapter 2: Tests of homogeneity of dispersions (PERMDISP)

Key reference Method: Anderson (2006)

#### 2.1 General description

Key reference Method: PERMDISP is a routine for testing the homogeneity of multivariate dis...

#### 2.2 Rationale

There are various reasons why one might wish to perform an explicit test of the null hypothesis o...

#### 2.3 Multivariate Levene’s test (Bumpus’ sparrows)

proposed doing an analysis of variance (ANOVA) on the absolute values of deviations of observati...

#### 2.4 Generalisation to dissimilarities

Of course, in many applications that we will encounter (especially in the case of community data)...

#### 2.5 $P$-values by permutation

The other hurdle that must be cleared is to recognise that, in line with the philosophy of all of...

#### 2.6 Test based on medians

Levene’s test (for univariate data) can be made more robust (i.e. less affected by outliers) by u...

#### 2.7 Ecological example (Tikus Island corals)

An ecological example of the test for homogeneity is provided by considering a study by on coral...

#### 2.8 Choice of measure

An extremely important point is that the test of dispersion is going to be critically affected by...

#### 2.9 Dispersion as beta diversity (Norwegian macrofauna)

When used on species composition (presence/absence) data in conjunction with certain resemblance ...

#### 2.10 Small sample sizes

There is one necessary restriction on the use of PERMDISP, which is that the number of replicate ...

#### 2.11 Dispersion in nested designs (Okura macrofauna)

In many situations, the experimental design is not as simple as a one-way analysis among groups. ...

#### 2.12 Dispersion in crossed designs (Cryptic fish)

When two factors are crossed with one another, there may be several possible hypotheses concernin...

#### 2.13 Concluding remarks

PERMDISP is designed to test the null hypothesis of no differences in dispersions among a priori ...

#### Chapter 3: Principal coordinates analysis (PCO)

Key references Method: Torgerson (1958), Gower (1966)

#### 3.1 General description

Key references Method: , PCO is a routine for performing principal coordinates analysis () ...

#### 3.2 Rationale

It is difficult to visualise patterns in the responses of whole sets of variables simultaneously....

#### 3.3 Mechanics of PCO

To construct axes that maximise fitted variation (or minimise residual variation) in the cloud of...

#### 3.4 Example: Victorian avifauna

As an example, consider the data on Victorian avifauna at the level of individual surveys, in the...

#### 3.5 Negative eigenvalues

The sharp-sighted will have noticed a conundrum in the output given for the Victorian avifauna sh...

#### 3.6 Vector overlays

A new feature of the PERMANOVA+ add-on package is the ability to add vector overlays onto graphic...

#### 3.7 PCO versus PCA (Clyde environmental data)

Principal components analysis (PCA) is described in detail in chapter 4 of . As stated earlier, P...

#### 3.8 Distances among centroids (Okura macrofauna)

In chapter 1, the difficulty in calculating centroids for non-Euclidean resemblance measures was ...

#### 3.9 PCO versus MDS

We recommend that, for routine ordination to visualise multivariate data on the basis of a chosen...

#### Chapter 4: Distance-based linear models (DISTLM) and distance-based redundancy analysis (dbRDA)

Key references Method: Legendre & Anderson (1999), McArdle & Anderson (2001) Permutatio...

#### 4.1 General description

Key references Method:, Permutation methods: , , , DISTLM is a routine for analysing and ...

#### 4.2 Rationale

Just as PERMANOVA does a partitioning of variation in a data cloud described by a resemblance mat...

#### 4.3 Partitioning

Consider an (N × p) matrix of response variables Y, where N = the number of samples and p = the n...

#### 4.4 Simple linear regression (Clyde macrofauna)

In our first example of DISTLM, we will examine the relationship between the Shannon diversity (H...

#### 4.5 Conditional tests

More generally, when X contains more than one variable, we may also be interested in conditional ...

#### 4.6 (Holdfast invertebrates)

To demonstrate conditional tests in DISTLM, we will consider the number of species inhabiting hol...

#### 4.7 Assumptions & diagnostics

Thus far, we have only done examples for a univariate response variable in Euclidean space, using...

#### 4.8 Building models

In many situations, a scientist may have measured a large number of predictor variables that coul...

#### 4.9 Cautionary notes

Before proceeding, a few cautionary notes are appropriate with respect to building models. First,...

#### 4.10 (Ekofisk macrofauna)

We shall now use the DISTLM tool to identify potential parsimonious models for benthic macrofauna...

#### 4.11 Visualising models: dbRDA

We may wish to visualise a given model in the multivariate space of our chosen resemblance matrix...

#### 4.12 Vector overlays in dbRDA

Something which certainly should come as no surprise is to see the X variables playing an importa...

#### 4.13 dbRDA plot for Ekofisk

Let us examine the constrained dbRDA ordination for the parsimonious model obtained earlier using...

#### 4.14 Analysing variables in sets (Thau lagoon bacteria)

In some situations, it is useful to be able to partition variability in the data cloud according ...

#### 4.15 Categorical predictor variables (Oribatid mites)

Sometimes the predictor variables of interest are not quantitative, continuous variables, but rat...

#### 4.16 DISTLM versus BEST/ BIOENV

On the face of it, the DISTLM routine might be thought of as playing a similar role to PRIMER’s B...

#### Chapter 5: Canonical analysis of principal coordinates (CAP)

Key references Method: Anderson & Robinson (2003), Anderson & Willis (2003)

#### 5.1 General description

Key references Method: , CAP is a routine for performing canonical analysis of principal co...

#### 5.2 Rationale (Flea-beetles)

In some cases, we may know that there are differences among some pre-defined groups (for example,...

#### 5.3 Mechanics of CAP

Details of CAP and how it is related to other methods are provided by and . In brief, a classica...

#### 5.4 Discriminant analysis (Poor Knights Islands fish)

We will begin with an example provided by Trevor Willis and Chris Denny (; ), examining temperate...

#### 5.5 Diagnostics

How did the CAP routine choose an appropriate number of PCO axes to use for the above discriminan...

#### 5.6 Cross-validation

The procedure of pulling out one sample at a time and checking the ability of the model to correc...

#### 5.7 Test by permutation (Anderson’s irises)

CAP can be used to test for significant differences among the groups in multivariate space. The t...

#### 5.8 CAP versus PERMANOVA

It might seem confusing that both CAP and PERMANOVA can be used to test for differences among gro...

#### 5.9 Caveats on using CAP (Tikus Island corals)

When using the CAP routine, it should come as no surprise that the hypothesis (usually) is eviden...

#### 5.10 Adding new samples

A new utility of the windows-based version of the CAP routine in PERMANOVA+ is the ability to pla...

#### 5.11 Canonical correlation: single gradient (Fal estuary biota)

So far, the focus has been on hypotheses concerning groups and the use of CAP for discriminant an...

#### 5.12 Canonical correlation: multiple X’s

In some cases, interest lies in finding axes through the cloud of points so as to maximise correl...

#### 5.13 Sphericising variables

It was previously stated that CAP effectively “sphericises” the data clouds as part of the proces...

#### 5.14 CAP versus dbRDA

So, how does CAP differ from dbRDA for relating two sets of variables? First, dbRDA is directiona...

#### 5.15 Comparison of methods using SVD

The relationship between dbRDA and CAP can also be seen if we consider their formulation using si...

#### 5.16 (Hunting spiders)

A study by explored the relationships between two sets of variables: the abundances of hunting s...

#### Appendices

#### A1 Acknowledgements

We wish to thank our many colleagues, whose ongoing research has supported this work by providing...

#### A2 References

Akaike (1973) Akaike H. 1973. ‘Information theory as an extension of the maximum l...

#### A3 Index to mathematical notation and symbols

Matrices and vectors A = matrix containing elements $a _ {ij} = - \frac{1}{2} d _ {ij} ^ 2 $ B =...

#### A4 Index to data sets used in examples

Below is an index to the data sets used in examples, listed in order of appearance in the text. W...