5.1 Other ordination methods

Principal Co-ordinates Analysis

The two main weaknesses of PCA, identified at the end of Chapter 4, are its inflexibility of dissimilarity measure and its poor distance-preservation. The first problem is addressed in an important paper by Gower (1966) , describing an extension to PCA termed Principal Co-ordinates Analysis (PCO), also sometimes referred to as classical scaling. This allows a wider definition of distance than simple Euclidean distance in the species space (the basis of PCA), but was initially restricted to a specific class of resemblance measures for which the samples could be represented by points in some reconfigured high-dimensional (real) space, in which Euclidean distance between two points is just the (non-Euclidean) resemblance between those samples. Effectively none of the most useful biological resemblance coefficients fall into this class – the high-d space representing those dissimilarities has both real and imaginary axes – but it has become clearer in the intervening decades that much useful inference can still be performed in this complex space, e.g. McArdle & Anderson (2001) , Anderson (2001a) , Anderson (2001b) . (This is essentially the space in which the PERMANOVA+ add-on routines to the PRIMER software carry out their core analyses). PCO can thus be applied completely generally to any resemblance measure but the final step is again a projection onto a low-dimensional ordination space (e.g. a 2-dimensional plane), as in ordinary PCA. It follows that PCA is just a special case of PCO, when the original dissimilarity is just Euclidean distance, but note that PCO is still subject to the second criticism of PCA: its lack of emphasis on distance-preservation when the information is difficult to represent in a low number of dimensions.

Detrended Correspondence Analysis

Correspondence analyses are a class of ordination methods originally featuring strongly in French data-analysis literature (for an early review in English see Greenacre (1984) ). Key papers in ecology are Hill (1973a) and Hill & Gauch (1980) , who introduced detrended correspondence analysis (DECORANA). The methods start from the data matrix, rather than a resemblance measure, so are rather inflexible in their definition of sample dissimilarity; in effect, multinomial assumptions generate an implicit dissimilarity measure of chi-squared distance (Chapter 16). Correspondence analysis (CA) has its genesis in a particular model of unimodal species response to underlying (unmeasured) environmental gradients. Description is outside the scope of this manual but good accounts of CA can be found in the works of Cajo ter Braak (e.g. in Jongman, ter Braak & Tongeren (1987) ), who has contributed a great deal in this area, not least CCA, Canonical Correspondence Analysis ( ter Braak (1986) ).^¶

The DECORANA version of CA, widely used in earlier decades, has a primary motivation of straightening out an arch effect in a CA ordination, which is expected on theoretical grounds if species abundances have unimodal (Guassian) responses along a single strong environmental gradient. Where such models are not appropriate, it is unclear what artefacts the algorithms may introduce into the final picture. In the Hill & Gauch (1980) procedure, the detrending is essentially carried out by first splitting the ordination space into segments, stretching or shrinking the scale in each segment and then realigning the segments to remove wide-scale curvature. For some people, this is uncomfortably close to attacking the data with scissors and glue and, though the method is not as subjective as this would imply, some arbitrary decisions about where and how the segmentation and rescaling are defined are hidden from the user in the software code. Thus Pielou (1984) and others criticized DECORANA for its ‘overzealous’ manipulation of the data. It is also unfortunate that the multivariate methods which were historically applied in ecology were often either poorly suited to the data or were based on conceptually complex algorithms (e.g. DECORANA and TWINSPAN, Hill (1979a) and Hill (1979b) ), erecting a communication barrier between data analyst and ecologist.

The ordination technique which is adopted in this manual’s strategy, non-metric MDS, is itself a complex numerical algorithm but it will be argued that it is conceptually simple. It makes few (if any) model assumptions about the form of the data, and the link between the final picture and the user’s original data is relatively transparent and easy to explain. Importantly, it addresses both the major criticisms of PCA made earlier: it has great flexibility both in the definition and conversion of dissimilarity to distance and its rationale is the preservation of these relationships in the low-dimensional ordination space.

^¶ A convenient way of carrying out CA-related routines is to use the excellent CANOCO package, ter Braak & Smilauer (2002) .

0.1 Introduction

0.2 Acknowledgements

0.3 Citing this book

1.1 Introduction

1.2 Univariate techniques

1.3 Example: Frierfjord macrofauna

1.4 Distributional techniques

1.5 Example: Loch Linnhe macrofauna

1.6 Example: Garroch Head macrofauna

1.7 Multivariate techniques

1.8 Example: Nutrient enrichment experiment, Solbergstrand

1.9 Summary

2.1 Similarity for quantitative data matrices

2.2 Example: Loch Linnhe macrofauna

2.3 Presence/absence data

2.4 Species similarities

2.5 Dissimilarity coefficients

2.6 More on resemblance measures

3.1 Cluster analysis

3.2 Hierarchical agglomerative clustering

3.3 Example: Bristol Channel zooplankton

3.4 Recommendations

3.5 Similarity profiles (SIMPROF)

3.6 Binary divisive clustering

3.7 k-R clustering (non-hierarchical)

4.1 Ordinations

4.2 Principal components analysis

4.3 Example: Garroch Head macrofauna

4.4 PCA for environmental data

4.5 Example: Dosing experiment, Solbergstrand mesocosm

5.1 Other ordination methods

5.2 Non-metric multidimensional scaling (MDS)

5.3 Diagnostics: Adequacy of MDS representation

5.4 EXAMPLE: Dosing experiment, Solbergstrand

5.5 Example: Celtic Sea zooplankton

5.6 Example: Amoco-Cadiz oil spill, Morlaix

5.7 MDS strengths and weaknesses

5.8 Further nMDS/mMDS developments

5.9 Example: Okura estuary macrofauna

5.10 Example: Messolongi lagoon diatoms

5.11 Recommendations

6.1 Univariate tests and multivariate tests

6.2 ANOSIM for the one-way layout

6.3 Example: Frierfjord macrofauna

6.4 Example: Indonesian reef-corals

6.5 ANOSIM for two-way layouts

6.6 Example: Clyde nematodes (2-way nested case)

6.7 Example: Eaglehawk Neck meiofauna (two-way crossed case)

6.8 Example: Mesocosm experiment (two-way crossed case with no replication)

6.9 Example: Exe nematodes (no replication and missing data)

6.10 ANOSIM for ordered factors

6.11 Example: Ekofisk oil-field macrofauna

6.12 Two-way ordered ANOSIM designs

6.13 Example: Phuket coral-reef time series

6.14 Three-way ANOSIM designs

6.15 Example: King Wrasse fish diets, WA

6.16 Example: NZ kelp holdfast macrofauna

6.17 Example: Tees Bay macrofauna

6.18 Recommendations

7.1 Species clustering

7.2 Type 2 and type 3 SIMPROF tests

7.3 Example: Amoco-Cadiz oil spill

7.4 Shade plots

7.5 Example: Bristol Channel zooplankton

7.6 Example: Garroch Head macrofauna

7.7 Example: Ekofisk oil-field macrofauna

7.8 Species contributions to sample (dis)similarities – SIMPER

7.9 Example: Tasmanian meiofauna

7.10 Bubble plots (plus examples)

8.1 Univariate measures

8.2 Graphical/distributional plots

8.3 Examples: Garroch Head and Ekofisk macrofauna

8.4 Examples: Loch Linnhe and Garroch Head macrofauna

8.5 Multivariate tools used on univariate data

8.6 Example: Plymouth particle-size data

8.7 Multiple diversity indices

9.1 Introduction

9.2 Univariate case

9.3 Multivariate case

9.4 Recommendations