5.3 Mechanics of CAP

Details of CAP and how it is related to other methods are provided by Anderson & Robinson (2003) and Anderson & Willis (2003) . In brief, a classical canonical analysis is simply done on a subset of the PCO axes. Here, we provide a thumbnail sketch to outline the main features of the analysis. The important issues to keep in mind are the conceptual ones, but a little matrix algebra is included here for completeness. Let D be an N × N matrix of dissimilarities (or distances) among samples⁹⁹. Let X be an N × q matrix that contains either codes for groups (for discriminant analysis) or one or more quantitative variables of interest (for canonical correlation analysis). Conceptually, we can consider X to contain the hypothesis of interest. When using CAP for discriminant analysis in the PERMANOVA+ add-on package for PRIMER, groups are identified by a factor associated with the resemblance matrix. The CAP routine will internally generate an appropriate X matrix specifying the group structure from this factor information, so no additional data are required. However, if a canonical correlation-type analysis is to be done, then a separate data sheet containing one or more X variables (and having the same number and labels for the samples as the resemblance matrix) needs to be identified. The mechanics of performing a CAP analysis are described by the following steps (Fig. 5.2):

First, principal coordinates are obtained from the resemblance matrix to describe the cloud of multivariate data in Euclidean space (see Fig. 3.1 for details). The individual PCO axes are not, however, standardised by their eigenvalues. Instead, they are left in their raw (orthonormal¹⁰⁰) form, which we will denote by Q$^0$. This means that not only is each PCO axis independent of all other PCO axes, but also each axis has a sum-of-squares (or length) equal to 1. So, the PCO data cloud is effectively “sphericised” (see the section entitled Sphericising variables).
From matrix X, calculate the “hat” matrix H = X[X′X]–1X′. This is the matrix derived from the solutions to the normal equations ordinarily used in multiple regression (e.g., Johnson & Wichern (1992) , Neter, Kutner, Nachtsheim et al. (1996) )¹⁰¹. Its purpose here is to orthonormalise (“sphericise”) the data cloud corresponding to matrix X as well.
If the resemblance matrix is N × N, then there will be, at most, (N – 1) non-zero PCO axes. If we did the canonical analysis using all of these axes, it would be like trying to fit a model to N points using (N – 1) parameters, and the fit would be perfect, even if the points were completely random and the hypothesis were false! So, only a subset of m < (N – 1) PCO axes should be used, denoted by Q$^0 _ m$. The value of m is chosen using appropriate diagnostics (see the section Diagnostics).
A classical canonical analysis is done to relate the subset of m orthonormal PCO axes to X. This is done by constructing the matrix Q$^0 _ m$′HQ$^0 _ m$. Eigenvalue decomposition of this matrix yields canonical eigenvalues $\delta _1^2$, $\delta _2^2$,$\ldots \delta _s^2$ and their associated eigenvectors. The trace of matrix Q$^0 _ m$′HQ$^0 _ m$ is equal to the sum of the canonical eigenvalues. These canonical eigenvalues are also the squared canonical correlations. They indicate the strength of the association between the data cloud and the hypothesis of interest.
The canonical coordinate axis scores C, a matrix of dimension (N × s), are used to produce the the CAP plot. These are made by pre-multiplying the eigenvectors by Q$^0 _ m$ and then scaling (multiplying) each of these by the square root of their corresponding eigenvalue. Thus, the CAP axes are linear combinations of the orthonormal PCO axes.

Fig. 5.2. Schematic diagram of the steps involved in performing a CAP analysis.

The number of canonical axes produced by the analysis (= s) will be the minimum of (m, q, (N – 1)). For a canonical correlation-type analysis, q is the number of variables in X. For a discriminant analysis, q = (g – 1), where g is the number of groups. If the analysis if based on Euclidean distances to begin with, then CAP is equivalent to classical canonical discriminant analysis (CDA) or canonical correlation analysis (CCorA). In such cases, the number for m should be chosen to be the same as the number of original variables (p) in data matrix Y, except in the event that p is larger than (N – 1), in which case the usual diagnostics should be used to choose m.

⁹⁹ If a resemblance matrix of similarities is available instead, then the CAP routine in PERMANOVA+ will automatically transform these into dissimilarities before proceeding; the user need not do this as a separate step.

¹⁰⁰ Orthonormal axes are uncorrelated and have a sum-of-squares and cross-products matrix (SSCP) equal to the identity matrix I (all sums of squares = 1 and all cross-products = 0).

¹⁰¹ CAP automatically centres the X data cloud.

0.1 Title page

0.2 Contact details and installation of the PERMANOVA+ software

0.3 Introduction to the methods of PERMANOVA+

0.4 Changes from DOS to PERMANOVA+ for PRIMER

0.5 Using this manual

1.1 General description

1.2 Partitioning

1.3 Huygens’ theorem

1.4 Sums of squares from a distance matrix

1.5 The pseudo-F statistic

1.6 Test by permutation

1.7 Assumptions

1.8 One-way example (Ekofisk oil-field macrofauna)

1.9 Creating a design file

1.10 Running PERMANOVA

1.11 Pair-wise comparisons

1.12 Monte Carlo P-values (Victorian avifauna)

1.13 PERMANOVA versus ANOSIM

1.14 Two-way crossed design (Subtidal epibiota)

1.15 Interpreting interactions

1.16 Additivity

1.17 Methods of permutations

1.18 Additional assumptions

1.19 Contrasts

1.20 Fixed vs random factors (Tasmanian meiofauna)

1.21 Components of variation

1.22 Expected mean squares (EMS)

1.23 Constructing $F$ from EMS

1.24 Exchangeable units

1.25 Inference space and power

1.26 Testing the design

1.27 Nested design (Holdfast invertebrates)

1.28 Estimating components of variation

1.29 Pooling or excluding terms

1.30 Designs that lack replication (Plankton net study)

1.31 Split-plot designs (Woodstock plants)

1.32 Repeated measures (Victorian avifauna, revisited)

1.33 Unbalanced designs

1.34 Types of sums of squares (Birds from Borneo)

1.35 Designs with covariates (Holdfast invertebrates, revisited)

1.36 Linear combinations of mean squares (NZ fish assemblages)

1.37 Asymmetrical designs (Mediterranean molluscs)

1.38 Environmental impacts

2.1 General description

2.2 Rationale

2.3 Multivariate Levene’s test (Bumpus’ sparrows)

2.4 Generalisation to dissimilarities

2.5 $P$-values by permutation

2.6 Test based on medians

2.7 Ecological example (Tikus Island corals)

2.8 Choice of measure

2.9 Dispersion as beta diversity (Norwegian macrofauna)

2.10 Small sample sizes

2.11 Dispersion in nested designs (Okura macrofauna)

2.12 Dispersion in crossed designs (Cryptic fish)

2.13 Concluding remarks

3.1 General description

3.2 Rationale

3.3 Mechanics of PCO

3.4 Example: Victorian avifauna

3.5 Negative eigenvalues

3.6 Vector overlays

3.7 PCO versus PCA (Clyde environmental data)

3.8 Distances among centroids (Okura macrofauna)

3.9 PCO versus MDS

4.1 General description

4.2 Rationale

4.3 Partitioning

4.4 Simple linear regression (Clyde macrofauna)

4.5 Conditional tests

4.6 (Holdfast invertebrates)

4.7 Assumptions & diagnostics

4.8 Building models

4.9 Cautionary notes

4.10 (Ekofisk macrofauna)

4.11 Visualising models: dbRDA

4.12 Vector overlays in dbRDA

4.13 dbRDA plot for Ekofisk

4.14 Analysing variables in sets (Thau lagoon bacteria)

4.15 Categorical predictor variables (Oribatid mites)