Skip to main content

5.1 General description

Key references

CAP is a routine for performing canonical analysis of principal coordinates. The purpose of CAP is to find axes through the multivariate cloud of points that either: (i) are the best at discriminating among a priori groups (discriminant analysis) or (ii) have the strongest correlation with some other set of variables (canonical correlation). The analysis can be based on any resemblance matrix of choice. The routine begins by calculating principal coordinates from the resemblance matrix among N samples and it then uses these to do one of the following: (i) predict group membership; (ii) predict positions of samples along some other single continuous variable (q = 1); or (iii) find axes having maximum correlations with some other set of variables (q > 1). There is a potential problem of overparameterisation here, because if we have (N – 1) PCO axes, they will clearly be able to do a perfect job of modeling N points. To avoid this problem, diagnostics are required in order to choose an appropriate subset of PCO axes (i.e., m < (N – 1) ) to use for the analysis. The value of m is chosen either: (i) by maximising a leave-one-out allocation success to groups or (ii) by minimising a leave-one-out residual sum of squares in a canonical correlation. These diagnostics and an appropriate choice for m are provided by the routine. The routine also can perform a permutation test for the significance of the canonical relationship, using either the trace (sum of canonical eigenvalues) or the first canonical eigenvalue as test statistics. Also provided is a plot of the canonical axis scores. A new feature is the capacity to place new observations into the canonical space, based only on their resemblances with prior observations. For a discriminant-type analysis, this includes the allocation of new observations to existing groups.