Rationale for 2nd stage MDS
As seen above, the $\rho$ statistic, which rank correlates the elements of two similarity matrices, can provide a very useful and succinct summary of the extent of agreement between two ordinations (or, to be more precise, of agreement in the high-dimensional multivariate data underlying these low-dimensional plots). Often, many such pairwise comparisons are made; for example, a single set of data may first be aggregated to a range of taxonomic levels (species, genus, family, …), then analysed under a range of pre-treatments: standardisations (none, by species or samples, and by maximum or total); other taxon weightings (e.g. dispersion weighting); then transformations (none, square root, 4th root, log, pres/abs), etc. Many ordination plots result and it is reasonable to ask how much the multivariate pattern changes as a result of these various decisions. What are the important choices? Does it matter whether the data are only identified to family rather than species level, or is the difference this makes completely dwarfed by the changes resulting from choosing to look at common to mid-abundance species (none or square root transform) or concentrating more on the less-common species (4th root or presence/absence)? Or is it the choice of a resemblance coefficient (from the 40 or so in Section 5) that really dictates the conclusions? It can be difficult, and arbitrary, to assess this just by looking at the range of different ordinations produced, though at least we can exploit the $\rho$ statistic to give quantification of the agreement in multivariate pattern for any pair of choices. But when there are many choices, even a set of $\rho$ values between pairs does not become a succinct enough description (considering only two types of choice, there are 20 different ordinations from 5 transformations and 4 taxonomic levels, thus 190 $\rho$ values between them!).
The key step here is to realise that $\rho$ itself can be regarded as a similarity measure, taking values near 1 if two multivariate patterns are highly similar and near zero if they bear no relation to each other. So, the triangular matrix of $\rho$ coefficients between all pairs of ordinations can be entered into the MDS routine, to obtain what PRIMER calls a 2nd stage MDS plot (an MDS of MDS’s, if you like!). The $\rho$ coefficient is not a distance-like measure (it can take small negative values and has a fixed upper limit) so it is unlikely to be turned into an ordination distance by a straight line through the origin on a Shepard plot, so again nMDS rather than mMDS seems appropriate This is based on the rank orders of the $\rho$ values, therefore catering naturally with the potential for small negative $\rho$ values – these just become patterns that are even less like each other than random re-arrangements, and in practice large negative values are not observed. The resulting second-stage nMDS plot thus gives a succinct summary in a 2-d picture, often with small stress, of the relation¬ship between the multivariate sample patterns under the various choices. The 2STAGE idea was introduced in this context by Somerfield PJ & Clarke KR 1995 Mar Ecol Prog Ser 127:113-119 and further explored by Olsgard F, Somerfield PJ, Carr MR 1997 & 1998 Mar Ecol Prog Ser 149: 173-181 & 172: 25-26, and is also covered extensively in Chapter 16 of CiMC, including the examples below.