Skip to main content

SIMPROF direct run

SIMPROF can be run directly using Analyse>SIMPROF, rather than as part of another analysis such as CLUSTER (above), UNCTREE or kRCLUSTER (later this section) or LINKTREE (see Section 13). In that case, the active window must be the data sheet, the rectangular matrix whose variables are permuted randomly and independently across the samples. SIMPROF must always have such an underlying data matrix available – it cannot work solely on a triangular resemblance sheet. Thus when the SIMPROF option is taken in CLUSTER – which is run when the active window is a triangular matrix – PRIMER uses its internal knowledge of how that resemblance matrix was calculated to specify the correct data matrix, as a default for (Data sheet: $\text{\hspace{3mm}}$ ) under SIMPROF options. Change this default at your peril! – its main purpose is simply to remind you that SIMPROF always works on the underlying rectangular array not the triangular matrix.

Direct runs of SIMPROF are used to test for evidence of internal group structure in the full set of samples that are submitted to it, i.e. a single test rather than the (usually large) series of subset tests in the CLUSTER option. The advantage of doing a single test at a time is that more information can be output, as seen in the plot windows shown above under the SIMPROF method heading, for a preliminary test of any structure in the full set of 57 samples for the Bristol Channel zooplankton.

Another output option for Analyse>SIMPROF, selected by checking ✓Stats to worksheet, is of the data used to plot the similarity profile itself. This worksheet will have a number of rows equal to the number of entries in the resemblance matrix, containing as ‘variables’: the real ranked similarities; the mean similarities from the permutations; the lowest and the highest similarities obtained, at each rank, over all permutations (not shown on the plot); and the lower and upper 99% limits (or whatever % specified) of the permuted values at that rank.