Correlation as similarity

Use of a correlation matrix between all pairs of variables as input to a multivariate ordination (say), in which points denote variables rather than samples (so that highly correlated variables are placed close together), either requires one of the absolute coefficients or a simple shift $S = 50(1+\rho)$ of the three standard coefficients, so that they are defined over (0, 100) rather than (–1, 1). There is an important difference between the two approaches: should highly negatively correlated variables be considered highly similar (use an absolute measure) or highly dissimilar (shift the scale upwards)? The practical context should usually make clear which is the right choice.

Save and close the current Europe groundfish workspace (as Groundfish ws), and open that for the N Sea biomarkers N Sea ws, created towards the end of Section 4 – see there for description of the variables. (If not available, just open N Sea flounder biomarkers(.pri) from directory C:\Examples v7\N Sea biomarkers). The previous pre-treatment by variability weighting of these (transformed) biomarkers was designed for calculation of standard sample similarities (which you may now wish to do by Analyse>Resemblance>(Measure•Euclidean distance) & (Analyse between•Samples)), but the reason for re-opening this workspace now is to calculate similarities among variables, via correlation. The choice is between standard (Pearson) correlation and a rank-based correlation (Spearman, say); if the analysis includes the categorical as well as the continuous variables, the rank option may be preferred. Note that any variability weighting previously carried out, to weight the biomarkers against each other in calculating sample similarities, will be irrelevant to correlation computation of variable similarities, because variables are renormalised (under Pearson) or ranked (under Spearman). For Spearman, even the square root transform applied to the EROD and Lipid variables is irrelevant, since this will not change the rank order of variable values across samples. Note that low lysosomal stability (AO or NRR) is associated with high EROD etc – both indicating contaminant impact – so an absolute correlation measure is used to capture biomarker similarities. Analyse>Resemblance>(Measure•Other>✓Correlation: Absolute Spearman rank correlation) & (Analyse between•Variables) on N Sea flounder biomarkers will produce values in the range (0,1). These could be scaled to (0,100) using Tools>Transform>(Expression:100*V) – see box heading Transform on resemblances in Section 11 – and the Type changed from Correlation to Similarity with Edit>Properties>(Resemblance type•Similarity) but this is not practically necessary for most routines in PRIMER, such as nMDS ordination, since only ranks of the resemblances are used.

Getting in touch with us

System requirements

Installing PRIMER

Information on analyses

PERMANOVA+ add-on

Introduction to the methods of PRIMER

Changes from PRIMER 6 to PRIMER 7

Typographic conventions for this manual

Opening the examples

Reading data in from Excel

Basic MVA wizard

Pre-treatment of data

Matrix display wizard

Environmental data

Resemblance calculation

ANOSIM tests

CLUSTER analyses

MDS & PCA ordinations

Species analyses

Other analyses

Primer 7 trial software

Help system & manuals

Updates

Install and Uninstall

Example data

Getting the examples

Primer file types

Compatibility of files

Opening the PRIMER 7 desktop

Entering data directly

Labelling samples & variables

Deleting & inserting rows/cols

Undo data sheet edits

Moving & sorting rows/cols

Cut, copying & pasting

Saving data, renaming & deleting

Undo in the workspace

Saving, closing & opening a workspace

Setting the initial directory

Opening PRIMER files

(Ekofisk oil-field fauna)

Properties

Opening Excel files

(Ekofisk abiotic data)

Wizard for input data

Missing or zero values?

(Tasmanian meiofauna)

Opening several files at once

Opening the same file twice

Text-format input files

Factors in 3-column text format files

Dialog for input of text format files

Size of data worksheets

Merging worksheets

Output data formats

Editing labels

Active window

Use of factors

Creating & filling in factors

Cut, Copy, Paste, Delete in factors

Renaming & reordering factors

Multiple sessions and recent workspaces

Combining factors (e.g. to average)

Factor keys

Importing factors

Label matching

Factors in *.xls(x) or *.txt files

Creating indicators on variables

Indicators in selection

Variable information (aggregation files)

Highlight and select

(W Australia fish diets)

Summary Statistics

Control of highlighting

Selecting & deselecting highlights

Duplicating a selected worksheet

Selecting by factor levels

Multiple selections

Selecting by number and non-missing

Selecting variables

Factors in .xls(x) or .txt files