12.6 Additional notes on implementing control charts

We offer here a few additional notes regarding the implementation of control charts in real applications. The control-chart dialog in PRIMER 8 offers many options. It is especially important to pay close attention to all of the choices that can affect the null hypothesis and/or the decision criterion, i.e., the upper control-chart limit ($U_{CL}$). We offer below some comments on these topics.

Start with a decent sample size

Control charts have historically arisen from industrial settings, where sample sizes, particularly for establishing baseline information, are typically very large. It is important to recognise that we are trying here to characterise the entire distribution's shape (for the in-control samples), and not just to estimate a centroid. Therefore, we should always apply the control-chart tool with a view to including as many 'in-control' (reference) samples as we possibly can. Mathematically, there are lower limits on the number of in-control points we need in order to run the analysis (i.e., $n_c$ = 4 points), but as a general rule, we should typically aim to run the control-chart routine on no fewer than $n_c$ = 10 sample points, and having more ($n_c$ = 20 or 30) would certainly be preferable.

If your total sample size is $N \ge$ 11, then the default for the Control Chart routine in P8 for the minimum number of in-control samples is $n_c$ = 10. If $N \lt$ 11, then the default is $n_c = N-1$, but with a strict lower bound of $n_c$ = 4.

A further practical point is that the Control Chart routine in P8 cannot handle missing values, so these will need to be removed prior to running the routine.

Be aware of H₀ for different types of control chart

The null hypothesis (H₀) for the specific test done at each time point in a given control chart depends critically on the type of control chart you are running: progressive, fixed baseline or moving window. You need to carefully consider which type of control chart is appropriate for your particular application (there may be more than one).

The default in PRIMER 8 is to run the control-chart by reference to a fixed baseline set of $n_c$ = 10 samples. However, the number of 'in-control' samples clearly needs to be thought about carefully and set to something appropriate for each specific dataset, driven by the null hypothesis of interest.

It is also important to consider how each type of control chart plays out in the specific tests it performs through time. For example, the 'progressive' type of control-chart may not produce output that 'makes sense' after an 'out-of-control' sample has been identified. For example, suppose you are looking at a progressive control-chart and an 'out-of-control' point has been identified at time-point $t$. The progressive chart will subsequently include that point at time $t$ as part of the 'in control' distribution of samples when it goes on to test subsequent time-points $(t+1)$, $(t+2)$, etc. This might not be appropriate. One might consider removing the out-of-control sample before proceding with the subsequent tests. These sorts of decisions will depend on the specific hypotheses to be examined for any particular dataset.

Be aware of important settings affecting $U_{CL}$

The upper control chart limit $U_{CL}$ and hence the assessment of whether a point is in control or out of control will clearly be critically affected by the following choices:

choice of parametric vs non-parametric approach
choice of $\alpha$-level (e.g., 0.05)
choice to apply shrinkage (or not) in estimating the variance-covariance matrix
choice of ordination method (PCO, mMDS or tmMDS)
choice of $m$, the dimensionality of the ordination

The defaults for the Control chart routine in PRIMER 8 will be quite sensible for a pretty wide variety of cases. These defaults are:

non-parametric
$\alpha$ = 0.05
apply shrinkage
use threshold metric MDS (tmMDS)
choose $m$ so that the matrix correlation is $r_{e,d}$ = 0.99.

However, thinking carefully about each of these choices is almost always warranted. For example, it is useful to observe that the default choice of 'non-parametric' may not be particularly sensible if the sample size $n_c$ is quite small (less than 10).

Introduction

New Statistical Methods in P8

New Tools & Utilities in P8

1.1 Expansion from P7 to P8

1.2 Definitions of statistics

1.3 Biotic data: summary stats

1.4 Split summary stats results by groups

1.5 Environmental data: summary stats

2.1 What is an empirical distribution?

2.2 Example: Empirical distributions of oyster sizes

3.1 Plots of empirical densities

3.2 Example: Dotplot of oyster sizes

3.3 Example: Violin plot of kelp holdfast volumes

4.1 Wilcoxon signed-rank test

4.2 Example: Plankton hauls

4.3 Mann-Whitney U test

4.4 Example: Snapper in marine reserves

4.5 Kruskal-Wallis test

4.6 Example: A bivalve species from Ekofisk

4.7 Kolmogorov-Smirnov test

4.8 Example: Sizes of oysters

4.9 Test of Association

4.10 Example: Ekofisk diversity

4.11 Example: Associations between species

Overview of new 'Design' options and tools

6.1 Overview - Allow heterogeneity

6.2 ANOVA in a nutshell

6.3 The Behrens-Fisher problem (BFP)

6.4 Multivariate Behrens-Fisher problem

6.5 Solution to the multivariate BFP

6.6 Example: one-way PERMANOVA allowing heterogeneity

6.7 Heterogeneity in more complex designs

6.8 Example: two-way crossed PERMANOVA allowing heterogeneity

7.1 Overview - Finite factors

7.2 Dichotomy: fixed vs random factors

7.3 Not a dichotomy: a progression from fixed to random

7.4 Example: environmental impact on molluscs

7.5 Broader implications for detecting impact

8.1 Designs lacking replication

8.2 Example: Split-plot - Woodstock vegetation

8.3 Example: Repeated measures - Victorian avifauna

9.1 Why group covariables together?

9.2 Periodic and cyclical models

9.3 Example: Annual monthly cycles - B.C. macroalgae

10.1 Ordinations for multi-factor designs

10.2 Main effects plot

10.3 Interaction plot

10.4 Example: NZ fish assemblages

11.1 What are 'residual' distances?

11.2 Example: Plankton (revisited)

12.1 Overview - Control charts

12.2 Classical univariate control chart

12.3 Classical multivariate control chart

12.4 Bivariate normal example: NZ fish

12.5 Dissimilarity-based multivariate control chart

12.6 Additional notes on implementing control charts

12.7 Example: Birds from Grand Forks

13.1 Overview

13.2 Analysing cumulative standardised data

13.3 Example: Mussel sizes in the Gulf of Alaska

13.4 Example: Gulf of Maine invertebrates - functional resemblance

14.1 Overview

14.2 Example: NE Pacific groundfish vs depth

15.1 New default colour palette

15.2 New selection options

15.3 Re-name levels of a factor (or indicator)

15.4 Add customised values/labels to graphical axes

15.5 Split data sheet by factor/indicator

15.6 Line plots for samples

15.7 Output group-level stats from dispersion (or variability) weighting

15.8 Output diagnostic plots from CAP

15.9 New diagnostics for PCA/PCO plots

12.6 Additional notes on implementing control charts

Start with a decent sample size

Be aware of H0 for different types of control chart

Be aware of important settings affecting $U_{CL}$

Be aware of H₀ for different types of control chart