Skip to main content

12.3 Classical multivariate control chart

A suitable criterion for a control chart designed to detect shifts in the population mean vector for multivariate normal data is Hotelling's $T^2$, the (normalised) deviation of a sample vector (or a sample mean vector) measured at time $t$ from some known (or hypothesised) target population mean vector ( Hotelling (1947) , Seber (1984) , Quesenberry (2007) ). Usually, an upper bound is set as a limit on the acceptable values for the proposed charting criterion, and any value of the criterion that exceeds this limit indicates that the process is 'out of control' at that time point. Multivariate control charts can be used not only to detect shifts in the mean vector of a process, but also to detect multivariate observations (individual samples) that are outliers ('out of control').

In industrial or manufacturing settings, the desired target mean and variance of the process is often known a priori, or else there is a substantial set of sample values measured from the process when it is known to be 'in control' (called 'Phase I') from which target values can be estimated ( Jensen et al. (2006) ), and against which values obtained from subsequent samples (in 'Phase II') can be measured. This information, and the types of variables that are often being monitored (quantitative, continuous and normally distributed) all provide a straightforward basis for constructing a suitable control chart using classical statistical techniques (e.g., Seber (1984) , Quesenberry (2007) , Montgomery (2020) ). The upper bound is typically derived from statistical results and may be articulated rather easily, e.g., the 0.95-quantile of a known probability distribution for the chosen charting criterion under classical assumptions. Rapid successful detection of an 'out-of-control' situation (if present) is the primary goal.

Control chart using Hotelling's $T^2$

Let matrix ${\bm Y}= \lbrace y_{ij} \rbrace$ consist of simultaneous measurements on each of $j = 1, \ldots, p$ variables (columns) obtained at each of $i = 1, \ldots, N$ sequential time points (rows). Also, let the $p$-length vector of measurements at any particular time-point $t$ be denoted by ${\bm y}_ t$. Furthermore, let the $p$-length vector of arithmetic averages calculated from the observed values for each of the variables for a designated subset of the $i = 1, \ldots, n_c$ sampling points (where $n_c < N$), which are all deemed to have been sampled when the system is 'in control', be denoted by $\bar{{\bm y}}_ c$, with elements:

$$ \lbrace \bar{y}_ {cj} \rbrace = {\Bigg \lbrace} \frac{1}{n_c} \sum_{i=1}^{n_c} y_{ij} {\Bigg \rbrace} $$

Now, consider the null hypothesis (H0) that the system remains in control at time $t$. If we assume that, when the system is in control, the variables arise jointly from a multivariate normal distribution with population mean vector ${\bm \mu}_ c$ and population covariance matrix ${\bm \Sigma }_ c$, then under H0 we have ${\bm y}_ t \sim N_p({\bm \mu}_ c, {\bm \Sigma }_ c)$. A suitable control-chart test-statistic ( Hotelling (1947) , Seber (1984) ) is given by:

$$ T^2 = ({\bm y}_ t - \bar{{\bm y}}_ c)^{\text T} {\bm S}_ c^{-1} ({\bm y}_ t - \bar{{\bm y}}_ c) $$

where superscript '$\text{T}$' indicates the transpose, superscript '$-1$' indicates the matrix inverse, and ${\bm S}_ c$ is the unbiased $(p \times p)$ sample variance-covariance matrix calculated on the set of in-control data points, with elements:

$$ \lbrace s_{jj'} \rbrace = \frac{1}{(n_c - 1)} \sum_{i=1}^{n_c} (y_{ij} - \bar{y}_ {cj})(y_{ij'} - \bar{y}_ {cj'}) $$ for every pair of variables $j = 1, \ldots, p$ and $j' = 1, \ldots, p$.

If H0 is true, then the control-chart test-statistic is distributed as a scalar multiple of a classical $F$-distribution (e.g., Seber (1984) ), namely:

$$ T^2 \sim \frac{p(n_c + 1)(n_c - 1)}{n_c(n_c-p)}F_{p,(n_c-p)} $$

The upper control-chart limit at a chosen significance level, $\alpha$, is therefore given by

$$ U_{CL} = \frac{p(n_c + 1)(n_c - 1)}{n_c(n_c-p)}Q_{(1-\alpha)}[F_{p,(n_c-p)}] $$

where $Q_{(1-\alpha)}[f]$ is the $(1-\alpha)$-quantile of probability density (or mass) function, $f$. If the true values of the parameters in matrix ${\bm \Sigma }_ c$ are known, then $T^2 \sim \chi^2_p$, a chi-square distribution ( Seber (1984) ), and we may use, more simply, $U_{CL} = Q_{(1-\alpha)}[\chi^2_p]$.

Note that, if the $n_c$ sampling units in the reference (in-control) set remain the same (e.g., there is a baseline, or 'Phase I' set of sampling units) and as $p$ remains constant, then the classical upper control-chart limit $U_{CL}$ (whether it relies on $F$ or $\chi^2$) remains constant over time.

Progressive change-point control chart

At any particular time-point $t$, we may wish to assess the extent to which the multivariate observation vector $y_t$ is unusual, given what has been observed up to and including time $(t-1)$. Thus, at time $t$, there are $n_c = (t-1)$ in-control sampling units, and the classical control-chart test-statistic is distributed as $$ T^2_t \sim \frac{pt(t - 2)}{(t-1)(t-p-1)}F_{p,(t-p-1)} $$

In this case, information about the characteristics of the system when it is “in control” increases incrementally over time. Thus, the value of the test-statistic, $T_t^2$, its distribution and hence the upper limit of the control chart, all change progressively over time with changes in the value of $t$. Note that the commencement of the progressive chart relying on the above classical result may not occur until such time as $t$ exceeds (at least) $p+2$.

High-dimensionality and shrinkage

Problems can arise using the proposed progressive control chart in a high-dimensional system, where $p$ exceeds $n_c$. For example, values of $n_c$ will inevitably be relatively small in the early stages of monitoring, when there are yet few in-control time points available. In such cases, the empirical estimate of the covariance matrix (${\bm S}_ c$) will be unsuitable; specifically, it loses full rank, is no longer positive definite, becomes singular and can no longer be inverted (e.g., Schäfer & Strimmer (2005) ).

To improve the control-chart performance for high-dimensional data and allow commencement of the monitoring scheme even for relatively small numbers of in-control samples, a shrinkage estimate of the covariance matrix can be obtained from a given set of in-control multivariate data, as follows:

$$ {\bm W}_ c = \lambda{\bm T} + (1 - \lambda){\bm S}_ c $$

where ${\bm T}$ is a so-called 'target' matrix and $\lambda \in [0,1]$ denotes the shrinkage intensity. Thus, ${\bm W}_ c$ is a weighted average of ${\bm S}_ c$ and ${\bm T}$, where $\lambda = 0$ gives ${\bm W}_ c = {\bm S}_ c$ and $\lambda = 1$ gives ${\bm W}_ c = {\bm T}$.

Two important questions immediately arise: (i) how shall ${\bm T}$ be constructed? and (ii) what value shall be chosen for $\lambda$? Here, we suggest using a target matrix that shrinks the diagonal elements (i.e., the sample variances) of the empirical estimate of the covariance matrix towards their median and shrinks the off-diagonal entries to zero ( Opgen-Rhein & Strimmer (2007) , Ullah et al. (2017) ). This has the effect of reducing larger eigenvalues and increasing smaller ones, thereby counteracting known biases inherent in sample-based estimation ( Friedman (1989) , Opgen-Rhein & Strimmer (2007) ). To estimate an optimal value for $\lambda$, we also use here the direct analytical approach of Opgen-Rhein & Strimmer (2007) and Schäfer & Strimmer (2005) , which is easy, fast and has good empirical and statistical properties.

The shrinkage estimate ${\bm W}_ c$ of the covariance matrix has been shown to be well-conditioned for small samples and does not make any distributional assumptions, so is not restricted to being used only with multivariate normal data ( Ullah et al. (2017) ). For further details regarding shrinkage estimators, including a variety of choices for target matrices and intensity parameters, see Friedman (1989) , Ledoit & Wolf (2003) , Ledoit & Wolf (2004) , Schäfer & Strimmer (2005) , Ullah et al. (2017) , Adegoke et al. (2018) and references therein.