6.4 Multivariate Behrens-Fisher problem

Overview

In a multivariate context, there are many ways that groups of sampling units can differ from one another. For example, let's consider conceptually just three important ways that groups (i.e., sets of sampling units in a multivariate space) can differ from one another (see Fig. 6.1). (There are more ways, of course)! They can differ in the position of their central location (centroids), in the overall variability of their sampling units (dispersion or spread), and/or in their degree of correlation among pairs of variables (shape).

Fig. 6.1. Schematic diagram of bivariate data in each of two groups (triangles vs circles) where the groups have: (a) similar centroids, spread and shape; (b) different centroids (a shift in the central location of the points); (c) different spread (triangles are more dispersed); (d) different shapes (triangles show a pattern of negative correlation, while circles show a pattern of positive correlation); and (e) different centroids, different overall spread and different shapes.

The multivariate Behrens-Fisher problem in classical statistics is typically stated as the problem of testing for the equality of mean vectors (centroids) from two or more multivariate normal distributions (groups or populations), when their covariance matrices (describing the shape and dispersion of the samples within each group) are possibly not equal.

The majority of solutions to the multivariate BFP (e.g., see Johnson & Weerhandi (1988) , Coombs & Algina (1996) , Christensen & Rencher (1997) , Gamage et al. (2004) , Belloni & Didier (2008) , Krishnamoorthy & Lu (2010) ) assume variables are multivariate normal and also do not handle high-dimensional data, where the number of variables can exceed the sample sizes (but see Ahmad et al. (2012) and Ahmad (2014) for some proposed non-parametric solutions to the multivariate BFP based on U statistics).

However, we would really like a solution to the multivariate BFP for dissimilarity-based approaches (such as ANOSIM or PERMANOVA). In this context, the somewhat more general multivariate BFP would be stated as:

How can we test for differences in central location (in the multivariate space defined by a given resemblance measure) when there are differences in dispersion (spread) among the groups?

We may begin by doing a test for homogeneity of multivariate dispersions using the PERMDISP routine in PRIMER (see Anderson (2006) and Anderson et al. (2006) ). If we find significant differences in spread among the groups, then we may consider how this might affect any test we may wish to perform using either ANOSIM or PERMANOVA.

Effects of heterogeneous dispersions on dissimilarity-based tests

ANOSIM

Anderson & Walsh (2013) did a simulation study to investigate how ANOSIM and PERMANOVA would be affected by variation in multivariate dispersions. They found that ANOSIM was very strongly affected by heterogeneity. Specifically, the ANOSIM test is sensitive to:

differences in location (centroids);
differences in dispersion; and/or
differences in shape.

ANOSIM's null hypothesis may be put simply as 'there are no differences among the groups', so any of these types of differences (individually or collectively), might trigger a significant result in an ANOSIM test. Although ANOSIM is more likely to reject the null hypothesis for changes in location (centroid), it does not set out to be a test for differences in location only - it is a test of any differences between groups that might render them 'distinctive'. Indeed, the R statistic in ANOSIM might best be regarded as a measure of the distinctiveness of the groups (see Clarke (1993) and Warwick & Clarke (1993) ).

PERMANOVA

In contrast, PERMANOVA is much more akin to classical ANOVA. It performs a partitioning of the variability in the space of the resemblance measure, and therefore is focused much more strongly on detecting shifts in location. PERMANOVA tests the more specific null hypothesis: 'there are no differences among the group centroids' in that space. The behaviour of PERMANOVA in the face of heterogeneous dispersions also mirrors what has been found for the classical univariate $F$ test. Specifically, Anderson & Walsh (2013) found that PERMANOVA was not affected by heterogeneous dispersions if the design was balanced (equal sample sizes per group). However, if the design was unbalanced (unequal sample sizes per group), then, precisely as in a univariate $F$ test, PERMANOVA was:

conservative (yielding an inflated Type II error rate) if a group (or groups) with a large sample size also had large variation relative to other groups; and
liberal (yielding an inflated Type I error rate) if a group (or groups) with a small sample size also had large variation relative to other groups.

In other words, if a group with a large sample size is greatly dispersed, then it wil be very difficult to detect a true shift in the centroids; the large within-group dispersion of that group will dominate the analysis (Fig. 6.2a). On the other hand, if a small sample-sized group has large dispersion, then even small differences in the sample centroids entirely due to random sampling might look relatively large (and be detected as significant) relative to the small within-group dispersion seen in other groups (Fig. 6.2b).

Fig. 6.2. Schematic diagram showing how imbalance can affect tests for differences in centroid in PERMANOVA: (a) large dispersion in a group with a large sample size will increase Type II error, hence decrease the power of the test; (b) large dispersion in a group with a small sample size will increase Type I error.

Introduction

New Statistical Methods in P8

New Tools & Utilities in P8

1.1 Expansion from P7 to P8

1.2 Definitions of statistics

1.3 Biotic data: summary stats

1.4 Split summary stats results by groups

1.5 Environmental data: summary stats

2.1 What is an empirical distribution?

2.2 Example: Empirical distributions of oyster sizes

3.1 Plots of empirical densities

3.2 Example: Dotplot of oyster sizes

3.3 Example: Violin plot of kelp holdfast volumes

4.1 Wilcoxon signed-rank test

4.2 Example: Plankton hauls

4.3 Mann-Whitney U test

4.4 Example: Snapper in marine reserves

4.5 Kruskal-Wallis test

4.6 Example: A bivalve species from Ekofisk

4.7 Kolmogorov-Smirnov test

4.8 Example: Sizes of oysters

4.9 Test of Association

4.10 Example: Ekofisk diversity

4.11 Example: Associations between species

Overview of new 'Design' options and tools

6.1 Overview - Allow heterogeneity

6.2 ANOVA in a nutshell

6.3 The Behrens-Fisher problem (BFP)

6.4 Multivariate Behrens-Fisher problem

6.5 Solution to the multivariate BFP

6.6 Example: one-way PERMANOVA allowing heterogeneity

6.7 Heterogeneity in more complex designs

6.8 Example: two-way crossed PERMANOVA allowing heterogeneity

7.1 Overview - Finite factors

7.2 Dichotomy: fixed vs random factors

7.3 Not a dichotomy: a progression from fixed to random

7.4 Example: environmental impact on molluscs

7.5 Broader implications for detecting impact

8.1 Designs lacking replication

8.2 Example: Split-plot - Woodstock vegetation

8.3 Example: Repeated measures - Victorian avifauna

9.1 Why group covariables together?

9.2 Periodic and cyclical models

9.3 Example: Annual monthly cycles - B.C. macroalgae

10.1 Ordinations for multi-factor designs

10.2 Main effects plot

10.3 Interaction plot

10.4 Example: NZ fish assemblages

11.1 What are 'residual' distances?

11.2 Example: Plankton (revisited)

12.1 Overview - Control charts

12.2 Classical univariate control chart

12.3 Classical multivariate control chart

12.4 Bivariate normal example: NZ fish

12.5 Dissimilarity-based multivariate control chart

12.6 Additional notes on implementing control charts

12.7 Example: Birds from Grand Forks

13.1 Overview

13.2 Analysing cumulative standardised data

13.3 Example: Mussel sizes in the Gulf of Alaska

13.4 Example: Gulf of Maine invertebrates - functional resemblance

14.1 Overview

14.2 Example: NE Pacific groundfish vs depth

15.1 New default colour palette

15.2 New selection options

15.3 Re-name levels of a factor (or indicator)

15.4 Add customised values/labels to graphical axes

15.5 Split data sheet by factor/indicator

15.6 Line plots for samples

15.7 Output group-level stats from dispersion (or variability) weighting

15.8 Output diagnostic plots from CAP

15.9 New diagnostics for PCA/PCO plots

6.4 Multivariate Behrens-Fisher problem

Overview

Effects of heterogeneous dispersions on dissimilarity-based tests

ANOSIM

PERMANOVA