6.3 The Behrens-Fisher problem (BFP)

Overview

The Behrens-Fisher problem (BFP) is one of the oldest puzzles in statistics ( Behrens (1929) ; Fisher (1935) ; Welch (1938) ). The essence of this problem is how validly to compare the means of two or more populations (groups) when their variances differ. It is clear how the assumption of common variance is built right in to the ANOVA $F$ statistic itself. For example, consider the one-way case, where the $F$ ratio is built using a single common estimate of the error variance (i.e., the residual mean square) as its denominator.

There are quite a few solutions to the Behrens-Fisher problem for univariate data (e.g., see Wang (1971) , Brown & Forsythe (1974) , Clinch & Keselman (1982) , Weerhandi (1993) and Ghosh & Kim (2001) ), yet all generally assume normality of errors.

Below we shall outline a solution to the univariate BFP proposed by Brown & Forsythe (1974) , as it points the way towards a more generalised solution to the BFP for multivariate data in dissimilarity-based analyses, using PERMANOVA.

The Brown & Forsythe (1974) solution to the BFP

Brown & Forsythe (1974) proposed a modification of the classical univariate $F$ ratio such that the means are weighted by $n_i / s_i^2$ (rather than being weighted only by $n_i$) and the denominator is chosen in order to ensure that numerator and denominator have the same expectation under a true null hypothesis, after this adjustment in the weights.

The resulting modified test-statistic is given by them as: $$ F_{\tiny{BF}} = \frac{ \sum_{i=1}^a n_i (\bar{y}_ {i \cdot} - \bar{y}_ {\cdot\cdot})^2 } { \sum_{i=1}^a (1 - n_i / N) s_i^2 } $$ Under the usual classical ANOVA assumptions, a p value can be obtained by comparing this modified test-statistic to an $F_0$ distribution having $(a-1)$ and $f$ degrees of freedom (defined implicitly by the Satterthwaite (1941) approximation), where: $$ f = \frac{1} { \sum_{i=1}^a c_i^2 / (n_i - 1)} $$ and $$ c_i = \frac { (1-n_i/N)s_i^2 } { \sum_{i = 1}^a (1 - n_i/N)s_i^2 } $$

Next, we shall see how a similar modification to the PERMANOVA pseudo F statistic can be constructed to allow heterogeneous dispersions in dissimilarity-based settings as well.

Introduction

New Statistical Methods in P8

New Tools & Utilities in P8

1.1 Expansion from P7 to P8

1.2 Definitions of statistics

1.3 Biotic data: summary stats

1.4 Split summary stats results by groups

1.5 Environmental data: summary stats

2.1 What is an empirical distribution?

2.2 Example: Empirical distributions of oyster sizes

3.1 Plots of empirical densities

3.2 Example: Dotplot of oyster sizes

3.3 Example: Violin plot of kelp holdfast volumes

4.1 Wilcoxon signed-rank test

4.2 Example: Plankton hauls

4.3 Mann-Whitney U test

4.4 Example: Snapper in marine reserves

4.5 Kruskal-Wallis test

4.6 Example: A bivalve species from Ekofisk

4.7 Kolmogorov-Smirnov test

4.8 Example: Sizes of oysters

4.9 Test of Association

4.10 Example: Ekofisk diversity

4.11 Example: Associations between species

Overview of new 'Design' options and tools

6.1 Overview - Allow heterogeneity

6.2 ANOVA in a nutshell

6.3 The Behrens-Fisher problem (BFP)

6.4 Multivariate Behrens-Fisher problem

6.5 Solution to the multivariate BFP

6.6 Example: one-way PERMANOVA allowing heterogeneity

6.7 Heterogeneity in more complex designs

6.8 Example: two-way crossed PERMANOVA allowing heterogeneity

7.1 Overview - Finite factors

7.2 Dichotomy: fixed vs random factors

7.3 Not a dichotomy: a progression from fixed to random

7.4 Example: environmental impact on molluscs

7.5 Broader implications for detecting impact

8.1 Designs lacking replication

8.2 Example: Split-plot - Woodstock vegetation

8.3 Example: Repeated measures - Victorian avifauna

9.1 Why group covariables together?

9.2 Periodic and cyclical models

9.3 Example: Annual monthly cycles - B.C. macroalgae

10.1 Ordinations for multi-factor designs

10.2 Main effects plot

10.3 Interaction plot

10.4 Example: NZ fish assemblages

11.1 What are 'residual' distances?

11.2 Example: Plankton (revisited)

12.1 Overview - Control charts

12.2 Classical univariate control chart

12.3 Classical multivariate control chart

12.4 Bivariate normal example: NZ fish

12.5 Dissimilarity-based multivariate control chart

12.6 Additional notes on implementing control charts

12.7 Example: Birds from Grand Forks

13.1 Overview

13.2 Analysing cumulative standardised data

13.3 Example: Mussel sizes in the Gulf of Alaska

13.4 Example: Gulf of Maine invertebrates - functional resemblance

14.1 Overview

14.2 Example: NE Pacific groundfish vs depth

15.1 New default colour palette

15.2 New selection options

15.3 Re-name levels of a factor (or indicator)

15.4 Add customised values/labels to graphical axes

15.5 Split data sheet by factor/indicator

15.6 Line plots for samples

15.7 Output group-level stats from dispersion (or variability) weighting

15.8 Output diagnostic plots from CAP

15.9 New diagnostics for PCA/PCO plots

6.3 The Behrens-Fisher problem (BFP)

Overview

The Brown & Forsythe (1974) solution to the BFP