7.1 Overview - Finite factors

ANOVA is one of the most widely used statistical techniques, providing a partitioning of the measured variation of a random variable in response to one or more factors in complex experimental designs and sampling programmes. A factor is a categorical variable that identifies several groups or levels that are of special interest to the researcher (e.g., treatment vs controls), or that are contributing a potentially important source of variation in the study design (e.g., sites). To make rigorous inferences in multi-factorial ANOVA settings, we need to ascertain, for each and every factor in a given experiment or sampling protocol, whether that factor is fixed or random. Classically, the levels of a fixed factor are viewed as being finite, while those of a random factor are viewed as being drawn randomly from an infinite (or, at least, an uncountably large) population of possible levels. The choice of whether any given factor is fixed or random is viewed as a dichotomy. The need to make appropriate decisions about this for every factor in the design before embarking on any statistical analysis is essential for dissimilarity-based PERMANOVA, just as it is for univariate ANOVA. There are important consequences of these choices on the results and the inferences that can be drawn from them.

What would happen if we have a factor that would typically be thought of and treated as random, but the population of possible levels is finite ? Well, if we can sample all of the levels, we might then just treat the random factor as fixed. However, what if we can't sample all of them, but we can sample a substantial fraction of them? Anderson et al. (2025) describe how the dichotomy of fixed vs random can, instead, be viewed as a progression, which depends on how much of the population of possible levels of a given factor has actually been sampled (i.e., the sampling fraction).

Finite factors tend to occur at large spatial scales. For example, suppose there is a cluster of 20 islands in a given region, and suppose 10 of these have undergone some intensive restoration of habitat. We may not be able to sample all of the islands, but perhaps we can sample 4 restored and 4 unrestored islands (in each case, out of a possible 10). By treating the factor of 'Islands' as 'finite', and specifying the size of the population and hence identifying the sampling fraction (the sampling fraction here is 4/10), we are able to get much stronger and more powerful inferences regarding the effectiveness of the restoration than we would otherwise obtain if we were to treat the factor of 'Islands' as random.

Introduction

New Statistical Methods in P8

New Tools & Utilities in P8

1.1 Expansion from P7 to P8

1.2 Definitions of statistics

1.3 Biotic data: summary stats

1.4 Split summary stats results by groups

1.5 Environmental data: summary stats

2.1 What is an empirical distribution?

2.2 Example: Empirical distributions of oyster sizes

3.1 Plots of empirical densities

3.2 Example: Dotplot of oyster sizes

3.3 Example: Violin plot of kelp holdfast volumes

4.1 Wilcoxon signed-rank test

4.2 Example: Plankton hauls

4.3 Mann-Whitney U test

4.4 Example: Snapper in marine reserves

4.5 Kruskal-Wallis test

4.6 Example: A bivalve species from Ekofisk

4.7 Kolmogorov-Smirnov test

4.8 Example: Sizes of oysters

4.9 Test of Association

4.10 Example: Ekofisk diversity

4.11 Example: Associations between species

Overview of new 'Design' options and tools

6.1 Overview - Allow heterogeneity

6.2 ANOVA in a nutshell

6.3 The Behrens-Fisher problem (BFP)

6.4 Multivariate Behrens-Fisher problem

6.5 Solution to the multivariate BFP

6.6 Example: one-way PERMANOVA allowing heterogeneity

6.7 Heterogeneity in more complex designs

6.8 Example: two-way crossed PERMANOVA allowing heterogeneity

7.1 Overview - Finite factors

7.2 Dichotomy: fixed vs random factors

7.3 Not a dichotomy: a progression from fixed to random

7.4 Example: environmental impact on molluscs

7.5 Broader implications for detecting impact

8.1 Designs lacking replication

8.2 Example: Split-plot - Woodstock vegetation

8.3 Example: Repeated measures - Victorian avifauna

9.1 Why group covariables together?

9.2 Periodic and cyclical models

9.3 Example: Annual monthly cycles - B.C. macroalgae

10.1 Ordinations for multi-factor designs

10.2 Main effects plot

10.3 Interaction plot

10.4 Example: NZ fish assemblages

11.1 What are 'residual' distances?

11.2 Example: Plankton (revisited)

12.1 Overview - Control charts

12.2 Classical univariate control chart

12.3 Classical multivariate control chart

12.4 Bivariate normal example: NZ fish

12.5 Dissimilarity-based multivariate control chart

12.6 Additional notes on implementing control charts

12.7 Example: Birds from Grand Forks

13.1 Overview

13.2 Analysing cumulative standardised data

13.3 Example: Mussel sizes in the Gulf of Alaska

13.4 Example: Gulf of Maine invertebrates - functional resemblance

14.1 Overview

14.2 Example: NE Pacific groundfish vs depth

15.1 New default colour palette

15.2 New selection options

15.3 Re-name levels of a factor (or indicator)

15.4 Add customised values/labels to graphical axes

15.5 Split data sheet by factor/indicator

15.6 Line plots for samples

15.7 Output group-level stats from dispersion (or variability) weighting

15.8 Output diagnostic plots from CAP

15.9 New diagnostics for PCA/PCO plots

7.1 Overview - Finite factors