1.2 Definitions of statistics

Given a set of values $\{ y_1, y_2, ..., y_n \}$ for any individual variable $Y$, the following summary statistics can be calculated by clicking on Tools > Summary Stats... in PRIMER 8:

Average: $\hspace{1mm}$ $\bar{y} = \sum_{i=1}^n{y_i} / n $, the average (or mean)
Median: $\hspace{1mm}$ $m$, the median value
Sum: $\hspace{1mm}$ $\sum_{i=1}^n{y_i}$, the sum of the values
Minimum: $\hspace{1mm}$ $\min(y_i)$, the minimum value
Maximum: $\hspace{1mm}$ $\max(y_i)$, the maximum value
Quantiles: $\hspace{1mm}$ $q_\alpha$, the value corresponding to a given ($\alpha$-)quantile in the empirical distribution of values. Quantiles must be chosen by the end-user and must be in the range (0, 1).
Range: $\hspace{1mm}$ the range; i.e., $(\max(y_i) - \min(y_i))$, the difference between the maximum and minimum values
IQR: $\hspace{1mm}$ the inter-quartile range; i.e., $(q_{0.75} - q_{0.25})$, the difference between the upper and lower quartile.
Standard deviation: $\hspace{1mm}$ $s$, the standard deviation; i.e., the square root of the variance.
Variance: $\hspace{1mm}$ $s^2=\sum_{i=1}^n{(y_i - \bar{y})^2} / (n-1)$, an unbiased estimate of the variance.
Sample size: $\hspace{1mm}$ $n$, the number of values
Standard error: $\hspace{1mm}$ $\sqrt{s^2/n}$, the standard error of the mean
Symmetry: $\hspace{1mm}$ $\alpha$-symmetry statistic, with $\alpha$ chosen by the end-user (default $\alpha$ = 0.05). For symmetric data, the median ($m$) is equidistant from the $\alpha$-quantile and the $(1-\alpha)$-quantile. The $\alpha$-symmetry statistic is defined as $(m-q_{\alpha}) / (q_{1-\alpha} - q_{\alpha})$ for a given quantile ($\alpha$). A value close to 0.5 indicates symmetry, a value < 0.5 indicates right-skewness, and a value > 0.5 indicates left-skewness.
Skewness: $\hspace{1mm}$ $k_3$, the skewness coefficient; i.e., $$ k_3 = \frac{ n \sum_{i=1}^n (y_i - \bar{y})^3 } { (n-1)(n-2) \cdot s^3 } $$A value close to zero indicate symmetry. A positive value indicates right-skewness; a negative value indicates left-skewness. See Sheskin (2011) .
Kurtosis: $\hspace{1mm}$ $k_4$, the kurtosis coefficient; i.e., $$ k_4 = \frac{ \left[ \left[ \sum_{i=1}^n (y_i - \bar{y})^4 (n)(n+1) \right] / (n-1) \right] - 3 \left[ \sum_{i=1}^n (y_i - \bar{y})^2 \right]^2 } { (n-2)(n-3) \cdot s^4 } $$ A value close to zero indicates a mesokurtic distribution. A positive value indicates a leptokurtic distribution (pointy, with broad tails). A negative value indicates a platykurtic distribution (flat-topped, with short tails). See Sheskin (2011) .
Number of zeros: $\hspace{1mm}$ the number of zeros.
Singletons: $\hspace{1mm}$ the number of ones (useful for count data).
Doubletons: $\hspace{1mm}$ the number of twos (useful for count data).
Number of nonzeros (frequency): $\hspace{1mm}$ the number of non-zero values; e.g., if the variable consisted of counts of an organism, this would be the frequency of occurrences of that organism across the set of values (samples).
Smallest number above threshold: $\hspace{1mm}$ the smallest value in the set that occurs above a specified threshold value ($y_t$), chosen by the end-user. For example, to obtain the smallest non-zero value in a set of non-negative values, specify $y_t=0$. Here is another example: suppose a variable consists of lead (Pb) concentrations measured from sediment. It may be useful to identify the smallest concentration value recorded above the detection limit of the instrument. Knowing the smallest non-zero (or detected) value can be handy for choosing an appropriate constant ($c$) to add for a transformation such as $log(y+c)$, when the variable contains zero values.
Largest number below threshold: $\hspace{1mm}$ the largest value in the set that occurs below a specified threshold value ($y_t$), chosen by the end-user. This option has similar uses to the previous one, but for non-positive data.

Introduction

New Statistical Methods in P8

New Tools & Utilities in P8

1.1 Expansion from P7 to P8

1.2 Definitions of statistics

1.3 Biotic data: summary stats

1.4 Split summary stats results by groups

1.5 Environmental data: summary stats

2.1 What is an empirical distribution?

2.2 Example: Empirical distributions of oyster sizes

3.1 Plots of empirical densities

3.2 Example: Dotplot of oyster sizes

3.3 Example: Violin plot of kelp holdfast volumes

4.1 Wilcoxon signed-rank test

4.2 Example: Plankton hauls

4.3 Mann-Whitney U test

4.4 Example: Snapper in marine reserves

4.5 Kruskal-Wallis test

4.6 Example: A bivalve species from Ekofisk

4.7 Kolmogorov-Smirnov test

4.8 Example: Sizes of oysters

4.9 Test of Association

4.10 Example: Ekofisk diversity

4.11 Example: Associations between species

Overview of new 'Design' options and tools

6.1 Overview - Allow heterogeneity

6.2 ANOVA in a nutshell

6.3 The Behrens-Fisher problem (BFP)

6.4 Multivariate Behrens-Fisher problem

6.5 Solution to the multivariate BFP

6.6 Example: one-way PERMANOVA allowing heterogeneity

6.7 Heterogeneity in more complex designs

6.8 Example: two-way crossed PERMANOVA allowing heterogeneity

7.1 Overview - Finite factors

7.2 Dichotomy: fixed vs random factors

7.3 Not a dichotomy: a progression from fixed to random

7.4 Example: environmental impact on molluscs

7.5 Broader implications for detecting impact

8.1 Designs lacking replication

8.2 Example: Split-plot - Woodstock vegetation

8.3 Example: Repeated measures - Victorian avifauna

9.1 Why group covariables together?

9.2 Periodic and cyclical models

9.3 Example: Annual monthly cycles - B.C. macroalgae

10.1 Ordinations for multi-factor designs

10.2 Main effects plot

10.3 Interaction plot

10.4 Example: NZ fish assemblages

11.1 What are 'residual' distances?

11.2 Example: Plankton (revisited)

12.1 Overview - Control charts

12.2 Classical univariate control chart

12.3 Classical multivariate control chart

12.4 Bivariate normal example: NZ fish

12.5 Dissimilarity-based multivariate control chart

12.6 Additional notes on implementing control charts

12.7 Example: Birds from Grand Forks

13.1 Overview

13.2 Analysing cumulative standardised data

13.3 Example: Mussel sizes in the Gulf of Alaska

13.4 Example: Gulf of Maine invertebrates - functional resemblance

14.1 Overview

14.2 Example: NE Pacific groundfish vs depth

15.1 New default colour palette

15.2 New selection options

15.3 Re-name levels of a factor (or indicator)

15.4 Add customised values/labels to graphical axes

15.5 Split data sheet by factor/indicator

15.6 Line plots for samples

15.7 Output group-level stats from dispersion (or variability) weighting

15.8 Output diagnostic plots from CAP

15.9 New diagnostics for PCA/PCO plots

1.2 Definitions of statistics