Statistical modelling

By integrating the analysis of genetic and species levels, we expect to discover new emerging patterns that could inform about the processes that drive biodiversity.

Study of bulk samples of sequence data permits a focus on communities in addition to species-based analyses.

Our general model of biodiversity for interpreting site-based sequence data is based on stochastic (random) dispersal

Species-genetic diversity correlation (SGDC)

When analyzing a single site, neutral processes predict that total species diversity at a site (as a function of size, connectivity and age of a habitat patch) is positively correlated to the genetic diversity of each local species (as a function of population size and age).  

SGDC provides a framework for testing diversity patterns at multiple hierarchical levels.  

Diversity at both species and genetic levels can be assessed based on sequence data from bulk samples, using established procedures of sequence-based species delimitation and single nucleotide polymorphisms (SNPs) to assess variation within each of these entities.

While highly suited to the site-based approach to the study of alpha diversity at multiple levels, the framework can be extended to patterns of turnover (beta diversity). 

 If continuous over time, stochastic dispersal is expected to result in a continuum of divergence among sites

  • individuals are close to their parents at birth and follow a trajectory of limited dispersal
  • patterns of clumping and nestedness arise over time with the accumulation of mutations

As a consequence, similarity of communities decreases with spatial distance, and it does so in concert across hierarchical levels.  Hence, spatial patterns evident at the lowest hierarchical (haplotype) levels should mirror those at higher (species levels), but at larger scales. 

Diagram of SGDC

The correlation of species diversity and intraspecific genetic diversity

The correlation of species diversity and intraspecific genetic diversity

Panel A. Local communities shown as white rectangles are composed of multiple species (large coloured rectangles A-D) each with multiple haplotypes (small rectangles 1-12). Note that higher species numbers are correlated with the number of genotypes per species.  Species may be not shared between both communities, resulting in β-diversity. These species (A, B, D) reduce the similarity between communities. Likewise, haplotypes of shared species may contribute to turnover at the haplotype level (i.e. 7,8,9). The metacommunity is connected among sampling sites by migration m, speciation and extinction, whose rates depend on abundance of a species and connectivity.

Panel 1B. Plots of pairwise community similarity (derived from the species and haplotypes shared, as shown in Panel A), as a function of geographic distance between these communities. The example shows the greater community similarity when studied at the species level of species than haplotype level, i.e. the β-diversity is greater for haplotypes. The predicted self-similarity at both hierarchical (temporal) levels over different spatial scales is shown by the arrows. Note that one species out of 4 is shared among both communities (S=1/4) but only 1 haplotype of 12 (S=1/12).