class: center, middle, inverse, title-slide .title[ # Introduction to model-based distribution estimation & random fields ] .subtitle[ ## DFO DSAF workshop ] .author[ ### ] .date[ ### January 12–16 2026 ] --- <!-- Build with: xaringan::inf_mr() --> # Why study distribution shifts? Species distributions are changing in response to climate, fishing pressure, and habitat changes -- Management and conservation are aided by understanding how distributions are changing and may change in the future -- Cross-border coordination requires understanding range-wide distributions -- Shifting distributions can cross management boundaries, affecting quota allocations --- ### We have wonderful scientific surveys, but... Surveys don't always cover the same areas every year -- Sampling effort varies spatially and temporally -- Even faithfully implemented random or random stratified designs introduce spatial sampling noise -- We often want to combine multiple surveys -- Different surveys may use different gear, vessels, or protocols -- Raw survey catch data can't reliably tell us about true distribution changes vs. sampling artifacts --- ### Raw sampling data can confuse sampling changes with distribution changes <img src="images/velocity-fig.png" width="75%" /> .xtiny[ Pinsky, M.L., Worm, B., Fogarty, M.J., Sarmiento, J.L., and Levin, S.A. 2013. Marine taxa track local climate velocities. Science 341(6151): 1239–1242. doi:10.1126/science.1239352. ] --- ### Species ranges often cover multiple surveys .pull-left[ <img src="images/survey-join-map.png" width="55%" /> ] .pull-right[ <img src="images/dogfish-svc-simple.png" width="55%" /> ] .xtiny[ Davidson, L.N.K., English, P.A., King, J., Grant, P.B.C., Taylor, I.G., Barnett, L.A.K., Gertseva, V., Tribuzio, C.A., and Anderson, S.C. 2026. Mystery of the disappearing dogfish: transboundary analyses reveal steep population declines across the northeast pacific with little evidence for regional redistribution. Fish and Fisheries 27(1): 1–12. doi:10.1111/faf.70028. ] --- # The solution: model-based metrics .large[ 1. Fit a statistical spatiotemporal ("dynamic") species distribution model 2. Predict to a constant spatial grid (standardizes for spatial, gear, and effort variation in sampling) 3. Calculate distribution metrics from those predictions with uncertainty ] .tiny[ Thorson, J.T., Pinsky, M.L., and Ward, E.J. 2016. Model-based inference for estimating shifts in species distribution, area occupied and centre of gravity. Methods Ecol Evol 7(8): 990–1002. doi:10.1111/2041-210X.12567. ] --- # Metrics we'll focus on .large[ Center of gravity Density-weighted habitat variables Range edges Effective area occupied Spatially varying trends Subregional population indices ] --- class: center, middle, inverse # The first and most complicated part:<br><br>Fitting a useful model --- # Fitting a useful model To calculate reliable distribution metrics, we need to understand how to fit models to large spatiotemporal survey datasets -- Spatial data has challenges to modelling -- Temporal data has challenges to modelling -- Spatiotemporal data has *a lot* to think to think about when modelling -- That's going to be the focus of much of this workshop --- class: middle, inverse # To make reliable inference on species distribution changes, we first need a good model-based representation of species distribution --- class: middle # sdmTMB is an R package that (hopefully) helps you fit a reasonable spatiotemporal model to species distribution data. # Among many uses, it also helps you summarize distribution metrics from the model output. --- class: center, middle, inverse # A few example uses related to distribution or distribution change --- # Groundfish climate velocities .center[ <img src="images/english-velocity.png" width="90%" /> ] .tiny[ English, P.A., Ward, E.J., Rooper, C.N., Forrest, R.E., Rogers, L.A., Hunter, K.L., Edwards, A.M., Connors, B.M., and Anderson, S.C. 2021. Contrasting climate velocity impacts in warm and cool locations show that effects of marine warming are worse in already warmer temperate waters. Fish and Fisheries 23(1): 239–255. doi:10.1111/faf.12613. ] --- ##### Integrating dive, trawl, and trap surveys for Dungeness crab .center[ <img src="images/nephin.png" width="50%" /> ] .xtiny[ Thompson, P.L., Anderson, S.C., Nephin, J., Robb, C.K., Proudfoot, B., Park, A.E., Haggarty, D.R., and Rubidge, E.M. 2023. Integrating trawl and longline surveys across British Columbia improves groundfish distribution predictions. Can. J. Fish. Aquat. Sci. 80(1): 195–210. doi:10.1139/cjfas-2022-0108. ] --- ##### Quantifying expected bycatch ratios based from spatiotemporal distribution to inform legal expansion of Indigenous fishing rights .pull-left[ <img src="images/halibut-ye.jpg" width="100%" /> ] .pull-right[ <br> <br> .tiny[ English, P.A., Picco, C.M., Edwards, J.C., Haggarty, D.R., Forrest, R.E., and Anderson, S.C. 2023. Spatial restrictions hinder avoidance of choke species in an Indigenous rights-based fishery. People and Nature 6: 75–90. <https://doi.org/10.1002/pan3.10554> ] ] --- #### Thermal niche changes for Pacific groundfish .center[ <img src="images/thermal-niche.png" width="58%" /> ] .xtiny[ Ward, E.J., et al. 2024. Win, lose, or draw: Evaluating dynamic thermal niches of northeast Pacific groundfish. PLOS Climate 3(11): e0000454. Public Library of Science. doi:10.1371/journal.pclm.0000454. ] --- ##### Biomass and condition distribution changes for Baltic Sea cod .center[ <img src="images/condition-sweden.png" width="65%" /> ] .xtiny[ Lindmark, M., Anderson, S.C., Gogina, M., and Casini, M. 2023. Evaluating drivers of spatiotemporal variability in individual condition of a bottom-associated marine fish, Atlantic cod (*Gadus morhua*). ICES Journal of Marine Science 80(5): 1539–1550. <https://doi.org/10.1093/icesjms/fsad084>. ] --- #### Oxygen limitation effects on fish distribution .center[ <img src="images/indivero3.png" width="50%" /> ] .xtiny[ Indivero, J., Anderson, S.C., Barnett, L.A.K., Thorson, J.T., Siedlecki, S., Ward, E.J., Essington, T.E. Oxygen constrains local densities across diverse marine bottom-associated marine fishes. Preprint posted soon. Indivero, J., Anderson, S.C., Barnett, L.A.K., Essington, T.E., and Ward, E.J. 2025. Estimating a physiological threshold to oxygen and temperature from marine monitoring data reveals challenges and opportunities for forecasting distribution shifts. Ecography 2025(e07413). doi:10.1111/ecog.07413. ] --- #### More than just fish!<br>52,000 years of woolly rhinoceros population dynamics .center[ <img src="images/woolly.png" width="52%" /> ] .xtiny[ Fordham, D.A., Brown, S.C., Canteri, E., Austin, J.J., Lomolino, M.V., Haythorne, S., Armstrong, E., Bocherens, H., Manica, A., Rey-Iglesia, A., Rahbek, C., Nogués-Bravo, D., and Lorenzen, E.D. 2024. 52,000 years of woolly rhinoceros population dynamics reveal extinction mechanisms. Proc. Natl. Acad. Sci. U.S.A. 121(24): e2316419121. doi:10.1073/pnas.2316419121. ] --- #### Spatiotemporal dynamics of Swedish marine garbage .center[ <img src="images/garbage2.jpg" width="100%" /> ] .xtiny[ Norén, K., Svensson, F., and Lindmark, M. 2025. Evaluating the potential of underwater television to contribute to marine litter assessments alongside bottom trawling. PLoS One 20(6): e0324900. <https://10.1371/journal.pone.0324900>. ] --- ### Increasingly used by agencies for distribution modelling and index standardization * DFO * NOAA: NWFSC, AFSC, ... * Pacific Hake U.S.-Canada assessment * Norwegian and Swedish governments * NAFO; STACREC .tiny[Standing Committee on Research Coordination] * ICES * Inter-American Tropical Tuna Commission * International Scientific Committee (ISC) for tuna and billfish assessments --- class: center, middle, inverse # Workshop plan --- # Plan for the 3-day workshop Day 1: Intro to random fields, intro to sdmTMB, spatial models, spatiotemporal models, model comparison/validation, residuals -- Day 2: Families, delta models, time-varying effects, spatially varying effects, index standardization -- Day 3: Integrated models, forecasting, simulating, **metrics of distribution change**, future directions, closing thoughts --- # Plan for the workshop Each day: a mix of lectures and exercises, breaks as needed, talks on applied sdmTMB uses Have questions? Please ask any time or in the [Google Doc](https://docs.google.com/document/d/19zveT73HVWa73UhxsAcDntAhrBDX_F327otqve7RQ4Y/edit?tab=t.0). Thanks! --- class: center, middle, inverse # Modelling spatial/spatiotemporal data --- # Geostatistical data We're focused on **geostatistical** data: observations of a continuous spatial process at georeferenced locations -- I.e., we pick spatial coordinates to sample at and observe something, like the number or weight of fish (as opposed to "point process" data or "areal" data) <img src="01-intro_files/figure-html/field-samples-1.png" width="700px" style="display: block; margin: auto;" /> --- # Modelling spatial data Data often have spatial attributes Ideal world: 1. Plug spatial covariates into a regression model 2. Residuals are uncorrelated <img src="01-intro_files/figure-html/sim-rf-intro-1.png" width="700px" style="display: block; margin: auto;" /> --- # Reality Residual spatial autocorrelation is pervasive. -- Alternatively, there are unmodelled, missing, or "latent" spatially structured covariates <img src="01-intro_files/figure-html/sim-rf-intro-cor-1.png" width="700px" style="display: block; margin: auto;" /> --- # Why worry about these latent spatial variables? 1. Often they are of ecological interest -- 2. Ignoring them hurts our predictions -- 3. Ignoring them messes up our quantification of uncertainty --- # Modeling latent spatial variables Need a 'wiggly' surface for approximating all missing spatially structured variables -- Several approaches exist. For example: * 2D smoothers with GAMs (e.g., mgcv) * Gaussian random fields * Random forests etc. .tiny[ Miller, D.L., Glennie, R., and Seaton, A.E. 2019. Understanding the Stochastic Partial Differential Equation approach to smoothing. JABES. Stock, B.C., Ward, E.J., Eguchi, T., Jannot, J.E., Thorson, J.T., Feist, B.E., and Semmens, B.X. 2019. Comparing predictions of fisheries bycatch using multiple spatiotemporal species distribution model frameworks. CJFAS. ] --- # Why focus on Gaussian random fields? Random fields are nice because -- 1. They are very flexible in the shapes they can fit -- 2. They can be fit efficiently -- 3. They estimate ecologically interpretable parameters --- # What is a random field? <img src="01-intro_files/figure-html/random-field-demo-1.png" width="700px" style="display: block; margin: auto;" /> --- background-image: url("images/eagle.png") background-position: bottom right background-size: 35% # Random field <img src="images/rf-wikipedia.png" width="550px" /> --- background-image: url("images/beaker.png") background-position: bottom right background-size: 35% # Random field A 2 dimensional "Gaussian Process". A realization from a multivariate normal distribution constrained by some covariance function. --- background-image: url("images/elmo.png") background-position: bottom right background-size: 30% # Random fields A way of estimating a wiggly surface to account for spatial and/or spatiotemporal correlation in data. -- Alternatively, a way of estimating a wiggly surface to account for unobserved spatially structured variables. -- As a bonus, they provide useful parameter estimates: spatial variance (magnitude of wiggles) and the rate at which data become uncorrelated with distance (how smooth are the wiggles). --- # Correlation/covariance functions - These constrain the random field MVN distribution. -- - They define how correlation decays with distance. -- - Can be isotropic or anisotropic. -- - Common choices: exponential, Gaussian, Matérn. --- # Exponential correlation $$ \mathcal{C}\left(d\right) = e^{ -d/\rho } $$ - `\(d\)` is distance - `\(\rho\)` controls the rate of correlation decay with distance <img src="01-intro_files/figure-html/exp-cov-1.png" width="700px" style="display: block; margin: auto;" /> --- # Gaussian or squared exponential correlation $$ \mathcal{C}\left(d\right) = e^{ -\left(d/\rho\right)^{2} } $$ - `\(d\)` is distance - `\(\rho\)` controls the rate of correlation decay with distance <img src="01-intro_files/figure-html/gauss-cov-1.png" width="700px" style="display: block; margin: auto;" /> --- # Matérn covariance Flexible, can be exponential or Gaussian shape or other shapes <img src="01-intro_files/figure-html/matern-plot-1.png" width="700px" style="display: block; margin: auto;" /> --- # Matérn covariance .xsmall[ $$ \mathcal{C} \left(d \right) = \frac{\sigma^2}{2^{\nu - 1}\Gamma(\nu)} (\kappa d)^\nu K_\nu \kappa d \quad 😱 $$ ] .xsmall[ - `\(d\)` = distance - `\(\Gamma\)` = the gamma function - `\(K_\nu\)` = modified Bessel function of the 2nd kind(!) - `\(\kappa\)` = decorrelation rate - `\(\sigma^2\)` = variance - `\(\nu\)` is fixed at 1 to take advantage of SPDE ] --- # Matérn covariance range Range = distance where correlation has decayed<br>to `\(\sim\)` 0.13. `\(\mathrm{range} = \sqrt{8 \nu} / \kappa\)`, so if `\(\nu = 1\)`, `\(\mathrm{range} = \sqrt{8} / \kappa\)` <img src="01-intro_files/figure-html/matern-range-1.png" width="700px" style="display: block; margin: auto;" /> .tiny[ Lindgren, F., Rue, H., and Lindström, J. 2011. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73(4): 423–498. ] --- # Effects of changing variance and range <img src="01-intro_files/figure-html/sim-rf-grid-1.png" width="700px" style="display: block; margin: auto;" /> --- # Effects of adding noise * Large observation error looks like noise * `\(\sigma_{obs}\)` >> `\(\sigma_{O}\)` <img src="01-intro_files/figure-html/sim-rf-large_phi-1.png" width="700px" style="display: block; margin: auto;" /> --- # Small observation errors * `\(\sigma_{obs}\)` << `\(\sigma_{O}\)` <img src="01-intro_files/figure-html/sim-rf-small_phi-1.png" width="700px" style="display: block; margin: auto;" /> --- class: center, middle, inverse # SPDE approach: what you need to know<br>(*and a bit extra in case you find this stuff thrilling* 🤓) --- # Estimating Gaussian random fields .small[ Georeferenced data often involve 1000s or more points We need to approximate the spatial field Options include nearest neighbor methods, covariance tapering, predictive process models, binning, etc. sdmTMB uses the SPDE approach from INLA * for VAST users, this is the same * INLA books: <https://www.r-inla.org/learnmore/books> ] --- # INLA, fmesher, and the SPDE approach .xsmall[ * SPDE: stochastic partial differential equation ] -- .xsmall[ * A solution to a specific SPDE allows computing a precision matrix of a Gaussian *Markov* random field (GMRF) that is a good approximation to a Gaussian random field with **Matérn** covariance. ] -- .xsmall[ * We can estimate this sparse precision matrix efficiently. ] -- .xsmall[ * In short: the SPDE approach lets us efficiently estimate approximate Gaussian random fields by letting us work with GMRFs and their precision matrix. ] -- .xsmall[ * INLA/fmesher performs data wrangling for SPDE estimation * INLA also performs approximate Bayesian estimation * sdmTMB uses fmesher to wrangle matrices, but uses TMB for maximum marginal likelihood estimation ] .tiny[ Lindgren, F., Rue, H., and Lindström, J. 2011. An explicit link between Gaussian fields and Gaussian Markov random fields: the stochastic partial differential equation approach. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 73(4): 423–498. Miller, D.L., Glennie, R., and Seaton, A.E. 2019. Understanding the Stochastic Partial Differential Equation approach to smoothing. JABES. doi:10.1007/s13253-019-00377-z. ] --- # Introducing meshes The SPDE approach involves a triangulation over the area of interest; we call this a "mesh" or more accurately a "finite element mesh" <img src="01-intro_files/figure-html/mesh-example-1.png" width="700px" style="display: block; margin: auto;" /> --- # SPDE matrices (*extra*) With the mesh, we can calculate 3 sparse matrices needed to calculate the precision matrix `\(\boldsymbol{Q}\)` $$ \boldsymbol{Q} = \tau^2 \left(\kappa^4\boldsymbol{C_0} + 2\kappa^2\boldsymbol{G}_1 + \boldsymbol{G}_2 \right), $$ - `\(\boldsymbol{C_0}\)`, `\(\boldsymbol{G_1}\)`, and `\(\boldsymbol{G_2}\)` are these sparse matrices - `\(\tau\)` scales the precision matrix - `\(\kappa\)` is the decorrelation rate --- # Getting back from `\(\tau\)` to variance (*extra*) .xsmall[ With this precision matrix representation of the Matérn, we need a way of converting from `\(\tau\)` back to variance `\(\sigma^2\)`: ] .small[ `\(\sigma^2 = \frac{1}{\left (4 \pi \tau^2 \kappa^2 \right)}\)` ] .xsmall[ `\(\tau\)` and `\(\kappa\)` are parameters from the Matérn precision we will estimate. Note that `\(\tau\)` is inverse to `\(\sigma\)`. ] <img src="01-intro_files/figure-html/marg-variance-1.png" width="700px" style="display: block; margin: auto;" /> --- # Interpolation from mesh to data (*extra*) .small[ We usually interpolate from the mesh vertices (the random effects) to any observation or prediction locations using an interpolation (or "projection") matrix. ] This is often called the "A matrix". .xsmall[ ``` r loc <- as.matrix(sdmTMB::pcod[,c("X", "Y")]) mesh <- fmesher::fm_mesh_2d(loc = loc, cutoff = 5) A <- fmesher::fm_basis(mesh, loc = loc) nrow(loc) # number of data locations #> [1] 2143 mesh$n # number of mesh vertices #> [1] 563 dim(A) # interpolation matrix dimensions #> [1] 2143 563 ``` ] --- # Putting it all together (*extra*) .xsmall[ **Before fitting:** 1. Create a mesh 2. Calculate SPDE matrices 3. Calculate interpolation matrices ] -- .xsmall[ **Within the model:** 1. Use your SPDE matrices to construct the precision matrix 2. Evaluate the likelihood of the GMRF 3. Project from random effect vertices to the data locations 4. Add in anything else you want (e.g., other parameter effects) 4. Evaluate the data likelihood 5. Project from random effects vertices to any new prediction locations 6. Calculate derived quantities such as range and marginal variance ] --- class: center, middle, inverse background-image: url("images/flickr-9fTfup.jpg") background-position: center background-size: 100% <div class="footer">CC BY-ND 2.0: https://flic.kr/p/9fTfup</div> --- # The good news <br> <br> .xlarge[sdmTMB does those steps for you 🤓]