Skip to contents

Introduction

This vignette describes the statistical model underlying sdmTMB. Further details are available in the sdmTMB paper (Anderson et al. 2024), which should be referenced and cited when using sdmTMB in a publication.

Other vignettes available

If this vignette is being viewed on CRAN, note that many other vignettes describing how to use sdmTMB are available on the documentation site under Articles.

Notation conventions

This appendix uses the following notation conventions, which generally follow the guidance in Edwards & Auger-Méthé (2019):

  • Greek symbols for parameters,

  • the Latin/Roman alphabet for data except for cases where another symbol is used by convention,

  • bold symbols for vectors or matrices (e.g., 𝛚\boldsymbol{\omega} is a vector and ω𝐬\omega_{\mathbf{s}} is the value of 𝛚\boldsymbol{\omega} at point in space 𝐬\mathbf{s}), with Latin/Roman symbols in non-italics (e.g., 𝐗\mathbf{X}, 𝐐\mathbf{Q}),

  • ϕ\phi for all distribution dispersion parameters for consistency with the code,

  • 𝔼[y]\mathbb{E}[y] to define the expected value (mean) of variable yy,

  • Var[y]\mathrm{Var}[y] to define the expected variance of the variable yy,

  • a *^* superscript represents interpolated or projected values as opposed to values at knot locations (e.g., 𝛚\boldsymbol{\omega} vs. 𝛚*\boldsymbol{\omega}^*), and

  • where possible, notation has been chosen to match VAST (Thorson 2019) to maintain consistency (e.g., 𝛚\boldsymbol{\omega} for spatial fields and 𝛜t\boldsymbol{\epsilon}_t for spatiotemporal fields).

Tables of indices and symbols are summarized in notation reference tables at the end of this vignette.

sdmTMB model structure

The complete sdmTMB model can be written as

η𝐬,t=𝐗𝐬,tmain𝛃+O𝐬,t+𝐙g𝐛g+𝐗𝐬,ttvc𝛄t+𝐗𝐬,tsvc𝛇𝐬+ω𝐬+ϵ𝐬,t,μ𝐬,t=f1(η𝐬,t),𝔼[y𝐬,t]=μ𝐬,t, \begin{aligned} \eta_{\mathbf{s},t} &= \mathbf{X}^{\mathrm{main}}_{\mathbf{s},t} \boldsymbol{\beta}+ O_{\mathbf{s},t} + \mathbf{Z}_{g} \mathbf{b}_{g} + \mathbf{X}^{\mathrm{tvc}}_{\mathbf{s},t} \boldsymbol{\gamma}_{t} + \mathbf{X}^{\mathrm{svc}}_{\mathbf{s},t} \boldsymbol{\zeta}_{\mathbf{s}} + \omega_{\mathbf{s}} + \epsilon_{\mathbf{s},t},\\ \mu_{\mathbf{s},t} &= f^{-1} \left( \eta_{\mathbf{s},t} \right),\\ \mathbb{E}[y_{\mathbf{s},t}] &= \mu_{\mathbf{s},t}, \end{aligned}

where

  • y𝐬,ty_{\mathbf{s},t} represents the response data at point 𝐬\mathbf{s} and time tt;
  • η\eta represents the linear predictor (i.e., in link space)
  • μ\mu represents the conditional mean on the response/data scale (after applying the inverse link);
  • ff represents a link function (e.g., log or logit) and f1f^{-1} represents its inverse, and η\eta represents the linear predictor before applying f1f^{-1};
  • 𝐗main\mathbf{X}^{\mathrm{main}}, 𝐗tvc\mathbf{X}^{\mathrm{tvc}}, and 𝐗svc\mathbf{X}^{\mathrm{svc}} represent design matrices (the superscript identifiers ‘main’ = main effects, ‘tvc’ = time varying coefficients, and ‘svc’ = spatially varying coefficients);
  • 𝛃\boldsymbol{\beta} represents a vector of fixed-effect coefficients;
  • O𝐬,tO_{\mathbf{s},t} represents an offset: a covariate (usually log transformed) with a coefficient fixed at one that enters on the linear predictor scale (before the inverse link);
  • 𝐙g\mathbf{Z}_{g} represents the random-effect design row(s) for group gg (a subset of the matrix 𝐙\mathbf{Z}) corresponding to 𝐛g\mathbf{b}_{g}; these random effects enter on the linear predictor scale; the scalar special case is a random intercept αgN(0,σα2)\alpha_{g}\sim \mathrm{N}(0,\sigma_\alpha^2), and more generally random slopes and intercepts are vector-valued with full covariance as described below;
  • 𝛄t\boldsymbol{\gamma}_{t} represents a vector of time-varying coefficients, where each coefficient γp,t\gamma_{p,t} can follow a random walk or AR(1) process;
  • 𝛇𝐬\boldsymbol{\zeta}_{\mathbf{s}} represents a vector of spatially varying coefficients (random fields), where each coefficient ζl,𝐬MVN(𝟎,𝚺ζ,l)\zeta_{l,\mathbf{s}} \sim \mathrm{MVN}(\boldsymbol{0},\boldsymbol{\Sigma}_{\zeta,l});
  • ω𝐬\omega_{\mathbf{s}} represents a spatial component (a Gaussian Markov random field, GMRF), ω𝐬MVN(𝟎,𝚺ω)\omega_{\mathbf{s}} \sim \mathrm{MVN}(\boldsymbol{0},\boldsymbol{\Sigma}_\omega) with Matérn covariance built from the mesh; and
  • ϵ𝐬,t\epsilon_{\mathbf{s},t} represents a spatiotemporal component (a GMRF), ϵ𝐬,tMVN(𝟎,𝚺ϵ)\epsilon_{\mathbf{s},t} \sim \mathrm{MVN}(\boldsymbol{0},\boldsymbol{\Sigma}_{\epsilon}) with Matérn covariance over space and temporal evolution as specified.

A single sdmTMB model will rarely, if ever, contain all of the above components. Next, we will split the model to describe the various parts in more detail using ‘\ldots’ to represent the other optional components.

Main effects

μ𝐬,t=f1(𝐗𝐬,tmain𝛃+) \begin{aligned} \mu_{\mathbf{s},t} &= f^{-1} \left( \mathbf{X}^{\mathrm{main}}_{\mathbf{s},t} \boldsymbol{\beta}+ \ldots \right) \end{aligned}

Within sdmTMB(), 𝐗𝐬,tmain𝛃\mathbf{X}^{\mathrm{main}}_{\mathbf{s},t} \boldsymbol{\beta} is defined by the formula argument and represents the main-effect model matrix and a corresponding vector of coefficients. This main effect formula can contain optional penalized smoothers or non-linear functions as defined below.

Smoothers

Smoothers in sdmTMB are implemented with the same formula syntax familiar to mgcv (Wood 2017) users fitting GAMs (generalized additive models). Smooths are implemented in the formula using + s(x), which implements a smooth from mgcv::s(). Within these smooths, the same syntax commonly used in mgcv::s() can be applied, e.g. 2-dimensional smooths may be constructed with + s(x, y); smooths can be specific to various factor levels, + s(x, by = group); smooths can vary according to a continuous variable, + s(x, by = x2); the basis function dimensions may be specified, e.g. + s(x, k = 4) (see ?mgcv::choose.k); and various types of splines may be constructed such as cyclic splines to model seasonality, e.g. + s(month, bs = "cc", k = 12).

While mgcv can fit unpenalized (e.g., B-splines) or penalized splines (P-splines), sdmTMB only implements penalized splines. The penalized splines are constructed in sdmTMB using the function mgcv::smooth2random(), which transforms splines into random effects (and associated design matrices) that are estimable in a mixed-effects modelling framework. This is the same approach as is implemented in the R packages gamm4 (Wood & Scheipl 2020) and brms (Bürkner 2017).

Linear break-point threshold models

The linear break-point or “hockey stick” model can be used to describe threshold or asymptotic responses. This function consists of two pieces, so that for x<b1x < b_{1}, s(x)=xb0s(x) = x \cdot b_{0}, and for x>b1x > b_{1}, s(x)=b1b0s(x) = b_{1} \cdot b_{0}. In both cases, b0b_{0} represents the slope of the function up to a threshold, and the product b1b0b_{1} \cdot b_{0} represents the value at the asymptote. No constraints are placed on parameters b0b_{0} or b1b_{1}.

These models can be fit by including + breakpt(x) in the model formula, where x is a covariate. The formula can contain a single break-point covariate.

Logistic threshold models

Models with logistic threshold relationships between a predictor and the response can be fit with the form

s(x)=τ+ψ[1+eln(19)(xs50)/(s95s50)]1, s(x)=\tau + \psi\ { \left[ 1+{ e }^{ -\ln \left(19\right) \cdot \left( x-s50 \right) / \left(s95 - s50 \right) } \right] }^{-1},

where ss represents the logistic function, ψ\psi is a scaling parameter (controlling the height of the y-axis for the response; unconstrained), τ\tau is an intercept, s50s50 is a parameter controlling the point at which the function reaches 50% of the maximum (ψ\psi), and s95s95 is a parameter controlling the point at which the function reaches 95% of the maximum. The parameter s50s50 is unconstrained but s95s95 is constrained to be larger than s50s50.

These models can be fit by including + logistic(x) in the model formula, where x is a covariate. The formula can contain a single logistic covariate.

Spatial random fields

Spatial random fields, ω𝐬\omega_{\mathbf{s}}, are included if spatial = 'on' (or TRUE) and omitted if spatial = 'off' (or FALSE).

μ𝐬,t=f1(+ω𝐬+),𝛚MVNormal(𝟎,𝚺ω), \begin{aligned} \mu_{\mathbf{s},t} &= f^{-1} \left( \ldots + \omega_{\mathbf{s}} + \ldots \right),\\ \boldsymbol{\omega}&\sim \mathrm{MVNormal} \left( \boldsymbol{0}, \boldsymbol{\Sigma}_\omega \right),\\ \end{aligned} The marginal standard deviation of 𝛚\boldsymbol{\omega} is indicated by Spatial SD in the printed model output or as sigma_O in the output of sdmTMB::tidy(fit, "ran_pars"). The ‘O’ is for ‘omega’ (ω\omega).

Internally, the random fields follow a Gaussian Markov random field (GMRF)

𝛚MVNormal(𝟎,σω2𝐐ω1), \boldsymbol{\omega}\sim \mathrm{MVNormal}\left(\boldsymbol{0}, \sigma_\omega^2 \mathbf{Q}^{-1}_\omega\right), where 𝐐ω\mathbf{Q}_\omega is a sparse precision matrix and σω2\sigma_\omega^2 is the marginal variance.

Spatiotemporal random fields

Spatiotemporal random fields are included by default if there are multiple time elements (time argument is not NULL) and can be set to IID (independent and identically distributed, 'iid'; default), AR(1) ('ar1'), random walk ('rw'), or off ('off') via the spatiotemporal argument. These text values are case insensitive.

Spatiotemporal random fields are represented by 𝛜t\boldsymbol{\epsilon}_t within sdmTMB. This has been chosen to match the representation in VAST (Thorson 2019). The marginal standard deviation of 𝛜t\boldsymbol{\epsilon}_t is indicated by Spatiotemporal SD in the printed model output or as sigma_E in the output of sdmTMB::tidy(fit, "ran_pars"). The ‘E’ is for ‘epsilon’ (ϵ\epsilon).

IID spatiotemporal random fields

IID spatiotemporal random fields (spatiotemporal = 'iid') can be represented as

μ𝐬,t=f1(+ϵ𝐬,t+),𝛜tMVNormal(𝟎,𝚺ϵ). \begin{aligned} \mu_{\mathbf{s},t} &= f^{-1} \left( \ldots + \epsilon_{\mathbf{s},t} + \ldots \right),\\ \boldsymbol{\epsilon}_{t} &\sim \mathrm{MVNormal} \left( \boldsymbol{0}, \boldsymbol{\Sigma}_{\epsilon} \right). \end{aligned}

where ϵ𝐬,t\epsilon_{\mathbf{s},t} represent random field deviations at point 𝐬\mathbf{s} and time tt. The random fields are assumed independent across time steps.

Similarly to the spatial random fields, these spatiotemporal random fields (including all versions described below) are parameterized internally with a sparse precision matrix (𝐐ϵ\mathbf{Q}_\epsilon)

𝛜tMVNormal(𝟎,σϵ2𝐐ϵ1). \boldsymbol{\epsilon}_{t} \sim \mathrm{MVNormal}\left(\boldsymbol{0}, \sigma_\epsilon^2 \mathbf{Q}^{-1}_\epsilon\right).

AR(1) spatiotemporal random fields

First-order auto regressive, AR(1), spatiotemporal random fields (spatiotemporal = 'ar1') add a parameter defining the correlation between random field deviations from one time step to the next. They are defined as

μ𝐬,t=f1(+δ𝐬,t+),𝛅t=1MVNormal(𝟎,𝚺ϵ),𝛅t>1=ρ𝛅t1+1ρ2𝛜t,𝛜tMVNormal(𝟎,𝚺ϵ), \begin{aligned} \mu_{\mathbf{s},t} &= f^{-1} \left( \ldots + \delta_{\mathbf{s},t} + \ldots \right),\\ \boldsymbol{\delta}_{t=1} &\sim \mathrm{MVNormal} (\boldsymbol{0}, \boldsymbol{\Sigma}_{\epsilon}),\\ \boldsymbol{\delta}_{t>1} &= \rho \boldsymbol{\delta}_{t-1} + \sqrt{1 - \rho^2} \boldsymbol{\epsilon}_{t}, \: \boldsymbol{\epsilon}_{t} \sim \mathrm{MVNormal} \left(\boldsymbol{0}, \boldsymbol{\Sigma}_{\epsilon} \right), \end{aligned} where ρ\rho is the correlation between subsequent spatiotemporal random fields. The ρ𝛅t1+1ρ2\rho \boldsymbol{\delta}_{t-1} + \sqrt{1 - \rho^2} term scales the spatiotemporal variance by the correlation such that it represents the steady-state marginal variance. The correlation ρ\rho allows for mean-reverting spatiotemporal fields, and is constrained to be 1<ρ<1-1 < \rho < 1. Internally, the parameter is estimated as ar1_phi, which is unconstrained. The parameter ar1_phi is transformed to ρ\rho with ρ=2(logit1(𝚊𝚛𝟷_𝚙𝚑𝚒))1\rho = 2 \left( \mathrm{logit}^{-1}(\texttt{ar1\_phi}) \right) - 1, mapping ar1_phi to the range (1,1)(-1, 1).

Random walk spatiotemporal random fields (RW)

Random walk spatiotemporal random fields (spatiotemporal = 'rw') represent a model where the difference in spatiotemporal deviations from one time step to the next are IID. They are defined as

μ𝐬,t=f1(+δ𝐬,t+),𝛅t=1MVNormal(𝟎,𝚺ϵ),𝛅t>1=𝛅t1+𝛜t,𝛜tMVNormal(𝟎,𝚺ϵ), \begin{aligned} \mu_{\mathbf{s},t} &= f^{-1} \left( \ldots + \delta_{\mathbf{s},t} + \ldots \right),\\ \boldsymbol{\delta}_{t=1} &\sim \mathrm{MVNormal} (\boldsymbol{0}, \boldsymbol{\Sigma}_{\epsilon}),\\ \boldsymbol{\delta}_{t>1} &= \boldsymbol{\delta}_{t-1} + \boldsymbol{\epsilon}_{t}, \: \boldsymbol{\epsilon}_{t} \sim \mathrm{MVNormal} \left(\boldsymbol{0}, \boldsymbol{\Sigma}_{\epsilon} \right), \end{aligned}

where the distribution of the spatiotemporal field in the initial time step is the same as for the AR(1) model, but the absence of the ρ\rho parameter allows the spatiotemporal field to be non-stationary in time. Note that, in contrast to the AR(1) parametrization, the variance is no longer the steady-state marginal variance.

Time-varying regression parameters

Parameters can be modelled as time-varying according to a random walk or first-order autoregressive, AR(1), process. The time-series model is defined by time_varying_type. For all types:

μ𝐬,t=f1(+𝐗𝐬,ttvc𝛄t+), \begin{aligned} \mu_{\mathbf{s},t} &= f^{-1} \left( \ldots + \mathbf{X}^{\mathrm{tvc}}_{\mathbf{s},t} \boldsymbol{\gamma}_{t} + \ldots \right), \end{aligned} where 𝛄t\boldsymbol{\gamma}_{t} is an optional vector of time-varying regression parameters and 𝐗𝐬,ttvc\mathbf{X}^{\mathrm{tvc}}_{\mathbf{s},t} is the corresponding model matrix with covariate values. This is defined via the time_varying argument, assuming that the time argument is also supplied a column name. time_varying takes a one-sided formula. ~ 1 implies a time-varying intercept.

The following equations describe the time-series process for a single time-varying coefficient γp,t\gamma_{p,t}, where pp indexes the time-varying coefficient. When multiple time-varying coefficients are specified, each follows its time-series process independently with its own variance σγ,p2\sigma_{\gamma,p}^2 and (for AR1) correlation parameter ργ,p\rho_{\gamma,p}.

For time_varying_type = 'rw', the first time step is estimated independently:

γp,t=1Uniform(,) (treated as an unconstrained parameter),γp,t>1Normal(γp,t1,σγ,p2). \begin{aligned} \gamma_{p,t=1} &\sim \mathrm{Uniform} \left(-\infty, \infty \right) \text{ (treated as an unconstrained parameter)},\\ \gamma_{p,t>1} &\sim \mathrm{Normal} \left(\gamma_{p,t-1}, \sigma_{\gamma,p}^2 \right). \end{aligned}

In this case, the first time-step value is given an implicit uniform prior. I.e., the same variable should not appear in the fixed effect formula since the initial value is estimated as part of the time-varying formula. The formula time_varying = ~ 1 implicitly represents a time-varying intercept (assuming the time argument has been supplied) and, this case, the intercept should be omitted from the main effects (formula ~ + 0 + ... or formula ~ -1 + ...).

For time_varying_type = 'rw0', the first time step is estimated from a mean-zero prior:

γp,t=1Normal(0,σγ,p2),γp,t>1Normal(γp,t1,σγ,p2). \begin{aligned} \gamma_{p,t=1} &\sim \mathrm{Normal} \left(0, \sigma_{\gamma,p}^2 \right),\\ \gamma_{p,t>1} &\sim \mathrm{Normal} \left(\gamma_{p,t-1}, \sigma_{\gamma,p}^2 \right). \end{aligned} In this case, the time-varying variable (including the intercept) should be included in the main effects.

For time_varying_type = 'ar1':

γp,t=1Normal(0,σγ,p2),γp,t>1Normal(ργ,pγp,t1,(1ργ,p2)σγ,p2), \begin{aligned} \gamma_{p,t=1} &\sim \mathrm{Normal} \left(0, \sigma_{\gamma,p}^2 \right),\\ \gamma_{p,t>1} &\sim \mathrm{Normal} \left(\rho_{\gamma,p}\gamma_{p,t-1}, (1 - \rho_{\gamma,p}^2) \sigma_{\gamma,p}^2 \right), \end{aligned} where ργ,p\rho_{\gamma,p} is the correlation between subsequent time steps. The first time step is given a mean-zero prior.

Spatially varying coefficients (SVC)

Spatially varying coefficient models are defined as

μ𝐬,t=f1(+𝐗𝐬,tsvc𝛇𝐬+),ζl,𝐬MVNormal(𝟎,𝚺ζ,l), \begin{aligned} \mu_{\mathbf{s},t} &= f^{-1} \left( \ldots + \mathbf{X}^{\mathrm{svc}}_{\mathbf{s}, t} \boldsymbol{\zeta}_{\mathbf{s}} + \ldots \right),\\ \zeta_{l,\mathbf{s}} &\sim \mathrm{MVNormal} \left( \boldsymbol{0}, \boldsymbol{\Sigma}_{\zeta,l} \right), \end{aligned}

where ζl,𝐬\zeta_{l,\mathbf{s}} represents the ll-th spatially varying coefficient field at location 𝐬\mathbf{s}. When multiple spatially varying coefficients are specified, each is estimated as an independent spatial random field with its own marginal variance σζ,l2\sigma_{\zeta,l}^2. Usually, 𝐗𝐬,tsvc\mathbf{X}^{\mathrm{svc}}_{\mathbf{s}, t} would represent a prediction matrix that is constant spatially for a given time tt as defined by a one-sided formula supplied to spatial_varying. For example spatial_varying = ~ 0 + x, where 0 omits the intercept.

The random fields are parameterized internally with a sparse precision matrix (𝐐ζ\mathbf{Q}_\zeta)

ζl,𝐬MVNormal(𝟎,σζ,l2𝐐ζ1). \zeta_{l,\mathbf{s}} \sim \mathrm{MVNormal}\left(\boldsymbol{0}, \sigma_{\zeta,l}^2 \mathbf{Q}^{-1}_\zeta\right).

Random intercepts and slopes

Multilevel (hierarchical) intercepts and slopes follow the familiar lme4/glmmTMB formulation. A single grouping factor can carry an intercept-only term or a vector of random effects (intercept plus slopes) with a full covariance:

μ𝐬,t=f1(+𝐙g𝐛g+),𝐛gMVNormal(𝟎,𝚺g), \begin{aligned} \mu_{\mathbf{s},t} &= f^{-1} \left( \ldots + \mathbf{Z}_{g} \mathbf{b}_{g} + \ldots \right),\\ \mathbf{b}_{g} &\sim \mathrm{MVNormal} \left( \boldsymbol{0}, \boldsymbol{\Sigma}_{g} \right), \end{aligned}

where 𝐙g\mathbf{Z}_{g} is the design row(s) for group gg (e.g., intercept and covariate values) and 𝐛g\mathbf{b}_{g} is the corresponding vector of random effects for that group. The scalar special case (1|g)(1|g) corresponds to n=1n=1 with αgNormal(0,σα2)\alpha_{g} \sim \mathrm{Normal}(0, \sigma_{\alpha}^2).

Use lme4-style syntax in formula, e.g. (1 | g) for random intercepts, (1 + x | g) for correlated intercepts and slopes, or (1 | g1) + (1 + x | g2) to give each grouping factor its own (co)variance.

Internally, standard deviations are estimated on the log scale, and correlations are represented by unconstrained Cholesky parameters (via UNSTRUCTURED_CORR), yielding a positive-definite 𝚺g\boldsymbol{\Sigma}_{g} per grouping factor.

Offset terms

Offset terms can be included through the offset argument in sdmTMB(). These are included in the linear predictor as

μ𝐬,t=f1(+O𝐬,t+), \begin{aligned} \mu_{\mathbf{s},t} &= f^{-1} \left( \ldots + O_{\mathbf{s},t} + \ldots \right), \end{aligned}

where O𝐬,tO_{\mathbf{s},t} is an offset term—a log transformed variable without a coefficient (assuming a log link). The offset is not included in the prediction. Therefore, if offset represents a measure of effort, for example, the prediction is for one unit of effort (log(1) = 0).

Observation model families

Here we describe the main observation families that are available in sdmTMB and comment on their parametrization, statistical properties, utility, and code representation in sdmTMB. Families are grouped by outcome type to make it easier to locate an appropriate observation model.

Bounded or binary outcomes (0–1, proportions)

Binomial

Binomial(N,μ) \operatorname{Binomial} \left(N, \mu \right) where NN is the size or number of trials, and μ\mu is the probability of success for each trial. If N=1N = 1, the distribution becomes the Bernoulli distribution. Internally, the distribution is parameterized as the robust version in TMB, which is numerically stable when probabilities approach 0 or 1. Following the structure of stats::glm(), lme4, and glmmTMB, a binomial family can be specified in one of 4 ways:

  1. the response may be a factor (and the model classifies the first level versus all others)
  2. the response may be binomial (0/1)
  3. the response can be a matrix of form cbind(success, failure), or
  4. the response may be the observed proportions, and the weights argument is used to specify the Binomial size (NN) parameter (probability ~ ..., weights = N).

Code defined within TMB.

Example: family = binomial(link = "logit")

Beta-binomial

BetaBinomial(N,μ,ϕ) \operatorname{BetaBinomial} \left(N, \mu, \phi \right) where NN is the number of trials, μ\mu is the mean success probability, and ϕ\phi is a precision parameter that controls overdispersion relative to a Binomial. The implied Beta parameters are α=μϕ\alpha = \mu \phi and β=(1μ)ϕ\beta = (1 - \mu)\phi, and the variance is $ [y] = N (1 - ), , $ which exceeds the Binomial variance unless ϕ\phi \to \infty. Available links are logit and cloglog. This family is useful for overdispersed counts of successes/failures (e.g., aggregated Bernoulli data, proportions with extra-binomial variation).

Code defined within sdmTMB and implemented in the TMB likelihood at src/sdmTMB.cpp (e.g., lines 974–982).

Example: family = betabinomial(link = "logit")

Beta

Beta(μϕ,(1μ)ϕ) \operatorname{Beta} \left(\mu \phi, (1 - \mu) \phi \right) where μ\mu is the mean and ϕ\phi is a precision parameter. This parametrization follows Ferrari & Cribari-Neto (2004) and the betareg R package (Cribari-Neto & Zeileis 2010). The variance is μ(1μ)/(ϕ+1)\mu (1 - \mu) / (\phi + 1).

Code defined within TMB.

Example: family = Beta(link = "logit")

Count data

Poisson

Poisson(μ) \operatorname{Poisson} \left( \mu \right) where μ\mu represents the mean and Var[y]=μ\mathrm{Var}[y] = \mu.

Code defined within TMB.

Example: family = poisson(link = "log")

Negative Binomial 2 (NB2)

NB2(μ,ϕ) \operatorname{NB2} \left( \mu, \phi \right)

where μ\mu is the mean and ϕ\phi is the dispersion parameter. The variance scales quadratically with the mean Var[y]=μ+μ2/ϕ\mathrm{Var}[y] = \mu + \mu^2 / \phi(Hilbe 2011). The NB2 parametrization is more commonly seen in ecology than the NB1. Internally, the distribution is parameterized as the robust version in TMB.

Code defined within TMB.

Example: family = nbinom2(link = "log")

Negative Binomial 1 (NB1)

NB1(μ,ϕ) \operatorname{NB1} \left( \mu, \phi \right)

where μ\mu is the mean and ϕ\phi is the dispersion parameter. The variance scales linearly with the mean Var[y]=μ+μ/ϕ\mathrm{Var}[y] = \mu + \mu / \phi(Hilbe 2011). Internally, the distribution is parameterized as the robust version in TMB.

Code defined within sdmTMB based on NB2 and borrowed from glmmTMB.

Example: family = nbinom1(link = "log")

Negative binomial 2 mixture

This is a 2 component mixture that extends the NB2 distribution, following mixture-distribution approaches for shoaling/aggregation in survey data (Thorson et al. 2011).

(1p)NB2(μ1,ϕ)+pNB2(μ2,ϕ) (1 - p) \cdot \operatorname{NB2} \left( \mu_1, \phi \right) + p \cdot \operatorname{NB2} \left( \mu_2, \phi \right)

where μ1\mu_1 is the mean of the first (smaller component) of the distribution, μ2\mu_2 is the mean of the larger component, and pp controls the contribution of each component to the mixture.

Example: family = nbinom2_mix(link = "log")

Positive continuous outcomes

Gamma

Gamma(ϕ,μϕ) \operatorname{Gamma} \left( \phi, \frac{\mu}{\phi} \right) where ϕ\phi represents the Gamma shape and μ/ϕ\mu / \phi represents the scale. The mean is μ\mu and variance is μ2/ϕ\mu^2 / \phi.

Code defined within TMB.

Example: family = Gamma(link = "log")

Lognormal

sdmTMB uses the “bias-corrected” lognormal distribution where ϕ\phi represents the standard deviation in log-space:

Lognormal(logμϕ22,ϕ2). \operatorname{Lognormal} \left( \log \mu - \frac{\phi^2}{2}, \phi^2 \right). Because of the bias correction, 𝔼[y]=μ\mathbb{E}[y] = \mu and Var[logy]=ϕ2\mathrm{Var}[\log y] = \phi^2.

Code defined within sdmTMB based on the TMB dnorm() normal density.

Example: family = lognormal(link = "log")

Generalized gamma

GenGamma(μ,ϕ,Qgg) \operatorname{GenGamma} \left( \mu, \phi, Q^{\mathrm{gg}}\right)

sdmTMB implements the Prentice (1974) parameterization introduced for spatiotemporal models and index standardization by Dunic et al. (2025), with parameters mean μ\mu, scale ϕ\phi (standard deviation on the log scale), and a shape parameter QggQ^{\mathrm{gg}} (reported as Generalized gamma Q).

Here, μ\mu is the mean on the data scale, ϕ\phi is the log-scale standard deviation, and QggQ^{\mathrm{gg}} controls the shape (Qgg0Q^{\mathrm{gg}}\to 0 yields the lognormal; Qgg=ϕQ^{\mathrm{gg}}= \phi yields the gamma). See Dunic et al. (2025) for the full PDF in the Prentice formulation as implemented. Links available: identity, log, inverse. This flexibility is useful for right-skewed positive responses with tails heavier or lighter than gamma/lognormal, and is often paired in a hurdle/mixture as delta_gengamma() for zero-inflated biomass or catch data (see Dunic et al. (2025)).

Code defined within sdmTMB.

Example: family = gengamma(link = "log"); delta/hurdle: family = delta_gengamma(link1 = "logit", link2 = "log").

Tweedie

Tweedie(μ,p,ϕ),1<p<2 \operatorname{Tweedie} \left(\mu, p, \phi \right), \: 1 < p < 2

where μ\mu is the mean, pp is the power parameter constrained between 1 and 2, and ϕ\phi is the dispersion parameter. The Tweedie distribution can be helpful for modelling data that are positive and continuous but also contain zeros.

Internally, pp is transformed from logit1(𝚝𝚑𝚎𝚝𝚊𝚏)+1\mathrm{logit}^{-1} (\texttt{thetaf}) + 1 to constrain it between 1 and 2 and is estimated as an unconstrained variable.

The source code is implemented as in the cplm package (Zhang 2013) and is based on Dunn & Smyth (2005). The TMB version is defined here.

Example: family = tweedie(link = "log")

Gamma mixture

This is a 2 component mixture that extends the Gamma distribution, motivated by mixture-distribution treatments of aggregation in survey data (Thorson et al. 2011),

(1p)Gamma(ϕ,μ1ϕ)+pGamma(ϕ,μ2ϕ), (1 - p) \cdot \operatorname{Gamma} \left( \phi, \frac{\mu_{1}}{\phi} \right) + p \cdot \operatorname{Gamma} \left( \phi, \frac{\mu_{2}}{\phi} \right), where ϕ\phi represents the Gamma shape, μ1/ϕ\mu_{1} / \phi represents the scale for the first (smaller component) of the distribution, μ2/ϕ\mu_{2} / \phi represents the scale for the second (larger component) of the distribution, and pp controls the contribution of each component to the mixture (also interpreted as the probability of larger events).

The mean is (1p)μ1+pμ2(1-p) \cdot \mu_{1} + p \cdot \mu_{2}. The variance follows the usual mixture formula: (1p)(μ12/ϕ+μ12)+p(μ22/ϕ+μ22)[(1p)μ1+pμ2]2(1-p)\left(\mu_1^2 / \phi + \mu_1^2\right) + p\left(\mu_2^2 / \phi + \mu_2^2\right) - \left[(1-p)\mu_1 + p\mu_2\right]^2.

Here, and for the other mixture distributions, the probability of the larger mean can be obtained from plogis(fit$model$par[["logit_p_extreme"]]) and the ratio of the larger mean to the smaller mean can be obtained from 1 + exp(fit$model$par[["log_ratio_mix"]]). The standard errors are available in the TMB sdreport: fit$sd_report.

If you wish to fix the probability of a large (i.e., extreme) mean, which can be hard to estimate, you can fix this value and pass this to the family:

sdmTMB(...,
  family = gamma_mix(link = "log", p_extreme = 0.01)
)

See also family = delta_gamma_mix() for an extension incorporating this distribution with delta models.

Lognormal mixture

This is a 2 component mixture that extends the lognormal distribution, again in the spirit of mixture approaches for aggregating/ shoaling data (Thorson et al. 2011),

(1p)Lognormal(logμ1ϕ22,ϕ2)+pLognormal(logμ2ϕ22,ϕ2). (1 - p) \cdot \operatorname{Lognormal} \left( \log \mu_{1} - \frac{\phi^2}{2}, \phi^2 \right) + p \cdot \operatorname{Lognormal} \left( \log \mu_{2} - \frac{\phi^2}{2}, \phi^2 \right).

Because of the bias correction, 𝔼[y]=(1p)μ1+pμ2\mathbb{E}[y] = (1-p) \cdot \mu_{1} + p \cdot \mu_{2}. The log-scale variance of the mixture is not simply ϕ2\phi^2; it can be obtained with the standard mixture-variance formula using component log-means logμiϕ2/2\log \mu_i - \phi^2/2 and log-variance ϕ2\phi^2.

As with the Gamma mixture, pp controls the contribution of each component to the mixture (also interpreted as the probability of larger events).

Example: family = lognormal_mix(link = "log"). See also family = delta_lognormal_mix() for an extension incorporating this distribution with delta models. Like with the gamma mixture, fixed probabilities of extreme events (pp in notation above) can be passed in, e.g.

sdmTMB(...,
  family = delta_lognormal_mix(p_extreme = 0.01)
)

Continuous real-valued outcomes

Gaussian

Normal(μ,ϕ2) \operatorname{Normal} \left( \mu, \phi^2 \right) where μ\mu is the mean and ϕ\phi is the standard deviation. The variance is ϕ2\phi^2.

Example: family = Gaussian(link = "identity")

Code defined within TMB.

Student-t

Student-t(μ,ϕ,ν) \operatorname{Student-t} \left( \mu, \phi, \nu \right)

where ν\nu, the degrees of freedom (df), is a user-supplied fixed parameter. Lower values of ν\nu result in heavier tails compared to the Gaussian distribution. Above approximately df = 20, the distribution becomes very similar to the Gaussian. The Student-t distribution with a low degrees of freedom (e.g., ν7\nu \le 7) can be helpful for modelling data that would otherwise be suitable for Gaussian but needs an approach that is robust to outliers (e.g., Anderson et al. 2017).

Code defined within sdmTMB based on the dt() distribution in TMB.

Example: family = student(link = "log", df = 7)

Zero-inflated and hurdle structures

Delta models

sdmTMB allows for several different kinds of delta (also known as hurdle) models. These families are implemented by specifying the family as a delta distribution. For example:

sdmTMB(
  ...,
  family = delta_gamma()
)

The list of supported families is included in the documentation on additional families, delta models, and Poisson-link delta models. By default, the delta_* families don’t use the Poisson link, but this structure can be specified with delta_gamma(type = "poisson-link"), which follows the formulation in Thorson (2018).

In the “standard” delta model implementation, sdmTMB constructs two internal models (formally, two “linear predictors”), with the first model representing presence-absence and the second model representing the positive component (such as catch rates in fisheries applications). Default links associated with each family can be inspected with delta_lognormal() and equivalent functions. For the standard delta models, the default first linear predictor link is logit. For the Poisson-link type, the first linear predictor link is log.

The formula, spatial, spatiotemporal, and share_range arguments of sdmTMB() can be specified independently as a 2-element list. For example, the spatial random field might be estimated for only the first linear predictor with:

sdmTMB(
  ...,
  family = delta_gamma(),
  spatial = list("on", "off")
)

Or for we may want separate main-effects formulas. For example:

sdmTMB(
  formula = list(
    y ~ depth + I(depth^2), 
    y ~ 1),
  family = delta_gamma()
)

All other arguments are shared between the linear predictors.

Currently if delta models contain smoothers, both components must have the same main-effects formula.

Gaussian random fields

Matérn parameterization

The Matérn defines the covariance Φ(sj,sk)\Phi \left( s_j, s_k \right) between spatial locations sjs_j and sks_k as

Φ(sj,sk)=τ2[2ν1Γ(ν)]1(κdjk)νKν(κdjk), \Phi\left( s_j,s_k \right) = \tau^2 \left[ 2^{\nu - 1}\Gamma(\nu) \right]^{-1} (\kappa d_{jk})^\nu K_\nu \left( \kappa d_{jk} \right),

where τ2\tau^2 is a scale/precision factor (higher τ\tau means lower marginal variance), ν\nu controls the smoothness, Γ\Gamma represents the Gamma function, djkd_{jk} represents the distance between locations sjs_j and sks_k, KνK_\nu represents the modified Bessel function of the second kind, and κ\kappa represents the decorrelation rate. The parameter ν\nu is set to 1 because the SPDE approach that gives us sparse matrices (and therefore major computational speedups) requires α=ν+d/2\alpha = \nu + d/2 to be an integer when working in 2D (so ν=1\nu = 1) (Lindgren et al. 2011). Internally, the parameters κ\kappa and τ\tau are converted to range and marginal standard deviation σ\sigma as range=8/κ\textrm{range} = \sqrt{8} / \kappa and σ=1/4πexp(2log(τ)+2log(κ))\sigma = 1 / \sqrt{4 \pi \exp \left(2 \log(\tau) + 2 \log(\kappa) \right) }, so increasing κ\kappa or τ\tau decreases σ\sigma.

In the case of a spatiotemporal model with both spatial and spatiotemporal fields, if share_range = TRUE in sdmTMB() (the default), then a single κ\kappa and range are estimated with separate σω\sigma_\omega and σϵ\sigma_\epsilon. This often makes sense since data are often only weakly informative about κ\kappa. If share_range = FALSE, then separate κω\kappa_\omega and κϵ\kappa_\epsilon are estimated. The spatially varying coefficient field always shares κ\kappa with the spatial random field.

Projection 𝐀\mathbf{A} matrix

The values of the spatial variables at the knots are multiplied by a projection matrix 𝐀\mathbf{A} that bilinearly interpolates from the knot locations to the values at the locations of the observed or predicted data (Lindgren & Rue 2015)

𝛚*=𝐀𝛚, \boldsymbol{\omega}^* = \mathbf{A}\boldsymbol{\omega}, where 𝛚*\boldsymbol{\omega}^* represents the values of the spatial random fields at the observed locations or predicted data locations. The matrix 𝐀\mathbf{A} has a row for each data point or prediction point and a column for each knot. Three non-zero elements on each row define the weight of the neighbouring 3 knot locations for location 𝐬\mathbf{s}. The same bilinear interpolation happens for any spatiotemporal random fields

𝛜t*=𝐀𝛜t. \boldsymbol{\epsilon}_t^* = \mathbf{A}\boldsymbol{\epsilon}_t.

Anisotropy

TMB allows for anisotropy, where spatial covariance may be asymmetric with respect to latitude and longitude (full details). Anisotropy can be turned on or off with the logical anisotropy argument to sdmTMB(). There are a number of ways to implement anisotropic covariance (Fuglstad et al. 2015), and we adopt a 2-parameter rotation matrix 𝐇\mathbf{H}. The elements of 𝐇\mathbf{H} are defined by the parameter vector 𝐱\boldsymbol{x} so that H1,1=x1H_{1,1} = x_{1}, H1,2=H2,1=x2H_{1,2} = H_{2,1} = x_{2} and H2,2=(1+x22)/x1H_{2,2} = (1 + x_{2}^2) / x_{1}.

Once a model is fitted with sdmTMB(), the anisotropy relationships may be plotted using the plot_anisotropy() function, which takes the fitted object as an argument. If a barrier mesh is used, anisotropy is disabled.

Incorporating physical barriers into the SPDE

In some cases the spatial domain of interest may be complex and bounded by some barrier such as by land or water (e.g., coastlines, islands, lakes). SPDE models allow for physical barriers to be incorporated into the modelling (Bakka et al. 2019). With sdmTMB() models, the mesh construction occurs in two steps: the user (1) constructs a mesh with a call to sdmTMBextra::make_mesh(), and (2) passes the mesh to sdmTMBextra::add_barrier_mesh(). The barriers must be constructed as sf objects (Pebesma 2018) with polygons defining the barriers. See ?sdmTMBextra::add_barrier_mesh for an example.

The barrier implementation requires the user to select a fraction value (range_fraction argument) that defines the fraction of the usual spatial range when crossing the barrier (Bakka et al. 2019). For example, if the range was estimated at 10 km, range_fraction = 0.2 would assign a 2 km range to the triangles marked as barrier. That makes correlation decay much faster through the barrier than around it, which effectively discourages spatial smoothing from “jumping” across landmasses. From experimentation, values around 0.1 or 0.2 seem to work well but values much lower than 0.1 can result in convergence issues.

This website by Francesco Serafini and Haakon Bakka provides an illustration with INLA. The implementation within TMB was borrowed from code written by Olav Nikolai Breivik and Hans Skaug at the TMB Case Studies Github site.

Optimization

Optimization details

The sdmTMB model is fit by maximum marginal likelihood. Internally, a TMB (Kristensen et al. 2016) model template calculates the marginal log likelihood and its gradient, and the negative log likelihood is minimized via the non-linear optimization routine stats::nlminb() in R (Gay 1990; R Core Team 2021). Random effects are represented by the values that maximize the log likelihood conditional on the fixed effects (their conditional modes), and the Laplace approximation integrates over them when evaluating the likelihood (Kristensen et al. 2016).

Like AD Model Builder (Fournier et al. 2012), TMB allows for parameters to be fit in phases and we include the multiphase argument in sdmTMB::sdmTMBcontrol() to allow this. For high-dimensional models (many fixed and random effects), phased estimation may be faster, but this is not always the case because it requires an extra call to TMB::MakeADFun(). In sdmTMB, phased estimation proceeds by first estimating all fixed-effect parameters contributing to the likelihood (holding random effects constant at initial values). In the second phase, the random-effect parameters (and their variances) are also estimated. Fixed-effect parameters are also estimated in the second phase and are initialized at their estimates from the first phase.

In some cases, a single call to stats::nlminb() may not be result in convergence (e.g., the maximum gradient of the marginal likelihood with respect to fixed-effect parameters is not small enough yet), and the algorithm may need to be run multiple times. In the sdmTMB::sdmTMBcontrol() function, we include an argument nlminb_loops that will restart the optimization at the previous best values. The number of nlminb_loops should generally be small (e.g., 2 or 3), and defaults to 1. For some sdmTMB models, the Hessian may also be unstable and need to be re-evaluated. We do this optionally with the stats::optimHess() routine after the call to stats::nlminb(). The stats::optimHess() function implements a Newton optimization routine to find the Hessian, and we include the argument newton_loops in sdmTMB::sdmTMBcontrol() to allow for multiple function evaluations (each starting at the previous best value). By default, this is set to 1 evaluation. The updated parameters are only accepted if they result in a lower negative marginal log likelihood. If a model is already fit, the function sdmTMB::run_extra_optimization() can run additional optimization loops with either routine to further reduce the maximum gradient.

Assessing convergence

Much of the guidance around diagnostics and glmmTMB also applies to sdmTMB, e.g. the glmmTMB vignette on troubleshooting. Optimization with stats::nlminb() involves specifying the number of iterations and evaluations (eval.max and iter.max) and the tolerances (abs.tol, rel.tol, x.tol, xf.tol)—a greater number of iterations and smaller tolerance thresholds increase the chance that the optimal solution is found, but more evaluations translates into longer computation time. Warnings of non-positive-definite Hessian matrices (accompanied by parameters with NAs for standard errors) often mean models are improperly specified given the data. Standard errors can be observed in the output of print.sdmTMB() or by checking fit$sd_report. The maximum gradient of the marginal likelihood with respect to fixed-effect parameters can be checked by inspecting (fit$gradients). Guidance varies, but gradients should be very close to zero (e.g., on the order of 10310^{-3} or smaller after reasonable parameter scaling) before assuming the fitting routine is consistent with convergence. If maximum gradients are already relatively small, they can sometimes be reduced further with additional optimization calls beginning at the previous best parameter vector as described above with sdmTMB::run_extra_optimization().

Notation reference tables

Tables of all major indices (Table 1) and symbols (Table 2) are provided here for quick reference.

Table 1: Subscript notation
Symbol Description
𝐬\mathbf{s} Index for space; a vector of x and y coordinates
tt Index for time
gg Group
pp Index for time-varying coefficient
ll Index for spatially varying coefficient (SVC)
Table 2: Symbol notation, code representation (in model output or in model template code), and descriptions.
Symbol Code Description
yy y_i Observed response data
μ\mu mu_i Mean
η\eta eta_i Linear predictor before applying the inverse link (f1f^{-1})
ϕ\phi phi A dispersion parameter for a distribution
ff fit$family$link Link function
f1f^{-1} fit$family$linkinv Inverse link function
𝛃\boldsymbol{\beta} b_j Parameter vector
𝐗\mathbf{X} X_ij A predictor model matrix
𝐙\mathbf{Z} Zt_list Random effect design matrix (list of sparse blocks; rows correspond to observations; subset by group gg)
O𝐬,tO_{\mathbf{s}, t} offset An offset variable at point 𝐬\mathbf{s} and time tt
ω𝐬\omega_{\mathbf{s}} omega_s Spatial random field at point 𝐬\mathbf{s} (knot)
ω𝐬*\omega_{\mathbf{s}}^* omega_s_A Spatial random field at point 𝐬\mathbf{s} (interpolated)
ζ𝐬\zeta_{\mathbf{s}} zeta_s Spatially varying coefficient random field at point 𝐬\mathbf{s} (knot)
ζ𝐬*\zeta_{\mathbf{s}}^* zeta_s_A Spatially varying coefficient random field at point 𝐬\mathbf{s} (interpolated)
ϵ𝐬,t\epsilon_{\mathbf{s}, t} epsilon_st Spatiotemporal random field at point 𝐬\mathbf{s} and time tt (knot)
ϵ𝐬,t*\epsilon_{\mathbf{s}, t}^* epsilon_st_A Spatiotemporal random field at point 𝐬\mathbf{s} and time tt (interpolated)
δ𝐬,t\delta_{\mathbf{s},t} b_t AR(1) or random walk spatiotemporal deviations (knot)
αg\alpha_g RE IID random intercept deviation for group gg
𝐛g\mathbf{b}_{g} re_b_pars Vector of random intercepts/slopes for group gg
𝚺ω\boldsymbol{\Sigma}_\omega - Spatial random field covariance matrix
𝚺ζ\boldsymbol{\Sigma}_\zeta - Spatially varying coefficient random field covariance matrix
𝚺ϵ\boldsymbol{\Sigma}_\epsilon - Spatiotemporal random field covariance matrix
𝐐ω\mathbf{Q}_\omega Q_s Spatial random field precision matrix
𝐐ζ\mathbf{Q}_\zeta Q_s Spatially varying coefficient random field precision matrix
𝐐ϵ\mathbf{Q}_\epsilon Q_st Spatiotemporal random field precision matrix
𝚺g\boldsymbol{\Sigma}_{g} re_cov_pars Random effect covariance per grouping factor (Cholesky SD + correlation params)
σα2\sigma_\alpha^2 sigma_G IID random intercept variance
σϵ2\sigma_\epsilon^2 sigma_E Spatiotemporal random field marginal variance
σω2\sigma_\omega^2 sigma_O Spatial random field marginal variance
σζ2\sigma_\zeta^2 sigma_Z Spatially varying coefficient random field marginal variance
κω\kappa_\omega kappa(0) Spatial decorrelation rate
κϵ\kappa_\epsilon kappa(1) Spatiotemporal decorrelation rate
ρ\rho ar1_rho Correlation between random fields in subsequent time steps
ργ\rho_{\gamma} rho_time Correlation between time-varying coefficients in subsequent time steps
QggQ^{\mathrm{gg}} gengamma_Q Generalized gamma shape parameter (Q0Q \rightarrow 0 gives lognormal; Q=ϕQ=\phi gives gamma)
𝐀\mathbf{A} A Sparse projection matrix to interpolate between knot and data locations
𝐇\mathbf{H} H 2-parameter rotation matrix used to define anisotropy

References

Anderson, S.C., Branch, T.A., Cooper, A.B. & Dulvy, N.K. (2017). Black-swan events in animal populations. Proceedings of the National Academy of Sciences, 114, 3252–3257.
Anderson, S.C., Ward, E.J., English, P.A., Barnett, L.A.K. & Thorson, J.T. (2024). sdmTMB: An R package for fast, flexible, and user-friendly generalized linear mixed effects models with spatial and spatiotemporal random fields. bioRxiv, 2022.03.24.485545.
Bakka, H., Vanhatalo, J., Illian, J., Simpson, D. & Rue, H. (2019). Non-stationary Gaussian models with physical barriers. arXiv:1608.03787 [stat]. Retrieved from https://arxiv.org/abs/1608.03787
Bürkner, P.-C. (2017). brms: An R package for Bayesian multilevel models using Stan. Journal of Statistical Software, 80, 1–28.
Cribari-Neto, F. & Zeileis, A. (2010). Beta regression in R. Journal of Statistical Software, 34, 1–24.
Dunic, J.C., Conner, J., Anderson, S.C. & Thorson, J.T. (2025). The generalized gamma is a flexible distribution that outperforms alternatives when modelling catch rate data. ICES Journal of Marine Science, 82, fsaf040.
Dunn, P.K. & Smyth, G.K. (2005). Series evaluation of Tweedie exponential dispersion model densities. Statistics and Computing, 15, 267–280.
Edwards, A.M. & Auger-Méthé, M. (2019). Some guidance on using mathematical notation in ecology. Methods in Ecology and Evolution, 10, 92–99.
Ferrari, S. & Cribari-Neto, F. (2004). Beta regression for modelling rates and proportions. Journal of Applied Statistics, 31, 799–815.
Fournier, D.A., Skaug, H.J., Ancheta, J., Ianelli, J., Magnusson, A., Maunder, M.N., Nielsen, A. & Sibert, J. (2012). AD model builder: Using automatic differentiation for statistical inference of highly parameterized complex nonlinear models. Optimization Methods and Software, 27, 233–249.
Fuglstad, G.-A., Lindgren, F., Simpson, D. & Rue, H. (2015). Exploring a new class of non-stationary spatial Gaussian random fields with varying local anisotropy. Statistica Sinica, 25, 115–133.
Gay, D.M. (1990). Usage summary for selected optimization routines. Computing Science Technical Report, 153, 1–21.
Hilbe, J.M. (2011). Negative Binomial Regression. Cambridge University Press.
Kristensen, K., Nielsen, A., Berg, C.W., Skaug, H. & Bell, B.M. (2016). TMB: Automatic differentiation and Laplace approximation. Journal of Statistical Software, 70, 1–21.
Lindgren, F. & Rue, H. (2015). Bayesian spatial modelling with R-INLA. Journal of Statistical Software, 63, 1–25.
Lindgren, F., Rue, H. & Lindström, J. (2011). An explicit link between Gaussian fields and Gaussian Markov random fields: The stochastic partial differential equation approach. J. R. Stat. Soc. Ser. B Stat. Methodol., 73, 423–498.
Pebesma, E. (2018). Simple Features for R: Standardized Support for Spatial Vector Data. The R Journal, 10, 439–446. Retrieved from https://doi.org/10.32614/RJ-2018-009
Prentice, R.L. (1974). A log gamma model and its maximum likelihood estimation. Biometrika, 61, 539–544.
R Core Team. (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Retrieved from https://www.R-project.org/
Thorson, J.T. (2019). Guidance for decisions using the Vector Autoregressive Spatio-Temporal (VAST) package in stock, ecosystem, habitat and climate assessments. Fisheries Research, 210, 143–161.
Thorson, J.T. (2018). Three problems with the conventional delta-model for biomass sampling data, and a computationally efficient alternative. Canadian Journal of Fisheries and Aquatic Sciences, 75, 1369–1382.
Thorson, J.T., Stewart, I.J. & Punt, A.E. (2011). Accounting for fish shoals in single- and multi-species survey data using mixture distribution models. Canadian Journal of Fisheries and Aquatic Sciences, 68, 1681–1693.
Wood, S.N. (2017). Generalized additive models: An introduction with R, 2nd edn. Chapman and Hall/CRC.
Wood, S. & Scheipl, F. (2020). gamm4: Generalized Additive Mixed Models using ’mgcv’ and ’lme4’. Retrieved from https://CRAN.R-project.org/package=gamm4
Zhang, Y. (2013). Likelihood-based and Bayesian methods for Tweedie compound Poisson linear mixed models. Statistics and Computing, 23, 743–757.