This page was last updated on March 20, 2024.
Linear mixed effects models provide a powerful tool for accounting for non-independence in nested data. In matrix notation a linear mixed effects model can be represented as
\(\boldsymbol{y_i}\) is the vector of observations (\(N \times 1\) vector);
\(X_i\) is a matrix of our `fixed’ predictor variables (\(N \times p\) matrix);
\(\boldsymbol{\beta}\) is a vector of fixed effects (\(p \times 1\) vector);
\(Z_i\) is a matrix of our random predictor variables (\(N \times qJ\) matrix for \(q\) random effects and \(J\) groups);
\(\boldsymbol{b_i}\) is a vector of random effects \(\sim \mathcal{N}(0,G_i)\) (\(qJ \times 1\) vector);
\(G_i\) is the variance-covariance matrix of \(\boldsymbol{b_i}\);
\(\boldsymbol{\varepsilon_i}\) is our distribution of errors \(\sim \mathcal{N}(0,\sigma_i)\).
In this Practical you will use the model to:
These data will be used again in Practical 05 when we explore model fitting, selection, and model averaging. Our goal right now is dive into these data to get a feel for their structure so we know how to approach the modelling process. You are asked to complete the following exercises and submit to Canvas before the deadline. In addition to the points detailed below, 5 points are assigned to the cleanliness of the code and resulting pdf document. Only knit documents (.pdf, .doc, or .html) will be accepted. Unknit .Rmd files will not be graded.
We will continue our exploration into mixed effects regression using data that were collected by Roulin and Bersier (2007) to explore whether the sex of the parent influenced how much nestling barn owls begged for food. To collect these data they set up cameras in 27 barn owl (Tyto alba) nests and studied the begging behaviour of nestlings in response to the presence of the father and of the mother. The main response variable we will be looking at is ‘NegPerChick’, which is the number of counted calls coming from a nest in the 30-s interval before the arrival of the parent, divided by the number of nestlings in that nest. The explanatory variables in these data are:
Over the course of this practical you will follow a step-wise procedure aimed at identifying the best possible model for these data. Remember that the ultimate goal of Roulin and Bersier was to determine whether the sex of the parent influenced how much nestling barn owls begged for food. With that in mind the first step is to import the data and visually inspect them to get a feel for their structure.
data <- read.csv("https://noonanm.github.io/Biol520C/Practical_Datasets/Owls.csv")
As a first step in any model fitting process, it is a good idea to fit a basic multiple linear regression model to your data as ‘quick and dirty’ approach to getting a feel for what direction you need to go.
*
notation for specifying the interactions). – 1
point(s)After getting a general feel for the features that you expect the need to account for in your data, the next step (for these data anyway) is to identify the most appropriate random effect structure for the data and model. Starting with a multiple linear regression model that contains the full range of fixed effects that we will be interested in testing, we compare 3 models:
These models are then compared using likelihood ratio tests and AIC values.
gls()
function to fit a multiple linear
regression model to predict log transformed NegPerChick as a function of
the interaction between the sex of the parent and the food treatment,
and the sex of the parent and arrival time. – 1 point(s)lme()
function to fit the same model, but with
a random intercept around data from the same nest. – 1 point(s)anova()
function to compare the models.
Describe the results and decide which random effect structure you should
use for these data. In particular what are the results of the likelihood
ratio tests, and what model does AIC model selection favour? – 2
point(s)method = "ML"
. Models fit by a mix
of ML and REML can not be compared using likelihood ratio tests or AIC
values. We will not be covering REML in this course, but this is
something you should be aware of.
A likelihood ratio test indicates that a mixed effects model with a random intercept is a significant improvement over the model that contains only fixed effects (\(\mathcal{L} = 27.34\), df = 1, \(p < 0.0001\)). A mixed effects model with a random slope and intercept offers an even further improvement of the random intercept only model (\(\mathcal{L} = 13.35\), df = 1, \(p = 0.0013\)). The random slope and intercept model is also favoured by AIC based model selection with an AIC value of -10.1, compared to AIC = -0.8 for the random intercept model, and AIC = 24.6 for the fixed effects only model. Collectively, this information suggests that a random slope and intercept is the most appropriate random effects structure for these data.
After identifying the correct random effects structure, we should next perform some diagnostic checks to ensure there are no glaring issues remaining.
After identifying the correct random effects structure, and having
confirmed that there are no glaring issues remaining, the next step is
to identify which of the fixed effects we should keep in our model.
There are two ways to do this. Manually, using likelihood ratio tests
and AIC values, or automatically using the dredge()
function from the MuMin
package. We will explore both
options.
summary()
function to obtain the parameter
estimates for the model you identified above. Looking at the parameter
estimates, the standard errors, and p-values, which terms do you think
might be worth dropping? – 1 point(s)dredge()
function from the MuMin
package. – 1 point(s)After identifying the correct fixed and random effects structure, we should perform a (hopefully) final round of diagnostic checks to ensure there are no issues remaining with our model
Because a number of the top models are all within \(\Delta\)AICc of 2 of the top model, this may be a situation that could benefit from model averaging.
help(subset.model.selection)
). – 1 point(s)model.avg()
function. – 1 point(s)The whole point of fitting this model was to identify the factors influencing begging behaviour of nestling owls.