Introductory course on spatial statistics focused on combining data with models to generate mechanistic descriptions of spatial patterns. Beginning from a discussion about the importance spatially referenced data, the course teaches methods for modelling spatial data for two routinely occurring situations 1) when the locations of points are the variable of interest (i.e., point pattern analysis), and 2) when the locations are an arbitrary artefact of the sampling design (i.e., Kriging and spatial interpolation). Emphasis is placed on understanding and interpreting the analyses in an applied context.
This course combines lectures on theory and concepts with time
practicing statistical tools in R
based software packages.
The course is designed to equip students with the tools and knowledge
they need to perform a variety of analyses that can model common
features of spatial data. For each topic, there will be a core lecture
module and a lab based practical module. With the lectures, students
will learn the theory behind core modelling tools including point
process models, variograms, Kriging, co-Kriging- regression-Kriging, and
regression with correlated errors, while the lab modules are intended to
provide a pathway that students can follow over the longer term as their
skills develop.
Learning Outcomes:
The course is divided into core lectures and lab-based practicals.
Lectures: Lectures will cover the core concepts of the course. Lecture slides will be posted on GitHub and Canvas prior to the lecture. Students are encouraged to take notes, and to ask questions in the lectures. Lectures will also be recorded and a link to the lecture will be made available via Canvas (subject to any unforeseen technical issues).
Practicals: The computer-labs use structured tutorials designed to train students on the use of the open-source software program R for applying the methods learned in the lectures to data. The lectures will cover the material required for students to complete the practical assignments, however, they are designed to be complementary and not all the material in the practicals will be covered in the lectures and vice versa. These also provide training on communicating the results of statistical analyses, with a focus on reproducibility.
Assessment of student learning will be based on submitted assignments and a practical final exam (details below).
spatstat
, maptools
, sp
,
sf
, viridis
, gstat
,
rgdal
, nlme
, raster
, and
terra
. It is recommended that you install these packages
before the course.There is no textbook for this course, but if students are interested in expanding their knowledge beyond what is covered in the course, the following textbook is recommended:
The assignments and evaluation for this course are structured along two distinct, but complimentary lines: i) guided practical assignments; and ii) a core research project. The practical assignments are designed to provide students with the opportunity to learn how to apply the methods learned in the lectures to real data using the statistical software R. The core research project is designed to train students in data analysis and presentation using best practices in open science. The core project is divided into 3 components aimed at replicating the typical analytical process: 1) initial hypothesis and expected outcome(s); 2) a verbal presentation of the analyses and findings; and 3) the submission of a paper describing the work.
Practical assignments | 39% | Due on a ca. a weekly basis |
Group Project | 30% | In-class during the final week |
Final Exam | 31% | In-class during the final week |
Total | 100% |
During the lab sessions, students will be guided through applied
practicals. They will then be asked to complete in-class assignments
based on the day’s lab. There will be a total of 3 practicals to be
completed throughout the course (i.e., one due on the Monday following
the lab). Each is designed to provide students with the opportunity to
learn how to apply the methods learned in the lectures and labs to novel
datasets using tools implemented in the statistical software
R
. The purpose of these lab assignments is to ensure that
students have the skills necessary to complete effective statistical
analyses that are central to problems involving spatial data. The skills
gained by completing each week’s assignment will also help students
prepare for the final exam and project. GitHub and Canvas will host the
labs, assignments, and the various datasets.
Grading: Each lab assignment is worth a total of 13% of your total grade with some points coming from answering the questions correctly, and some from the cleanliness and documentation of your code. Late assignments will be penalised at a rate of 5% per day.
Working in groups of 3-4, students will be required to complete a data analysis assignment. This assignment is designed to provide students with an opportunity to apply what they have learned in the lectures and labs in an unguided format as they would if they were analysing spatial point data outside of a classroom setting. For this project, students will apply the modelling tools covered in the lectures on an empirical dataset (details below) and give a short presentation about their work on these data that summarises their findings. This presentation should be comprised of five sections: Introduction, Methods, Results, Discussion, References.
Introduction: Provide a brief description of the study system from which the data come and an outline of what questions you intend on exploring with the data, citing any relevant literature.
Methods: Briefly describe the data and what variables are included. Provide a detailed description of the analytical workflow that was applied to the data, citing any relevant literature and statistical packages employed. There should be enough information that anyone can reproduce the workflow if they had access to the data.
Results: Describe your statistical findings. Tables and figures should be used throughout.
Discussion: Provide a brief summary of your findings.
References: Include references to all necessary literature.
In addition, students are to create a GitHub repository for their project where all relevant code and files are to be stored.
Datasets: To complete these assignments, each group much select a species of their choosing and download the occurrence records from the Global Biodiversity Information Facility (GBIF) data repository (https://www.gbif.org/Links to an external site.). The species must be found in British Columbia. Groups will be provided provided with a set of environmental variables, and must use these to describe trends in the observed point data.
Grading: Presentations will be graded out of 100, with 75% based on the pre-provided rubric, and the GitHub repository worth an additional 25%.
The final exam will consist of a set of question designed to assess students’ understanding of the core concepts covered in the course. The exam will be comprehensive, and any material covered in the lectures and labs will be testable.
Grading: The final is worth a total of 30% of your total grade.
Note: Final grades may be subject to scaling based on departmental policies.
(Approximate schedule of topics covered in lectures)
Week | Lecture Topics | Practical Assignment |
---|---|---|
1 | Course introduction; Spatial Intensity | Lab 01: Introduction to the spatstat package |
2 | Spatial Correlations; Poisson Point Process Models | Lab 02: Exploratory Data Analysis |
3 | Point Process Model Validation | Lab 03: Fitting and Validating Poisson Point Process Models |
4 | Spatial Autocorrelation; Variograms; Kriging | No Lab |
5 | Kriging with Covariates | Lab time used for presentations |