Spatial Data and Analyses in R

Chapter 1 Introduction

1.1 Data types

Spatial phenomena usually consist of either (a) discrete entities with clear boundaries (e.g. roads, countries, research station), or (b) continuous phenomena with values everywhere (e.g. elevation, rainfall). Discrete objects are normally stored as vector data, which can consist of points, lines or polygons, possibly with associated information. Data on continuous phenomena are generally stored as raster data, which essentially is a grid with (often equally-sized square) grid cells (or pixels) that contain the value of the variable(s) that are mapped.

1.2 Coordinate systems

Coordinate systems are fundamental to spatial data, as it is a crucial piece of information needed to know where on Earth’s surface (or on a map) something is located. There are generally two types of coordinate reference systems (CRS): (1) geographic coordinate systems, and (2) projected coordinated systems.

Geographic coordinate systems define locations based on latitude and longitude coordinates that specifies the angle between any point and the equator, and the angle between any point and the prime meridian. Thus, a geographic coordinate system can reference any location on Earth, and thus is a global coordinate system. The angular units are usually degrees.

Projected coordinate systems (also known as Cartesian coordinate systems) define locations on a flat surface using Cartesian coordinates (x and y coordinates), that define the horizontal and vertical positions of locations. Coordinates in projected systems are usually defined in meters (or another linear unit). A projected coordinate system is always based on a geographic coordinate system that has been flattened using a map projection specific to a region of the globe, which defines how the Earth’s surface should be distorted in order to go from 3D to 2D. There are many different projections, where each one is best suited to a particular part of the Earth.

1.2.1 EPSG codes

EPSG codes are used to identify each coordinate reference system: it consists of 4-5 digits. It is linked to a definition that uniquely describes a specific CRS. Handy websites to look up information about a CRS (e.g. using an EPSG code) include epsg.io and spatialreference.org. Important coordinate systems and their EPSG codes include:

  • WGS84 (EPSG:4326): a global geographic CRS based on the Earth’s centre of mass. WGS84 is used by GPS and other global navigation satellite systems (GNSS);
  • Web Mercator (EPSG:3857) a global geographic CRS used by many web-based mapping tools (e.g. Google or OpenStreetMap);
  • Amersfoort / RD New (EPSG:28992): a local projected CRS for the Netherlands.
  • ETRS89-extended / LAEA Europe (EPSG:3035): a local projected CRS for Europe

1.3 Helpful resources

There is a very long list of packages in R that deal in some way with spatial data and/or analyses. This tutorial focuses on the terra package. Other good packages include sf (Simple Features for R; a tidyverse friendly way of dealing with spatial data) and stars (for dealing with spatial-temporal data).

For more resources, see the CRAN Task View: Analysis of Spatial Data, or Appendix D for suggestions for further reading.

1.4 Case study

To provide a simple case study so as to practice some skills in working with spatial data in R, we are going to retrieve long-term climate data for some of the most important cities in our part of Europe: Berlin, Brussels, London, Paris, Vienna, and (obviously) Wageningen :-)

1.5 Preliminaries

Before working with spatial data here with the case study in R, you will have to prepare some files and (sub) directories on your computer, so that you can later store (and load) data efficiently without your project becoming an entire mess (namely when working with spatial data, the number of files will rapidly increase so that proper project management is needed!). So, on you computer, create the following files and (sub) folders, yet note that although in the structure shown above the main folder (i.e. the project root) is set to C:/WEC/, on your computer you can of course choose a different folder:

C:/WEC/
|-- script_day3.r
|
|-- data/
    |-- cities/
    |
    |-- countries/
    |
    |-- worldclim/

Then, we need to start a new script (e.g. ‘script_day3.r’ as shown above), load the needed packages (install them first if needed), as well as to set the working directory. For example:

# Set working directory
setwd("C:/WEC") # update to your own working directory

# Load libraries
library(terra)
library(leaflet)
library(geodata)

In the next pages, we mainly use the terra package, but also use the leaflet package for dynamic plotting and the geodata package for retrieving data from online sources.