Spatial Data and Analyses in R

Chapter 4 Processing raster data

4.1 Cropping rasters

As we saw earlier, the different raster objects that we have loaded (bio, water, and mask) have different extents:

ext(bio)
## SpatExtent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
ext(water)
## SpatExtent : -3, 19, 46, 55 (xmin, xmax, ymin, ymax)
ext(mask)
## SpatExtent : -3, 19, 46, 55 (xmin, xmax, ymin, ymax)

Namely, the bio raster has a global extent, whereas the other two layers have a west European extent.

In order to focus our analyses to the area of interest, we can crop the object bio from a global extent to a smaller extent using the crop function, while supplying it with an extent to crop the raster to (here the output of the ext(mask) function):

bio <- crop(bio, ext(mask))

Alternatively, we could also crop rasters manually if we know the coordinates to which we want to crop the raster: the extent function can also take as argument a vector of values in the order of: xmin, xmax, ymin, ymax. Thus, we could also crop the bio raster to longitude ranging from -3 to 19 degrees, and latitude ranging from 46 to 55 degrees:

bio <- crop(bio, ext(-3, 19, 46, 55)) # xmin, xmax, ymin, ymax

Let’s check whether bio now has a smaller extent:

ext(bio)
## SpatExtent : -3, 19, 46, 55 (xmin, xmax, ymin, ymax)
plot(bio[[12]])
points(cities, pch=16, col="black")

As we can see, the bio raster now has a much smaller extent!

4.2 Subsetting layers

Instead of focussing on all 19 bioclimatic variables, let’s focus on 4 variables: Annual Mean Temperature (bio1), Temperature Seasonality (bio4), Max Temperature of Warmest Month (bio5) and Annual Precipitation (bio12). We can use the subset function to subset layers from a raster. Let’s assign the subsetted raster to an object called bio_sub:

bio_sub <- subset(bio, c("bio1","bio4","bio5","bio12"))

Let’s check the number of layers in bio_sub, their names, and plot it:

nlyr(bio_sub)
## [1] 4
names(bio_sub)
## [1] "bio1"  "bio4"  "bio5"  "bio12"
plot(bio_sub)

As we can see, the plot function will automatically plot all (here: 4) layers in the raster.

4.3 Calculating with rasters

We can easily see a quick summary of the values in each layer in the raster object using the summary function:

summary(bio_sub)
##       bio1             bio4            bio5            bio12       
##  Min.   :-5.922   Min.   :341.4   Min.   : 3.789   Min.   : 253.0  
##  1st Qu.: 8.183   1st Qu.:563.2   1st Qu.:21.458   1st Qu.: 624.0  
##  Median : 8.965   Median :630.8   Median :22.749   Median : 741.0  
##  Mean   : 8.791   Mean   :623.8   Mean   :22.455   Mean   : 785.8  
##  3rd Qu.: 9.951   3rd Qu.:697.8   3rd Qu.:23.768   3rd Qu.: 850.0  
##  Max.   :13.288   Max.   :840.9   Max.   :28.928   Max.   :2406.0  
##  NA's   :4586     NA's   :4586    NA's   :4586     NA's   :4586

To retrieve global statistics (i.e. statistics that summarize values of an entire raster), we can use the global function. For example, to compute the maximum value of the raster layers in bio_sub:

global(bio_sub, fun = "max", na.rm = TRUE)
##             max
## bio1    13.2875
## bio4   840.9453
## bio5    28.9280
## bio12 2406.0000

Here we use na.rm = TRUE to remove missing values (NA) before computing the maximum.

We can apply basic mathematical operiations (multiplication, addition etc) directly on raster objects. For example, the BIO3 variable (Isothermality) is computed as BIO2 / BIO7 * 100. Thus, we could compute it ourselves and verify by comparing the result with bio3:

bio[["bio2"]] / bio[["bio7"]] * 100
bio[["bio3"]]

We see that the range in values is equal (we could also plot them to verify that they are identical).

4.4 Working with classes

We saw earlier that the values of the water object are only 1 (denoting water) or 0 (denoting land). Suppose that we want to know for each city in cities how far it is from the sea or ocean, we may want to create a distance to water raster. A handy way of getting such raster is by first creating a raster that the value 1 for grid cells that are water, and value NA elsewhere. Let’s create such a layer and call it dwater:

### Get raster with 1 = water and NA else
dwater <- water            # first we copy the water raster
names(dwater) <- "dwater"  # assign it a more meaningful name
dwater[dwater == 0] <- NA  # replace the value 0 with the value NA

# check by plotting and creating a frequency table
freq(dwater)
##   layer value count
## 1     1     1  4586
plot(dwater)

4.5 Computing distances

Given that we have created the dwater raster above, with values 1 (water) and NA (elsewhere), we can now conveniently compute the distance to water using the distance function from the terra package. This function computes, for each grid cell, the Euclidean distance (thus ‘as the crow flies’) to the nearest grid cell that is not NA (thus, as per the creation of dwater: those grid cells that are water!):

### Compute distance to water
dwater <- distance(dwater) # note that this may take several seconds to compute
## 
|---------|---------|---------|---------|
=========================================
                                          
names(dwater) <- "dwater"  # assign it a more meaningful name
dwater <- dwater / 1000    # since the distance is returned in meters, convert km
plot(dwater)

4.6 Merging raster layers

We now have multiple objects with raster data: bio (with all 19 bioclimatic variables), bio_sub (with our 4 selected layers), water, dwater, as well as a raster with masking information: mask.

We can combine different layers into a single raster (assuming that these objects are in the same CRS, extent and resolution) using the c function. Let’s merge bio_sub and dwater into a single raster object called rstack:

### Combine the different raster objects into a single raster
rstack <- c(bio_sub, dwater)

# inspect
rstack
## class       : SpatRaster 
## dimensions  : 108, 264, 5  (nrow, ncol, nlyr)
## resolution  : 0.08333333, 0.08333333  (x, y)
## extent      : -3, 19, 46, 55  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326) 
## source(s)   : memory
## names       :      bio1,     bio4,   bio5, bio12,   dwater 
## min values  : -5.921834, 341.4172,  3.789,   253,   0.0000 
## max values  : 13.287500, 840.9453, 28.928,  2406, 920.5839
names(rstack)
## [1] "bio1"   "bio4"   "bio5"   "bio12"  "dwater"

4.7 Masking rasters

Now that we have combined all rasters with environmental data into a single object rstack, we can mask the object using the mask: this creates a new raster object that has the same values as the input object, except for the cells that are NA in a ‘mask’ object:

### Mask the values in the rstack object using the mask layer
rstack <- mask(rstack, mask)

# Check by plotting
plot(rstack)

We now have a single raster object rstack with 5 layers: 4 selected bioclimatic variables (with adjusted values), and a layer with the distance to the sea or ocean (in km). All values in the resulting raster are masked using the mask grid, so that grid cells that contain water are masked to value NA.

4.8 Resampling rasters

Especially when working with many different raster objects, it often happens that the raster objects do not match (in terms of origin, extent and resolution), even when they are in the same CRS (see section 5 for [re]-projecting) raster layers into a different CRS). In such cases, it could help to crop or extend a/the raster (using the crop or extend functions), or to (dis)aggregate rasters (using the aggregate or disagg functions; where aggregate lowers the resolution by a specified factor and with specified function, whereas disagg creates a raster with a higher resolution - thus smaller grid cells - than the original raster by a specified factor using a specified function).

Also, it may be needed to resample rasters, using the resample function, which transfers values between non matching raster objects (in terms of origin and resolution) and uses bilinear interpolation or nearest neighbour assignment to assign values to resampled grid cells.

Note that aggregating, disaggregating and resampling rasters with categorical values (classes) should only be done with great care: see Appendix C.

4.9 Other processing methods

For more information on processing spatial data, see Appendix D. For a list with the most important methods of processing spatial data using the terra package, grouped by theme, see:

?terra