Spatial Data and Analyses in R

Chapter 2 Vector data

2.1 Loading data from local files

Let’s load the locations of these cities into R: download the file cities.zip into your “data” folder, and unzip it to the “cities” subfolder, so that the folder “data/cities” contains the 4 unzipped files that together make a shapefile. Then, we can use the vect function from the terra package to load the shapefile into R (see Appendix A for other ways of loading vector data into R):

### Load the cities shapefile into a vector layer
cities <- vect("data/cities/cities.shp")

Now, we can explore the properties of the cities object using e.g.:

# overall overview
cities
##  class       : SpatVector 
##  geometry    : points 
##  dimensions  : 6, 1  (geometries, attributes)
##  extent      : -0.1249526, 16.38053, 48.19038, 52.5132  (xmin, xmax, ymin, ymax)
##  source      : cities.shp
##  coord. ref. : lon/lat WGS 84 (EPSG:4326) 
##  names       :       city
##  type        :      <chr>
##  values      : Wageningen
##                     Paris
##                    London
# class
class(cities)
## [1] "SpatVector"
## attr(,"package")
## [1] "terra"
# spatial extent/bounding box
ext(cities)
## SpatExtent : -0.124952562617393, 16.3805349488525, 48.1903797097879, 52.5132034875388 (xmin, xmax, ymin, ymax)
# retrieve information on coordinate system
crs(cities, proj = TRUE) # crs in PROJ-string notation
## [1] "+proj=longlat +datum=WGS84 +no_defs"
crs(cities, describe = TRUE) # EPSG code and other info
##     name authority code area         extent
## 1 WGS 84      EPSG 4326 <NA> NA, NA, NA, NA
crs(cities, proj = TRUE, describe = TRUE) # combined
##     name authority code area         extent                                proj
## 1 WGS 84      EPSG 4326 <NA> NA, NA, NA, NA +proj=longlat +datum=WGS84 +no_defs
# columns with properties per location
names(cities)
## [1] "city"
# header of all properties
head(cities)
##         city
## 1 Wageningen
## 2      Paris
## 3     London
## 4     Berlin
## 5   Brussels
## 6     Vienna
# retrieve inforamtion on geometry (e.g. coordinates)
geom(cities)
##      geom part          x        y hole
## [1,]    1    1  5.6634428 51.98553    0
## [2,]    2    1  2.2917644 48.85816    0
## [3,]    3    1 -0.1249526 51.49865    0
## [4,]    4    1 13.3766394 52.51320    0
## [5,]    5    1  4.3420292 50.89498    0
## [6,]    6    1 16.3805349 48.19038    0

Above, we can see that the loaded shapefile has coordinates that are stored in the ‘WGS84’ geographical coordinate system (EPSG: 4326).

We can use a dynamic mapping library like LeafletJS (see Leaflet for R) to plot geographical data in an interactive map. We can map the locations stored in cities as basic markers on a Leaflet map, with a popup that shows the city name, using a very simple syntax:

### Plot on a dynamic leaflet map
plet(cities)

2.2 Loading data from online sources

We can use the gadm function from the geodata package to (down)load the outline of administrative boundaries for any country in the world; they are read in as a SpatVector polygon object. As input it needs ISO3 country code data (see function country_codes), e.g. “NLD” for the Netherlands.

Suppose that we want to load the outline of 13 European countries that are in the vicinity of our selected cities, we could use:

### ISO3 codes for selected European countries
euroISO3 <- c("AUT", "BEL", "CHE", "CZE", "DEU", "DNK", "FRA",
              "GBR", "HUN", "ITA", "LUX", "NLD", "POL")

### Download country outline data
countries <- gadm(country = euroISO3, # vector with ISO3 names of the countries
                  level = 0, # 0: country, 1: first level of subdivision
                  path = "data/countries/") # folder where to save data
                                            # to save in temporary folder use: 
                                            # path = tempdir()

If we check the folder “data/countries”, we see that a folder “gadm” is added, with 13 files; one for each country.

When the data has succesfully been downloaded, running this code again will not download the files again, but rather simply load the downloaded files instead of downloading the files again.

Let’s see what we now have loaded:

# explore
countries
##  class       : SpatVector 
##  geometry    : polygons 
##  dimensions  : 13, 2  (geometries, attributes)
##  extent      : -8.649996, 24.14578, 35.49292, 60.84548  (xmin, xmax, ymin, ymax)
##  coord. ref. : lon/lat WGS 84 (EPSG:4326) 
##  names       : GID_0     COUNTRY
##  type        : <chr>       <chr>
##  values      :   AUT     Austria
##                  BEL     Belgium
##                  CHE Switzerland

We can see that the countries object contains 13 geometries (the 13 selected countries) and 2 attributes (the attributes “GID_0” and “COUNTRY”).

We could save the countries object as a shapefile, so that we could open it in another GIS software, or so we can efficiently load it later into R. To save vector files to your computer, we could use the writeVector function:

writeVector(countries,
            filename = "data/countries/countries.shp",
            filetype = "ESRI Shapefile")

After saving this file, we could easily load it using the vect function:

countries <- vect("data/countries/countries.shp")

Note that if we want to subset the countries object further for a specific set of countries, we could use the subset function. For example, suppose that we want to retrieve the subset with only the Netherlands and Belgium, we could use:

subset(countries, countries$GID_0 %in% c("NLD","BEL"))

In order to do this efficiently, we can specify a vector with countries to keep using the c() function, and use the %in% logical operator to keep all of those features in the countries object that has a value for the “GID_0” column that is present in the specified vector.

We can check the object countries by plotting it (note that it can take some time, especially if you have a slow computer or when your computer is buisy with other tasks) and adding the cities as (red) points:

plot(countries)
points(cities, col = "red", pch = 16) #