Skip to contents

This function designates observations to cells of a given grid to create an aggregated data cube.

Usage

grid_designation(
  observations,
  grid,
  id_col = "row_names",
  seed = NA,
  aggregate = TRUE,
  randomisation = c("uniform", "normal"),
  p_norm = ifelse(tolower(randomisation[1]) == "uniform", NA, 0.95)
)

Arguments

observations

An sf object with POINT geometry and a time_point and coordinateUncertaintyInMeters column. If the former column is not present, the function will assume a single time point. If the latter column is not present, the function will assume no uncertainty (zero meters) around the observation points.

grid

An sf object with POLYGON geometry (usually a grid) to which observations should be designated.

id_col

The column name containing unique IDs for each grid cell. If "row_names" (the default), a new column cell_code is created where the row names represent the unique IDs.

seed

A positive numeric value setting the seed for random number generation to ensure reproducibility. If NA (default), then set.seed() is not called at all. If not NA, then the random number generator state is reset (to the state before calling this function) upon exiting this function.

aggregate

Logical. If TRUE (default), returns data cube in aggregated form (grid with the number of observations per grid cell). Otherwise, returns sampled points within the uncertainty circle.

randomisation

Character. Method used for sampling within the uncertainty circle around each observation. "uniform" (default) means each point in the uncertainty circle has an equal probability of being selected. The other option is "normal", where a point is sampled from a bivariate Normal distribution with means equal to the observation point and variance such that p_norm % of all possible samples from this Normal distribution fall within the uncertainty circle. See sample_from_binormal_circle().

p_norm

A numeric value between 0 and 1, used only if randomisation = "normal". The proportion of all possible samples from a bivariate Normal distribution that fall within the uncertainty circle. Default is 0.95.

Value

If aggregate = TRUE, an sf object with POLYGON geometry containing the grid cells, an n column with the number of observations per grid cell, and a min_coord_uncertainty column with the minimum coordinate uncertainty per grid cell. If aggregate = FALSE, an sf object with POINT geometry containing the sampled observations within the uncertainty circles, and a coordinateUncertaintyInMeters column with the coordinate uncertainty for each observation.

Examples

library(sf)
library(dplyr)

# Create four random points
n_points <- 4
xlim <- c(3841000, 3842000)
ylim <- c(3110000, 3112000)
coordinate_uncertainty <- rgamma(n_points, shape = 5, rate = 0.1)

observations_sf <- data.frame(
  lat = runif(n_points, ylim[1], ylim[2]),
  long = runif(n_points, xlim[1], xlim[2]),
  time_point = 1,
  coordinateUncertaintyInMeters = coordinate_uncertainty
) %>%
  st_as_sf(coords = c("long", "lat"), crs = 3035)

# Add buffer uncertainty in meters around points
observations_buffered <- observations_sf %>%
  st_buffer(observations_sf$coordinateUncertaintyInMeters)

# Create grid
grid_df <- st_make_grid(
  observations_buffered,
  square = TRUE,
  cellsize = c(200, 200)
) %>%
  st_sf()

# Create occurrence cube
grid_designation(
  observations = observations_sf,
  grid = grid_df,
  seed = 123
)
#> Simple feature collection with 40 features and 4 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 3841029 ymin: 3110005 xmax: 3841829 ymax: 3112005
#> Projected CRS: ETRS89-extended / LAEA Europe
#> # A tibble: 40 × 5
#>    time_point cell_code     n min_coord_uncertainty                     geometry
#>  *      <dbl> <chr>     <int>                 <dbl>                <POLYGON [m]>
#>  1          1 1             1                  59.9 ((3841029 3110005, 3841229 …
#>  2          1 35            1                  83.1 ((3841429 3111605, 3841629 …
#>  3          1 37            1                  26.1 ((3841029 3111805, 3841229 …
#>  4          1 4             1                  38.5 ((3841629 3110005, 3841829 …
#>  5          1 2             0                  NA   ((3841229 3110005, 3841429 …
#>  6          1 3             0                  NA   ((3841429 3110005, 3841629 …
#>  7          1 5             0                  NA   ((3841029 3110205, 3841229 …
#>  8          1 6             0                  NA   ((3841229 3110205, 3841429 …
#>  9          1 7             0                  NA   ((3841429 3110205, 3841629 …
#> 10          1 8             0                  NA   ((3841629 3110205, 3841829 …
#> # ℹ 30 more rows