Skip to contents

Generate \(n_{\mathrm{inv}}\) trait rows for hypothetical invaders by resampling the empirical distribution of resident traits. By default, traits are sampled independently per column (“columnwise”), creating novel combinations across traits. Alternatively, entire rows can be bootstrapped (“rowwise”) to preserve the resident covariance structure. Row names of the returned object are set to the invader IDs.

Usage

simulate_invaders(
  resident_traits,
  n_inv = 10,
  species_col = "species",
  trait_cols = NULL,
  mode = c("columnwise", "rowwise"),
  numeric_method = c("bootstrap", "normal", "uniform"),
  keep_bounds = TRUE,
  inv_prefix = "inv",
  keep_species_column = TRUE,
  seed = NULL
)

Arguments

resident_traits

A data.frame containing a species ID column (specified by species_col) or species IDs as row names, plus one or more trait columns (numeric, factor, character supported).

n_inv

Integer; number of invaders to simulate (default 10).

species_col

NULL or character; the species ID column name in resident_traits. If NULL, species IDs are taken from row names (default "species").

trait_cols

NULL or character vector; which trait columns to use. Default: all columns except species_col (when present).

mode

Either "columnwise" (new trait combinations; default) or "rowwise" (preserve covariance by resampling rows).

numeric_method

For columnwise numeric traits: one of "bootstrap" (default; sample from observed values), "normal" (draws from \(N(\bar{x}, s)\); truncated to min, max if keep_bounds = TRUE), or "uniform" (draws from Uniform[min, max]).

keep_bounds

Logical; if TRUE, normal or uniform draws are constrained to the observed [min, max]. This does not apply to "bootstrap" (default TRUE).

inv_prefix

Character; prefix used to construct invader IDs (default "inv").

keep_species_column

Logical; when species_col is not NULL, keep the species ID column after setting row names (default TRUE). Ignored if species_col = NULL.

seed

NULL or integer; optional RNG seed for reproducibility.

Value

A data.frame of simulated invaders. Row names are the invader IDs. If species_col is not NULL and keep_species_column = TRUE, that column will contain the same IDs as the row names.

Details

Species identifiers. Supply species IDs in a dedicated column via species_col, or set species_col = NULL to use row.names(resident_traits) as the species IDs. In both cases, newly simulated invaders receive fresh, unique IDs (inv_prefix1, inv_prefix2, …), which become the row names of the returned data.frame. When species_col is not NULL, the same IDs are also stored in that column (unless keep_species_column = FALSE).

Trait selection. If trait_cols is NULL, all columns except species_col (when present) are considered traits. Otherwise, only the intersection of trait_cols and existing column names is used.

Sampling modes.

  • mode = "columnwise": Each trait is generated independently. Numeric traits can be drawn by bootstrap (empirical sampling), uniform within observed bounds, or from a normal distribution parameterized by the empirical mean and SD (optionally truncated to observed min–max if keep_bounds = TRUE). Factor and character traits are sampled from their observed values/levels.

  • mode = "rowwise": Entire rows are resampled with replacement from resident_traits, preserving the joint structure. In this mode, if species_col is provided, its values are overwritten with the new invader IDs. If species_col = NULL, only the trait columns are returned and the row names are replaced by the invader IDs.

ID collisions. If any proposed invader IDs would collide with existing resident IDs, they are made unique using make.unique.

Examples

## ---------------------------
## Example 1: species IDs in a column
## ---------------------------
set.seed(1)
residents_col = data.frame(
  species = paste0("sp", 1:5),
  height  = c(10.2, 9.8, 11.1, 10.5, 9.9),
  SLA     = c(15.0, 15.2, 14.7, 15.5, 15.1),
  lifeform = factor(c("tree", "shrub", "shrub", "tree", "herb"))
)

# Columnwise bootstrap for numeric traits; factor sampled from levels.
inv1 = simulate_invaders(
  residents_col, n_inv = 4, species_col = "species",
  mode = "columnwise", numeric_method = "bootstrap",
  inv_prefix = "inv", keep_species_column = TRUE, seed = 42
)
head(inv1)
#>      species height  SLA lifeform
#> inv1    inv1   10.2 15.2     herb
#> inv2    inv2    9.9 15.5     tree
#> inv3    inv3   10.2 15.2     tree
#> inv4    inv4   10.2 15.2     herb

# Rowwise resampling preserves covariance (entire rows).
inv2 = simulate_invaders(
  residents_col, n_inv = 3, species_col = "species",
  mode = "rowwise", inv_prefix = "inv", seed = 123
)
head(inv2)
#>      species height  SLA lifeform
#> inv1    inv1   11.1 14.7    shrub
#> inv2    inv2   11.1 14.7    shrub
#> inv3    inv3    9.8 15.2    shrub

## ---------------------------
## Example 2: species IDs in row names (species_col = NULL)
## ---------------------------
set.seed(2)
residents_rn = data.frame(
  height  = c(10.2, 9.8, 11.1, 10.5, 9.9),
  SLA     = c(15.0, 15.2, 14.7, 15.5, 15.1),
  lifeform = factor(c("tree", "shrub", "shrub", "tree", "herb"))
)
rownames(residents_rn) = paste0("sp", 1:5)

# Columnwise with normal draws truncated to observed min–max
inv3 = simulate_invaders(
  residents_rn, n_inv = 5, species_col = NULL,
  mode = "columnwise", numeric_method = "normal",
  keep_bounds = TRUE, inv_prefix = "x", seed = 99
)
head(inv3)  # row names are the invader IDs; no species column present
#>      height      SLA lifeform
#> x1 10.41220 15.13577    shrub
#> x2 10.55153 14.84815    shrub
#> x3 10.34606 15.24275     herb
#> x4 10.53276 14.99384    shrub
#> x5 10.10973 14.72267    shrub

# Rowwise with species_col = NULL: entire rows resampled; row names replaced
inv4 = simulate_invaders(
  residents_rn, n_inv = 3, species_col = NULL,
  mode = "rowwise", inv_prefix = "new", seed = 77
)
head(inv4)
#>      height  SLA lifeform
#> new1    9.8 15.2    shrub
#> new2    9.9 15.1     herb
#> new3    9.9 15.1     herb