‘SongEvo’ package

Introduction

SongEvo simulates the cultural evolution of quantitative traits of bird song. SongEvo is an individual- (agent-) based model. SongEvo is spatially-explicit and can be parameterized with, and tested against, measured song data. Functions are available for model implementation, sensitivity analyses, parameter optimization, model validation, and hypothesis testing.

Overview of Functions

  1. SongEvo implements the model
  2. par.sens allows sensitivity analyses
  3. par.opt allows parameter optimization
  4. mod.val allows model validation
  5. h.test allows hypothesis testing

Getting Started

Load and attach SongEvo package

library(SongEvo)

Functions

SongEvo implements the model par.sens allows sensitivity analyses par.opt allows parameter optimization mod.val allows model validation h.test allows hypothesis testing

Examples

EXAMPLE 1

Load the example data: song.data and global parameters

To explore the SongEvo package, we will use a database of songs from Nuttall’s white-crowned sparrow (Zonotrichia leucophrys nuttalli) recorded at three locations in 1969 and 2005.

data("song.data")

Examine global parameters. Global parameters describe our understanding of the system and may be measured or hypothesized. They are called “global” because they are used by many many functions and subroutines within functions. For descriptions of all adjustable parameters, see ?song.data.

data("glo.parms")
glo.parms$mortality.a.m <- glo.parms$mortality.a.f <- glo.parms$mortality.a
glo.parms$mortality.j.m <- glo.parms$mortality.j.f <- glo.parms$mortality.j
glo.parms$male.fledge.n.mean <- glo.parms$male.fledge.n.mean*2
glo.parms$male.fledge.n.sd <- glo.parms$male.fledge.n.sd*2
glo.parms <- glo.parms[!names(glo.parms) %in% c("mortality.a","mortality.j")]
str(glo.parms)
#> List of 17
#>  $ learning.error.d  : num 0
#>  $ learning.error.sd : num 430
#>  $ n.territories     : num 40
#>  $ lifespan          : num 2.08
#>  $ phys.lim.min      : num 1559
#>  $ phys.lim.max      : num 4364
#>  $ male.fledge.n.mean: num 2.7
#>  $ male.fledge.n.sd  : num 1
#>  $ disp.age          : num 2
#>  $ disp.distance.mean: num 110
#>  $ disp.distance.sd  : num 100
#>  $ terr.turnover     : num 0.5
#>  $ male.fledge.n     : num [1:40] 1 1 2 1 0 2 2 2 2 1 ...
#>  $ mortality.a.f     : num 0.468
#>  $ mortality.a.m     : num 0.468
#>  $ mortality.j.f     : num 0.5
#>  $ mortality.j.m     : num 0.5

Share global parameters with the global environment. We make these parameters available in the global environment so that we can access them with minimal code.

list2env(glo.parms, globalenv())
#> <environment: R_GlobalEnv>

Examine song data

Data include the population name (Bear Valley, PRBO, or Schooner), year of song recording (1969 or 2005), and the frequency bandwidth of the trill.

str(song.data)
#> 'data.frame':    89 obs. of  3 variables:
#>  $ Population: Factor w/ 3 levels "Bear Valley",..: 3 3 3 3 3 3 3 3 3 3 ...
#>  $ Year      : int  1969 1969 1969 1969 1969 1969 1969 1969 1969 1969 ...
#>  $ Trill.FBW : num  3261 2494 2806 2878 2758 ...

Simulate bird song evolution with SongEvo()

Define initial individuals

In this example, we use songs from individual birds recorded in one population (PRBO) in the year 1969, which we will call starting.trait.

starting.trait <- subset(song.data, Population=="PRBO" & Year==1969)$Trill.FBW

We want a starting population of 40 individuals, so we generate additional trait values to complement those from the existing 30 individuals. Then we create a data frame that includes a row for each individual; we add identification numbers, ages, and geographical coordinates for each individual.

starting.trait2 <- c(starting.trait, rnorm(n.territories-length(starting.trait), 
    mean=mean(starting.trait), sd=sd(starting.trait)))
init.inds <- data.frame(id = seq(1:n.territories), age = 2, trait = starting.trait2)
init.inds$x1 <-  round(runif(n.territories, min=-122.481858, max=-122.447270), digits=8)
init.inds$y1 <-  round(runif(n.territories, min=37.787768, max=37.805645), digits=8)

Specify and call the SongEvo model

SongEvo() includes several settings, which we specify before running the model. For this example, we run the model for 10 iterations, over 36 years (i.e. 1969–2005). When conducting research with SongEvo(), users will want to increase the number iterations (e.g. to 100 or 1000). Each timestep is one year in this model (i.e. individuals complete all components of the model in 1 year). We specify territory turnover rate here as an example of how to adjust parameter values. We could adjust any other parameter value here also. The learning method specifies that individuals integrate songs heard from adults within the specified integration distance (intigrate.dist, in kilometers). In this example, we do not includ a lifespan, so we assign it NA. In this example, we do not model competition for mates, so specify it as FALSE. Last, specify all as TRUE in order to save data for every single simulated individual because we will use those data later for mapping. If we do not need data for each individual, we set all to FALSE because the all.inds data.frame becomes very large!

iteration <- 10
years <- 36
timestep <- 1
terr.turnover <- 0.5
integrate.dist <- 0.1
lifespan <- NA
mate.comp <- FALSE
prin <- FALSE
all <- TRUE

Now we call SongEvo with our specifications and save it in an object called SongEvo1.

SongEvo1 <- SongEvo(init.inds = init.inds, females = 1.0, iteration = iteration, steps = years,  
    timestep = timestep, n.territories = n.territories, terr.turnover = terr.turnover, 
    integrate.dist = integrate.dist, 
    learning.error.d = learning.error.d, learning.error.sd = learning.error.sd, 
    mortality.a.m = mortality.a.m, mortality.a.f = mortality.a.f,
    mortality.j.m = mortality.j.m, mortality.j.f = mortality.j.f, lifespan = lifespan, 
    phys.lim.min = phys.lim.min, phys.lim.max = phys.lim.max, 
    male.fledge.n.mean = male.fledge.n.mean, male.fledge.n.sd = male.fledge.n.sd, male.fledge.n = male.fledge.n,
    disp.age = disp.age, disp.distance.mean = disp.distance.mean, disp.distance.sd = disp.distance.sd, 
    mate.comp = mate.comp, prin = prin, all = TRUE)

Examine results from SongEvo model

The model required the following time to run on your computer:

SongEvo1$time
#>    user  system elapsed 
#>  21.503   0.049  21.554

Three main objects hold data regarding the SongEvo model. Additional objects are used temporarily within modules of the model.

First, currently alive individuals are stored in a data frame called “inds.” Values within “inds” are updated throughout each of the iterations of the model, and “inds” can be viewed after the model is completed.

head(SongEvo1$inds, min(5,nrow(SongEvo1$inds)))
#>                 coordinates   id age    trait        x1       y1
#> M1528 (-122.4476, 37.79043) 1528   9 2789.953 -122.4476 37.79043
#> M1729 (-122.4846, 37.78023) 1729   6 3470.873 -122.4846 37.78023
#> M1763 (-122.4493, 37.79083) 1763   5 2763.244 -122.4493 37.79083
#> M1765 (-122.4867, 37.78252) 1765   5 3100.426 -122.4867 37.78252
#> M1773 (-122.4827, 37.78406) 1773   5 3252.713 -122.4827 37.78406
#>       male.fledglings female.fledglings territory father sex fitness learn.dir
#> M1528               2                 1         1   1359   M       1         0
#> M1729               0                 0         0   1655   M       1         0
#> M1763               0                 0         0   1647   M       1         0
#> M1765               1                 1         1   1655   M       1         0
#> M1773               0                 0         1   1708   M       1         0
#>              x0       y0
#> M1528 -122.4459 37.79239
#> M1729 -122.4865 37.78322
#> M1763 -122.4497 37.78890
#> M1765 -122.4865 37.78322
#> M1773 -122.4819 37.78442

Second, an array (i.e. a multi-dimensional table) entitled “summary.results” includes population summary values for each time step (dimension 1) in each iteration (dimension 2) of the model. Population summary values are contained in five additional dimensions: population size for each time step of each iteration (“sample.n”), the population mean and variance of the song feature studied (“trait.pop.mean” and “trait.pop.variance”), with associated lower (“lci”) and upper (“uci”) confidence intervals.

dimnames(SongEvo1$summary.results)
#> $iteration
#>  [1] "iteration 1"  "iteration 2"  "iteration 3"  "iteration 4"  "iteration 5" 
#>  [6] "iteration 6"  "iteration 7"  "iteration 8"  "iteration 9"  "iteration 10"
#> 
#> $step
#>  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15"
#> [16] "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30"
#> [31] "31" "32" "33" "34" "35" "36"
#> 
#> $feature
#> [1] "sample.n"           "trait.pop.mean"     "trait.pop.variance"
#> [4] "lci"                "uci"

Third, individual values may optionally be concatenated and saved to one data frame entitled “all.inds.” all.inds can become quite large, and is therefore only recommended if additional data analyses are desired.

head(SongEvo1$all.inds,  min(5,nrow(SongEvo1$all.inds)))
#>                   coordinates id age  trait        x1       y1 male.fledglings
#> I1.T1.1 (-122.4602, 37.78831)  1   2 4004.8 -122.4602 37.78831               1
#> I1.T1.2 (-122.4747, 37.78894)  2   2 3765.0 -122.4747 37.78894               0
#> I1.T1.3 (-122.4643, 37.80173)  3   2 3237.4 -122.4643 37.80173               1
#> I1.T1.4  (-122.455, 37.78949)  4   2 3621.1 -122.4550 37.78949               0
#> I1.T1.5 (-122.4522, 37.80261)  5   2 3285.4 -122.4522 37.80261               0
#>         female.fledglings territory father sex fitness learn.dir x0 y0 timestep
#> I1.T1.1                 0         1      0   M       1         0  0  0        1
#> I1.T1.2                 1         1      0   M       1         0  0  0        1
#> I1.T1.3                 1         1      0   M       1         0  0  0        1
#> I1.T1.4                 1         1      0   M       1         0  0  0        1
#> I1.T1.5                 0         1      0   M       1         0  0  0        1
#>         iteration
#> I1.T1.1         1
#> I1.T1.2         1
#> I1.T1.3         1
#> I1.T1.4         1
#> I1.T1.5         1

Simulated population size

We see that the simulated population size remains relatively stable over the course of 36 years. This code uses the summary.results array.

plot(SongEvo1$summary.results[1, , "sample.n"], xlab="Year", ylab="Abundance", type="n", 
    xaxt="n", ylim=c(0, max(SongEvo1$summary.results[, , "sample.n"], na.rm=TRUE)))
axis(side=1, at=seq(0, 40, by=5), labels=seq(1970, 2010, by=5))
    for(p in 1:iteration){
        lines(SongEvo1$summary.results[p, , "sample.n"], col="light gray")
        }
n.mean <- apply(SongEvo1$summary.results[, , "sample.n"], 2, mean, na.rm=TRUE)
lines(n.mean, col="red")

#Plot 95% quantiles
quant.means <- apply (SongEvo1$summary.results[, , "sample.n"], MARGIN=2, quantile, 
    probs=c(0.975, 0.025), R=600, na.rm=TRUE)
lines(quant.means[1,], col="red", lty=2)
lines(quant.means[2,], col="red", lty=2)

Load Hmisc package for plotting functions.

library("Hmisc")

Simulated trait values

We see that the mean trait values per iteration varied widely, though mean trait values over all iterations remained relatively stable. This code uses the summary.results array.

plot(SongEvo1$summary.results[1, , "trait.pop.mean"], xlab="Year", ylab="Bandwidth (Hz)", 
    xaxt="n", type="n", xlim=c(-0.5, 36), 
    ylim=c(min(SongEvo1$summary.results[, , "trait.pop.mean"], na.rm=TRUE), 
    max(SongEvo1$summary.results[, , "trait.pop.mean"], na.rm=TRUE)))
    for(p in 1:iteration){
        lines(SongEvo1$summary.results[p, , "trait.pop.mean"], col="light gray")
        }
freq.mean <- apply(SongEvo1$summary.results[, , "trait.pop.mean"], 2, mean, na.rm=TRUE)
lines(freq.mean, col="blue")
axis(side=1, at=seq(0, 35, by=5), labels=seq(1970, 2005, by=5))#, tcl=-0.25, mgp=c(2,0.5,0))

#Plot 95% quantiles
quant.means <- apply (SongEvo1$summary.results[, , "trait.pop.mean"], MARGIN=2, quantile, 
    probs=c(0.95, 0.05), R=600, na.rm=TRUE)
lines(quant.means[1,], col="blue", lty=2)
lines(quant.means[2,], col="blue", lty=2)

#plot mean and CI for historic songs.  
 #plot original song values
library("boot")
sample.mean <- function(d, x) {
    mean(d[x])
}
boot_hist <- boot(starting.trait, statistic=sample.mean, R=100)#, strata=mn.res$iteration)  
ci.hist <- boot.ci(boot_hist, conf=0.95, type="basic")
low <- ci.hist$basic[4]
high <- ci.hist$basic[5]
points(0, mean(starting.trait), pch=20, cex=0.6, col="black")
errbar(x=0, y=mean(starting.trait), high, low, add=TRUE)
 #text and arrows
text(x=5, y=2720, labels="Historical songs", pos=1)
arrows(x0=5, y0=2750, x1=0.4, y1=mean(starting.trait), length=0.1)

Trait variance

We see that variance for each iteration per year increased in the first few years and then stabilized. This code uses the summary.results array.

 #plot variance for each iteration per year
plot(SongEvo1$summary.results[1, , "trait.pop.variance"], xlab="Year", 
    ylab="Bandwidth Variance (Hz)", type="n", xaxt="n", 
    ylim=c(min(SongEvo1$summary.results[, , "trait.pop.variance"], na.rm=TRUE), 
    max(SongEvo1$summary.results[, , "trait.pop.variance"], na.rm=TRUE)))
axis(side=1, at=seq(0, 40, by=5), labels=seq(1970, 2010, by=5))
    for(p in 1:iteration){
        lines(SongEvo1$summary.results[p, , "trait.pop.variance"], col="light gray")
        }
n.mean <- apply(SongEvo1$summary.results[, , "trait.pop.variance"], 2, mean, na.rm=TRUE)
lines(n.mean, col="green")

#Plot 95% quantiles
quant.means <- apply (SongEvo1$summary.results[, , "trait.pop.variance"], MARGIN=2, quantile, 
    probs=c(0.975, 0.025), R=600, na.rm=TRUE)
lines(quant.means[1,], col="green", lty=2)
lines(quant.means[2,], col="green", lty=2)

Maps

The simulation results include geographical coordinates and are in a standard spatial data format, thus allowing calculation of a wide variety of spatial statistics.

Load packages for making maps.

library("sp")
library("reshape2")
library("lattice")

Convert data frame from long to wide format. This is necessary for making a multi-panel plot.

all.inds1 <- subset(SongEvo1$all.inds, SongEvo1$all.inds$iteration==1)
w <- dcast(as.data.frame(all.inds1), id ~ timestep, value.var="trait", fill=0)
all.inds1w <- merge(all.inds1, w, by="id")
years.SongEvo1 <- (dim(w)[2]-1 )
names(all.inds1w@data)[-(1:length(all.inds1@data))] <-paste("Ts", 1:(dim(w)[2]-1 ), sep="")

Create a function to generate a continuous color palette–we will use the palette in the next call to make color ramp to represent the trait value.

rbPal <- colorRampPalette(c('blue','red')) #Create a function to generate a continuous color palette

Plot maps, including a separate panel for each timestep (each of 36 years). Our example shows that individuals move across the landscape and that regional dialects evolve and move. The x-axis is longitude, the y-axis is latitude, and the color ramp indicates trill bandwidth in Hz.

spplot(all.inds1w[,-c(1:ncol(all.inds1))], as.table=TRUE, 
    cuts=c(0, seq(from=1500, to=4500, by=10)), ylab="", 
    col.regions=c("transparent", rbPal(1000)), 
    #cuts specifies that the first level (e.g. <1500) is transparent.
colorkey=list(
    right=list(
          fun=draw.colorkey,
          args=list( 
                key=list(
                at=seq(1500, 4500, 10),
                col=rbPal(1000),
                labels=list(
                at=c(1500, 2000, 2500, 3000, 3500, 4000, 4500),
                labels=c("1500", "2000", "2500", "3000", "3500", "4000", "4500")
                )
                )
                )
            )
    )
)

In addition, you can plot simpler multi-panel maps that do not take advantage of the spatial data class.

 #Lattice plot (not as a spatial frame)
it1 <- subset(SongEvo1$all.inds, iteration==1)
rbPal <- colorRampPalette(c('blue','red')) #Create a function to generate a continuous color palette
it1$Col <- rbPal(10)[as.numeric(cut(it1$trait, breaks = 10))]
xyplot(it1$y1~it1$x1 | it1$timestep, groups=it1$trait, asp="iso", col=it1$Col, 
    xlab="Longitude", ylab="Latitude")

Test model sensitivity with par.sens()

This function allows testing the sensitivity of SongEvo to different parameter values.

Specify and call par.sens()

Here we test the sensitivity of the Acquire a Territory submodel to variation in territory turnover rates, ranging from 0.8–1.2 times the published rate (40–60% of territories turned over). The call for the par.sens function has a format similar to SongEvo. The user specifies the parameter to test and the range of values for that parameter. The function currently allows examination of only one parameter at a time and requires at least two iterations.

parm <- "terr.turnover"
par.range = seq(from=0.4, to=0.6, by=0.025)
sens.results <- NULL

Now we call the par.sens function with our specifications.

extra_parms <- list(init.inds = init.inds, 
                    females = 1,  # New in SongEvo v2
                    timestep = 1, 
                    n.territories = nrow(init.inds),
                    integrate.dist = 0.1,
                    lifespan = NA, 
                    terr.turnover = 0.5, 
                    mate.comp = FALSE, 
                    prin = FALSE,
                    all = TRUE,
                    # New in SongEvo v2
                    selectivity = 3,
                    content.bias = FALSE,
                    n.content.bias.loc = "all",
                    content.bias.loc = FALSE,
                    content.bias.loc.ranges = FALSE,
                    affected.traits = FALSE,
                    conformity.bias = FALSE,
                    prestige.bias=FALSE,
                    learn.m="default",
                    learn.f="default",
                    learning.error.d=0,
                    learning.error.sd=200)
global_parms_key <- which(!names(glo.parms) %in% names(extra_parms))
extra_parms[names(glo.parms[global_parms_key])]=glo.parms[global_parms_key]
par.sens1 <- par.sens(parm = parm, par.range = par.range, 
                      iteration = iteration, steps = years, mate.comp = FALSE, 
                      fixed_parms=extra_parms[names(extra_parms)!=parm], all = TRUE)
#> [1] "terr.turnover =  0.4"
#> [1] "terr.turnover =  0.425"
#> [1] "terr.turnover =  0.45"
#> [1] "terr.turnover =  0.475"
#> [1] "terr.turnover =  0.5"
#> [1] "terr.turnover =  0.525"
#> [1] "terr.turnover =  0.55"
#> [1] "terr.turnover =  0.575"
#> [1] "terr.turnover =  0.6"

Examine par.sens results

Examine results objects, which include two arrays:

The first array, sens.results, contains the SongEvo model results for each parameter. It has the following dimensions:

dimnames(par.sens1$sens.results)
#> [[1]]
#>  [1] "iteration 1"  "iteration 2"  "iteration 3"  "iteration 4"  "iteration 5" 
#>  [6] "iteration 6"  "iteration 7"  "iteration 8"  "iteration 9"  "iteration 10"
#> 
#> [[2]]
#>  [1] "1"  "2"  "3"  "4"  "5"  "6"  "7"  "8"  "9"  "10" "11" "12" "13" "14" "15"
#> [16] "16" "17" "18" "19" "20" "21" "22" "23" "24" "25" "26" "27" "28" "29" "30"
#> [31] "31" "32" "33" "34" "35" "36"
#> 
#> [[3]]
#> [1] "sample.n"           "trait.pop.mean"     "trait.pop.variance"
#> [4] "lci"                "uci"               
#> 
#> [[4]]
#> [1] "par.val 0.4"   "par.val 0.425" "par.val 0.45"  "par.val 0.475"
#> [5] "par.val 0.5"   "par.val 0.525" "par.val 0.55"  "par.val 0.575"
#> [9] "par.val 0.6"

The second array, sens.results.diff contains the quantile range of trait values across iterations within a parameter value. It has the following dimensions:

dimnames(par.sens1$sens.results.diff)
#> [[1]]
#> [1] "par.val 0.4"   "par.val 0.425" "par.val 0.45"  "par.val 0.475"
#> [5] "par.val 0.5"   "par.val 0.525" "par.val 0.55"  "par.val 0.575"
#> [9] "par.val 0.6"  
#> 
#> [[2]]
#>  [1] "Quantile diff 1"  "Quantile diff 2"  "Quantile diff 3"  "Quantile diff 4" 
#>  [5] "Quantile diff 5"  "Quantile diff 6"  "Quantile diff 7"  "Quantile diff 8" 
#>  [9] "Quantile diff 9"  "Quantile diff 10" "Quantile diff 11" "Quantile diff 12"
#> [13] "Quantile diff 13" "Quantile diff 14" "Quantile diff 15" "Quantile diff 16"
#> [17] "Quantile diff 17" "Quantile diff 18" "Quantile diff 19" "Quantile diff 20"
#> [21] "Quantile diff 21" "Quantile diff 22" "Quantile diff 23" "Quantile diff 24"
#> [25] "Quantile diff 25" "Quantile diff 26" "Quantile diff 27" "Quantile diff 28"
#> [29] "Quantile diff 29" "Quantile diff 30" "Quantile diff 31" "Quantile diff 32"
#> [33] "Quantile diff 33" "Quantile diff 34" "Quantile diff 35" "Quantile diff 36"

To assess sensitivity of SongEvo to a range of parameter values, plot the range in trait quantiles per year by the parameter value. We see that territory turnover values of 0.4–0.6 provided means and quantile ranges of trill bandwidths that are similar to those obtained with the published estimate of 0.5, indicating that the Acquire a Territory submodel is robust to realistic variation in those parameter values.

In the figure, solid gray and black lines show the quantile range of song frequency per year over all iterations as parameterized with the published territory turnover rate (0.5; thick black line) and a range of values from 0.4 to 0.6 (in steps of 0.05, light to dark gray). Orange lines show the mean and 2.5th and 97.5th quantiles of all quantile ranges.

 #plot of range in trait quantiles by year by parameter value
plot(1:years, par.sens1$sens.results.diff[1,], ylim=c(min(par.sens1$sens.results.diff, 
    na.rm=TRUE), max(par.sens1$sens.results.diff, na.rm=TRUE)), type="l", 
    ylab="Quantile range (Hz)", xlab="Year", col="transparent", xaxt="n")
axis(side=1, at=seq(0, 35, by=5), labels=seq(1970, 2005, by=5))

  #Make a continuous color ramp from gray to black
grbkPal <- colorRampPalette(c('gray','black'))
  
  #Plot a line for each parameter value
for(i in 1:length(par.range)){
lines(1:years, par.sens1$sens.results.diff[i,], type="l", 
    col=grbkPal(length(par.range))[i])
}

  #Plot values from published parameter values
lines(1:years, par.sens1$sens.results.diff[2,], type="l", col="black", lwd=4)

  #Calculate and plot mean and quantiles
quant.mean <- apply(par.sens1$sens.results.diff, 2, mean, na.rm=TRUE)
lines(quant.mean, col="orange")

#Plot 95% quantiles (which are similar to credible intervals)
  #95% quantiles of population means (narrower)
quant.means <- apply (par.sens1$sens.results.diff, MARGIN=2, quantile, 
    probs=c(0.975, 0.025), R=600, na.rm=TRUE)
lines(quant.means[1,], col="orange", lty=2)
lines(quant.means[2,], col="orange", lty=2)

Optimize parameter values with par.opt()

This function follows par.sens to help users optimize values for imperfectly known parameters for SongEvo. The goals are to maximize accuracy and precision of model prediction. Accuracy is quantified by three different approaches: i) the mean of absolute residuals of the predicted population mean values in relation to target data (e.g. observed or hypothetical values (smaller absolute residuals indicate a more accurate model)), ii) the difference between the bootstrapped mean of predicted population means and the mean of the target data, and iii) the proportion of simulated population trait means that fall within (i.e. are “contained by”) the confidence intervals of the target data (a higher proportion indicates greater accuracy). Precision is measured with the residuals of the predicted population variance to the variance of target data (smaller residuals indicate a more precise model).

Prepare current song values

target.data <- subset(song.data, Population=="PRBO" & Year==2005)$Trill.FBW

Specify and call par.opt()

Users specify the timestep (“ts”) at which to compare simulated trait values to target trait data (“target.data”) and save the results in an object (called par.opt1 here).

ts <- years
par.opt1 <- par.opt(sens.results=par.sens1$sens.results, ts=ts, 
    target.data=target.data, par.range=par.range)

Examine results objects (residuals and target match).

par.opt1$Residuals
#> , , Residuals of mean
#> 
#>               Iteration 1 Iteration 2 Iteration 3 Iteration 4 Iteration 5
#> par.val 0.4    423.832929    9.072603   220.01363    157.5111   327.92873
#> par.val 0.425   17.414232  179.799740   198.27723    324.6101   309.26890
#> par.val 0.45   360.922983  208.739367   107.36239    467.0985    23.33257
#> par.val 0.475    3.002444  338.446644   329.96314    311.6260   315.47123
#> par.val 0.5    341.488030  201.403363   537.93143    150.5028   447.10296
#> par.val 0.525   48.990549    4.716653    84.59006    248.0517    34.04253
#> par.val 0.55   200.512859  405.570797   407.15971    132.2251   220.14958
#> par.val 0.575   67.261911   42.186316   162.43372    206.6027   239.75274
#> par.val 0.6    150.811345  381.295382   171.25321    208.1720    38.38828
#>               Iteration 6 Iteration 7 Iteration 8 Iteration 9 Iteration 10
#> par.val 0.4   188.0418805  259.409848   101.07989   306.98292    268.24544
#> par.val 0.425  76.1139092  243.176975   262.11944    16.48896     76.02736
#> par.val 0.45  383.0761086  375.910563   252.16602   503.62474    323.32619
#> par.val 0.475 153.2608278   28.727254   141.17831   309.86060    221.06166
#> par.val 0.5    99.4218135    3.951095   272.47274   320.36094     17.62708
#> par.val 0.525 312.9364931  250.340011    24.43407   327.93556    342.19469
#> par.val 0.55    0.5390983  263.765862   382.95820   211.45151    506.97215
#> par.val 0.575 318.7519610  163.869580   153.41187   284.50814    226.04219
#> par.val 0.6   123.6396871  102.360795   408.96077   175.26647    122.70636
#> 
#> , , Residuals of variance
#> 
#>               Iteration 1 Iteration 2 Iteration 3 Iteration 4 Iteration 5
#> par.val 0.4      2515.863     174.677     308.593   7942.1285   12035.771
#> par.val 0.425   10017.516    6831.015   15853.671   7922.4563   13192.871
#> par.val 0.45     1593.826    3603.625   10946.393   9493.4430    5722.825
#> par.val 0.475   29326.766   14049.239    6683.842   6075.8750    7398.454
#> par.val 0.5      2252.965    5008.545    8220.947   3904.8427   15041.659
#> par.val 0.525   16246.213    4985.730    2583.530   1494.4725    1153.909
#> par.val 0.55    12505.949    3790.839    3751.787   3554.9732    8349.303
#> par.val 0.575   10729.690    5888.646    4653.696  18446.0970    2108.804
#> par.val 0.6      2678.177   10986.184   12812.079    381.5835    6387.349
#>               Iteration 6 Iteration 7 Iteration 8 Iteration 9 Iteration 10
#> par.val 0.4      5660.699    745.1856   2455.0148   10604.891    6587.2905
#> par.val 0.425    8724.015   6092.3161   8211.8084   10462.553    8426.1971
#> par.val 0.45    15458.684  11808.1302   3453.4660   16422.832    1918.9196
#> par.val 0.475   11394.772  11449.7729   2677.1872    5761.245    7357.4517
#> par.val 0.5      3504.654  42385.3601  21120.2672   17558.363   21909.3151
#> par.val 0.525    8086.326   1175.7755    633.1128    6163.405    9007.2104
#> par.val 0.55    22058.056   8688.4754   5270.3232   14535.844    7420.0998
#> par.val 0.575    6963.664  10890.7865   1026.5676    1593.142     965.6395
#> par.val 0.6      1461.935   4864.1629  25620.6497    2764.837    6627.8493
par.opt1$Target.match
#>               Difference in means Proportion contained
#> par.val 0.4              174.3299                  0.1
#> par.val 0.425            166.8468                  0.2
#> par.val 0.45             300.5559                  0.1
#> par.val 0.475            215.2598                  0.2
#> par.val 0.5              239.2262                  0.2
#> par.val 0.525            162.9364                  0.4
#> par.val 0.55             273.1305                  0.1
#> par.val 0.575            173.0297                  0.2
#> par.val 0.6              167.8133                  0.1

Plot results of par.opt()

Accuracy of par.opt()

  1. Difference in means.
plot(par.range, par.opt1$Target.match[,1], type="l", xlab="Parameter range", 
    ylab="Difference in means (Hz)")

  1. Plot proportion contained.
plot(par.range, par.opt1$Prop.contained, type="l", xlab="Parameter range", 
    ylab="Proportion contained")

  1. Calculate and plot mean and quantiles of residuals of mean trait values.
res.mean.means <- apply(par.opt1$Residuals[, , 1], MARGIN=1, mean, na.rm=TRUE)
res.mean.quants <- apply (par.opt1$Residuals[, , 1], MARGIN=1, quantile, 
    probs=c(0.975, 0.025), R=600, na.rm=TRUE)
plot(par.range, res.mean.means, col="orange", ylim=c(min(par.opt1$Residuals[,,1], 
    na.rm=TRUE), max(par.opt1$Residuals[,,1], na.rm=TRUE)), type="b", 
    xlab="Parameter value (territory turnover rate)", 
    ylab="Residual of trait mean (trill bandwidth, Hz)")
points(par.range, res.mean.quants[1,], col="orange")
points(par.range, res.mean.quants[2,], col="orange")
lines(par.range, res.mean.quants[1,], col="orange", lty=2)
lines(par.range, res.mean.quants[2,], col="orange", lty=2)

Precision of par.opt()

#Calculate and plot mean and quantiles of residuals of variance of trait values
res.var.mean <- apply(par.opt1$Residuals[, , 2], MARGIN=1, mean, na.rm=TRUE)
res.var.quants <- apply (par.opt1$Residuals[, , 2], MARGIN=1, quantile, 
    probs=c(0.975, 0.025), R=600, na.rm=TRUE)
plot(par.range, res.var.mean, col="purple", 
    ylim=c(min(par.opt1$Residuals[,,2], na.rm=TRUE), 
    max(par.opt1$Residuals[,,2], na.rm=TRUE)), type="b", 
    xlab="Parameter value (territory turnover rate)", 
    ylab="Residual of trait variance (trill bandwidth, Hz)")
points(par.range, res.var.quants[1,], col="purple")
points(par.range, res.var.quants[2,], col="purple")
lines(par.range, res.var.quants[1,], col="purple", lty=2)
lines(par.range, res.var.quants[2,], col="purple", lty=2)

Visual inspection of accuracy and precision of par.opt(): plot trait values for range of parameters

par(mfcol=c(3,2),
    mar=c(2.1, 2.1, 0.1, 0.1),
    cex=0.8)
for(i in 1:length(par.range)){
plot(par.sens1$sens.results[ , , "trait.pop.mean", ], xlab="Year", ylab="Bandwidth (Hz)", 
    xaxt="n", type="n", xlim=c(-0.5, years), 
    ylim=c(min(par.sens1$sens.results[ , , "trait.pop.mean", ], na.rm=TRUE), 
    max(par.sens1$sens.results[ , , "trait.pop.mean", ], na.rm=TRUE)))
    for(p in 1:iteration){
        lines(par.sens1$sens.results[p, , "trait.pop.mean", i], col="light gray")
        }
freq.mean <- apply(par.sens1$sens.results[, , "trait.pop.mean", i], 2, mean, na.rm=TRUE)
lines(freq.mean, col="blue")
axis(side=1, at=seq(0, 35, by=5), labels=seq(1970, 2005, by=5))

#Plot 95% quantiles
quant.means <- apply (par.sens1$sens.results[, , "trait.pop.mean", i], MARGIN=2, quantile, 
    probs=c(0.95, 0.05), R=600, na.rm=TRUE)
lines(quant.means[1,], col="blue", lty=2)
lines(quant.means[2,], col="blue", lty=2)

#plot mean and CI for historic songs.  
 #plot original song values
library("boot")
sample.mean <- function(d, x) {
    mean(d[x])
}
boot_hist <- boot(starting.trait, statistic=sample.mean, R=100)#, strata=mn.res$iteration)  
ci.hist <- boot.ci(boot_hist, conf=0.95, type="basic")
low <- ci.hist$basic[4]
high <- ci.hist$basic[5]
points(0, mean(starting.trait), pch=20, cex=0.6, col="black")
library("Hmisc")
errbar(x=0, y=mean(starting.trait), high, low, add=TRUE)
 
 #plot current song values
library("boot")
sample.mean <- function(d, x) {
    mean(d[x])
}
boot_curr <- boot(target.data, statistic=sample.mean, R=100)#, strata=mn.res$iteration) 
ci.curr <- boot.ci(boot_curr, conf=0.95, type="basic")
low <- ci.curr$basic[4]
high <- ci.curr$basic[5]
points(years, mean(target.data), pch=20, cex=0.6, col="black")
library("Hmisc")
errbar(x=years, y=mean(target.data), high, low, add=TRUE)

  #plot panel title
text(x=3, y=max(par.sens1$sens.results[ , , "trait.pop.mean", ], na.rm=TRUE)-100, 
    labels=paste("Par = ", par.range[i], sep=""))  
}

Model validation with mod.val()

This function allows users to assess the validity of the specified model by testing model performance with a population different from the population used to build the model. The user first runs SongEvo with initial trait values from the validation population. mod.val() uses the summary.results array from SongEvo, along with target values from a specified timestep, to calculate the same three measures of accuracy and one measure of precision that are calculated in par.opt.

We parameterized SongEvo with initial song data from Schooner Bay, CA in 1969, and then compared simulated data to target (i.e. observed) data in 2005.

Prepare initial song data for Schooner Bay.

starting.trait <- subset(song.data, Population=="Schooner" & Year==1969)$Trill.FBW
starting.trait2 <- c(starting.trait, rnorm(n.territories-length(starting.trait), 
    mean=mean(starting.trait), sd=sd(starting.trait)))

init.inds <- data.frame(id = seq(1:n.territories), age = 2, trait = starting.trait2)
init.inds$x1 <-  round(runif(n.territories, min=-122.481858, max=-122.447270), digits=8)
init.inds$y1 <-  round(runif(n.territories, min=37.787768, max=37.805645), digits=8)

Specify and call SongEvo() with validation data

iteration <- 10
years <- 36
timestep <- 1
terr.turnover <- 0.5

SongEvo2 <- SongEvo(init.inds = init.inds, females = 1.0, iteration = iteration, steps = years,  
    timestep = timestep, n.territories = n.territories, terr.turnover = terr.turnover, 
    integrate.dist = integrate.dist, 
    learning.error.d = learning.error.d, learning.error.sd = learning.error.sd, 
    mortality.a.m = mortality.a.m, mortality.a.f = mortality.a.f,
    mortality.j.m = mortality.j.m, mortality.j.f = mortality.j.f, lifespan = lifespan, 
    phys.lim.min = phys.lim.min, phys.lim.max = phys.lim.max, 
    male.fledge.n.mean = male.fledge.n.mean, male.fledge.n.sd = male.fledge.n.sd, male.fledge.n = male.fledge.n,
    disp.age = disp.age, disp.distance.mean = disp.distance.mean, disp.distance.sd = disp.distance.sd, 
    mate.comp = mate.comp, prin = prin, all = TRUE)

Specify and call mod.val()

ts <- 36
target.data <- subset(song.data, Population=="Schooner" & Year==2005)$Trill.FBW
mod.val1 <- mod.val(summary.results=SongEvo2$summary.results, ts=ts, 
    target.data=target.data)

Plot results from mod.val()

plot(SongEvo2$summary.results[1, , "trait.pop.mean"], xlab="Year", ylab="Bandwidth (Hz)", 
    xaxt="n", type="n", xlim=c(-0.5, 36.5), 
    ylim=c(min(SongEvo2$summary.results[, , "trait.pop.mean"], na.rm=TRUE), 
    max(SongEvo2$summary.results[, , "trait.pop.mean"], na.rm=TRUE)))
    for(p in 1:iteration){
        lines(SongEvo2$summary.results[p, , "trait.pop.mean"], col="light gray")
        }
freq.mean <- apply(SongEvo2$summary.results[, , "trait.pop.mean"], 2, mean, na.rm=TRUE)
lines(freq.mean, col="blue")
axis(side=1, at=seq(0, 35, by=5), labels=seq(1970, 2005, by=5))

#Plot 95% quantiles 
quant.means <- apply (SongEvo2$summary.results[, , "trait.pop.mean"], MARGIN=2, quantile, 
    probs=c(0.95, 0.05), R=600, na.rm=TRUE)
lines(quant.means[1,], col="blue", lty=2)
lines(quant.means[2,], col="blue", lty=2)

#plot mean and CI for historic songs.  
 #plot original song values
library("boot")
sample.mean <- function(d, x) {
    mean(d[x])
}
boot_hist <- boot(starting.trait, statistic=sample.mean, R=100)
ci.hist <- boot.ci(boot_hist, conf=0.95, type="basic")
low <- ci.hist$basic[4]
high <- ci.hist$basic[5]
points(0, mean(starting.trait), pch=20, cex=0.6, col="black")
library("Hmisc")
errbar(x=0, y=mean(starting.trait), high, low, add=TRUE)

 #text and arrows
text(x=5, y=2720, labels="Historical songs", pos=1)
arrows(x0=5, y0=2750, x1=0.4, y1=mean(starting.trait), length=0.1)

 #plot current song values
library("boot")
sample.mean <- function(d, x) {
    mean(d[x])
}
boot_curr <- boot(target.data, statistic=sample.mean, R=100)
ci.curr <- boot.ci(boot_curr, conf=0.95, type="basic")
low <- ci.curr$basic[4]
high <- ci.curr$basic[5]
points(years, mean(target.data), pch=20, cex=0.6, col="black")
library("Hmisc")
errbar(x=years, y=mean(target.data), high, low, add=TRUE)

 #text and arrows
text(x=25, y=3100, labels="Current songs", pos=3)
arrows(x0=25, y0=3300, x1=36, y1=mean(target.data), length=0.1)

The model did reasonably well predicting trait evolution in the validation population, suggesting that it is valid for our purposes: the mean bandwidth was abs(mean(target.data)-freq.mean)Hz from the observed values, ~21% of predicted population means fell within the 95% confidence intervals of the observed data, and residuals of means (~545 Hz) and variances (~415181 Hz) were similar to those produced by the training data set.

Hypothesis testing with h.test()

This function allows hypothesis testing with SongEvo. To test if measured songs from two time points evolved through mechanisms described in the model (e.g. drift or selection), users initialize the model with historical data, parameterize the model based on their understanding of the mechanisms, and test if subsequently observed or predicted data match the simulated data. The output data list includes two measures of accuracy: the proportion of observed points that fall within the confidence intervals of the simulated data and the residuals between simulated and observed population trait means. Precision is measured as the residuals between simulated and observed population trait variances. We tested the hypothesis that songs of Z. l. nuttalli in Bear Valley, CA evolved through cultural drift from 1969 to 2005.

Prepare initial song data for Bear Valley.

starting.trait <- subset(song.data, Population=="Bear Valley" & Year==1969)$Trill.FBW
starting.trait2 <- c(starting.trait, rnorm(n.territories-length(starting.trait), 
    mean=mean(starting.trait), sd=sd(starting.trait)))

init.inds <- data.frame(id = seq(1:n.territories), age = 2, trait = starting.trait2)
init.inds$x1 <-  round(runif(n.territories, min=-122.481858, max=-122.447270), digits=8)
init.inds$y1 <-  round(runif(n.territories, min=37.787768, max=37.805645), digits=8)

Specify and call SongEvo() with test data

SongEvo3 <- SongEvo(init.inds = init.inds, females = 1.0, iteration = iteration, steps = years,  
    timestep = timestep, n.territories = n.territories, terr.turnover = terr.turnover, 
    integrate.dist = integrate.dist, 
    learning.error.d = learning.error.d, learning.error.sd = learning.error.sd, 
    mortality.a.m = mortality.a.m, mortality.a.f = mortality.a.f,
    mortality.j.m = mortality.j.m, mortality.j.f = mortality.j.f, lifespan = lifespan, 
    phys.lim.min = phys.lim.min, phys.lim.max = phys.lim.max, 
    male.fledge.n.mean = male.fledge.n.mean, male.fledge.n.sd = male.fledge.n.sd, male.fledge.n = male.fledge.n,
    disp.age = disp.age, disp.distance.mean = disp.distance.mean, disp.distance.sd = disp.distance.sd, 
    mate.comp = mate.comp, prin = prin, all = TRUE)

Specify and call h.test()

target.data <- subset(song.data, Population=="Bear Valley" & Year==2005)$Trill.FBW
h.test1 <- h.test(summary.results=SongEvo3$summary.results, ts=ts, 
    target.data=target.data)

The output data list includes two measures of accuracy: the proportion of observed points that fall within the confidence intervals of the simulated data and the residuals between simulated and observed population trait means. Precision is measured as the residuals between simulated and observed population trait variances.

Eighty percent of the observed data fell within the central 95% of the simulated values, providing support for the hypothesis that cultural drift as described in this model is sufficient to describe the evolution of trill frequency bandwidth in this population.

h.test1
#> $Residuals
#>              Residuals of mean Residuals of variance
#> Iteration 1           570.5283              1766.702
#> Iteration 2           897.4296             27905.781
#> Iteration 3           363.7918             34670.479
#> Iteration 4           842.5410             33374.746
#> Iteration 5           714.8821             29906.828
#> Iteration 6           521.8058             43487.935
#> Iteration 7           563.2237             59290.204
#> Iteration 8           428.2495             44896.769
#> Iteration 9           438.2434             15331.006
#> Iteration 10          117.8777             61258.778
#> 
#> $Prop.contained
#> [1] 0.4

We can plot simulated data in relation to measured data.

#Plot
plot(SongEvo3$summary.results[1, , "trait.pop.mean"], xlab="Year", ylab="Bandwidth (Hz)", 
    xaxt="n", type="n", xlim=c(-0.5, 35.5), 
    ylim=c(min(SongEvo3$summary.results[, , "trait.pop.mean"], na.rm=TRUE), 
    max(SongEvo3$summary.results[, , "trait.pop.mean"], na.rm=TRUE)))
    for(p in 1:iteration){
        lines(SongEvo3$summary.results[p, , "trait.pop.mean"], col="light gray")
        }
freq.mean <- apply(SongEvo3$summary.results[, , "trait.pop.mean"], 2, mean, na.rm=TRUE)
lines(freq.mean, col="blue")
axis(side=1, at=seq(0, 35, by=5), labels=seq(1970, 2005, by=5))#, tcl=-0.25, mgp=c(2,0.5,0))

#Plot 95% quantiles (which are similar to credible intervals)
quant.means <- apply (SongEvo3$summary.results[, , "trait.pop.mean"], MARGIN=2, quantile, 
    probs=c(0.95, 0.05), R=600, na.rm=TRUE)
lines(quant.means[1,], col="blue", lty=2)
lines(quant.means[2,], col="blue", lty=2)

 #plot original song values
library("boot")
sample.mean <- function(d, x) {
    mean(d[x])
}
boot_hist <- boot(starting.trait, statistic=sample.mean, R=100)#, strata=mn.res$iteration)  
ci.hist <- boot.ci(boot_hist, conf=0.95, type="basic")
low <- ci.hist$basic[4]
high <- ci.hist$basic[5]
points(0, mean(starting.trait), pch=20, cex=0.6, col="black")
library("Hmisc")
errbar(x=0, y=mean(starting.trait), high, low, add=TRUE)

 #plot current song values
points(rep(ts, length(target.data)), target.data)

library("boot")
sample.mean <- function(d, x) {
    mean(d[x])
}
boot_curr <- boot(target.data, statistic=sample.mean, R=100)#, strata=mn.res$iteration) 
ci.curr <- boot.ci(boot_curr, conf=0.95, type="basic")
low <- ci.curr$basic[4]
high <- ci.curr$basic[5]
points(years, mean(target.data), pch=20, cex=0.6, col="black")
library("Hmisc")
errbar(x=years, y=mean(target.data), high, low, add=TRUE)

 #text and arrows
text(x=11, y=2850, labels="Historical songs", pos=1)
arrows(x0=5, y0=2750, x1=0.4, y1=mean(starting.trait), length=0.1)
text(x=25, y=2900, labels="Current songs", pos=1)
arrows(x0=25, y0=2920, x1=years, y1=mean(target.data), length=0.1)