monta_verde_succession_analysis.Rmd · monta-verde-succession

---
title: "Monta Verde Succession"
author: "Scott Yanco"
date: "December 12, 2019"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# Introduction
This document describes analyses and preliminary results of avian point count surveys across a successional gradient at the Monta Verde Cloud Forest in Costa Rica.

# Required Libraries and Custom Functions

```{r message=FALSE, warning=FALSE}
#### REQUIRED LIBRARIES ####
library(dplyr)
library(ggplot2)
library(lme4)
library(AICcmodavg)
library(ggeffects)
library(viridis)
library(tidyr)
library(betareg)
library(tibble)
library(boot)
library(tidyverse)
library(vegan)
library(ggthemes)
library(gridExtra)
library(ggrepel)
library(hms)
library(lubridate)
```

Load a custom function that will help us combine initial migratory coding into bigger bins.

```{r}
combineMigCats <- Vectorize(vectorize.args = "x",
                               FUN = function(x) {
                                 switch(as.character(x), 
                            "altitudinal" = "mig",
                            "long-distance" = "mig",
                            "short-distance" = "mig",
                            "nomadic" = "mig",
                            "resident" = "res",
                            "NA" = NA)})

combineGuildCats <- Vectorize(vectorize.args = "x",
                               FUN = function(x) {
                                 switch(as.character(x), 
                            "carnivore" = "specialist",
                            "frugivore" = "specialist",
                            "granivore" = "specialist",
                            "herbivore" = "specialist",
                            "insectivore" = "specialist",
                            "nectarivore" = "specialist",
                            "nectarivore (omnivore?)" = "omnivore",
                            "omnivore" = "omnivore",
                            "scavenger" = "omnivore")})
```


# Import and Process Data

## Load data

### Bird Data

First load all the point count data and combine into a single data frame.

```{r}
#load site data
nf4_2011 <- read.csv("../data_by_site/NF4-2011.csv", stringsAsFactors = T)
nf1_2003 <- read.csv("../data_by_site/NF1-2003.csv", stringsAsFactors = T)
nf2 <- read.csv("../data_by_site/NF2.csv", stringsAsFactors = T)
mf1 <- read.csv("../data_by_site/MF1.csv", stringsAsFactors = T)
mf2 <- read.csv("../data_by_site/MF2.csv", stringsAsFactors = T)
nf6_2011 <- read.csv("../data_by_site/NF6-2011.csv", stringsAsFactors = T)
p1 <- read.csv("../data_by_site/P1.csv", stringsAsFactors = T)
p2 <- read.csv("../data_by_site/P2.csv", stringsAsFactors = T)
nf3nf5_2008 <- read.csv("../data_by_site/NF3NF5-2008.csv", stringsAsFactors = T)

#excluded Finca site from analysis
#nf7ab_2008 <- read.csv("../data_by_site/NF7AB-2008.csv") 

#combine into single data object
dat <- rbind(nf4_2011, nf1_2003, nf2, mf1, mf2, nf6_2011, p1, p2, nf3nf5_2008)
```

Load the datasheet containing key to codes.

```{r}
codes <- read.csv("../data_by_site/CODES.csv")
```

### Vegetation Data

Load in the tree data:

```{r}
trees <- read.csv("../trees_nofinca_121219.csv")

shrubs <- read.csv("../shrubs.csv")
shrubs$year_planted <- factor(shrubs$year_planted, levels = c("0","2011", "2008",
                                                              "2003", "old"))

#set factor order
trees$treatment <- factor(trees$treatment, levels = c("0", "2011", "2008", "2003", "old"))
```

Load in densiometer data (values represent open "dots"):

```{r}
dens <- read.csv("../dens_nofinca.csv")

#set factor order
dens$treatment <- factor(dens$treatment, levels = c("0", "2011", "2008", "2003", "old"))
```

## Process data

Initial basic cleaning actions:

```{r}
#convert date to R date format
dat$date <- as.Date(dat$date, format = "%d-%b-%y")

dat$start_time <- as_hms(as.POSIXlt(dat$start_time, format = "%H:%M"))
dat$obs_time <- as_hms(as.POSIXlt(dat$obs_time, format = "%H:%M"))


#convert year planted from numeric to factor
dat$year_planted <- as.factor(dat$year_planted) 

#fix 2 NA values for year in LC NF2
dat$year_planted[is.na(dat$year_planted)] <- "2003"

#reorder the levels of the year factor variable to make plots nicer
levels(dat$year_planted) <- c("0", "2011", "2008", "2003", "old")

#fix weird transect coding
dat$transect[dat$transect == "P1(N)"] <- "P1"
dat$transect[dat$transect == "P1 (N)"] <- "P1"
dat$transect <- droplevels(dat$transect)
levels(dat$transect) 
```

Subset the dataset to only include those individuals that were observed within 25m of the sample point.

```{r}
dat_close <- dat[dat$dist == "<25",]
```

Further subset to exclude those species whose behavior was coded as "pass through" since they were'nt truly using the habitat.

```{r}
dat_close_use <- dat_close[dat_close$beh != "pt",]
```

Subset to first 15 minutes only

```{r}
dat_close_use <- dat_close_use %>% 
  filter(abs(start_time-obs_time) <= 900)
```


Now we need to join the species coding information with the database so that we have feeding guild and migratory status associated with each observation.

```{r}
dat_join <- dat_close_use %>%
  inner_join(., codes, by = "species")
```

Finally, we'll simplify the migratory guild data to only include 3 groups: "resident", "migratory", or "nomadic"; and simplify feeding guilds into "omnovore" or "specialist". 

```{r}
dat_join$mig_code <- unlist(combineMigCats(x = dat_join$mig_guild))

dat_join$guild_code <- unlist(combineGuildCats(x = dat_join$feed_guild))
```

# Analyisis

## Abundance

Now we're ready for some anlyeses.  Let's first build some really basic models and plots to see if there is an effect of treatment on the total number of birds seen (having excluded pass throughs, and those observed >25m from the observer).

This next code chunk counts up the number of observations grouped by transect and date (so for each transect-day, we get the total number of birds)

```{r}
mod1_counts <- dat_join %>%
  group_by(transect, year_planted) %>%
  count(date)
```


We can fit two pretty simple generalized linear models to this.  One is the simple intercept model, the other is modeling the abundance during each survey as a function of successional stage (measured as a factor variable).  Each model includes date as a random effect on the intercept - basically saying that there may be more or less birds on a given day). WE don't have enough observations to allow for day to modify how succession effects the count (e.g., cant' do random effect on slope). We'll first just look at the AICc-based selection table.

```{r}
# f0 <- lmer(n ~ 0 + (1|date), data = mod1_counts, REML = F)
# #succession effect model
# f1 <- lmer(n ~ 0 + year_planted + (1 |date), data = mod1_counts, REML = F)

#intercept ("dot") model
f0 <- glmer(n ~ 0 + (1|date), data = mod1_counts, family = "poisson")
#succession effect model
f1 <- glmer(n ~ 0 + year_planted + (1 |date), data = mod1_counts, , family = "poisson")

aictab(list("dot" = f0, "succession" = f1))

```

Clear information theoretic support for the succession model.  Let's now evaluate the model. By looking at the summary and a residuals plot. First though we need to refit the models using REML instead of ML.  ML is needed to use AICc but produces biased coefficient estimates.

```{r}
#intercept ("dot") model
f0 <- glmer(n ~ 1 + (1|date), data = mod1_counts, family = "poisson")
#succession effect model
f1 <- glmer(n ~ 1 + year_planted + (1 |date), data = mod1_counts, family = "poisson")
```


```{r}
summary(f1)
plot(f1)
qqnorm(resid(f1))
qqline(resid(f1))

(f1.ci <- exp(confint(f1, oldNames = F)))

(f1.coef <- exp(f1@beta))


```

The residuals don't look great, but not horrible either - need to think more about what might be going on, but there seems like there might be some structure where we see residual bias at the highest predicted values. It might also be because we're technically using a model that assumes normality of residuals with count data which we could also model as a Poisson.  That said, think about Gary White's talk at RRF and his paper that showed sometimes the normal model is actually more robust than the Poisson even for known-Poisson-generated data.  I think the residuals look OK enough to keep going.

The actual fixed effect coefficient estimates are also a bit confusing in their trend, but at least they are larger than their standard error (except for old) - let's make some plots to look at these graphically.

Make a version forced through intercept for graphing
```{r}
f1.0 <- glmer(n ~ 0 + year_planted + (1 |date), data = mod1_counts, family = "poisson")
(f1.0.coef <- exp(f1.0@beta))
(f1.0.ci <- exp(confint(f1.0, oldNames = F)))
f1est <- data.frame(coef = f1.0.coef, ci.low = f1.0.ci[2:6,1], ci.hi = f1.0.ci[2:6,2],
                    year_planted = 
                      ordered(c("Pasture", "2011", "2008", "2003", ")ld"),
                              levels = c("Pasture", "2011", "2008", "2003", 
                                         "Old")))
```


```{r fig.height=4, fig.width=4}
#make some model fited values
f1_pred <- ggpredict(f1, terms = c("year_planted"))

(abund_plot <- ggplot() +
  geom_point(data = f1est, aes(x = year_planted, y = coef), alpha = .6) +
  geom_errorbar(data = f1est, 
                aes(x = year_planted, ymax = ci.hi, ymin = ci.low),
                width = .2) +
  scale_x_discrete(labels = c("Pasture", "T-7", "T-10", "T-15", "Old")) +
  ylab(expression(paste("Abundance birds count "^-1))) +
  xlab("Successional Stage") +
  theme_few())

ggsave("../figures/abund_plot.pdf")
```

The above plot shows the raw count data (in the blue points) colored by date. The black squares represent the model estimate for each succession class (think of this as the mean corrected for day effects).  The error bars ar 95% CIs.  Hard to see a trend I'd be confident in here, other than maybe a boost in abundance at the earliest successional stage.  I think the model is overfit, so the parameter estimates themselves are difficult to have much confidence in (the model, in the end, includes 7 parameters for ~65 observations, that's not great...  I trust the graph more.)

## Migratory status

Instead of raw abundance, let's look at the proportion of migrants versus residents. Let's just make the simplest version of the dataset and plot.

```{r}

#break down mean counts per mig guild
mig_counts <- dat_join %>%
  group_by(year_planted, transect, date, mig_code) %>%
  count(start_time)  %>%
  ungroup() %>% 
  complete(nesting(year_planted, transect, date, start_time), mig_code, 
           fill = list("n" = 0)) 

mig_prop <- mig_counts %>%
  group_by(year_planted, mig_code) %>% 
  summarise(m_count = mean(n), se = sd(n)/length(n))

mig_date <- mig_counts %>%
  group_by(year_planted, date, mig_code) %>% 
  summarise(m_count = mean(n), se = sd(n)/length(n))

mig_beta <- dat_join %>% 
  group_by(year_planted, transect, date, start_time) %>% 
  summarise(mig = sum(mig_code == "mig"), res = sum(mig_code == "res")) %>% 
  mutate(prop = mig/(mig+res), trials = (mig+res)) 
  
# ggplot() +
#   geom_bar(data = mig_prop, aes(x=year_planted, y = m_count, fill = mig_code), 
#            position = "fill", stat = "identity") +
#   geom_errorbar(data = mig.df, aes(x=year_planted, ymin=1-ci.hi, ymax = 1-ci.low),
#                 width = .25) +
#   ylab("Proportion Birds Observed") +
#   xlab("Treatment") +
#   scale_x_discrete(labels = c("Pasture", "2011", "2008", "2003", "Old")) +
#   scale_fill_discrete(labels = c("Migrant", "Resident")) +
#   labs(fill = "Migratory Status") +
#   scale_fill_discrete(name = "Migratory Status", 
#                       labels = c("Migrant", "Resident"))
```

First a couple plots by date separated by mig code and succession treatment:

```{r}
ggplot(mig_counts) +
  geom_point(aes(x = date, y = n, group = mig_code, 
                color = mig_code)) +
  facet_wrap(~year_planted)

ggplot(mig_beta) +
  geom_point(aes(x = date, y = prop)) +
  ylab("Proportion Migrants") +
  facet_wrap(~year_planted, nrow =1)
```

```{r}
mean.fun <- function(dat, idx) mean(dat[idx], na.rm = TRUE)

boot.prop <- sapply(unique(mig_beta$year_planted), FUN = function(x){
 out <- boot(mig_beta$prop[mig_beta$year_planted == x], mean.fun, R=10000, 
              sim = "ordinary")
  }, simplify = F)

            
cis <- sapply(boot.prop, FUN = function(x){boot.ci(boot.out = x, type="bca")},
              simplify = F)
unique(mig_beta$year_planted)
cis

mig.df <- data.frame(year_planted = unique(mig_beta$year_planted),
                     mean = sapply(cis, function(x){x$t0}),
                     ci.low = sapply(cis, function(x){x$bca[4]}),
                     ci.hi = sapply(cis, function(x){x$bca[5]}))
```
 Plot:
```{r fig.width=4, fig.height=4}
(mig_plot <- ggplot(mig.df) +
  geom_point(aes(x=year_planted, y = mean)) +
  geom_errorbar(aes(x=year_planted, ymin = ci.low, ymax = ci.hi), width =.1) +
  scale_x_discrete(labels = c("Pasture", "T-7", "T-10", "T-15", "Old")) +
  ylab(expression(paste("Proportion migrants point count "^-1))) +
  xlab("Successional Stage") +
   ylim(0,1)+
  theme_few())

# ggplot(mig.df) +
#   geom_bar(aes(x = year_planted, y = mean, fill = mig_code), 
#            position = "fill", stat = "identity")
```


## Foraging Guild

Similar analysis, but for collapsed feeding guild.

```{r}

#break down mean counts per mig guild
guild_counts <- dat_join %>%
  group_by(year_planted, transect, guild_code) %>%
  count(date)  %>%
  ungroup() %>% 
  complete(nesting(year_planted, transect, date), guild_code, 
           fill = list("n" = 0)) 

guild_beta <- dat_join %>% 
  group_by(year_planted, transect, date, start_time) %>% 
  summarise(omni = sum(guild_code == "omnivore"), 
            spec = sum(guild_code == "specialist")) %>% 
  mutate(prop = spec/(spec+omni), trials = (spec+omni)) 

guild_prop <- guild_counts %>%
  group_by(year_planted, guild_code) %>% 
  summarise(m_count = mean(n), se = sd(n)/length(n))

guild_date <- guild_counts %>%
  group_by(year_planted, date, guild_code) %>% 
  summarise(m_count = mean(n), se = sd(n)/length(n))

ggplot(guild_prop) +
  geom_bar(aes(x=year_planted, y = m_count, fill = guild_code), 
           position = "fill", stat = "identity") +
  ylab("Proportion Birds Observed") +
  xlab("Treatment") +
  theme_minimal() +
  scale_fill_discrete(name = "Feeding Guild",
                      labels = c("Omnivore", "Specialist"))
```

```{r}
ggplot(guild_counts) +
  geom_point(aes(x = date, y = n, group = guild_code, 
                color = guild_code)) +
  facet_wrap(~year_planted)

ggplot(guild_beta) +
  geom_point(aes(x = date, y = prop)) +
  ylab("Proportion Specialists") +
  facet_wrap(~year_planted, nrow =1)
```
```{r}
mean.fun <- function(dat, idx) mean(dat[idx], na.rm = TRUE)

boot.prop <- sapply(unique(guild_beta$year_planted), FUN = function(x){
 out <- boot(guild_beta$prop[guild_beta$year_planted == x], mean.fun, R=10000, 
              sim = "ordinary")
  }, simplify = F)

            
cis.guild <- sapply(boot.prop, FUN = function(x){boot.ci(boot.out = x, type="bca")},
              simplify = F)
unique(guild_beta$year_planted)
cis.guild

guild.df <- data.frame(year_planted = unique(guild_beta$year_planted),
                     mean = sapply(cis.guild, function(x){x$t0}),
                     ci.low = sapply(cis.guild, function(x){x$bca[4]}),
                     ci.hi = sapply(cis.guild, function(x){x$bca[5]}))
```
 Plot:
```{r fig.height=4, fig.width=4}
(spec_plot <- ggplot(guild.df) +
  geom_point(aes(x=year_planted, y = mean)) +
  geom_errorbar(aes(x=year_planted, ymin = ci.low, ymax = ci.hi), width =.1) +
  scale_x_discrete(labels = c("Pasture", "T-7", "T-10", "T-15", "Old")) +
  ylab(expression(paste("Proportion specialists point count "^-1))) +
  xlab("Successional Stage") +
  ylim(0,1) +
  theme_few())
```
Combine the plots for mig and foraging guild

```{r}
gl_guild <- list(mig_plot, spec_plot)

guild <- arrangeGrob(grobs = gl_guild,
                    heights = c(4),
                    widths = c(4,4),
                    layout_matrix = matrix(c(1,2), nrow = 1))

ggsave("../figures/guild_plot.pdf", guild)
ggsave("../figures/guild_plot.png", guild)
```


## Vegetation

Again, given sample sizes I don't think it makes a ton of sense to model the effect of veg parameters directly (though I'm certainly open to conversation about that...).  But some visualizations to help with the "storytelling" might be useful - let's see what's in the veg data...

```{r}
library(moments)
trees %>% 
  group_by(treatment) %>% 
  summarise(m = mean(ht), se = sd(ht)/length(ht), v = var(ht), 
            skew = skewness(ht), kurt = kurtosis(ht),
            range.low = range(ht)[1], range.hi = range(ht)[2])
```


Let's look at the distribution of tree heights by treatement.

```{r fig.height=4, fig.width=4}
(tr_ht <- ggplot() +
  geom_violin(data = trees, aes(x=treatment, y = ht), fill = "gray") +
  geom_point(data = trees[trees$treatment == 0,], aes(x = treatment, y = ht),
             size = 3) +
  #scale_fill_grey(start = .9, end = 0) +
  scale_x_discrete(labels = c("Pasture", "T-7", "T-10", "T-15", "Old")) +
  theme_minimal() +
  ylab("Tree Ht. (cm)") +
  xlab("Treatment") + 
   theme_few() +
  theme(legend.position = "none"))
```

That's pretty interesting, again, the only real standout is the 2011 treatment - 2003 and 2008 look more similar to old than 2011 - just missing the far end of the tail.

We can do the same thig with DBH data too :

```{r fig.height=4, fig.width=4}
(tr_dbh <- ggplot() +
  geom_violin(data = trees, aes(x=treatment, y = dbh), fill = "gray") +
  geom_point(data = trees[trees$treatment == 0,], aes(x = treatment, y = dbh),
             size = 3) +
  theme_minimal() +
  ylab("Tree DBH (cm)") +
  scale_x_discrete(labels = c("Pasture", "T-7", "T-10", "T-15", "Old")) +
  xlab("Treatment") + 
  theme_few() +
  theme(legend.position = "none"))
```

Pretty cool looking graph but I think 2011 doesn't jump out to me the way it does for tree height.

Pulling the densiometer data:


```{r fig.height=4, fig.width=4}
trt_lab <-  c("Pasture", "T-7", "T-10", "T-15", "Old")
names(trt_lab) <- levels(dens$treatment)

(can_op <- ggplot() +
  geom_jitter(data = dens, aes(x = treatment, y = dens_mean), 
              size = 3, width = 0.3, height= 0) +
  scale_x_discrete(labels = c("Pasture", "T-7", "T-10", "T-15", "Old")) +
  ylab("Canopy Openness (%)") +
  xlab("Treatment") + 
  theme_few() +
  theme(legend.position = "none", axis.text.x = element_blank(), 
        axis.ticks.x = element_blank()) +
  facet_grid(~treatment, scales = "free_x", labeller = 
               labeller(treatment = trt_lab)))
```

```{r fig.height=4, fig.width=4}
trt_lab <-  c("Pasture", "T-7", "T-10", "T-15", "Old")
names(trt_lab) <- levels(shrubs$year_planted)

(shrub_den <- ggplot() +
  geom_jitter(data = shrubs, aes(x = year_planted, y = shrubs), 
              width = .2, height = 0, size = 3) +
  ylab("Shrub Density Score") +
  xlab("Treatment") + 
  theme_few() +
    theme(legend.position = "none", axis.text.x = element_blank(), 
        axis.ticks.x = element_blank())   +
   facet_grid(~year_planted, scales = "free_x", 
              labeller = labeller(year_planted = trt_lab)))
```

### Combine veg plots

```{r fig.width=8, fig.height=10}
gl_veg <- list(tr_ht, tr_dbh, can_op, shrub_den)

veg <- arrangeGrob(grobs = gl_veg,
                    heights = c(4,4),
                    widths = c(4,4),
                    layout_matrix = matrix(rbind(c(1,2),
                                           c(3,4)), nrow = 2))

ggsave("../figures/veg_plot.pdf", veg, dpi = 600,
       width = 8, height = 8, units = c("in"))
ggsave("../figures/veg_plot.png", veg, dpi = 600,
       width = 8, height = 8, units = c("in"))
```


## Diversity

Create presence/absence dataset filling in implied "0"s.

```{r}
occur <- dat_join
occur$pres <- 1

ocurr_sites <- occur %>%
  complete(nesting(year_planted, transect, date, start_time), species, 
           fill = list("pres" = 0))
```

First let's look at raw species richness by site:

```{r fig.height=4, fig.width=4}
rich <- ocurr_sites %>% 
  group_by(transect, year_planted) %>%
  distinct(species, transect, year_planted, pres) %>% 
  tally(wt = pres)

rich %>% 
  group_by(year_planted) %>% 
  summarise(mean(n))

rich.visit <- ocurr_sites %>% 
  group_by(year_planted, transect, date, start_time) %>%
  distinct(species, transect, year_planted, pres, start_time) %>% 
  summarise(sp_count = sum(pres))

(total_rich_plot <- ggplot(rich) +
  geom_point(aes(x = year_planted, y = n), size = 3) +
  xlab("Successional Stage") +
  ylab("Total Species Richness") +
  scale_x_discrete(labels = c("Pasture", "T-7", "T-10", "T-15", "Old")) +
    theme_few())

ggplot(rich.visit) +
  geom_point(aes(x = date, y = sp_count, color = transect)) +
  facet_wrap(~year_planted, nrow = 1) +
  xlab("Successional Stage") +
  ylab("Total Species Richness") +
  theme_minimal() +
  theme(legend.position = "none")
```

```{r}
mean.fun <- function(dat, idx) mean(dat[idx], na.rm = TRUE)

boot.sp <- sapply(unique(rich.visit$year_planted), FUN = function(x){
 out <- boot(rich.visit$sp_count[rich.visit$year_planted == x], mean.fun, 
             R=10000, sim = "ordinary")
  }, simplify = F)

            
cis.sp <- sapply(boot.sp, FUN = function(x){boot.ci(boot.out = x, type="bca")},
              simplify = F)
unique(rich.visit$year_planted)
cis.sp

rich.df <- data.frame(year_planted = unique(rich.visit$year_planted),
                     mean = sapply(cis.sp, function(x){x$t0}),
                     ci.low = sapply(cis.sp, function(x){x$bca[4]}),
                     ci.hi = sapply(cis.sp, function(x){x$bca[5]}))
```
 Plot:
```{r fig.height=4, fig.width=4}
(per_rich_plot <- ggplot(rich.df) +
  geom_point(aes(x=year_planted, y = mean)) +
  geom_errorbar(aes(x=year_planted, ymin = ci.low, ymax = ci.hi), width =.1) +
  ylab(expression(paste("Species Richness Point Count "^-1))) +
  xlab("Successional Stage") +
      scale_x_discrete(labels = c("Pasture", "T-7", "T-10", "T-15", "Old")) +
  theme_few())
```

Plot them together:

```{r}
gl1 <- list(total_rich_plot, per_rich_plot)

rich <- arrangeGrob(grobs = gl1,
                    heights = c(4),
                    widths = c(4,4),
                    layout_matrix = matrix(c(1,2), nrow = 1))

ggsave("../figures/rich_plot.pdf", rich)
ggsave("../figures/rich_plot.png", rich)
```


## Species composition (NMDS)


```{r}
comm_dat <- dat_join %>% 
  group_by(year_planted, transect, date, start_time, species) %>% 
  count() %>% 
  ungroup() %>%
  group_by(year_planted, transect, species) %>%
  summarise(m_abund = mean(n)) %>%
  ungroup()

comm_mat <- comm_dat %>% 
  spread(species, m_abund, fill=0) %>% 
  column_to_rownames(var="transect") %>% 
  select(-c(year_planted)) %>% 
  as.matrix

comm_pres <- t(apply(comm_mat, MARGIN = c(1,2), FUN = function(x) {
  (if(x >= 1){1}else{0})
}))

write.csv(comm_pres, "pres-abs_table_020720.csv")

mat_key <-   comm_dat %>% 
  spread(species, m_abund, fill=0) %>% 
  column_to_rownames(var="transect")
```
#Get most common species in each habitat
```{r}
#TODO: make more elegant - summarize by year_planted first
abund_top <- comm_dat %>% 
  group_by(year_planted, species) %>%
  summarize(tot_m_abund = mean(m_abund),
            .groups = "keep") %>% 
  group_by(year_planted) %>% 
  top_n(3, wt = tot_m_abund)
abund_top
```

```{r}
NMDS <- metaMDS(comm_mat,  k=2, trymax = 250, maxit = 999)
```

Plots

```{r}
#extract NMDS scores (x and y coordinates)
data.scores <- as.data.frame(scores(NMDS))
data.scores$year_planted <- mat_key$year_planted

species.score <- as.data.frame(NMDS$species) %>% 
  rownames_to_column(var = "species") %>% 
  left_join(codes, by = "species") %>% 
  select(species, MDS1, MDS2, mig_guild)

species.score$mig_code <- unlist(combineMigCats(x = species.score$mig_guild))  

(nmds <- ggplot() +
  geom_point(data = data.scores, aes(x = NMDS1, y = NMDS2, group = year_planted,
                                     shape = year_planted), 
               color = "black", fill = "black", size = 5) +
  scale_shape_discrete(name = "Successional Stage", 
                       labels = c("Pasture", "T-7", "T-10", "T-15", "Old"),
                       solid = T) +
  # geom_point(data = species.score, aes(x=MDS1, y = MDS2)) +
  # scale_color_discrete(name = "Migratory Status", labels = c("Migrant", "Resident")) +
  geom_text_repel(data = species.score, aes(x=MDS1, y = MDS2, label = species), 
                  size = 2, direction = "both", force_pull = 1, segment.color = 'transparent',
                  max.overlaps = 20) +
  theme_few())

ggsave("../figures/nmds.pdf")
ggsave("../figures/nmds.png")

```

### Do it again but use each transect

```{r}
comm_dat <- dat_join %>% 
  group_by(year_planted, transect, date, start_time, species) %>% 
  count() %>% 
  ungroup()

comm_mat <- comm_dat %>% 
  spread(species, n, fill=0) %>% 
  select(-c(year_planted, transect, date, start_time)) %>% 
  as.matrix

mat_key <- comm_dat %>% 
  spread(species, n, fill=0)
```

```{r}

#NOT CONVERGING
NMDS <- metaMDS(comm_mat,  k=2, trymax = 250, maxit = 999)
```

Plots

```{r}
#extract NMDS scores (x and y coordinates)
data.scores <- as.data.frame(scores(NMDS))
data.scores$year_planted <- mat_key$year_planted

species.score <- as.data.frame(NMDS$species) %>% 
  rownames_to_column(var = "species") %>% 
  left_join(codes, by = "species") %>% 
  select(species, MDS1, MDS2, mig_guild)

species.score$mig_code <- unlist(combineMigCats(x = species.score$mig_guild))  

ggplot() +
  geom_point(data = data.scores, aes(x = NMDS1, y = NMDS2, group = year_planted,
                                     shape = year_planted), 
               color = "black", size = 5) +
  geom_point(data = species.score, aes(x=MDS1, y = MDS2, color = mig_code)) +
  theme_minimal()

```
```{r}
colvec <- c("gray0", "gray0", "gray49", "gray49")   # Identifies colors for group assignments
pchvec <- c(21, 22, 23, 24, 35)   # Identifies character symbols for group assignments

plot(NMDS)
with(key,
     points(NMDS,
            display = "sites",
            col = "black",
            pch = pchvec[transect]))

#Create convex hulls that highlight point clusters based on grouping dataframe
ordihull(
  NMDS,
  key$transect,
  display = "sites",
  draw = c("polygon"),
  col = NULL,
  border = c("gray0", "gray0", "gray48", "gray48"),
  lty = c(1, 2, 1, 2),
  lwd = 2.5
  )

# # Calculating centroids 
# 
# # You can calculate centroids for your groups which can be viewed as the average position of observations in ordination space. 
# 
# # Calculating and plotting centroids of NMDS Result
# scrs <-
#   scores(island.spp_NMS, display = "sites", "species")
# cent <-
#   aggregate(scrs ~ habitat, data = island.spp_groups, FUN = "mean")
# names(cent) [-1] <- colnames(scrs)
# points(cent [,-1],
#        pch = c( 8 , 8 , 8, 8),
#        col = c("gray0", "gray0", "gray48", "gray48"),
#        bg = c("black"),
#        lwd = 3.0,
#        cex = 2.0 # Plots centroids as points on ordination
#        )

```


# Preliminary Conclusions

This is a work in progress, but here's what I'm thinking for now.  There's a bump in both abundance and proportion of migrants in the 2011 treatment.  We'll need to appropriately caveat because, with the small sample size, one or a few days could be driving this (though I think the panel plot helps make the case it's a real trend).  That said, there could be a story here.  We wouldn't necessarily expect the most migrants in pasture because that's actually a totally different habitat type.  So you would expect the most migrants in the earliest successional stages, which is exactly what we see.  What's surprising, is that the trend appears to be nonlinear - it really only holds for the earliest successional habitats and after that, there is no discernable difference when compared to uncut forest. 

The tree height plot helps us disentangle that a bit more - the distribution of tree heights is really different in the 2011 plot compared to the other successional stages or the mature forest.

Here's another way of looking at it - we fit a separate loess smoother over the counts by date grouped by treatment and mig code.  It makes the influx of migrants late winter into early spring more obvious than when looking at the raw data.  This smoother doesn't show what our model did exactly (we don't have enough data to model this exactly, but it at least makes it visible to us).

```{r}
ggplot(mig_counts, aes(x = date, y = n, group = mig_code, 
                color = mig_code)) +
  # geom_line(data = cbind(mig_counts[!is.na(mig_counts$date),], 
  #                        pred = predict(f3)), aes(y = pred), 
            # size = 1) +  # adding predicted line from mixed model 
  geom_smooth(method = "loess") +
  ylim(c(0,20)) +
  facet_wrap(~year_planted)
```

# Notes
- Diversity index
- Check journals:  Biologia Tropical; Journal of Tropical Ecology; JFO; Agirculture, Ecosystems & Environment; Biotropica