dual-conversation-constraints.Rmd

---
title: "High- and Low-Level Constraints on Coordination during Conversation: Code for Paxton & Dale (accepted, Fontiers in Psychology)"
output: 
  html_document:
    number_sections: true
    keep_md: true
---

This R markdown provides the basis for our manuscript, "Interpersonal movement synchrony responds to high- and low-level conversational constraints" (Paxton & Dale, *accepted*, *Frontiers in Psychology*). The study explores how high-level (i.e., conversational context) and low-level (i.e., visual stimuli) constraints affect interpersonal synchrony during conversation. We quantify coordination using amplitude of movement from head-mounted accelerometers (using Google Glass; see Paxton, Rodriguez, & Dale, 2015, *Behavior Research Methods*).

To run these analyses from scratch, you will need the following files:

* `./data/prepped_data-DCC.csv`: Contains experimental data. All data for included dyads are freely available in the OSF repository for the project: `https://osf.io/x9ay6/`.
* `./supplementary-code/required_packages-DCC.r`: Installs required libraries, if they are not already installed. **NOTE**: This should be run *before* running this script.
* `./supplementary-code/libraries_and_functions-DCC.r`: Loads in necessary libraries and creates new functions for our analyses.
* `./supplementary-code/continuous_rqa_parameters-DCC.r`: Identifies the appropriate parameters for continuous cross-recurrence quantification analysis (CRQA).

Additional files will be created during the initial run that will help reduce processing time. Several of these files--including the chosen CRQA parameters, the final plotting dataframe, and the final analysis dataframe---are available as CSVs from the OSF repository listed above.

**Written by**: A. Paxton (University of California, Berkeley) and R. Dale (University of California, Merced)
<br>**Date last modified**: 1 July 2017

***

# Data trimming

**NOTE**: The chunks of code in this section do not have to be run each time, since the resulting datasets will be saved to CSV files. As a result, these chunks are currently set to `eval=FALSE`. Bear this in mind if these data need to be re-calculated.

***

## Preliminaries

This section reads in participant data, saved in long format (i.e., one line per sample). The columns include:

* `dyad`: identifier for each dyad
* `partic`: identifier for each participant within each dyad
* `conv.type`: high-level constraint (within subjects)
    + `0` = affiliative conversation
    + `1` = argumentative conversation
* `cond`: low-level constraint (between subjects)
    + `0` = noise
    + `1` = dual-task
* `conv.num`: conversation number (2 total for each dyad)
* `t`: time of sample in seconds
* `x`, `y`, `z`: accelerometer readings (from head-mounted sensor in Google Glass) at each sample in 3-dimensional plane, relative to head location at initialization

```{r trimming-prelims, warning = FALSE, error = FALSE, message = FALSE, eval = FALSE}

# clear our workspace
rm(list=ls())

# read in libraries and create functions
source('./supplementary-code/libraries_and_functions-DCC.r')

# read in unified dataframe
coords = read.table('./data/prepped_data-DCC.csv',sep=',',header = TRUE)

```

***

## Pre-process movement data

Because participants' movement series were sampled separately, we here time-align time series for each dyad.

These data were collected using an accelerometer. Because this measures the change in velocity on each plane, we might call this Euclidean *acceleration* (rather than Euclidean distance).

```{r align-time-series, eval = FALSE}

# specify Butterworth filters
anti_aliasing_butter = butter(4,.4)
post_downsample_butter = butter(2,.02)

# get Euclidean acceleration, filter, downsample, and take mean of each axis at new time scale
coords = coords %>% ungroup() %>%
  group_by(dyad,partic,conv.num,conv.type,cond) %>%
  mutate(euclid_accel = c(NA,euclidean(x,y,z))) %>% # get Euclidean acceleration
  mutate(euclid_accel = signal::filtfilt(anti_aliasing_butter, euclid_accel)) %>% # filter
  select(-x,-y,-z) %>% # drox unneeded variables
  mutate(t = floor(t * sampling_rate) / sampling_rate) %>% # downsample
  ungroup() %>%
  group_by(dyad, partic, conv.num, conv.type, cond, t) %>%
  summarize_each(funs(mean(.))) %>% # take means
  mutate(euclid_accel = signal::filtfilt(post_downsample_butter, euclid_accel)) # filter

# isolate participants' time series within each dyad
p0 = coords %>% ungroup() %>% 
  dplyr::filter(partic == 0) %>% 
  select(-partic) %>%
  dplyr::rename(euclid0 = euclid_accel)
p1 = coords %>% ungroup() %>% 
  dplyr::filter(partic == 1) %>% 
  select(-partic) %>%
  dplyr::rename(euclid1 = euclid_accel)

# join together the participants' time series, leaving only any overlapping slices
coords = plyr::join(p0, p1, 
                    by=c("dyad", "conv.num", "conv.type", "cond", "t"), 
                    type="inner")

```

***

## Trim instructions from time series

After instructions were given, both participants were instructed to shake their heads repeatedly to indicate the start of the experiment.  Here, we identify that head shake by looking for the first bouts of high-velocity movement and trim everything before it.

***

### Calculate Euclidean acceleration first and derivatives

Because participants were instructed to shake their heads to signal the start of the trial, we can identify that by looking at the derivatives of acceleration: jerk (first derivative of acceleration) and jounce (second derivative of acceleration). We calculate each here.

```{r trim-instructions, eval = FALSE}

# get acceleration derivatives
coords.deriv = coords %>% ungroup() %>%
  group_by(dyad, conv.num, conv.type, cond) %>%
  mutate(jerk0 = c(0,diff(euclid0) / diff(t))) %>% # jerk for p0
  mutate(jerk1 = c(0,diff(euclid1) / diff(t))) %>% # jerk for p1
  mutate(jounce0 = c(0,diff(jerk0) / diff(t))) %>% # jounce for p0
  mutate(jounce1 = c(0,diff(jerk1) / diff(t))) # jounce for p1

```

***

### Identify cutoff points for movement

Allowing time for participant questions and any setup issues, the total instruction time preceding each conversation should have been approximately 60 to 120 seconds. We therefore check between 60 and 120 seconds of the data to identify likely beginning times using peak jerk and jounce.

```{r identify-cutoffs, eval = FALSE}

# identify our minimum and maximum possible instruction end times
min.check = 60
max.check = 120

# identify possible cutoff times using jerk
cutoff.points.jerk = coords.deriv %>% ungroup() %>%
  group_by(dyad,conv.num, conv.type, cond) %>%
  dplyr::filter((t > min.check) & (t < max.check)) %>%
  select(dyad, conv.num, conv.type, cond, t, jerk0, jerk1) %>%
  dplyr::summarize(cutoff = max(c(jerk0,jerk1))) %>%
  merge(.,coords.deriv) %>%
  dplyr::filter((jerk0 == cutoff) | (jerk1 == cutoff)) %>%
  select(-ends_with("0"),-ends_with("1"))

# identify possible cutoff times using jounce
cutoff.points.jounce = coords.deriv %>% ungroup() %>%
  group_by(dyad,conv.num, conv.type, cond) %>%
  dplyr::filter((t > min.check) & (t < max.check)) %>%
  select(dyad, conv.num, conv.type, cond, t, jounce0, jounce1) %>%
  dplyr::summarize(cutoff = max(c(jounce0,jounce1))) %>%
  merge(.,coords.deriv) %>%
  dplyr::filter((jounce0 == cutoff) | (jounce1 == cutoff)) %>%
  select(-ends_with("0"),-ends_with("1"))

```

```{r invisible-store-jounce-and-jerk-values-for-table, echo=FALSE, eval = FALSE}

# invisible chunk: save file so that we can get the output without having to run everything
write.table(cutoff.points.jounce,'./data/DCC-cutoff_jounce.csv', sep=',',
            row.names=FALSE,col.names=TRUE)
write.table(cutoff.points.jerk,'./data/DCC-cutoff_jerk.csv', sep=',',
            row.names=FALSE,col.names=TRUE)

```

```{r invisible-load-jounce-and-jerk-values-for-table, eval=TRUE, echo=FALSE, warning = FALSE, error = FALSE, message = FALSE}

# invisible chunk: load file to correctly display correlation if everything else is `eval=FALSE`
source('./supplementary-code/libraries_and_functions-DCC.r')
cutoff.points.jounce = read.table('./data/DCC-cutoff_jounce.csv', sep=',', header = TRUE)
cutoff.points.jerk = read.table('./data/DCC-cutoff_jerk.csv', sep=',', header = TRUE)

```

```{r check-correlation-between-measures, eval = TRUE}

# are they correlated?
pander::pander(cor.test(cutoff.points.jounce$t,cutoff.points.jerk$t),
       style = "rmarkdown")

```

```{r plot-jounce-and-jerk, echo=FALSE, fig.width=3, fig.height=3, fig.align='center', eval = FALSE}

# create the plot
jounce.v.jerk = qplot(x = cutoff.points.jounce$t, 
                      y = cutoff.points.jerk$t) +
  xlab('Jounce') + ylab('Jerk') +
  ggtitle('Relation of Jerk and Jounce') +
  geom_abline()

# save a high-resolution version of the plot
ggsave(plot = jounce.v.jerk,
       height = 3,
       width = 3,
       filename = './figures/DCC-jounce_v_jerk.jpg')

# save a smaller version of the plot for knitr
ggsave(plot = jounce.v.jerk,
       height = 3,
       width = 3,
       dpi=100,
       filename = './figures/DCC-jounce_v_jerk-knitr.jpg')

```

![**Figure**. Correlation of jounce and jerk for each conversation of each dyad, including best-fit line.](./figures/DCC-jounce_v_jerk-knitr.jpg)

Jerk and jounce are significantly correlated, so we'll use the more conservative measure (i.e., tending to identify later points). We then remove everything before that cutoff point from the analyzed dataset.

```{r trim-data, eval = FALSE}

# identify which tends to be more conservative
cutoff_test = cutoff.points.jerk$t - cutoff.points.jounce$t
conservative_jerk = length(cutoff_test[cutoff_test>=0])
conservative_jounce = length(cutoff_test[cutoff_test<0])
if( conservative_jerk >= conservative_jounce ){
  cutoff.points = plyr::rename(cutoff.points.jerk,c('t'='cutoff.t'))
}else{
  cutoff.points = plyr::rename(cutoff.points.jounce,c('t'='cutoff.t'))
}

# merge with the rest of the dataframe, trim instruction period, drop unneded variables, and filter
coords.trimmed = coords.deriv %>% ungroup() %>%
  merge(., cutoff.points, 
        by = c('dyad','conv.num','conv.type','cond')) %>%
  group_by(dyad, conv.num, conv.type, cond) %>%
  dplyr::filter(t > unique(cutoff.t)) %>%
  select(-one_of('cutoff.t','cutoff'), 
         -matches('jerk'), 
         -matches('jounce'))

```

***

## Save trimmed data to file

```{r save-trimmed-data, eval = FALSE}

write.table(coords.trimmed,'./data/DCC-trimmed-data.csv', sep=',',
            row.names=FALSE,col.names=TRUE)

```

***

## Summary statistics on conversation lengths

```{r invisible-load-trimmed-for-stats, echo = FALSE}

# invisible chunk: load in the conversation data to correctly display table if everything else is `eval=FALSE`
coords.trimmed = read.table('./data/DCC-trimmed-data.csv', sep=',',header=TRUE)

```


```{r summary-conv-stats, eval = TRUE}

# identify the maximum time for each dyad
interaction.time = coords.trimmed %>%
  dplyr::group_by(dyad, conv.type) %>%
  dplyr::summarize(duration = max(t) - min(t))

# what's the mean length of conversation data (in seconds)?
mean(interaction.time$duration)

# what's the range of conversation data (in seconds)?
range(interaction.time$duration)

```

```{r plot-recorded-lengths, echo=FALSE, fig.width=3, fig.height=3, fig.align='center', eval = FALSE}

# create the plot
bin_width = 20
recorded.lengths = qplot(round(interaction.time$duration),
                         geom='histogram',binwidth=bin_width) +
  geom_histogram(aes(fill = ..count..),binwidth=bin_width) +
  scale_fill_viridis(discrete=FALSE) +
  xlab('Recorded Length (sec)') + ylab('Frequency') +
  ggtitle('Histogram of Recording Lengths\nfor All Included Conversations') +
  labs(fill="Freq.")

# save a high-resolution version of the plot
ggsave(plot = recorded.lengths,
       height = 3,
       width = 3,
       filename = './figures/DCC-recorded_lengths.jpg')

# save a smaller version of the plot for knitr
ggsave(plot = recorded.lengths,
       height = 3,
       width = 3,
       dpi=100,
       filename = './figures/DCC-recorded_lengths-knitr.jpg')

```

![**Figure**. Histogram of all recorded conversation lengths within the dataset.](./figures/DCC-recorded_lengths-knitr.jpg)

***

# Recurrence analyses

**NOTE**: The chunks of code in this section do not have to be run each time, since the resulting datasets will be saved to CSV files. As a result, these chunks are currently set to `eval=FALSE`. Bear this in mind if these data need to be re-calculated.

***

## Preliminaries

This section clears the workspace and reads in the prepared data files.

```{r recurrence-prelims, warning = FALSE, error = FALSE, message = FALSE, eval = FALSE}

# clear our workspace
rm(list=ls())

# read in libraries and create functions
source('./supplementary-code/libraries_and_functions-DCC.r')

# load data
coords = read.table('./data/DCC-trimmed-data.csv', sep=',', header = TRUE)

```

***

## Identify CRQA parameters

Before we can analyze the data, we need to identify the appropriate parameters for continuous CRQA for the dataset. We identify parameters that provide a steady *rate of recurrence* or *RR* of 5% for each conversation of each dyad and save these parameters to a CSV file.

The source file produces outputs that are useful for tracking progress, but we suppress them here for brevity.

```{r crqa-parameters, echo = FALSE, warning = FALSE, error = FALSE, message = FALSE, eval = FALSE}

# run CRQA parameters
source('./supplementary-code/continuous_rqa_parameters-DCC.r')

```

***

## Prepare for CRQA and DRPs

```{r ready-for-crqa-drps, eval = FALSE}

# read in our chosen parameters
crqa_params = read.table('./data/crqa_data_and_parameters-DCC.csv',
                         sep=',',header=TRUE)

# grab only the parameters we need
crqa_params = crqa_params %>%
  select(-matches('euclid')) %>%
  distinct()

# rescale the data (by mean)
coords_crqa = coords %>% ungroup() %>%
  group_by(dyad,conv.num) %>%
  mutate(rescale.euclid0 = euclid0/mean(euclid0)) %>%
  mutate(rescale.euclid1 = euclid1/mean(euclid1)) %>%
  select(-matches('jounce'))

# fold in our CRQA parameter information
coords_crqa = plyr::join(x=crqa_params,y=coords_crqa,
                         by=c('dyad'='dyad',
                              'conv.num'='conv.num'))

# slice up the data so that we have one dataset per conversation
split_convs = split(coords_crqa,
                    list(coords_crqa$dyad, coords_crqa$conv.num))

```

***

## Run CRQA and DRPs

Now that we have our parameters, we run continuous CRQA over each conversation for each dyad using the `crqa` function from the `crqa` package (Coco & Dale, 2014, *Frontiers in Psychology*).

```{r run-crqa-using-parameters, eval = FALSE}

# identify window size
target_seconds = 5
win_size = target_seconds * sampling_rate

# cycle through each conversation using the sliced subsets
drp_results = data.frame()
crqa_results = data.frame()
for (next_conv in split_convs){
  
  # isolate parameters for this next dyad
  chosen.delay = unique(next_conv$chosen.delay)
  chosen.embed = unique(next_conv$chosen.embed)
  chosen.radius = unique(next_conv$chosen.radius)
  
  # # print update
  # print(paste("CRQA: Dyad ", unique(next_conv$dyad),
  #             ", conversation ",unique(next_conv$conv.num),
  #             sep=""))
  
  # run cross-recurrence
  rec_analysis = crqa(ts1=next_conv$rescale.euclid0,
                      ts2=next_conv$rescale.euclid1,
                      delay=chosen.delay,
                      embed=chosen.embed,
                      r=chosen.radius,
                      normalize=0, 
                      rescale=0, 
                      mindiagline=2,
                      minvertline=2, 
                      tw=0, 
                      whiteline=FALSE,
                      recpt=FALSE)
  
  # save plot-level information to dataframe
  dyad_num = unique(next_conv$dyad)
  next_data_line = data.frame(c(dyad_num,
                                unique(next_conv$conv.type),
                                rec_analysis[1:9]))
  names(next_data_line) = c("dyad",'conv.type',names(rec_analysis[1:9]))
  crqa_results = rbind.data.frame(crqa_results,next_data_line)

  # recreate DRP from diagonal lines within our target window
  diag_lines = spdiags(rec_analysis$RP)
  subset_plot = data.frame(diag_lines$B[,diag_lines$d >= -win_size & diag_lines$d <= win_size])
  rr = colSums(subset_plot)/dim(subset_plot)[1]

  # convert to dataframe, padding (with 0) where no RR was observed
  next_drp = dplyr::full_join(data.frame(lag = as.integer(stringr::str_replace(names(rr),'X',''))-(win_size+1),
                                  rr = rr),
                       data.frame(lag = -win_size:win_size),
                       by='lag')
  next_drp[is.na(next_drp)] = 0

  # save it to dataframe
  next_drp$dyad = dyad_num
  next_drp$conv.type = unique(next_conv$conv.type)
  drp_results = rbind.data.frame(drp_results,next_drp)
}

# save results to file
write.table(crqa_results,'./data/crqa_results-DCC.csv',sep=",")
write.table(drp_results,'./data/drp_results-DCC.csv',sep=',')

```

***

## Export merged recurrence dataset

```{r merge-recurrence-datasets, eval = FALSE}

# merge CRQA and DRP analysis results
recurrence_results = plyr::join(drp_results, crqa_results,
                                by=c('dyad','conv.type'))

# grab information about experiment condition
additional_dyad_info = coords %>% ungroup() %>%
  select(dyad,conv.num,conv.type,cond) %>% distinct()

# merge recurrence analyses and condition information
recurrence_df = plyr::join(recurrence_results, additional_dyad_info,
           by=c('dyad','conv.type'))

# save to file
write.table(recurrence_df,'./data/recurrence_df-DCC.csv',sep=',')

```

***

# Data preparation

Now that we've calculated our CRQA and DRP measures, we're ready to prepare our data for analysis.

**NOTE**: The chunks of code in this section do not have to be run each time, since the resulting datasets will be saved to CSV files. As a result, these chunks are currently set to `eval=FALSE`. Bear this in mind if these data need to be re-calculated.

***

## Preliminaries

This section clears the workspace and reads in the prepared data files.

```{r prep-prelim, warning=FALSE, error=FALSE, message=FALSE, eval = FALSE}

# clear our workspace
rm(list=ls())

# read in libraries and create functions
source('./supplementary-code/libraries_and_functions-DCC.r')

# read in the recurrence dataframe
recurrence_df = read.table('./data/recurrence_df-DCC.csv',sep=',',header=TRUE)

```

***

## Create first- and second-order polynomials

In order to examine the linear and curvilinear patterns in the DRPs (cf. Main, Paxton, & Dale, 2016, *Emotion*), we create orthogonal polynomials for the lag term. This section creates the first- and second-order othogonal polynomials that are essential to allowing us to interpret the linear (i.e., first-order polynomial) and quadratic (i.e., second-order polynomial) patterns in the DRP independently from one another.

```{r create-polynomials, eval = FALSE}

# create first- and second-order orthogonal polynomials for lag
raw_lag = min(recurrence_df$lag):max(recurrence_df$lag)
lag_vals = data.frame(raw_lag)
lag_offset = (0-min(raw_lag)) + 1
t = stats::poly((raw_lag + lag_offset), 2)
lag_vals[, paste("ot", 1:2, sep="")] = t[lag_vals$raw_lag + lag_offset, 1:2]

# join it to the original data table
recurrence_df = left_join(recurrence_df,lag_vals, by = c("lag" = "raw_lag"))

```


***

## Create interaction terms

Because we will be providing both standardized and raw models, we create all interaction terms here. For simplicity, we will now change the `conv.type` variable to `convers` and `cond` to `condition`. Additionally, because we will be manually creating all interaction terms, we code `condition` and `convers` with levels `-0.5` and `0.5`; this ensures that we have nonzero values for interaction terms in the affiliative (`convers = 0`) and dual-task (`condition = 0`) cases.

```{r create-interactions, eval = FALSE}

# rename variables and center the binary variables
recurrence_df = recurrence_df %>% ungroup() %>%
  plyr::rename(.,
               c("conv.type"="convers",
                 "cond"="condition")) %>%
  mutate(condition = condition-.5) %>%
  mutate(convers = convers-.5) %>%
  mutate(condition.convers = condition * convers) %>%

  # first-order polynomials
  mutate(condition.ot1 = condition * ot1) %>%
  mutate(convers.ot1 = convers * ot1) %>%
  mutate(condition.convers.ot1 = condition * convers * ot1) %>%

  # second-order polynomials
  mutate(condition.ot2 = condition * ot2) %>%
  mutate(convers.ot2 = convers * ot2) %>%
  mutate(condition.convers.ot2 = condition * convers * ot2) %>%

  # polynomial interactions
  mutate(ot1.ot2 = ot1 * ot2) %>%
  mutate(condition.ot1.ot2 = condition * ot1 * ot2) %>%
  mutate(convers.ot1.ot2 = convers * ot1 * ot2) %>%
  mutate(condition.convers.ot1.ot2 = condition * convers * ot1 * ot2)

```

***

## Create standardized dataframe

Let's create a new dataframe with all standardized variables. This allows us to interpret the resulting values as effect sizes (see Keith, 2005, *Multiple regression and beyond*).

```{r standardized-dataframe, eval = FALSE}

# standardize all variables
rec_st = mutate_each(recurrence_df,funs(as.numeric(scale(.))))

```

***

## Export analysis and plotting dataframes

```{r export-analysis-and-plotting, eval = FALSE}

# export plotting dataframe
write.table(recurrence_df,'./data/plotting_df-DCC.csv',row.names=FALSE,sep=',')

# export standardized analysis dataframe
write.table(rec_st,'./data/analysis_df-DCC.csv',row.names=FALSE,sep=',')

```

***

# Data analysis

All data have been cleaned, all parameters have been identified, and all final data preparation has been finished. Using the analysis-ready dataframe (`rec_st`) and plotting dataframe (`rec_plot`), we now analyze our data and generate visualizations.

***

## Preliminaries

This section clears the workspace and reads in the prepared data files.

```{r analysis-prelim, warning=FALSE, error=FALSE, message=FALSE}

# clear our workspace
rm(list=ls())

# read in libraries and create functions
source('./supplementary-code/libraries_and_functions-DCC.r')

# read in the plotting and analysis recurrence dataframes
rec_st = read.table('./data/analysis_df-DCC.csv',sep=',',header=TRUE)
rec_plot = read.table('./data/plotting_df-DCC.csv',sep=',',header=TRUE)

```

***

## Recurrence by lag, conversation type, and condition

We now create a linear mixed-effects model to gauge how linear lag (`ot1`) and quadratic lag (`ot2`) interact with conversation type (`convers`) and task (`condition`) to influence head movement coordination (`rr`). We present both standardized and raw models below.

```{r central-model, warning=FALSE, error=FALSE, message=FALSE}

# standardized maximal random-effects model
rec_convers_condition_gca_st = lmer(rr ~ convers + condition + ot1 + ot2 +
                                       condition.convers + ot1.ot2 +
                                       convers.ot1 + condition.ot1 + condition.convers.ot1 +
                                       convers.ot2 + condition.ot2 + condition.convers.ot2 +
                                       convers.ot1.ot2 + condition.ot1.ot2 + condition.convers.ot1.ot2 +
                                       (1 + ot1 + ot2 + convers + condition.convers.ot1 | conv.num) + 
                                       (1 + ot1 + ot2 + convers + condition.convers.ot1 | dyad), 
                                    data=rec_st, REML=FALSE)
invisible(pander_lme_to_latex(rec_convers_condition_gca_st,'standardized_model_latex-DCC.tex'))
pander_lme(rec_convers_condition_gca_st,stats.caption = TRUE)

# raw maximal random-effects model
rec_convers_condition_gca_raw = lmer(rr ~ convers + condition + ot1 + ot2 +
                                       condition.convers + ot1.ot2 +
                                       convers.ot1 + condition.ot1 + condition.convers.ot1 +
                                       convers.ot2 + condition.ot2 + condition.convers.ot2 +
                                       convers.ot1.ot2 + condition.ot1.ot2 + condition.convers.ot1.ot2 +
                                       (1 + ot1 + ot2 + convers + condition.convers.ot1 | conv.num) +
                                       (1 + ot1 + ot2 + convers + condition.convers.ot1 | dyad),
                                     data=rec_plot,REML=FALSE)
pander_lme(rec_convers_condition_gca_raw,stats.caption = TRUE)

```

***

### Comparing maximal random-effects model to random-intercepts-only model

We next check whether our maximal random-effects model (above) provides a better fit to the data than a model with only random intercepts for `dyad` and `conv.num` that is otherwise identical. We present analyses of both standardized and raw datasets.

The maximal random-effects model (i.e., that with maximally permissible random slopes for random intercepts of `dyad` and `conv.num`  using backwards selection; cf. Barr et al., 2013, *Journal of Memory and Language*) accounts for significantly more of the data than the random-intercept-only model. Plots of the residuals of both maximal and random-intercepts-only models demonstrate that the maximal model better meets the assumptions of homogeneity and normality of residuals than the random-intercepts-only model.

```{r checking-maximal-standardized, warning=FALSE, error=FALSE, message=FALSE}

# standardized random-intercepts-only model
rec_convers_condition_gca_st_rio = lmer(rr ~ convers + condition + ot1 + ot2 +
                                      condition.convers + ot1.ot2 +
                                      convers.ot1 + condition.ot1 + condition.convers.ot1 +
                                      convers.ot2 + condition.ot2 + condition.convers.ot2 +
                                      convers.ot1.ot2 + condition.ot1.ot2 + condition.convers.ot1.ot2 +
                                      (1 | conv.num) + (1 | dyad), data=rec_st, REML=FALSE)

# is the maximal random-effects model a better fit than the random-intercepts-only model? (standardized)
pander_anova(anova(rec_convers_condition_gca_st_rio,rec_convers_condition_gca_st,test="Chisq"))

```

```{r checking-maximal-raw, warning=FALSE, error=FALSE, message=FALSE}

# raw random-intercepts-only model
rec_convers_condition_gca_raw_rio = lmer(rr ~ convers + condition + ot1 + ot2 +
                                       condition.convers + ot1.ot2 +
                                       convers.ot1 + condition.ot1 + condition.convers.ot1 +
                                       convers.ot2 + condition.ot2 + condition.convers.ot2 +
                                       convers.ot1.ot2 + condition.ot1.ot2 + condition.convers.ot1.ot2 +
                                       (1 | conv.num) + (1 | dyad), data=rec_plot,REML=FALSE)

# is the maximal random-effects model a better fit than the random-intercepts-only model? (raw)
pander_anova(anova(rec_convers_condition_gca_raw_rio,rec_convers_condition_gca_raw,test='Chisq'))

```

```{r plot-standardized-residuals, echo = FALSE, eval = FALSE}

# plot residuals for random-intercepts-only model
rio_residuals = qplot(x = fitted(rec_convers_condition_gca_st_rio), 
                      y = resid(rec_convers_condition_gca_st_rio)) +
  geom_hline(aes(yintercept=0)) + geom_smooth() + 
  scale_color_viridis() +
  coord_cartesian(xlim = c(-4,4), ylim = c(-2,3)) +
  ylab('Residuals') + xlab('Fitted') + ggtitle('Random-Intercepts-Only')

# plot residuals for maximal model
max_residuals = qplot(x = fitted(rec_convers_condition_gca_st), 
                      y = resid(rec_convers_condition_gca_st)) +
  geom_hline(aes(yintercept=0)) + geom_smooth() + 
  coord_cartesian(xlim = c(-4,4), ylim = c(-2,3)) +
  ylab('Residuals') + xlab('Fitted') + ggtitle('Maximal')

# plot and save
ggsave('./figures/DCC-standardized_residuals_plot.png',
       units = "in", width = 5, height = 3, dpi=100,
       grid.arrange(
         top=textGrob("Homogeneity of Standardized Models",
                      gp=gpar(fontsize=14)),
         rio_residuals,
         max_residuals,
         nrow = 1
       ))

```

![**Figure**. Residuals plots using standardized dataset. Residuals are more evenly distributed around 0 for the maximal model (right) than the random-intercepts-only model (left), meeting the assumption of homogeneity of residuals.](./figures/DCC-standardized_residuals_plot.png)

```{r hist-standardized-residuals, echo = FALSE, eval = FALSE}

# set bin width
bin_width = .1
y_max = 900

# plot histogram of residuals for random-intercepts-only model
rio_residuals = qplot(resid(rec_convers_condition_gca_st_rio),
                      geom='histogram',binwidth=bin_width) +
  geom_histogram(aes(fill = ..density..),binwidth=bin_width) +
  ylab('Frequency') + xlab('Residuals') +
  scale_fill_viridis(discrete=FALSE,name='Density') +
  coord_cartesian(xlim = c(-3,3), ylim = c(0,y_max)) + 
  theme(legend.position='bottom') +
  ggtitle('Random-Intercepts-Only')
 
# plot histogram of residuals maximal model
max_residuals = qplot(resid(rec_convers_condition_gca_st),
                      geom='histogram',binwidth=bin_width) +
  geom_histogram(aes(fill = ..density..),binwidth=bin_width) +
  ylab('Frequency') + xlab('Residuals') +
  scale_fill_viridis(discrete=FALSE,name='Density') +
  coord_cartesian(xlim = c(-3,3), ylim = c(0,y_max)) + 
  theme(legend.position='bottom') +
  ggtitle('Maximal')

# plot and save
ggsave('./figures/DCC-standardized_residuals_hist.png',
       units = "in", width = 6, height = 3, dpi=100,
       grid.arrange(
         top=textGrob("Normality of Standardized Models",
                      gp=gpar(fontsize=14)),
         rio_residuals,
         max_residuals,
         nrow = 1
       ))

```

![**Figure**. Frequency density plots of residuals using standardized dataset. Residuals for the maximal model (right) are more normally distributed than the random-intercepts-only model (left), meeting the assumption of normality of residuals.](./figures/DCC-standardized_residuals_hist.png)

***

## Exploring interaction terms

Let's do some investigations into the significant four-way interaction. After inspecting the interaction plot (see Discussion section), we choose to divide the data by conversation type -- exploring whether we still see significant differences in synchrony by task condition when we examine the conversations separately.

***

### Prepare separate datasets for each conversation

We create raw and standardized datasets for each conversation type here.

```{r prep-post-hoc-data}

# select only the friendly conversations (convers = -.5)
aff_only_raw = rec_plot %>%
  dplyr::filter(convers < 0)

# restandardize friendly conversation data
aff_only_st = aff_only_raw %>%
  mutate_each(.,funs(as.numeric(scale(.)))) %>%
  mutate(convers = -.5)

# select only the arguments (convers = .5)
arg_only_raw = rec_plot %>%
  dplyr::filter(convers > 0)

# restandardize argument data
arg_only_st = arg_only_raw %>%
  mutate_each(.,funs(as.numeric(scale(.)))) %>%
  mutate(convers = .5)

```

***

### Post-hoc interaction tests

Do we still see significant differences by condition in each conversation?

```{r aff-post-hoc-test}

# check for differences in the friendly conversation (raw)
cond_aff_gca_raw = lmer(rr ~ condition +
                      ot1 + ot2 + ot1.ot2 +
                      condition.ot1 + condition.ot2 + condition.ot1.ot2 +
                      (1 + ot1 + ot2 | conv.num) +
                      (1 + ot1 + ot2 | dyad),
                    data=aff_only_raw,REML=FALSE)
pander_lme(cond_aff_gca_raw, stats.caption=TRUE)

# check for differences in the friendly conversation (standardized)
cond_aff_gca_st = lmer(rr ~ condition +
                      ot1 + ot2 + ot1.ot2 +
                      condition.ot1 + condition.ot2 + condition.ot1.ot2 +
                      (1 + ot1 + ot2 | conv.num) +
                      (1 + ot1 + ot2 | dyad),
                    data=aff_only_st,REML=FALSE)
invisible(pander_lme_to_latex(cond_aff_gca_st,'aff_posthoc_latex-DCC.tex'))
pander_lme(cond_aff_gca_st, stats.caption=TRUE)

```

```{r arg-post-hoc-test}

# check for differences in the argumentative conversation (raw)
cond_arg_gca_raw = lmer(rr ~ condition +
                          ot1 + ot2 + ot1.ot2 +
                          condition.ot1 + condition.ot2 + condition.ot1.ot2 +
                          (1 + ot1 + ot2 | conv.num) +
                          (1 + ot1 + ot2 | dyad),
                        data=arg_only_raw,REML=FALSE)
pander_lme(cond_arg_gca_raw, stats.caption=TRUE)

# check for differences in the argumentative conversation (standardized)
cond_arg_gca_st = lmer(rr ~ condition +
                         ot1 + ot2 + ot1.ot2 +
                         condition.ot1 + condition.ot2 + condition.ot1.ot2 +
                         (1 + ot1 + ot2 | conv.num) +
                         (1 + ot1 + ot2 | dyad),
                       data=arg_only_st,REML=FALSE)
invisible(pander_lme_to_latex(cond_arg_gca_st,'arg_posthoc_latex-DCC.tex'))
pander_lme(cond_arg_gca_st, stats.caption=TRUE)

```

***

# Surrogate analysis: Phase-radomized baseline

To explore how these patterns differ from the amount of coordination that might be expected by chance, we here re-create the above models with a deconstructed time series.

***

## Surrogate data preparation

***

### Preliminaries

```{r surrogate-prep-prelim, warning=FALSE, error=FALSE, message=FALSE, eval = FALSE}

# clear our workspace
rm(list=ls())

# read in libraries and create functions
source('./supplementary-code/libraries_and_functions-DCC.r')

# load real data and CRQA parameters
coords = read.table('./data/DCC-trimmed-data.csv', sep=',', header = TRUE)
crqa_params = read.table('./data/crqa_data_and_parameters-DCC.csv',
                         sep=',',header=TRUE)

```

***

### Load CRQA parameters and specify chunks

Load in the CRQA parameters specified for the real data and prepare the dataset for analysis.

```{r surrogate-prep-for-crqa-drps, eval = FALSE}

# grab only the parameters we need
crqa_params = crqa_params %>%
  select(-matches('euclid')) %>%
  distinct()

# rescale the data (by mean)
coords_crqa = coords %>% ungroup() %>%
  group_by(dyad,conv.num) %>%
  mutate(rescale.euclid0 = euclid0/mean(euclid0)) %>%
  mutate(rescale.euclid1 = euclid1/mean(euclid1)) %>%
  select(-matches('jounce'))

# merge with parameters
coords_crqa = plyr::join(x = crqa_params,
                         y = coords_crqa,
                         by = c('dyad'='dyad',
                                'conv.num'='conv.num'))

# slice up the data so that we have one dataset per conversation
split_convs = split(coords_crqa,
                    list(coords_crqa$dyad, coords_crqa$conv.num))

```

***

### Calculate CRQA and DRPs for surrogate data

We create surrogate movement data for each participant through phase-randomization, implemented using the `FFTsurrogate` function in the `nonlinearTseries` package. We then calculate CRQA over those surrogate data in order to create a baseline value of synchrony between participants.

```{r surrogate-crqa-calculation, eval = FALSE}

# identify window size and set seed (for reproducibility)
target_seconds = 5
win_size = target_seconds * sampling_rate
set.seed(111)

# cycle through each conversation using the sliced subsets
drp_results = data.frame()
crqa_results = data.frame()
for (next_conv in split_convs){

  # isolate parameters for next dyad
  chosen.delay = unique(next_conv$chosen.delay)
  chosen.embed = unique(next_conv$chosen.embed)
  chosen.radius = unique(next_conv$chosen.radius)
  dyad_num = unique(next_conv$dyad)

  # # print update
  # print(paste("Surrogate CRQA: Dyad ", unique(next_conv$dyad),
  #             ", conversation ",unique(next_conv$conv.num),
  #             sep=""))
  
  # re-create the surrogate dyad and run again
  for (run in 1:10){

    # create phase-randomized baseline for each participant
    shuffle0 = nonlinearTseries::FFTsurrogate(next_conv$rescale.euclid0, n.samples = 1)
    shuffle1 = nonlinearTseries::FFTsurrogate(next_conv$rescale.euclid1, n.samples = 1)

    # run cross-recurrence
    rec_analysis = crqa(ts1=shuffle0,
                        ts2=shuffle1,
                        delay=chosen.delay,
                        embed=chosen.embed,
                        r=chosen.radius,
                        normalize=0,
                        rescale=0,
                        mindiagline=2,
                        minvertline=2,
                        tw=0,
                        whiteline=FALSE,
                        recpt=FALSE)

    # save plot-level information to dataframe
    next_data_line = data.frame(c(run,
                                  dyad_num,
                                  unique(next_conv$conv.type),
                                  rec_analysis[1:9]))
    names(next_data_line) = c('run',"dyad",'conv.type',names(rec_analysis[1:9]))
    crqa_results = rbind.data.frame(crqa_results,next_data_line)

    # recreate DRP from diagonal lines within our target window
    diag_lines = spdiags(rec_analysis$RP)
    subset_plot = data.frame(diag_lines$B[,diag_lines$d >= -win_size & diag_lines$d <= win_size])
    rr = colSums(subset_plot)/dim(subset_plot)[1]

    # convert to dataframe, padding (with 0) where no RR was observed
    next_drp = full_join(data.frame(lag = as.integer(stringr::str_replace(names(rr),'X',''))-(win_size+1),
                                    rr = rr),
                         data.frame(lag = -win_size:win_size),
                         by='lag')
    next_drp[is.na(next_drp)] = 0

    # save it to dataframe
    next_drp$dyad = dyad_num
    next_drp$conv.type = unique(next_conv$conv.type)
    next_drp$run = run
    drp_results = rbind.data.frame(drp_results,next_drp)
  }}

# merge CRQA and DRP analysis results
recurrence_results = plyr::join(drp_results, crqa_results,
                                by=c('dyad',
                                     'conv.type',
                                     'run'))

# grab information about experiment condition
additional_dyad_info = coords %>% ungroup() %>%
  select(dyad,conv.num,conv.type,cond) %>% distinct()

# merge recurrence analyses and condition information
surrogate_recurrence_df = plyr::join(recurrence_results, additional_dyad_info,
           by=c('dyad','conv.type'))

```

***

### Create polynomials and center binary variables for surrogate data

```{r surrogate-polynomials, eval = FALSE}

# create first- and second-order orthogonal polynomials for lag
raw_lag = min(surrogate_recurrence_df$lag):max(surrogate_recurrence_df$lag)
lag_vals = data.frame(raw_lag)
lag_offset = (0-min(raw_lag)) + 1
t = stats::poly((raw_lag + lag_offset), 2)
lag_vals[, paste("ot", 1:2, sep="")] = t[lag_vals$raw_lag + lag_offset, 1:2]

# join it to the original data table
surrogate_recurrence_df = left_join(surrogate_recurrence_df,lag_vals, by = c("lag" = "raw_lag")) %>%
  ungroup() %>%
  plyr::rename(.,
               c("conv.type"="convers",
                 "cond"="condition")) %>%
  mutate(condition = condition-.5) %>%
  mutate(convers = convers-.5) 

```

***

### Export surrogate dataframe

```{r export-surrogate-df, eval = FALSE}

# export dataframe
write.table(surrogate_recurrence_df,'./data/surrogate_plotting_df-DCC.csv',row.names=FALSE,sep=',')

```

***

## Surrogate data analysis: Phase-randomized baseline

***

### Preliminaries

```{r surrogate-analysis-prelim, warning=FALSE, error=FALSE, message=FALSE}

# clear our workspace
rm(list=ls())

# read in libraries and create functions
source('./supplementary-code/libraries_and_functions-DCC.r')

# read in the raw datasets for real and baseline data
surrogate_plot = read.table('./data/surrogate_plotting_df-DCC.csv', sep=',', header = TRUE)
rec_plot = read.table('./data/plotting_df-DCC.csv',sep=',',header=TRUE)

```

***

### Combine real and surrogate data for analysis dataframes 

We combine the real data (loaded as `rec_plot`) and the surrogate data (loaded as `surrogate_plot`) into a single dataframe for later analysis. We create a new variable called `data`, which designates whether the data were real (`data = .5`) or phase-randomized baseline (`data = -.5`). We then create all relevant interaction terms and create a standardized dataframe.

```{r surrogate-analysis-interactions}

# label real and surrogate data
rec_plot$data = .5
surrogate_plot$data = -.5

# specify interaction terms for surrogate data
surrogate_plot = surrogate_plot %>% ungroup() %>%
  mutate(condition.convers = condition * convers) %>%

  # first-order polynomials
  mutate(condition.ot1 = condition * ot1) %>%
  mutate(convers.ot1 = convers * ot1) %>%
  mutate(condition.convers.ot1 = condition * convers * ot1) %>%

  # second-order polynomials
  mutate(condition.ot2 = condition * ot2) %>%
  mutate(convers.ot2 = convers * ot2) %>%
  mutate(condition.convers.ot2 = condition * convers * ot2) %>%

  # polynomial interactions
  mutate(ot1.ot2 = ot1 * ot2) %>%
  mutate(condition.ot1.ot2 = condition * ot1 * ot2) %>%
  mutate(convers.ot1.ot2 = convers * ot1 * ot2) %>%
  mutate(condition.convers.ot1.ot2 = condition * convers * ot1 * ot2)

# specify variables we'd like to keep
shared_variables = names(rec_plot)[names(rec_plot) %in% names(surrogate_plot)]

# append the two time series
all_data_plot = rbind.data.frame(dplyr::select(rec_plot,one_of(shared_variables)),
                                 dplyr::select(surrogate_plot,one_of(shared_variables)))

# create interaction terms
all_data_plot = all_data_plot %>% ungroup() %>%
  mutate(data.convers = data * convers) %>%
  mutate(data.condition = data * condition) %>%
  mutate(data.condition.convers = data * condition * convers) %>%

  # first-order polynomials
  mutate(data.ot1 = data * ot1) %>%
  mutate(data.condition.ot1 = data * condition * ot1) %>%
  mutate(data.convers.ot1 = data * convers * ot1) %>%
  mutate(data.condition.convers.ot1 = data * condition * convers * ot1) %>%

  # second-order polynomials
  mutate(data.ot2 = data * ot2) %>%
  mutate(data.condition.ot2 = data * condition * ot2) %>%
  mutate(data.convers.ot2 = data * convers * ot2) %>%
  mutate(data.condition.convers.ot2 = data * condition * convers * ot2) %>%

  # polynomial interactions
  mutate(data.ot1.ot2 = data * ot1 * ot2) %>%
  mutate(data.condition.ot1.ot2 = data * condition * ot1 * ot2) %>%
  mutate(data.convers.ot1.ot2 = data * convers * ot1 * ot2) %>%
  mutate(data.condition.convers.ot1.ot2 = data * condition * convers * ot1 * ot2)

# create standardized dataframe
all_data_st = mutate_each(all_data_plot,funs(as.numeric(scale(.))))

```

***

### Comparing real data to baseline

To see how the real data compare to baseline, we run a model identical to our central analysis (`rec_convers_condition_gca_st` and `rec_convers_condition_gca_raw`) but add in terms to capture whether the data were real or surrogate baseline (i.e., `data` and all relevant interaction terms).

```{r compare-model-to-baseline, warning=FALSE, error=FALSE, message=FALSE}

# standardized model
baseline_comparison_st = lmer(rr ~ data + convers + condition + ot1 + ot2
                              + data.ot1
                              + data.ot2
                              + condition.convers
                              + data.convers
                              + data.condition
                              + data.condition.convers
                              + condition.ot1
                              + data.condition.ot1
                              + convers.ot1
                              + data.convers.ot1
                              + condition.convers.ot1
                              + data.condition.convers.ot1
                              + condition.ot2
                              + data.condition.ot2
                              + convers.ot2
                              + data.convers.ot2
                              + condition.convers.ot2
                              + data.condition.convers.ot2
                              + ot1.ot2
                              + data.ot1.ot2
                              + condition.ot1.ot2
                              + data.condition.ot1.ot2
                              + convers.ot1.ot2
                              + data.convers.ot1.ot2
                              + condition.convers.ot1.ot2
                              + data.condition.convers.ot1.ot2
                              + (1 + convers | conv.num)
                              + (1 + convers + data.ot2 + data.convers | dyad),
                              data = all_data_st,
                              REML = FALSE)
invisible(pander_lme_to_latex(baseline_comparison_st,'baseline_comparison_latex-DCC.tex'))
pander_lme(baseline_comparison_st,stats.caption=TRUE)

# raw model
baseline_comparison_raw = lmer(rr ~ data + convers + condition + ot1 + ot2
                              + data.ot1
                              + data.ot2
                              + condition.convers
                              + data.convers
                              + data.condition
                              + data.condition.convers
                              + condition.ot1
                              + data.condition.ot1
                              + convers.ot1
                              + data.convers.ot1
                              + condition.convers.ot1
                              + data.condition.convers.ot1
                              + condition.ot2
                              + data.condition.ot2
                              + convers.ot2
                              + data.convers.ot2
                              + condition.convers.ot2
                              + data.condition.convers.ot2
                              + ot1.ot2
                              + data.ot1.ot2
                              + condition.ot1.ot2
                              + data.condition.ot1.ot2
                              + convers.ot1.ot2
                              + data.convers.ot1.ot2
                              + condition.convers.ot1.ot2
                              + data.condition.convers.ot1.ot2
                              + (1 + convers | conv.num)
                              + (1 + convers + data.ot2 + data.convers | dyad),
                              data = all_data_plot,
                              REML = FALSE)
pander_lme(baseline_comparison_raw,stats.caption=TRUE)

```

***

### Exploring interaction terms

In order to understand the higher-order interactions, we create separate datasets for the friendly and argumentative conversations and then re-run the model as above.

***

#### Prepare separate datasets for comparison

```{r prep-surrogate-post-hoc-datasets}

# select only the friendly conversations (convers = -.5)
aff_only_raw = all_data_plot %>%
  dplyr::filter(convers < 0)

# restandardize friendly conversation data
aff_only_st = aff_only_raw %>%
  mutate_each(.,funs(as.numeric(scale(.)))) %>%
  mutate(convers = -.5)

# select only the arguments (convers = .5)
arg_only_raw = all_data_plot %>%
  dplyr::filter(convers > 0)

# restandardize argument data
arg_only_st = arg_only_raw %>%
  mutate_each(.,funs(as.numeric(scale(.)))) %>%
  mutate(convers = .5)

```

***

#### Post-hoc interaction tests

```{r aff-posthoc-comparison-test}

# check for differences in the friendly conversation (standardized)
aff_posthoc_comparison_st = lmer(rr ~ data + condition + ot1 + ot2
                              + data.ot1
                              + data.ot2
                              + data.condition
                              + condition.ot1
                              + data.condition.ot1
                              + condition.ot2
                              + data.condition.ot2
                              + ot1.ot2
                              + data.ot1.ot2
                              + condition.ot1.ot2
                              + data.condition.ot1.ot2
                              + (1 + ot1| conv.num)
                              + (1 + ot1+ data + condition + data.ot2 | dyad),
                              data = aff_only_st,
                              REML = FALSE)
invisible(pander_lme_to_latex(aff_posthoc_comparison_st,'aff_posthoc_comparison_latex-DCC.tex'))
pander_lme(aff_posthoc_comparison_st,stats.caption=TRUE)

# check for differences in the friendly conversation (raw)
aff_posthoc_comparison_raw = lmer(rr ~ data + condition + ot1 + ot2
                              + data.ot1
                              + data.ot2
                              + data.condition
                              + condition.ot1
                              + data.condition.ot1
                              + condition.ot2
                              + data.condition.ot2
                              + ot1.ot2
                              + data.ot1.ot2
                              + condition.ot1.ot2
                              + data.condition.ot1.ot2
                              + (1 + ot1 | conv.num)
                              + (1 + ot1 + data + condition + data.ot2 + data.condition | dyad),
                              data = aff_only_raw,
                              REML = FALSE)
pander_lme(aff_posthoc_comparison_raw,stats.caption=TRUE)

```

```{r arg-posthoc-comparison-test}

# check for differences in the argumentative conversation (standardized)
arg_posthoc_comparison_st = lmer(rr ~ data + condition + ot1 + ot2
                              + data.ot1
                              + data.ot2
                              + data.condition
                              + condition.ot1
                              + data.condition.ot1
                              + condition.ot2
                              + data.condition.ot2
                              + ot1.ot2
                              + data.ot1.ot2
                              + condition.ot1.ot2
                              + data.condition.ot1.ot2
                              + (1 + ot1 | conv.num)
                              + (1 + ot1 + data + condition + ot2 | dyad),
                              data = arg_only_st,
                              REML = FALSE)
invisible(pander_lme_to_latex(arg_posthoc_comparison_st,'arg_posthoc_comparison_latex-DCC.tex'))
pander_lme(arg_posthoc_comparison_st,stats.caption=TRUE)

# check for differences in the argumentative conversation (raw)
arg_posthoc_comparison_raw = lmer(rr ~ data + condition + ot1 + ot2
                              + data.ot1
                              + data.ot2
                              + data.condition
                              + condition.ot1
                              + data.condition.ot1
                              + condition.ot2
                              + data.condition.ot2
                              + ot1.ot2
                              + data.ot1.ot2
                              + condition.ot1.ot2
                              + data.condition.ot1.ot2
                              + (1 + ot1 | conv.num)
                              + (1 + ot1 + data + condition + ot2 | dyad),
                              data = arg_only_raw,
                              REML = FALSE)
pander_lme(arg_posthoc_comparison_raw,stats.caption=TRUE)

```

***

# Surrogate analysis: Sample-wise shuffled baseline

Our manuscript uses the phase-randomized baseline (given above) as its surrogate analysis, which is able to preserve the autocorrelation of a time series while breaking its local dependencies. We here also present the sample-wise shuffled baseline, a common way of identifying the amount of interpersonal synchrony that might be expected by chance between individuals' time series by independently randomizing the order of all samples within the time series. Because the phase-randomized baseline is used in this research area less often than the sample-wise shuffled baseline, we present both here to demonstrate the robustness of our results against both kinds of baselines.

***

## Surrogate data preparation

***

### Preliminaries

This section clears the workspace and reads in the prepared data files.

```{r shuffled-prep-prelim, warning=FALSE, error=FALSE, message=FALSE, eval = FALSE}

# clear our workspace
rm(list=ls())

# read in libraries and create functions
source('./supplementary-code/libraries_and_functions-DCC.r')

# load real data and CRQA parameters
coords = read.table('./data/DCC-trimmed-data.csv', sep=',', header = TRUE)
crqa_params = read.table('./data/crqa_data_and_parameters-DCC.csv',
                         sep=',',header=TRUE)

```

***

### Load CRQA parameters

Let's load in the CRQA parameters specified for the real data and prepare for CRQA.

```{r shuffled-prep-for-crqa-drps, eval = FALSE}

# grab only the parameters we need
crqa_params = crqa_params %>%
  select(-matches('euclid')) %>%
  distinct()

# rescale the data (by mean)
coords_crqa = coords %>% ungroup() %>%
  group_by(dyad,conv.num) %>%
  mutate(rescale.euclid0 = euclid0/mean(euclid0)) %>%
  mutate(rescale.euclid1 = euclid1/mean(euclid1)) %>%
  select(-matches('jounce'))

# merge with parameters
coords_crqa = plyr::join(x = crqa_params,
                         y = coords_crqa,
                         by = c('dyad'='dyad',
                                'conv.num'='conv.num'))

# slice up the data so that we have one dataset per conversation
split_convs = split(coords_crqa,
                    list(coords_crqa$dyad, coords_crqa$conv.num))

```

***

### Calculate CRQA and DRPs for surrogate data

Just as with the real data, we run continuous CRQA over each conversation of each dyad with the selected parameters using the `crqa` function from the `crqa` package (Coco & Dale, 2014, *Frontiers in Psychology*). Again, this includes code to print out updates, but we suppress them for brevity.

```{r shuffled-crqa-calculation, eval = FALSE}

# identify window size and set seed (for reproducibility)
target_seconds = 5
win_size = target_seconds * sampling_rate
set.seed(111)

# cycle through each conversation using the sliced subsets
drp_results = data.frame()
crqa_results = data.frame()
for (next_conv in split_convs){

  # isolate parameters for next dyad
  chosen.delay = unique(next_conv$chosen.delay)
  chosen.embed = unique(next_conv$chosen.embed)
  chosen.radius = unique(next_conv$chosen.radius)
  dyad_num = unique(next_conv$dyad)

  # # print update
  # print(paste("Surrogate CRQA: Dyad ", unique(next_conv$dyad),
  #             ", conversation ",unique(next_conv$conv.num),
  #             sep=""))

  # re-shuffle the surrogate dyad and run again
  for (run in 1:10){

    # shuffle time series
    shuffle0 = base::sample(next_conv$rescale.euclid0,replace=FALSE)
    shuffle1 = base::sample(next_conv$rescale.euclid1,replace=FALSE)
    
    # run cross-recurrence
    rec_analysis = crqa(ts1=shuffle0,
                        ts2=shuffle1,
                        delay=chosen.delay,
                        embed=chosen.embed,
                        r=chosen.radius,
                        normalize=0,
                        rescale=0,
                        mindiagline=2,
                        minvertline=2,
                        tw=0,
                        whiteline=FALSE,
                        recpt=FALSE)

    # save plot-level information to dataframe
    next_data_line = data.frame(c(run,
                                  dyad_num,
                                  unique(next_conv$conv.type),
                                  rec_analysis[1:9]))
    names(next_data_line) = c('run',"dyad",'conv.type',names(rec_analysis[1:9]))
    crqa_results = rbind.data.frame(crqa_results,next_data_line)

    # recreate DRP from diagonal lines within our target window
    diag_lines = spdiags(rec_analysis$RP)
    subset_plot = data.frame(diag_lines$B[,diag_lines$d >= -win_size & diag_lines$d <= win_size])
    rr = colSums(subset_plot)/dim(subset_plot)[1]

    # convert to dataframe, padding (with 0) where no RR was observed
    next_drp = full_join(data.frame(lag = as.integer(stringr::str_replace(names(rr),'X',''))-(win_size+1),
                                    rr = rr),
                         data.frame(lag = -win_size:win_size),
                         by='lag')
    next_drp[is.na(next_drp)] = 0

    # save it to dataframe
    next_drp$dyad = dyad_num
    next_drp$conv.type = unique(next_conv$conv.type)
    next_drp$run = run
    drp_results = rbind.data.frame(drp_results,next_drp)
  }}

# merge CRQA and DRP analysis results
recurrence_results = plyr::join(drp_results, crqa_results,
                                by=c('dyad',
                                     'conv.type',
                                     'run'))

# grab information about experiment condition
additional_dyad_info = coords %>% ungroup() %>%
  select(dyad,conv.num,conv.type,cond) %>% distinct()

# merge recurrence analyses and condition information
surrogate_recurrence_df = plyr::join(recurrence_results, additional_dyad_info,
           by=c('dyad','conv.type'))

```

***

### Create polynomials for surrogate data and center variables

As with the real data, we create first- and second-order orthogonal polynomials for the lag term. 

```{r shuffled-polynomials, eval = FALSE}

# create first- and second-order orthogonal polynomials for lag
raw_lag = min(surrogate_recurrence_df$lag):max(surrogate_recurrence_df$lag)
lag_vals = data.frame(raw_lag)
lag_offset = (0-min(raw_lag)) + 1
t = stats::poly((raw_lag + lag_offset), 2)
lag_vals[, paste("ot", 1:2, sep="")] = t[lag_vals$raw_lag + lag_offset, 1:2]

# join it to the original data table
surrogate_recurrence_df = left_join(surrogate_recurrence_df,lag_vals, by = c("lag" = "raw_lag"))

# rename variables and center the binary variables
surrogate_recurrence_df = surrogate_recurrence_df %>% ungroup() %>%
  plyr::rename(.,
               c("conv.type"="convers",
                 "cond"="condition")) %>%
  mutate(condition = condition-.5) %>%
  mutate(convers = convers-.5) %>%
  mutate(condition.convers = condition * convers)

```

***

### Export surrogate dataframe

```{r export-shuffled-dfs, eval = FALSE}

# export plotting dataframe
write.table(surrogate_recurrence_df,'./data/surrogate_shuffled_plotting_df-DCC.csv',
            row.names=FALSE,sep=',')

```

***

## Surrogate data analysis: Sample-wise shuffled baseline

***

### Preliminaries

```{r shuffled-surrogate-analysis-prelim, warning=FALSE, error=FALSE, message=FALSE}

# clear our workspace
rm(list=ls())

# read in libraries and create functions
source('./supplementary-code/libraries_and_functions-DCC.r')

# read in the raw datasets for real and baseline data
surrogate_plot = read.table('./data/surrogate_shuffled_plotting_df-DCC.csv', 
                            sep=',', header = TRUE)
rec_plot = read.table('./data/plotting_df-DCC.csv',sep=',',header=TRUE)

```

***

### Create unified dataframe and appropriate interactions

Now that we've created the surrogate dataset, we join it to the real dataset so that we can compare our observed data to what might be expected by chance. We create a new variable `data` to mark whether the row contains real (`data = .5`) or surrogate (`data = -.5`) data and interact it appropriately with other predictors. We then create a standardized dataframe.

```{r shuffled-analysis-prelim, warning=FALSE, error=FALSE, message=FALSE}

# label real and surrogate data
rec_plot$data = .5
surrogate_plot$data = -.5

# specify interaction terms for surrogate data
surrogate_plot = surrogate_plot %>% ungroup() %>%
  mutate(condition.convers = condition * convers) %>%

  # first-order polynomials
  mutate(condition.ot1 = condition * ot1) %>%
  mutate(convers.ot1 = convers * ot1) %>%
  mutate(condition.convers.ot1 = condition * convers * ot1) %>%

  # second-order polynomials
  mutate(condition.ot2 = condition * ot2) %>%
  mutate(convers.ot2 = convers * ot2) %>%
  mutate(condition.convers.ot2 = condition * convers * ot2) %>%

  # polynomial interactions
  mutate(ot1.ot2 = ot1 * ot2) %>%
  mutate(condition.ot1.ot2 = condition * ot1 * ot2) %>%
  mutate(convers.ot1.ot2 = convers * ot1 * ot2) %>%
  mutate(condition.convers.ot1.ot2 = condition * convers * ot1 * ot2)

# specify variables we'd like to keep
shared_variables = names(rec_plot)[names(rec_plot) %in% names(surrogate_plot)]

# append the two time series
all_data_plot = rbind.data.frame(dplyr::select(rec_plot,one_of(shared_variables)),
                                 dplyr::select(surrogate_plot,one_of(shared_variables)))

# create interaction terms
all_data_plot = all_data_plot %>% ungroup() %>%
  mutate(data.convers = data * convers) %>%
  mutate(data.condition = data * condition) %>%
  mutate(data.condition.convers = data * condition * convers) %>%

  # first-order polynomials
  mutate(data.ot1 = data * ot1) %>%
  mutate(data.condition.ot1 = data * condition * ot1) %>%
  mutate(data.convers.ot1 = data * convers * ot1) %>%
  mutate(data.condition.convers.ot1 = data * condition * convers * ot1) %>%

  # second-order polynomials
  mutate(data.ot2 = data * ot2) %>%
  mutate(data.condition.ot2 = data * condition * ot2) %>%
  mutate(data.convers.ot2 = data * convers * ot2) %>%
  mutate(data.condition.convers.ot2 = data * condition * convers * ot2) %>%

  # polynomial interactions
  mutate(data.ot1.ot2 = data * ot1 * ot2) %>%
  mutate(data.condition.ot1.ot2 = data * condition * ot1 * ot2) %>%
  mutate(data.convers.ot1.ot2 = data * convers * ot1 * ot2) %>%
  mutate(data.condition.convers.ot1.ot2 = data * condition * convers * ot1 * ot2)

# create standardized dataframe
all_data_st = mutate_each(all_data_plot,funs(as.numeric(scale(.))))

```

***

### Comparing model to baseline

After analyzing the dynamics of the observed data (Section 4), we now explore how those dynamics differ from those we might expect to see by chance. The models below add `data` as an additional predictor (with all appropriate interactions) and use the most maximal random effects structure justified by the data.

```{r compare-shuffle-to-baseline, warning=FALSE, error=FALSE, message=FALSE}

# standardized model
shuffled_baseline_comparison_st = lmer(rr ~ data + convers + condition + ot1 + ot2
                              + data.ot1
                              + data.ot2
                              + condition.convers
                              + data.convers
                              + data.condition
                              + data.condition.convers
                              + condition.ot1
                              + data.condition.ot1
                              + convers.ot1
                              + data.convers.ot1
                              + condition.convers.ot1
                              + data.condition.convers.ot1
                              + condition.ot2
                              + data.condition.ot2
                              + convers.ot2
                              + data.convers.ot2
                              + condition.convers.ot2
                              + data.condition.convers.ot2
                              + ot1.ot2
                              + data.ot1.ot2
                              + condition.ot1.ot2
                              + data.condition.ot1.ot2
                              + convers.ot1.ot2
                              + data.convers.ot1.ot2
                              + condition.convers.ot1.ot2
                              + data.condition.convers.ot1.ot2
                              + (1 + convers | conv.num)
                              + (1 + convers + data.ot2 + data.convers | dyad),
                              data = all_data_st,
                              REML = FALSE)
pander_lme(shuffled_baseline_comparison_st,stats.caption=TRUE)

# raw model
shuffled_baseline_comparison_raw = lmer(rr ~ data + convers + condition + ot1 + ot2
                              + data.ot1
                              + data.ot2
                              + condition.convers
                              + data.convers
                              + data.condition
                              + data.condition.convers
                              + condition.ot1
                              + data.condition.ot1
                              + convers.ot1
                              + data.convers.ot1
                              + condition.convers.ot1
                              + data.condition.convers.ot1
                              + condition.ot2
                              + data.condition.ot2
                              + convers.ot2
                              + data.convers.ot2
                              + condition.convers.ot2
                              + data.condition.convers.ot2
                              + ot1.ot2
                              + data.ot1.ot2
                              + condition.ot1.ot2
                              + data.condition.ot1.ot2
                              + convers.ot1.ot2
                              + data.convers.ot1.ot2
                              + condition.convers.ot1.ot2
                              + data.condition.convers.ot1.ot2
                              + (1 + convers | conv.num)
                              + (1 + convers + data.ot2 + data.convers | dyad),
                              data = all_data_plot,
                              REML = FALSE)
pander_lme(shuffled_baseline_comparison_raw,stats.caption=TRUE)

```

***

### Exploring interaction terms

***

#### Prepare separate datasets for comparison

```{r prep-shuffle-post-hoc-datasets}

# select only the friendly conversations (convers = -.5)
aff_only_raw = all_data_plot %>%
  dplyr::filter(convers < 0)

# restandardize friendly conversation data
aff_only_st = aff_only_raw %>%
  mutate_each(.,funs(as.numeric(scale(.)))) %>%
  mutate(convers = -.5)

# select only the arguments (convers = .5)
arg_only_raw = all_data_plot %>%
  dplyr::filter(convers > 0)

# restandardize argument data
arg_only_st = arg_only_raw %>%
  mutate_each(.,funs(as.numeric(scale(.)))) %>%
  mutate(convers = .5)

```

***

#### Post-hoc interaction tests

As with the real data, we next examine the interactions by separately analyzing the friendly and argumentative conversations, still including the `data` predictor to compare observed to baseline.

```{r shuffled-aff-posthoc-comparison-test}

# check for differences in the friendly conversation (standardized)
shuffled_aff_posthoc_comparison_st = lmer(rr ~ data + condition + ot1 + ot2
                              + data.ot1 + data.ot2 + data.condition
                              + condition.ot1 + condition.ot2 + data.condition.ot1 
                              + data.condition.ot2 + ot1.ot2 + data.ot1.ot2
                              + condition.ot1.ot2 + data.condition.ot1.ot2
                              + (1 | conv.num)
                              + (1 + ot1 + data + condition + data.ot2 + data.condition | dyad),
                              data = aff_only_st,
                              REML = FALSE)
pander_lme(shuffled_aff_posthoc_comparison_st,stats.caption=TRUE)

# check for differences in the friendly conversation (raw)
shuffled_aff_posthoc_comparison_raw = lmer(rr ~ data + condition + ot1 + ot2
                              + data.ot1 + data.ot2 + data.condition
                              + condition.ot1 + condition.ot2 + data.condition.ot1 
                              + data.condition.ot2 + ot1.ot2 + data.ot1.ot2
                              + condition.ot1.ot2 + data.condition.ot1.ot2
                              + (1 | conv.num)
                              + (1 + ot1 + data + condition + data.ot2 + data.condition | dyad),
                              data = aff_only_raw,
                              REML = FALSE)
pander_lme(shuffled_aff_posthoc_comparison_raw,stats.caption=TRUE)

```

```{r shuffled-arg-posthoc-comparison-test}

# check for differences in the argumentative conversation (standardized)
shuffled_arg_posthoc_comparison_st = lmer(rr ~ data + condition + ot1 + ot2
                              + data.ot1
                              + data.ot2
                              + data.condition
                              + condition.ot1
                              + data.condition.ot1
                              + condition.ot2
                              + data.condition.ot2
                              + ot1.ot2
                              + data.ot1.ot2
                              + condition.ot1.ot2
                              + data.condition.ot1.ot2
                              + (1 + ot1 | conv.num)
                              + (1 + ot1 + data + condition + ot2 | dyad),
                              data = arg_only_st,
                              REML = FALSE)
pander_lme(shuffled_arg_posthoc_comparison_st,stats.caption=TRUE)

# check for differences in the argumentative conversation (raw)
shuffled_arg_posthoc_comparison_raw = lmer(rr ~ data + condition + ot1 + ot2
                              + data.ot1
                              + data.ot2
                              + data.condition
                              + condition.ot1
                              + data.condition.ot1
                              + condition.ot2
                              + data.condition.ot2
                              + ot1.ot2
                              + data.ot1.ot2
                              + condition.ot1.ot2
                              + data.condition.ot1.ot2
                              + (1 + ot1 | conv.num)
                              + (1 + ot1 + data + condition + ot2 | dyad),
                              data = arg_only_raw,
                              REML = FALSE)
pander_lme(shuffled_arg_posthoc_comparison_raw,stats.caption=TRUE)

```

***

# Discussion

The model's results indeed suggest that high- and low-level constraints influence coordination dynamics. Unexpectedly, we saw that participants did not exhibit time-locked head movement synchrony overall in the presence of these additional constraints on the interaction. Extending previous findings with gross body movements in another naturalistic interaction corpus (Paxton & Dale, 2013, *Quarterly Journal of Experimental Psychology*), we here found that argument decreases interpersonal synchrony.

Interestingly, although we hypothesized that the noise condition would increase coordination compared to a dual-task condition, we did not find a simple effect of condition. Instead, condition affected the dynamics of coordination in a three-way interaction among conversation type, condition, and the quadratic term. In both conditions, affiliative conversations induced relatively flat coordination dynamics, although at a rate significanty higher than in arguments. In argument, however, we found very different dynamics by condition: We observed the inverted U-shape characteristic of turn-taking dynamics in the dual-task condition but saw a relatively flat recurrence profile in the noise condition.

Taken together, we view our results as consistent with the growing view of interpersonal communication as a complex dynamical system. 

***

```{r plot-three-way-interaction, echo=FALSE, warning = FALSE, error = FALSE, message = FALSE, eval = FALSE}

# read in the raw datasets for real and baseline data
surrogate_plot = read.table('./data/surrogate_plotting_df-DCC.csv', sep=',', header = TRUE)
rec_plot = read.table('./data/plotting_df-DCC.csv',sep=',',header=TRUE)

# split plotting dataframe by condition
rec_plot$data = 'Real'
real_condition_dfs = split(rec_plot,rec_plot$condition)
real_noise_data = real_condition_dfs[[1]]
real_dual_task_data = real_condition_dfs[[2]]

# surrogate data
surrogate_plot$data = 'Surrogate'
surrogate_condition_dfs = split(surrogate_plot,surrogate_plot$condition)
surrogate_noise_data = surrogate_condition_dfs[[1]]
surrogate_dual_task_data = surrogate_condition_dfs[[2]]

# plot dual-task condition
dual_task_plot = ggplot(real_dual_task_data,
                        aes(x=lag,
                            y=rr,
                            color=as.factor(convers))) +
  scale_colour_manual(values=c("chartreuse4","red2"),
                      breaks=c(min(real_dual_task_data$convers),
                               max((real_dual_task_data$convers)))) +
  theme(legend.position = "none") +
  stat_smooth() +
  ggtitle('Dual-Task') +
  xlab(paste('Lag (',sampling_rate,' Hz)', sep="")) + ylab('Mean RR') +
  coord_cartesian(ylim = c(0, 
                           max(c(mean(real_noise_data$rr),mean(real_dual_task_data$rr)))+.03)) +
  geom_smooth(data = surrogate_dual_task_data,
            aes(x = lag,
                y = rr,
                colour=as.factor(convers)),
            linetype=3)

# plot noise condition
noise_plot = ggplot(real_noise_data,
                    aes(x=lag,
                        y=rr,
                        color=as.factor(convers))) +
  scale_colour_manual(values=c("chartreuse4","red2"),
                      breaks=c(min(real_noise_data$convers),
                               max((real_noise_data$convers)))) +
  theme(legend.position = "none") +
  ggtitle('Noise') +
  stat_smooth() +
  xlab(paste('Lag (',sampling_rate,' Hz)', sep="")) + ylab('') +
  coord_cartesian(ylim = c(0, 
                           max(c(mean(real_noise_data$rr),mean(real_dual_task_data$rr)))+.03)) +
  geom_smooth(data = surrogate_noise_data,
            aes(x = lag,
                y = rr,
                colour=as.factor(convers)),
            linetype=3)

# create another noise plot just for the legend
noise_plot_legend = ggplot(real_noise_data,
                           aes(x=lag,
                               y=rr,
                               color=as.factor(convers))) +
  scale_colour_manual(values=c("chartreuse4","red2"),
                      name='Conversation Type',
                      breaks=c(min(real_noise_data$convers),
                               max((real_noise_data$convers))),
                      labels=c("Affiliative","Argumentative")) +
  ggtitle('Noise') +
  stat_smooth() +
  coord_cartesian(ylim = c(0, 
                           max(c(mean(real_noise_data$rr),mean(real_dual_task_data$rr)))+.03))

# create a master legend
master_legend = gtable_filter(ggplot_gtable(
  ggplot_build(noise_plot_legend + theme(legend.position="bottom"))),
  "guide-box")

# save to file for manuscript submission
ggsave('./figures/DCC-interaction-RR_lag_conversation_condition.png',
       units = "in", width = 4, height = 5,
       grid.arrange(
         top=textGrob("Head movement synchrony\nby task and conversation",
                      gp=gpar(fontsize=14)),
         dual_task_plot,
         noise_plot,
         bottom=master_legend,
         ncol = 2
       ))

# save smaller version for knitr
ggsave('./figures/DCC-interaction-RR_lag_conversation_condition-knitr.png',
       units = "in", width = 4, height = 5, dpi=100,
       grid.arrange(
         top=textGrob("Head movement synchrony\nby task and conversation",
                      gp=gpar(fontsize=14)),
         dual_task_plot,
         noise_plot,
         bottom=master_legend,
         ncol = 2
       ))

# save tiff version
tiff('./figures/DCC-interaction-RR_lag_conversation_condition.tiff',
     units = "in", width = 4, height = 5, res=300)
grid.arrange(
         top=textGrob("Head movement synchrony\nby task and conversation",
                      gp=gpar(fontsize=14)),
         dual_task_plot,
         noise_plot,
         bottom=master_legend,
         ncol = 2
       )
dev.off()


```

![**Figure**. Effects of conversation type (affiliative/argumentative), condition (dual-task/noise), and lag (±5 seconds) on RR.](./figures/DCC-interaction-RR_lag_conversation_condition-knitr.png)

***

```{r plot-all-drps, echo=FALSE, warning = FALSE, error = FALSE, message = FALSE, eval = FALSE}

# split plotting dataframe by condition
rec_plot = read.table('./data/plotting_df-DCC.csv',sep=',',header=TRUE)
condition_dfs = split(rec_plot,rec_plot$condition)
noise_data = condition_dfs[[1]]
dual_task_data = condition_dfs[[2]]

# reorder dyads by rr at lag-0 in affiliative conversation
dual_task_data$sorted_dyad = with(dual_task_data,
                                  reorder(dyad,
                                          rr[ifelse(lag == 0, TRUE, NA) &
                                               ifelse(convers > 0, TRUE, NA)],
                                          mean, na.rm = TRUE))
noise_data$sorted_dyad = with(noise_data,
                              reorder(dyad,
                                      rr[ifelse(lag == 0, TRUE, NA) &
                                           ifelse(convers > 0, TRUE, NA)],
                                      mean, na.rm = TRUE))

# plot for noise condition
noise_plot = ggplot(data = noise_data, aes(x = lag, y = rr,
                              color=as.factor(convers))) +
  facet_wrap(~sorted_dyad) + geom_smooth() + geom_line() +
  scale_colour_manual(values=c("chartreuse4","red2"),
                      breaks=c(min(c(noise_data$convers,
                                     dual_task_data$convers)),
                               max(c(noise_data$convers,
                                     dual_task_data$convers)))) +
  ylim(c(min(c(noise_data$rr,dual_task_data$rr)) - .01),
         max(c(noise_data$rr,dual_task_data$rr)) + .01) +
  theme(legend.position = "none",
        strip.background = element_blank(),
        strip.text.x = element_blank()) +
  ggtitle('Noise Condition') +
  ylab(' ') + xlab(paste('Lag (',sampling_rate,' Hz)', sep=""))
  stat_smooth()

# plot for dual-task condition
dual_task_plot = ggplot(data = dual_task_data, aes(x = lag, y = rr,
                                  color=as.factor(convers))) +
  facet_wrap(~sorted_dyad) + geom_smooth() + geom_line() +
  scale_colour_manual(values=c("chartreuse4","red2"),
                      breaks=c(min(c(noise_data$convers,
                                     dual_task_data$convers)),
                               max(c(noise_data$convers,
                                     dual_task_data$convers)))) +
  ylim(c(min(c(noise_data$rr,dual_task_data$rr)) - .01),
         max(c(noise_data$rr,dual_task_data$rr)) + .01) +
  theme(legend.position = "none",
        strip.background = element_blank(),
        strip.text.x = element_blank()) +
  ggtitle('Dual-Task Condition') +
  ylab('RR (mean)') + xlab(paste('Lag (',sampling_rate,' Hz)', sep=""))
  stat_smooth()

# create another noise plot just for the legend
noise_plot_legend = ggplot(noise_data,
                           aes(x=lag,
                               y=rr,
                               color=as.factor(convers))) +
  scale_colour_manual(values=c("chartreuse4","red2"),
                      name='Conversation Type',
                      breaks=c(min(noise_data$convers),
                               max((noise_data$convers))),
                      labels=c("Affiliative","Argumentative")) +
  ggtitle('Noise') +
  stat_smooth() +
  coord_cartesian(ylim=c(.02,.07))

# create a master legend
master_legend = gtable_filter(ggplot_gtable(
  ggplot_build(noise_plot_legend + theme(legend.position="bottom"))),
  "guide-box")

# save to file for manuscript submission
ggsave('./figures/DCC-all_dyad_profiles.png',
       units = "in", width = 8, height = 5,
       grid.arrange(
         top=textGrob("Individual dyad profiles of\nhead movement synchrony",
                      gp=gpar(fontsize=14)),
         dual_task_plot,
         noise_plot,
         bottom=master_legend,
         ncol = 2
       ))

# save smaller version for knitr
ggsave('./figures/DCC-all_dyad_profiles-knitr.png',
       units = "in", width = 8, height = 5, dpi=100,
       grid.arrange(
         top=textGrob("Individual dyad profiles of\nhead movement synchrony",
                      gp=gpar(fontsize=14)),
         dual_task_plot,
         noise_plot,
         bottom=master_legend,
         ncol = 2
       ))

# save tiff version
tiff('./figures/DCC-all_dyad_profiles-knitr.tiff',
     units = "in", width = 8, height = 5, res=300)
grid.arrange(
         top=textGrob("Individual dyad profiles of\nhead movement synchrony",
                      gp=gpar(fontsize=14)),
         dual_task_plot,
         noise_plot,
         bottom=master_legend,
         ncol = 2
       )
dev.off()

```

![**Figure**. Individual profiles for each dyad's RR by conversation type (affiliative/argumentative), condition (dual-task/noise), and lag (±5 seconds).](./figures/DCC-all_dyad_profiles-knitr.png)