Chapter 1 Background and R Setup

The goal of this document is provision of a template for using R to evaluate data from a 1-factor repeated measures design that is often called a within-subjects problem. Rather than providing one data point for a DV measurement as was the case for the “between-groups” design, each case provides more than one measurement since they are measured “repeatedly”.

The standard R axiom that there are always multiple ways of performing any task is never more accurate than with the ANOVA models. Beginning with graphical depiction and extending to standard NHST inferences, contrast analysis and post hoc tests, and evaluation of assumptions, etc., we can add to that list major divisions in approaches to repeated measures analysis, and this document could become very very long.

This document

  • Is intended for use by APSY511 course at the University at Albany, but can be more broadly used by data analysts.
  • Is a fairly full one-factor repeated measures anova exposition for a five category design.
  • Implements graphical summaries and numerical descriptions in an EDA section.
  • Approaches ANOVA as linear modeling and is supplemented with analytical contrasts, and multiple comparison tests.
  • Includes a section on the Multivariate approach to the repeated measures problem.
  • Provides templates for both the traditional Univariate/GLM approach as well as linear mixed models approaches.
  • Includes graphical and inferential evaluation of assumptions.
  • Provides brief illustrations of Bayes Factor, resampling, and robust methods, as well as a non-parametric approach.

One primary philosophy drives much of this document: repeated measures ANOVA is not dead. It can be very useful in experimental design situations where there are no missing data. Otherwise, linear mixed effects models have an advantage. The traditional methods are much criticized on the basis of flawed error terms when non-sphericity is present. Adjustment method such as Greenhouse-Geisser and Huynh-feldt are looked at with disdain in some quarters. However, an overarching perspective on ANOVA can argue that omnibus effects are the least interesting parts of an analysis. Follow up analyses employing contrasts (and in factorial designs, contrasts on main effects, interactions and simple main effects) are valuable tools. With the implementation of specific error terms the tests of those contrasts are not subject to the non-sphericity consequences (see the contrasts sections below for citations). Since the GG and HF methods seem to be looked down upon by mixed effects modelers, it becomes a non-issue if the focus is on contrast analysis, perhaps instantiated with orthogonal sets. This recommendation is also informed by an understanding that mixed effects modeling of contrasts is a somewhat fuzzy area where clear additional understanding is required. That said, the document also contains some rudimentary linear mixed modeling approaches.

The document is always under development.

One of the primary goals is to reproduce all the work we have accomplished with the SPSS GLM, and MANOVA procedures (and then some).

Several R packages are required:

#if (!requireNamespace("BiocManager", quietly = TRUE))
#    install.packages("BiocManager")
#BiocManager::install("Biobase", version = "3.8")
knitr::opts_chunk$set(echo = TRUE, warning=FALSE)
# load packages and import data
library(afex)
library(BayesFactor)
library(car)
library(emmeans)
library(ez)
library(foreign)
library(ggplot2)
library(ggthemes)
library(granova)
library(gt)
library(kableExtra)
library(knitr)
library(lme4)
library(multcomp)
library(nlme)
library(nortest)
library(permuco)
library(phia)
library(plyr)
library(psych)
library(rmarkdown)
library(Rmisc)
library(sciplot)
library(sjstats)
library(tidyr)
library(WRS2)

Package citations for packages loaded here (in the above order): afex (Singmann, Bolker, Westfall, Aust, & Ben-Shachar, 2020), BayesFactor (Richard D. Morey & Rouder, 2018),
car (Fox, Weisberg, & Price, 2020), emmeans (Lenth, 2020), ez (Lawrence, 2016), ggplot2 (Wickham et al., 2020), ggthemes (Pruzek & Helmreich, 2014), granova (Pruzek & Helmreich, 2014), gt (Iannone, Cheng, & Schloerke, 2019), kableExtra (Zhu, 2019), knitr (Xie, 2020b), lme4 (Bates, Maechler, Bolker, & Walker, 2019), multcomp (Hothorn, Bretz, & Westfall, 2020), nlme (Pinheiro, Bates, & R-core, 2020), nortest (Gross & Ligges, 2015), permuco (Frossard & Renaud, 2019), phia (De Rosario-Martinez, 2015), psych (Revelle, 2020), rmarkdown(Allaire et al., 2020), Rmisc(Hope, 2013), sciplot (Morales, R Development Core Team, R-help listserv community, & Duncan Murdoch., 2020), sjstats (Lüdecke, 2020), tidyr(Wickham & Henry, 2020), WRS2 (Mair & Wilcox, 2020)

1.1 A note on R functions and usage style

With the large set of packages needed for a suite of functions used in this document, it can often be confusing to the novice R user to sort out which package a particular function comes from. I have used two strategies to aid in avoidance of this confusion. Often, in introductory text in a section, I refer to a “function from a package (e.g., bar from foo). In other places I use the double colon convention to call a function from its package. For example”psych::describe” calls the describe function from the psych package. In cases where neither method of specification exists, that would typically mean that the function is in the base set of R packages that are installed on initial setup. For example, aov is in the base system stats package and is used without specifying that.

1.2 Resources

The following list will provide a good start for those needing a broader background in ANOVA techniques and more detailed sources for the primary packages employed in this document.

In addition, the following internet resources can be helpful.