1  Background and R Setup

The goal of this document is provision of a template for using R to evaluate data from a 1-factor design that is typically called a 1-way ANOVA problem. The completely randomized design used for the initial illustration here is a 3-group design. These initial data come from an exercise in the classic Hays textbook. Later chapters utilize other data sets that have more treatment conditions.

The standard R axiom that there are always multiple ways of performing any task is never more accurate than with the ANOVA models. Beginning with graphical depiction and extending to standard NHST inferences, contrast analysis and post hoc tests, and evaluation of assumptions, the document also includes some rudimentary Bayesian approaches to inference.

This document

The document is constantly under development:

One of the primary goals is to reproduce all the work we have accomplished with the SPSS REGRESSION, GLM, MANOVA and ONEWAY procedures (and then some).

Several R packages are required:

Show/Hide Code
#if (!requireNamespace("BiocManager", quietly = TRUE))
#    install.packages("BiocManager")
#BiocManager::install("Biobase", version = "3.8")
Show/Hide Code
# load packages
library(afex)
library(asbio)
library(BayesFactor)
library(beeswarm)
library(car)
library(coin)
library(dunn.test)
library(effectsize)
library(emmeans)
library(ez)
library(DTK)
library(ggdist)
library(gghalves)
library(ggplot2)
library(ggrain)
library(ggthemes)
library(ggstatsplot)
library(granova)
library(gridExtra)
library(gt)
library(KScorrect)
library(knitr)
library(lattice)
library(lawstat)
library(lmboot)
library(lmPerm)
library(lsr)
library(multcomp)
library(multtest)
library(mutoss)
library(nortest)
library(outliers)
library(pgirmess)
library(plotrix)
library(plyr)
library(psych)
library(pwr)
library(rcompanion)
library(Rmisc)
library(sciplot)
library(sdamr)
library(sjstats)
library(userfriendlyscience)
library(WRS2)
library(dplyr)

Package citations for packages loaded here (in the above order): afex (Singmann, Bolker, Westfall, & Aust, 2018), asbio (Aho, 2019), BayesFactor (Morey & Rouder, 2018), beeswarm (Eklund, 2016), car (Fox, Weisberg, & Price, 2018), coin (Hothorn, Hornik, van de Wiel, Winell, & Zeileis, 2017), effectsize (Ben-Shachar, Makowski, & Lüdecke, 2021), emmeans (Lenth, 2019), ez (Lawrence, 2016), DTK (Lau, 2013), dunn.test (Dinno, 2017), ggdist (Kay, 2024), ggplot2 (Tiedemann, 2022), ggplot2 (Wickham et al., 2018), ggrain (Judd, van Langen, & Kievit, 2024), ggthemes (Arnold, 2018), ggstatsplot (Patil, 2021), granova (Pruzek & Helmreich, 2014), gridExtra (Auguie, 2017), gt (Iannone, Cheng, & Schloerke, 2019), KScorrect (Novack-Gottshall & Wang, 2018), knitr (Xie, 2018), lattice (Sarkar, 2018) lawstat (Gastwirth et al., 2017), lmPerm (Wheeler & Torchiano, 2016), lsr (Navarro, 2015) multcomp (Hothorn, Bretz, & Westfall, 2017), multtest (Pollard, Gilbert, Ge, Taylor, & Dudoit, 2018), mutoss (Team et al., 2017), nortest (Gross & Ligges, 2015), outliers (Komsta, 2011), pgirmess (Giraudoux, 2018), plotrix (Lemon et al., 2018), plyr (Wickham, 2016), psych (Revelle, 2019), pwr (Champely, 2018), rcompanion (Mangiafico, 2019), Rmisc (Hope, 2013,) sciplot (Morales, R Development Core Team, R-help listserv community, & Duncan Murdoch., 2017), sdamr (Speekenbrink, 2022), sjstats (Lüdecke, 2019), userfriendlyscience (Peters, 2017), WRS2 (Mair & Wilcox, 2018), dplyr (Wickham, François, Henry, & Müller, 2019)

1.1 A note on R version and package installations.

R packages are undergoing constant revision and some code here may be deprecated or slightly modified in more recent versions of some packages. RStudio makes it simple to update versions of packages. Users can always install the most recent versions (or archived versions if they are no longer maintained on CRAN) of R packages with source files rather than binaries, when they are available. The general process is to download the appropriate source files from the repository (ending in “tar.gz”). Then use this function to install the package:

Show/Hide Code
#install.packages(file.choose(), repos=NULL, type="source")

Note that Windows users will need to install the Rtools suite of tools before source package installation is attempted.

https://cran.r-project.org/bin/windows/Rtools/

Rstudio may permit direct installation from source.

Two packages that are required for permutation tests and bootstrapping, lmPerm and lmboot, may be arcived from CRAN can be obtained by searching CRAN (search the package name).

Three packages come from the BioConductor suite of r packages and the core BioConductor installer should also be installed.
https://www.bioconductor.org/

Search for pages of each of these four to download and install the latest package source files. But by the time you read this the normal process of installing the binary files may work (see the BiocManager page)

BiocManager

Biobase

BioGenerics

multtest

1.2 Resources

The following list will provide a good start for those needing a broader background in ANOVA techniques and more detailed sources for the primary packages employed in this document.

  • Salvatore S. Mangiafico’s R Companion: [https://rcompanion.org/rcompanion/d_05.html]
  • Martin Schweinberger’s Blog: [http://www.martinschweinberger.de/blog/one-way-anova/]
  • cwoods on RPub: [https://rpubs.com/cwoods/anova]
  • Daniel Wollschläger’s R Examples Repository [http://dwoll.de/rexrepos/posts/anovaCRp.html]

1.3 A note on R coding style

In this document, a great many functions from a great many packages are used. Sometimes packages use the same name for a function that is uses in another package. In order to reduce ambiguity I have attempted to be consistent in a way of calling functions in the code.

Normally, if a package is loaded, we can write code that just calls the function. For example, here is how one can call the describe function to analyze an object/dataframe/variable:

Show/Hide Code
describe(variablename)

But a describe function exists in multiple packages. Readers would not necessarily know which package the describe function employed here came from unless the text or a comment in the code chunk identified it. It turns out that the last package loaded with that function in it would gain priority.

So, in order to add clarity, I have tried to use the pkgname::functionname convention. Preceding the function name with the package name and the double colon, executes the function from that package. This is effective even if the package has not previously been loaded with the library function. For example:

Show/Hide Code
psych::describe(variable_name)

The exception is when a function is found in base R packages, but not add-ons. In that case I don’t use the :: approach.