Chapter 8 Effect Sizes, Power, and Sample Size Planning

An imporant adjunct to NHST work with ANOVA is the provision of effect sizes. Although there are several approaches possible to finding effect size statistics in R, they are fairly simple to obtain. Sample size planning can be accomplished with tools from the pwr package, although I probably prefer the use of the GPower software for its broad capabilities.

8.1 Effect Sizes

In earlier sections of this document, we have already seen provision of some effect size statistics. The multiple R-squared available from the ‘lm’ fit objects is also called eta-squared. The afex package functions give the “generalized eta squared” statistic.

Note that in a 1-way design, partial eta squared values will equal the eta squared value, and partial omega squared values will equal omega squared values.

Eta Squared, omega Squared, and Cohen’s F can be obtained quickly from the effectsize package.

One additional useful function is provided here for computation of several commonly used effect sizes (sjstats).

8.1.1 Use of the effectsize package

Two functions, eta_sq and omega_sq permit confidence interval calculation for either full or partial effect size statistics. The partial effect size statistic concept really has no meaning in 1-way ANOVAS and that is reflected in the commentary produced by the functions.

effectsize::omega_squared(fit.1,   ci.lvl = .95)
## For one-way between subjects designs, partial omega squared is equvilant to omega squared.
## Returning omega squared.
## Parameter | Omega2 |       90% CI
## ---------------------------------
## factora   |   0.25 | [0.02, 0.44]
effectsize::eta_squared(fit.1, ci.lvl = .95)
## For one-way between subjects designs, partial eta squared is equvilant to eta squared.
## Returning eta squared.
## Parameter | Eta2 |       90% CI
## -------------------------------
## factora   | 0.30 | [0.06, 0.49]

The Cohen’s F statistics is also provided by a function from the effectsize package:

effectsize::cohens_f(fit.1, ci.lvl=.95)
## For one-way between subjects designs, partial eta squared is equvilant to eta squared.
## Returning eta squared.
## Parameter | Cohen's f |       90% CI
## ------------------------------------
## factora   |      0.66 | [0.26, 0.98]

8.2 Use of the sjstats package

The anova_stats function strikes me as very useful. It not only returns the basic ANOVA, but provides several effect size indices, including Cohen’s f. Note that the “power” value returned in the data frame produced by the anova_stats function is a post-hoc power calculation that we have reviewed as a problematic concept in how it has often been applied. Refer to the stattoolkit bibliography for the literature on this.

kable(sjstats::anova_stats(fit.1))
term df sumsq meansq statistic p.value etasq partial.etasq omegasq partial.omegasq epsilonsq cohens.f power
factora factora 2 233.867 116.933 5.895 0.008 0.304 0.304 0.246 0.246 0.252 0.661 0.874
…2 Residuals 27 535.600 19.837 NA NA NA NA NA NA NA NA NA

The anova_stats function can work on either an anova object or one of class Anova. For larger factorial designs, this would be an important way of obtaining effect sizes based on differing SS Types that can be specified. In the above section, we passed the already-created anova fit object to anova_stats, but we can also do the whole analysis in one line of code using the car package Anova function to specify SS Type. For our example here, the Type 1 and Type 3 SS are identical because the design is balanced and Type 1 vs Type 3 should not be relevant in a 1way design.

contrasts(hays$factora) <- contr.sum
kable(sjstats::anova_stats(Anova(aov(dv~factora, data=hays), type=3)))
term sumsq meansq df statistic p.value etasq partial.etasq omegasq partial.omegasq epsilonsq cohens.f power
factora factora 233.867 116.933 2 5.895 0.008 0.304 0.304 0.246 0.246 0.252 0.661 0.874
…2 Residuals 535.600 19.837 27 NA NA NA NA NA NA NA NA NA

8.2.1 Use of the lsr package

Another option for effect size calculation is the etaSquared function from the lsr package. It can take an argument that permits specification of SS Type, but that will not matter in this balanced design.

library(lsr)
#etaSquared(fit.1, type=1)
#etaSquared(fit.1, type=2)
etaSquared(fit.1, type=3)
##            eta.sq eta.sq.part
## factora 0.3039335   0.3039335

8.3 Power and sample size planning for completely randomized 1-factor ANOVA designs

The pwr package provides a fairly comprehensive way to estimate sample size requirements when designing studies. For a one-factor design, the logic of the code is very straight forward. In other work, we have seen how to use GPower as well.

To use the pwr.anova.test funtion:

  • We need to tell it how many groups.
  • We need to have an estimate of the within-group std deviation (assumes homogeneity of variance)
  • We need to have a set of expected outcome values for the sample means. With these means and the within-group variation (the sd), we can estimate cohen’s effect size statistic (the “f”)

Since we have been working with a 3-group design, lets see how we might have planned for that with the ‘pwr.anova.test’ function. I set the means and sd arbitrarily in this example - they would normally be chosen on the basis of informed prior information, perhaps from pilot studies or published literature.

It is instructive to “fiddle” with this code, changing the means and the sd to see how it affects the desired n per group. This is an alternative to GPower.

#library(pwr)
groups = 3
means = c(25,20,20)
sd = 5
grand.mean  = mean(means)
efsize = sqrt( sum( (1/groups) * (means-grand.mean)^2) ) /sd #cohen's "f" effect size
efsize
## [1] 0.4714045
pwr.anova.test(k = groups,
               n = NULL,
               f = efsize,
               sig.level = 0.05,
               power = 0.90)
## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 3
##               n = 20.01726
##               f = 0.4714045
##       sig.level = 0.05
##           power = 0.9
## 
## NOTE: n is number in each group

This, and other pwr functions work by passing all but one of the relevant characteristics to the function. By leaving out sample size, and passing alpha, power, and effect size, the minimal sample size per group required to acheive that power is returned.