Confidence Intervals from Normalized Data: A correction to Cousineau (2005)

Presenting confidence intervals around means is a common method of expressing uncertainty in data. Loftus and Masson (1994) describe confidence intervals for means in within-subjects designs. These confidence intervals are based on the ANOVA mean squared error. Cousineau (2005) presents an alternative to the Loftus and Masson method


Confidence Intervals from Normalized Data:
A correction to Cousineau (2005)

Richard D. Morey University of Missouri-Columbia
Presenting confidence intervals around means is a common method of expressing uncertainty in data.Loftus and Masson (1994) describe confidence intervals for means in within-subjects designs.These confidence intervals are based on the ANOVA mean squared error.Cousineau (2005) presents an alternative to the Loftus and Masson method, but his method produces confidence intervals that are smaller than those of Loftus and Masson.I show why this is the case and offer a simple correction that makes the expected size of Cousineau confidence intervals the same as that of Loftus and Masson confidence intervals.
Confidence intervals (CIs) are a staple in the presentation of psychological data because they allow researchers to quickly gage the amount of uncertainty in data (Rouder & Morey, 2005).For within-subjects designs, there are several approaches to creating confidence intervals.For a given design it may not be clear which to choose.Consider a simple within-subjects design with two conditions, a pre-test and post-test.For this design, there are multiple methods of generating confidence intervals.I will discuss each in turn.

Approaches to confidence intervals
The standard way to build confidence intervals is to compute the standard error of the mean for each condition, and multiply it by the appropriate t-distribution quantile.In order to make this concrete, Table 1 lists 9) = 2:26.This gives the size of the CI around the mean.The resulting confidence intervals are labeled "A" in Figure 1.The size of the confidence intervals is larger than what might be expected from the results of the t t-test.The t ttest indicates a large, significant effect, but the confidence intervals overlap.The reason for this is that each observation reflects three sources of variance: the fixed effect of condition, the random ability of each participant, and random error.The size of the CIs labeled "A" are determined by both participant variance and error variance.The paired t t-test, in contrast, accounts for participant variability in computing the significance of the effect.Loftus and Masson (1994) suggested that confidence intervals in within-subject designs be based on the denominator Mean Square (MS) from the appropriate ANOVA analysis.Because the MS is an unbiased estimate of the appropriate error variance (Moser, 1996;Loftus & Masson, 1994), using the MS would yield a confidence interval uncontaminated by variance from participants.To build Loftus and Masson CIs, one computes a withinsubjects standard error by dividing the ANOVA denominator MS by N N and taking the square-root.Using this standard error in place of the standard error of each condition, then multiplying by the appropriate t t quantile, yields a within-subjects confidence interval.The CIs labeled "B" in Figure 1 are Loftus and Masson confidence intervals.These confidence intervals are much tighter and better reflect our conclusions from the t t-test analysis.
Loftus and Masson's method has two primary drawbacks: First, all CIs are the same size, so any heterogeneity of variance is hidden; second, it may be inconvenient to extract the appropriate MS and force a statistical program, such as SPSS, to use it to build CIs.Cousineau (2005) suggested another method for constructing confidence intervals in within-subjects designs.Cousineau's method is simple and seems reasonable: normalize the data by subtracting the appropriate participant's mean performance from each observation, and then add the grand mean score to every observation.Then, use the normalized data to build confidence intervals using the standard method described above.
In notation, the Cousineau method can be described as follows.Let yij yij be the i ith participant's score in the j jth condition (i = 1; : : : ; N i = 1; : : : ; N ; j = 1; : : : ; M j = 1; : : : ; M ).Then define the normalized observations zij zij as (1) The variances of the resulting normalized values in a condition, and thus the size of the CIs, no longer depend on the participant effects (Loftus & Masson, 1994;Masson & Loftus, 2003).The intervals labeled "C" in Figure 1 were generated by Cousineau's method.The primary benefit of the normalization method over basing the CIs on the MS is ease of use and intuitiveness, and Cousineau provides SPSS code to easily generate the normalized scores.

Inconsistency between approaches
Examination of Figure 1  The reason that CIs based on normalized scores will be smaller on average is that normalizing scores induces positive covariance between scores within a condition.Because all the data is used in the computation of a single normalized observation (see Eq. 1, the observations become correlated.If scores are correlated, the sample variance will be biased low, because positively correlated scores will not vary as much as expected from one another.Thus, even though both Loftus and Masson CIs and Cousineau CIs are functions of error variance alone, Loftus and Masson CIs are based on an unbiased estimate, while Cousineau CIs are based on an estimate biased to be small.All CIs generated from normalized scores will be "too small" with respect to Loftus and Masson CIs.Loftus and Masson (1994) noted this in their Appendix, and a general proof of this fact is provided in the Appendix of this manuscript.
The size of Cousineau CIs relative to Loftus and Masson CIs is a function of the number of within-subjects conditions.The expected value of the sample variance in a condition of normalized data is where ¾ 2 ² ¾ 2 ² is the error variance, SSj SSjis the sum of squares for conditionj j, and M M is the number of within-subjects conditions.The worst case bias occurs for M = 2 M = 2, and the bias decreases as M M increases.

An easy correction
Eq. 2 suggests a way of correcting the bias of Cousineau CIs.Compute the sample variance of the normalized data as Cousineau suggests, then simply multiply the sample variances in each condition by M=(M ¡ 1) M=(M ¡ 1).Multiplying by this correction factor will bring the size of Cousineau CIs into line with Loftus and Masson CIs.For M = 2 M = 2, the resulting corrected CIs will be exactly the same size as the Loftus and Masson intervals.The corrected Cousineau CIs for the hypothetical data are represented by "B" in Figure 1.For M > 2 M > 2, corrected CIs will not be exactly the same size in general, but their expected size is the same as Loftus and Masson CIs.This simple correction also holds for more complicated designs.It can accommodate any number of fixed effects; simply replace M M with Q P p=1 Mp Q P p=1 Mp, where P P is the number of fixed factors, and Mp Mp is the number of levels of the p pth fixed factor.If there are between-subject factors, each level of the between-subject factor should be considered separately.Using this simple correction, the ease-of-use of the normalizing approach is retained, while also retaining the expected size of the Loftus and Masson intervals.

General Discussion
The use of within-subjects confidence intervals in data presentation is an important way of communicating the amount of uncertainty in data.In within-subjects designs, a researcher may desire CIs whose size accords more with the result of inferential tests.Loftus and Masson (1994) and Cousineau (2005) provide methods of building these confidence intervals.Cousineau's method has two advantages: first, it is easy to compute; second, it may show violations of homogeneity of variance (Cousineau, 2005).
Understanding the properties of the methods used to create CIs is a vital part of understanding graphical presentations of data.I have shown that within-subjects confidence intervals based on normalized data are small relative to Loftus and Masson within-subjects confidence intervals.Luckily, however, the bias in normalized CIs is easily corrected; the simple correction suggested above will make the expected size of Cousineau CIs the same as the commonly-used Loftus and Masson CIs.
where the ®i(i = 1; : : : ; N) ®i(i = 1; : : : ; N) are random participant effects and ¯j(j = 1; : : : ; M ) ¯j(j = 1; : : : ; M ) are fixed condition effects (ie, P j ¯j = 0 P j ¯j = 0), and all ²ij ²ij and ®i ®i are mutually independent.Let the multivariate normal data vector y y be 2 (3) It is obvious that the covariance matrix for y y given the model is where IN IN is the N £ N N £ N identity matrix, JN JNis an N £ N N £ N matrix of all 1, and Ð Ð is the Kronecker product operator.
reveals a mismatch between the size of Loftus and Masson style CIs and CIs based on normalized data.If the size of both CIs is based only on the error variance, why are normalized CIs smaller than those based on the MS?The MS from the ANOVA analysis has properties that are well-understood in linear model theory.There is very little discussion in the literature, however, on the properties of CIs built from normalized scores.If CIs based on normalized data are to be useful, it is important to understand the reason for the mismatch between Loftus and Masson CIs and Cousineau CIs.

Table 1 .
Sample results used in Figure1.