1994, 1 (4), 476-490 | GEOFFREY R. LOFTUS and MICHAEL E. J. MASSON
Loftus and Masson argue that confidence intervals can effectively supplement or replace traditional hypothesis testing in within-subject designs. They emphasize that within-subject designs typically ignore between-subject variance, making the within-subject error term (subject × condition interaction) suitable for confidence intervals. This interval is based on the same error term as the analysis of variance (ANOVA), leading to comparable conclusions. It is also related to confidence intervals of mean differences by a factor of √2, allowing for inferences about population mean patterns.
In experimental psychology, most statistical analysis revolves around the relationship between sample and population means. In ideal experiments, sample means closely approximate population means, making hypothesis testing unnecessary. However, in real experiments, population means are estimated, necessitating statistical analysis to determine the reliability of observed patterns.
Hypothesis testing is the dominant method in the social sciences, involving the formulation of a null hypothesis and decision based on sample data. Loftus and Masson suggest that graphical procedures, particularly confidence intervals, can serve as a supplement or replacement for hypothesis testing. They note that the origins of hypothesis testing stem from a compromise between conflicting approaches, including Bayesian techniques and the null hypothesis significance testing (NHST) approach.
NHST, developed by Fisher, focuses on inductive reasoning and determining the likelihood of observed results under a null hypothesis. It differs from Bayesian methods, which focus on updating probabilities based on prior information. Neyman and Pearson further developed NHST, introducing the concept of two competing hypotheses and decision errors (Type I and Type II). Their approach emphasizes the importance of error rates in determining significance levels.
The debate between Fisher and Neyman-Pearson remains unresolved, but a "silent solution" has emerged in the behavioral sciences, combining elements of both approaches. This solution includes specifying significance levels before data collection and using confidence intervals to assess the degree of belief in hypotheses.
Confidence intervals provide a direct answer to the question of how well sample means represent population means. They can supplement or replace hypothesis testing by offering a visual representation of the uncertainty around sample means. In within-subject designs, confidence intervals based on the subject × condition interaction are appropriate, as they reflect the variability due to the interaction rather than between-subject variance.
In a hypothetical experiment, within-subject data showed a consistent study-time effect, leading to a significant ANOVA result. However, a confidence interval based on between-subject variance would conflict with this result. By normalizing data to remove subject variability, a confidence interval based on the interaction variance was computed, aligning with the ANOVA results.
Confidence intervals in within-subject designs are computed using the interaction mean squares, leading to intervals that reflect the variability around condition means. These intervals provide a clear picture of the underlying pattern of population means and statistical power. They are related to confidence intervals around mean differences by a factor of √2, allowing for inferences about population mean patterns.
In multifactor designs,Loftus and Masson argue that confidence intervals can effectively supplement or replace traditional hypothesis testing in within-subject designs. They emphasize that within-subject designs typically ignore between-subject variance, making the within-subject error term (subject × condition interaction) suitable for confidence intervals. This interval is based on the same error term as the analysis of variance (ANOVA), leading to comparable conclusions. It is also related to confidence intervals of mean differences by a factor of √2, allowing for inferences about population mean patterns.
In experimental psychology, most statistical analysis revolves around the relationship between sample and population means. In ideal experiments, sample means closely approximate population means, making hypothesis testing unnecessary. However, in real experiments, population means are estimated, necessitating statistical analysis to determine the reliability of observed patterns.
Hypothesis testing is the dominant method in the social sciences, involving the formulation of a null hypothesis and decision based on sample data. Loftus and Masson suggest that graphical procedures, particularly confidence intervals, can serve as a supplement or replacement for hypothesis testing. They note that the origins of hypothesis testing stem from a compromise between conflicting approaches, including Bayesian techniques and the null hypothesis significance testing (NHST) approach.
NHST, developed by Fisher, focuses on inductive reasoning and determining the likelihood of observed results under a null hypothesis. It differs from Bayesian methods, which focus on updating probabilities based on prior information. Neyman and Pearson further developed NHST, introducing the concept of two competing hypotheses and decision errors (Type I and Type II). Their approach emphasizes the importance of error rates in determining significance levels.
The debate between Fisher and Neyman-Pearson remains unresolved, but a "silent solution" has emerged in the behavioral sciences, combining elements of both approaches. This solution includes specifying significance levels before data collection and using confidence intervals to assess the degree of belief in hypotheses.
Confidence intervals provide a direct answer to the question of how well sample means represent population means. They can supplement or replace hypothesis testing by offering a visual representation of the uncertainty around sample means. In within-subject designs, confidence intervals based on the subject × condition interaction are appropriate, as they reflect the variability due to the interaction rather than between-subject variance.
In a hypothetical experiment, within-subject data showed a consistent study-time effect, leading to a significant ANOVA result. However, a confidence interval based on between-subject variance would conflict with this result. By normalizing data to remove subject variability, a confidence interval based on the interaction variance was computed, aligning with the ANOVA results.
Confidence intervals in within-subject designs are computed using the interaction mean squares, leading to intervals that reflect the variability around condition means. These intervals provide a clear picture of the underlying pattern of population means and statistical power. They are related to confidence intervals around mean differences by a factor of √2, allowing for inferences about population mean patterns.
In multifactor designs,