T-Tests and One-Way ANOVA
- 5 T-tests and analysis of variance (ANOVA) are widely used statistical methods to compare group means. Both are parametric statistical techniques, in that these tests involve a number of assumptions, including: normally distributed population; dependent variable measured on continuous interval or ratio level; random sampling of data; observations must be independent of one another; and homogeneity of variance (population means may differ, but all populations should have the same standard deviation). The independent variable is categorical.
Both t-tests and analysis of variance (ANOVA) procedures are used to test hypotheses – by means of the null hypothesis and alternative hypothesis. The researcher asks: Does the observed variation represent a real difference between the two populations, or just a chance difference in the samples? The null hypothesis asserts that there is no difference between the population groups and that any observed variation is due to chance alone. The rival hypothesis is the alternative (research) hypothesis, which asserts that an observed effect is genuine.
Assuming that the null hypothesis is true, what is the probability of obtaining the observed value for the test statistic? Statistical significance (p value £ 0.5) is a possible finding of both the t-test statistic and F-ratio statistic. This would indicate that the sample is unlikely to have occurred by chance. Therefore, the null hypothesis would be rejected, and the alternative hypothesis supported.
The t-test is used to test differences in means between two groups. The t-test is used when the dependent variable is a continuous interval/ratio scale variable (such as total self-esteem) and the independent variable is a two-level categorical variable (such as gender). The t-test can be used even if sample sizes are very small, as long as the variables within each group are normally distributed and the variation of scores within the two groups is equal (no reliable differences). With the t-test, the test statistic used to generate p values has a Student’s t distribution with n-1 degrees of freedom.
The statistical t-test procedure is used to determine a p-value that indicates how likely the results would be obtained by chance. If there is £ 5% chance of getting the observed differences by chance, the null hypothesis is rejected because a statistically significant difference was found between the two groups.
The t-test can be used with two independent groups (independent samples t-test) and when the sample is paired or dependent (paired samples t-test). Independent samples are usually two groups chosen by random selection. Dependent samples are two groups matched on some variable (such as gender or age) or the same group being tested twice (repeated measures).
The two sample t-test simply tests whether or not two independent populations have different mean values on some measure. An example of an independent samples t-test is evaluating differences in test scores between a group of patients who were given a treatment intervention and a control group who received a placebo. An example of a paired samples t-test is computing differences in tests scores on the same sample of patients using a pretest-posttest design (such as measuring pretreatment and posttreatment cholesterol levels).
Whereas statistical significance determines how likely an observed finding occurred by chance, effect size measures the strength of relationship between two variables. Effect size is a population effect and its indices are independent of sample size. The effect size statistic for the independent-samples t-test is either Cohen’s d or eta squared. The effect size (d) is the difference between the two population means, divided by the estimated population standard deviation. The formula for eta squared = t2 / t2 + (N12 + N2-2).
To ascertain how precise is the estimate of effects (for instance, the mean), a confidence interval (CI) is formulated. The CI is constructed around a sample mean or another statistic to establish a range of values for the unknown estimated population parameter (mean or mean difference), as well as the probability of being right (the degree of confidence for this estimate). The 95% or 99% CI is most commonly used.
When a researcher reports the results from an independent or paired-samples t-test, he or she needs to include the following information: verification of parametric assumptions; dependent variable scores; independent variable, levels; statistical data: significance, t-scores, probability, group means, group standard deviations, mean differences, confidence intervals, and effect size. Examples are below.
Presenting the results for independent-samples t-test
An independent-samples t-test was conducted to compare the sleepiness scores for males and females. There was no significant difference in scores for males (M = 31.04, SD = 2.36) and females (M = 34.53, SD = 3.22); t (588) = 1.62, p = .14 (two-tailed). The magnitude of the differences in the means (mean difference = 3.49, 95% CI: -1.80 to 1.87) was very small (eta squared = .008).
Presenting the results for paired-samples t-test
A paired-samples t test was conducted to evaluate the impact of the intervention on students’ scores on the Fear of Statistics Test (FOST). There was a statistically significant decrease in FOST scores from Time 1 (M = 39.16, SD = 4.25) to Time 2 (M = 35.55, SD = 4.35), t (32) = 5.12, p < .0005 (two-tailed). The mean decrease in FOST scores was 3.61 with a 95% confidence interval ranging from 1.45 to 4.38. The eta squared statistic (.50) indicted a large effect size.
While the t-test is used to compare the means between two groups, ANOVA is a statistical procedure used to compare means between three or more groups. Analysis of variance (ANOVA), despite its name, is concerned with differences between means of groups, not differences between variances. The term analysis of variance comes from the way the procedure uses variances to decide whether the means are different.
The ANOVA statistical procedure examines what the variation (difference) is within the groups (SSw), then examines how that variation translates into variation between the groups (SSb), taking into account how many subjects there are in the groups (degrees of freedom). If the observed differences are greater than what is likely to occur by chance, then there is statistical significance.
The statistic computed in ANOVA to generate p-values is the F-ratio, the ratio of the mean of the squares between to the mean of the squares within: F = MSb/ MSw (each of the means = SS/ df). Like the t, F depends on degrees of freedom to determine probabilities and critical values. The F statistic and the p-value depend on the variability of the data within groups and the differences among the means.
The null hypothesis for ANOVA is that the population mean (average value of the dependent variable) is the same for all groups. In other words, there are no differences among the group means. The alternative hypothesis is that the average is not the same for all groups. A significant F test means the null hypothesis is rejected – the population means are not equal. When the null hypothesis is true, the F-ratio is approximately 1. When the alternative hypothesis is true, the F statistic tends to be large.
The F test is always one-sided because any differences among the group means tend to make F large. The ANOVA F test shares the robustness of the two-sample t test.
With ANOVA, if the null hypothesis is rejected, then it is known that at least two groups are different from each other. It is not known specifically which of the groups differ. In order to determine which groups differ, post-hoc t-tests are performed using some form of correction (such as the Bonferroni correction) to adjust for an inflated probability of a Type I error (false positive conclusion).
Effect size for ANOVA is determined by estimating eta squared. Eta squared is calculated by dividing the sum of squares between (SSb) by the total sum of squares (SSt) and it indicates the proportion of variance explained in ANOVA.
There are several varieties of ANOVA, such as one-factor (or one-way) ANOVA or two-factor (or two-way) ANOVA. The factors are the independent variables, each of which must be measured on a categorical scale. The levels of the independent variable (factor) define the separate groups.
The one-way ANOVA is used with an interval or ratio level continuous dependent variable, and a categorical independent variable (factor) that has two or more different levels. The levels correspond to different groups or conditions. There are two different types of one-way ANOVA: between groups ANOVA (comparing two or more different groups; independent design), and repeated measures ANOVA (one group of subjects exposed to two or more conditions; within-subjects design).
An example of one-way between groups ANOVA is a research study comparing the effectiveness of four different dosage regimens of the same antidepressant medication on depression scores. A questionnaire that measures depression is given to participants in the four different intervention groups.
When a researcher reports the results from a one-way between groups ANOVA or repeated measures ANOVA, he or she needs to include the following information: verification of parametric assumptions; dependent variable scores; independent variable, levels; statistical data: significance, F-ratio scores, probability, group means, group standard deviations, mean differences, confidence intervals, effect size, and post-hoc comparisons. An example is below.
Presenting the results from one-way between groups ANOVA with post-hoc tests (Pallant, 2007, p. 248)
A one-way between groups analysis of variance was conducted to explore the impact of age on levels of optimism, as measured by the Life Orientation Test (OT). Subjects were divided into three groups according to their age (Group 1: 29 yrs or less; Group 2: 30 to 44 yrs; Group 3: 45 yrs and above). There was a statistically significant difference at the p < .05 level in LOT scores for the three age groups: F (2, 432) = 4.6, p = .01. Despite reaching statistical significance, the actual difference in mean scores between the groups was quite small. The effect size, calculated using eta squared, was .02. Post-hoc comparisons using the Tukey HSD test indicated that the mean score for Group 1 (M = 1.36, SD = 4.55) was significantly different from Group 3 (M = 22.96, SD = 4.49). Group 2 (M = 22.10, SD = 4.15) did not differ significantly from either Group 1 or 3.
Moore, D. S., & McCabe, G. P. (2003). Introduction to the practice of statistics (4th ed.). New York: W. H. Freeman and Company.
Pallant, J. (2007). SPSS survival manual. New York: McGraw-Hill Education.
Polit, D. F., & Beck, C. T. (2008). Nursing research: Generating and assessing evidence for nursing practice (8th ed.). Philadelphia: Wolters Kluwer Health.
VickyRN has '16' year(s) of experience and specializes in 'Gerontological, cardiac, med-surg, peds'. From 'Under the shadow of His wings...'; Joined Mar '01; Posts: 12,044; Likes: 6,438.1Mar 18, '09 by pshs_2000, RNQuote from VickyRNOh, I was just joking...lol. I'm working on my master's thesis and yesterday was crazy trying to get my code to run in SAS. It eventually did run, but I think my ORs are wrong. Anyway, thanks for posting info on t-test and ANOVA. I liked the refresher. It brings back memories of biostats. :-)No. You'll need to contact a statistician.1Mar 19, '09 by nursemarionNicely written overview. I love statistics, and so seldom get to use them other than my annual QI project. Thanks for posting it!
Oh how I dream of just doing QI, working with variables, doing T-tests, and making statistically significant differences in outcomes!!! Why are there no jobs anywhere for a nurse who wants to do this kind of work???0Mar 22, '09 by ghillbert, MSN, NP GuideQuote from cxg174Lots of research jobs in industry.Nicely written overview. I love statistics, and so seldom get to use them other than my annual QI project. Thanks for posting it!
Oh how I dream of just doing QI, working with variables, doing T-tests, and making statistically significant differences in outcomes!!! Why are there no jobs anywhere for a nurse who wants to do this kind of work???