Page: ANOVA -- Comparison of Means Test
Stop!!!!! Are you sure you have an interval variable and a polychotomous (3+) categorical variable? If not, go back to this Page. If so, proceed.
When ready, you may find the generic ANOVA do-file a useful complement to this Page.
Tests for comparing the means of three or more groups (ANOVA)
Are you trying to compare the means of three or more groups? Do you have a categorical variable (e.g., race) and an interval outcome (e.g., composite of social trust)?
In addition to this page and the videos embedded within, you may wish to consult this chapter Download chapter on how to conduct ANOVA in Stata.
One-Way ANOVA
You will almost always be using the One-Way ANOVA:
- . oneway intvar groupvar
- . oneway intvar groupvar, tabulate bonferroni
https://www.youtube.com/watch?v=U5wCBHoX_IQ&feature=youtu.be
Links to an external site.
Determining which groups’ means are significantly different from one another
ANOVA only tells you whether there is a significant difference between our groups overall (see Urdan, p. 106-107 for interpreting ANOVA output). It doesn’t tell us which groups are significantly different from one another. Fortunately, we can have Stata conduct multiple-comparison tests as part of ANOVA. Simply check on the “Bonferroni” option. Be sure you are reading the results correctly.
- . oneway intvar groupvar, tabulate bonferroni
Where the MSbetween is the average amount of variation between groups and the MSwithin is the average amount of variation within each of our groups. Note that in order for F to be large (and therefore unlikely to be due to a fluke sample), the average variation between groups has to be high relative to the average variation within groups. Either can affect the size of the F-statistic.
Determining effect size for ANOVA results
You may find statistical significance, but that doesn’t tell you whether the difference you are seeing is practically or substantively significant. It is always good practice to gauge the effect (difference) size, though many fail to do this. **Unfortunately, the effectsize command does not work in relation to the “oneway” ANOVA command. Instead, you will have to use the “anova” command to (re)run your ANOVA. Thus, use the following command sequence:
- . anova intvar groupvar
- . effectsize groupvar (type this command after your ANOVA and it will calculate eta-squared along with another measure of effect size—omega-squared. These are similar to r-squared and can be interpreted as the percent of variation explained by your grouping variable.
One can use a similar scale as Cohen’s d for interpreting eta-squared and omega-quared. Again, these are somewhat arbitrary values and one should take into account the phenomenon in question when characterizing the values.
Small effect |0.2|
Medium effect |0.5|
Large effect |0.8|
Note: You can also run an OLS regression and the R-squared will match the eta-squared value from the effectsize command above. This only applies when your categorical variable, dummied out, is the only independent variable in your regression model. You can also run the test command afterward if you want to see if any two groups that aren't the reference group are significantly different from one another (which Bonferroni also does for you).
- . regress intvar i.groupvar
- . test 2.groupvar=3.groupvar
Assumptions of ANOVA
Like the t-test, ANOVAs assume certain things about your data, as the determination of your p-value is based on data that conform to these assumptions.
1) the interval variable is normally distributed
2) groups exhibit equal variances
3) sample sizes for each group are roughly equal
The ANOVA is remarkably robust against violations of these assumptions, particularly if your sample size is large. Still, it’s worth conducting diagnostics on your data to determine whether a non-parametric test is needed or appropriate.
Diagnostics of assumptions
- Testing for normality:
- . histogram intvar, normal (eye-balling it)
- . sktest intvar [A normal distribution will have a skewness of 0 and a kurtosis value or approsimately 3.0; a kurtosis value higher than 3 indicates greater "peakedness" and a value lower than 3 indicates a flatter distribution. The p-values less than .05 in the sktest indicate a distribution that departs from a normal distribution.]
- Testing for unequal variances between your groups
- The ANOVA results return a Bartlett’s test for equal variances, but many prefer to ignore this since the test because it is overly influenced by sample size. Eyeballing (comparing standard deviations of the groups) is often sufficient.
- Compare each group’s sample size (tests assume they are similar). This can be found in a number of different ways—through summary statistics or by including the tabulate command in the oneway ANOVA command.
- . oneway intvar groupvar, tabulate
Non-parametric alternatives
As noted above, ANOVA is remarkably robust against violations of its assumptions, particularly if you have a decent sized sample. If in doubt, run the non-parametric alternative (below). I generally report both results or, if the non-parametric alternative confirms the ANOVA, just report the ANOVA results and simply state that the Kruskal-Wallis Rank Test supports the finding. If there is a discrepancy, then say as much. In such a case, the non-parametric alternative should be privileged.
- Kruskal-Wallis Rank Test
- . kwallis intvar, by(groupvar)
- Comparing medians
- . tabstat intvar, statistics(mean median sd) by(groupvar)
As with the Wilcoxan (Mann-Whitney) Rank Sum Test, the Kruskal-Wallis Rank Test can be used in situations where your dependent variable is ordinal rather than interval in nature.
Writing up your results
ANOVAs are reported like the t test, but there are two degrees-of-freedom numbers to report. First report the between-groups degrees of freedom, then report the within-groups degrees of freedom (separated by a comma). After that report the F statistic (rounded off to two decimal places) and the significance level. For example:
"There is a significant difference in mean levels of trust by race/ethnicity, F(1, 145) = 5.43, p = .02. A Bonferroni comparison of means shows the statistically significant differences to be between A-B, A-D, .... The associated eta-squared had a value of #, indicating a moderate sized effect of race/ethnicity on trust."
Complementary graphs
You may opt to include graphs to helps illustrate your data. I recommend consulting the Graphing Results Page.
. graph hbar (median) intvar, over(groupvar)
. graph box intvar, over(groupvar)
https://www.youtube.com/watch?v=05Nsl6u5EKU&feature=youtu.be
Links to an external site.
A strip plot is similar to a box plot but also includes the actual data points that inform those box plots. There is a user-written command in Stata to generate these. Note: You can make the marker size small, vsmall, tiny, etc. The degree of transparency of the points can be adjusted using the mcolor setting. The jitter setting "shakes out" the points so that they don't overlap as much. You can play with the settings.
. stripplot intvar, over(catvar) box iqr msize(tiny) mcolor(%25) jitter(5)
A kernel density plot shows the distributions for each group (replace the numbers with whatever numbers your groups are assigned to):
. twoway kdensity intvar if groupvar == 1 || kdensity intvar if groupvar == 2 || kdensity intvar if groupvar == 3
Alternatively, you can graph the means and include the 95% confidence intervals around them. To do so, run the following sequence of commands.
. anova intvar groupvar
. margins groupvar
. marginsplot, xdimension(groupvar) recast(bar)
Or, to represent means with a dot:
. marginsplot, xdimension(catvar) recast(dot)