By James Marquez, April 5, 2017

ANOVA is an **AN**alysis **O**f **VA**riance. It is a test to determine if there is a significant difference between the means of two or more populations. It describes the variance within groups and the variance between groups. It tests the null hypothesis which states that all population means are equal while the alternative hypothesis states that at least one is different. One-way ANOVA is used to test groups with only one response variable.

The first step is to load the custom packages we will use to increase the functionality of base R.

In [2]:

```
library(ggplot2) # Functions used to create beautful plots
library(mosaic) # Plot TukeyHSD
```

Next, let's load the iris dataset and explore it a few different ways to get a good understanding of the data we have available to us.

In [3]:

```
data('iris') # Load the iris dataset into local memory
str(iris) # View the structure of the iris dataset
table(iris$Species) # View the number of samples in each category of Species
head(iris) # View the first six rows of the iris dataset
```

In [32]:

```
# Group, color, and fill by Species
ggplot(iris, aes(x=Sepal.Width)) +
geom_density(aes(group=Species, color=Species, fill=Species), alpha=0.3)
```

Next, let's perform our ANOVA test. We'll use the aov() function and pass in our variables in the correct order. We'll save our results to an object we name ANOVA.

In [5]:

```
ANOVA <- aov(Sepal.Width ~ Species, data=iris) # (DV ~ IV, data=dataset)
summary(ANOVA) # View results of the ANOVA test
```

In [6]:

```
TukeyHSD(ANOVA)
```

In [7]:

```
mplot(TukeyHSD(ANOVA), system="ggplot") # Use the mosaic package to plot the results of TukeyHSD
```

If you prefer to do t-tests, you can use the following method to perform pairwise t-tests on all your factor levels. The pairwise.t.test() function allows you to choose between eight p-value adjustments to help counteract the problem of multiple comparisons: holm, hochberg, hommel, bonferroni, BH, BY, fdr, and none. To reduce the chance of incorrectly rejecting our null hypothesis (Type I error) we'll use the Bonferroni correction method when performing our multiple comparisons.

In [8]:

```
pairwise.t.test(iris$Sepal.Width, iris$Species,
p.adj="bonferroni", paired=FALSE)
```

- One-way ANOVAs in R – including post-hocs/t-tests and graphs by Hayward Godwin
- What is ANOVA? by Minitab
- Analysis of variance by Wikipedia
- Iris flower data set by Wikipedia
- Tukey's range test by Wikipedia
- ANOVA and Tukey's test on R by Flavio Barros
- Graphics with the mosaic package by Randall Pruim
- Bonferroni correction by Wikipedia