ANOVA (Analysis Of Variance) can be used to determine the differences between means from more than one group.

A two-way ANOVA can be used to determine how the meaning of qualitative variables changes depending on the levels of two categorical factors. If you need to determine how two independent variables affect a dependent variable, a two-way ANOVA is used.

Two-way ANOVA: When should it be used?

A two-way ANOVA can be used when collected data is at multiple levels of two independent categorical variables.

quantitative variable is a representation of numbers or amounts. You can divide it to get a group means.

Categorical variables represent types or categories. A level is an individual categorical variable.

How to do a 2-way ANOVA

Our imaginary crop yield experiment data includes observations of:

  • Final crop yield (bushels/acre)
  • Type of fertilizer (fertilizer type 1, 2, or 3)
  • Planting density (1=low density, 2=high density)
  • Block in the field (1), 2, 3, 4,

Two-way ANOVA tests whether the independent variables (fertilizer types and planting density) have an impact on the dependent variable (average yield). We also need to consider other sources of variability in the data.

Our experimental treatment was applied in blocks. We want to see if the planting block has an effect on average crop yield. We want to see if there is an interaction between two variables. For example, it could be that the plants’ ability to take up fertilizer affects their planting density.

We will compare three models because there are a few possible relationships between variables.

  1. A two-way ANOVA that does not include any interaction or blocking variables (also known as an additive two-way ANOVA).
  2. A two-way ANOVA with interaction, but no blocking variable.
  3. A two-way ANOVA using interaction and the blocking variable.

Model 1 assumes that there is no interaction between independent variables. Model 2 assumes there is an interaction between the independent variables. Model 3 assumes that there is an interaction between variables and that the blocking variable is an important source for variation in data.

We can test the variables and their combinations for data description and determine if the planting block is important for the average crop yield by running all three versions.

It is not the only method of doing your analysis. However, it can be a useful way to efficiently compare models based on what you consider reasonable combinations.