 # ANOVA Output

Note

Choose View, Annotated ANOVA to activate blue hints and tips for how to interpret the ANOVA results.

At the top are the name of the response, its number, and the name given when the design was built.

The next line gives a brief description of the model being fit, followed by the type of sum of squares used for the calculations.

## The ANOVA table

### Rows

Block: The block row shows how much variation in the response is attributed to blocks. Block variation is removed from the analysis. Blocks are not tested (no F or p- value) because they are considered a non-replicated, hard-to-change factor. Blocks are assumed to not interact with the factors. If there are no blocks in the design, this row will not be present.

Model: The model row shows how much variation in the response is explained by the model along with the over-all model test for significance.

Terms: The model is separated into individual terms and tested independently.

Residual: The residual row shows how much variation in the response is still unexplained.

Lack of Fit: is the amount the model predictions miss the observations.

Pure Error: is the amount of difference between replicate runs.

Cor Total: This row shows the amount of variation around the mean of the observations. The model explains part of it, the residual explains the rest.

### Columns

Source: A meaningful name for the rows.

Sum of Squares: Sum of the squared differences between the overall average and the amount of variation explained by that rows source.

df: Degrees of Freedom: The number of estimated parameters used to compute the source’s sum of squares.

Mean Square: The sum of squares divided by the degrees of freedom. Also called variance.

F Value: Test for comparing the source’s mean square to the residual mean square.

Prob > F: (p-value) Probability of seeing the observed F-value if the null hypothesis is true (there are no factor effects). Small probability values call for rejection of the null hypothesis. The probability equals the integral under the curve of the F-distribution that lies beyond the observed F-value.

In “plain English”, if the Prob>F value is very small (less than 0.05 by default) then the source has tested significant. Significant model terms probably have a real effect on the response. Significant lack of fit indicates the model does not fit the data within the observed replicate variation.

### Modeling Statistics

Std Dev: (Root MSE) Square root of the residual mean square. Consider this to be an estimate of the standard deviation associated with the experiment.

Mean: Overall average of all the response data.

C.V.: Coefficient of Variation, the standard deviation expressed as a percentage of the mean. Calculated by dividing the Std Dev by the Mean and multiplying by 100.

PRESS: Predicted Residual Error Sum of Squares – A measure of how the model fits each point in the design. The PRESS is computed by first predicting where each point should be from a model that contains all other points except the one in question. The squared residuals (difference between actual and predicted values) are then summed.

$$e_{-i} = y_i\, -\, \hat{y}_{-i}\, =\, \frac{e_i}{1\, -\, h_{ii}}$$

$$PRESS = \sum_{i=1}^n(e_{-i})^2$$

$$e_{-i}$$ is a deletion residual computed by fitting a model without the $$i^{th}$$ run then trying to predict the $$i^{th}$$ observation with the resulting model.

$$e_i$$ is the residual for each observation left over from the model fit to all thedata.

$$h_{ii}$$ is the leverage of the run in the design.

R-squared: A measure of the amount of variation around the mean explained by the model.

$$R^2 = 1 - \begin{bmatrix}\frac{SS_{residual}}{SS_{residual}\, +\, SS_{model}}\end{bmatrix}\, =\, 1\, -\, \begin{bmatrix}\frac{SS_{residual}}{SS_{total}\, -\, SS_{curvature}\, -\, SS_{block}}\end{bmatrix}$$

Adj R-squared: A measure of the amount of variation around the mean explained by the model, adjusted for the number of terms in the model. The adjusted R-squared decreases as the number of terms in the model increases if those additional terms don’t add value to the model.

$$Adj. R^2 = 1 - \begin{bmatrix}\left(\frac{SS_{residual}}{df_{residual}}\right)\, /\, \left(\frac{SS_{residual}\, +\, SS_{model}}{df_{residual}\, +\, df_{model}}\right)\end{bmatrix}\, =\, 1\, -\, \begin{bmatrix}\,\left(\frac{SS_{residual}}{df_{residual}}\right)\, /\, \left(\frac{SS_{total}\, -\, SS_{curvature}\, +\, SS_{block}}{df_{total}\, -\, df_{curvature}\, +\, df_{block}}\right) \end{bmatrix}$$

Pred R-squared: A measure of the amount of variation in new data explained by the model.

$$Pred. R^2 = 1 - \begin{bmatrix}\frac{PRESS}{SS_{residual}\, +\, SS_{model}}\end{bmatrix}\, =\, 1\, -\, \begin{bmatrix}\frac{PRESS}{SS_{total}\, -\, SS_{curvature}\, -\, SS_{block}}\end{bmatrix}$$

The predicted R-squared and the adjusted R-squared should be within 0.20 of each other. Otherwise there may be a problem with either the data or the model. Look for outliers, consider transformations, or consider a different order polynomial.

Adequate Precision: This is a signal-to-noise ratio. It compares the range of the predicted values at the design points to the average prediction error. Ratios greater than 4 indicate adequate model discrimination.

$$\frac{max(\hat{Y})\, -\, min(\hat{Y})}{\sqrt{\bar{V}_{\hat{Y}}}}\, >\, 4$$

$$\bar{V}_{\hat{Y}}\, =\, \frac{p\hat{\sigma}^2}{n}$$

-2 Log Likelihood: This is derived by iteratively improving the coefficient estimates for the chosen model to maximize the likelihood that the fitted model is the correct model. For balanced, orthogonal designs, this is exactly the same result as least squares regression. The -2 log likelihood is used to compute the following penalized modeling statistics.

BIC: a large design penalized likelihood statistic used to choose the best model.

$$BIC(M) = -2\, \cdot\, \ln(L[M|Data])\, +\, \ln(n)\, \cdot\, p$$

AICc: a small to medium (most designs) penalized likelihood statistic used to choose the best model.

$$AIC(M) = -2\, \cdot\, \ln(L[M|Data])\, +\, 2\, \cdot\, p$$

$$AICc(M) = AIC(M)\, +\, \frac{2p(p\, +\, 1)}{n\, -\, p\, -\, 1}$$

$$p$$ = number of model parameters (including intercept (b0) and any block coefficients)

$$n$$ = number of runs in the experiment

$$\sigma^2$$ = residual mean square from the ANOVA table

## Coefficient Estimates

This table has one row per estimated term in the model.

The number of columns depends on the type of analysis.

Factor: Experimental variables selected for inclusion in the predictive model.

Coefficient Estimate: Regression coefficient representing the expected change in response Y per unit change in X when all remaining factors are held constant. In orthogonal two-level designs, it equals one-half the factorial effect.

Coefficient Estimate for General Factorial Designs: Coefficients for multi-level categorical factors are not as simple to interpret. Beta(1) is the difference of level 1’s average from the overall average. Beta(2) is the difference of level 2’s average from the overall average. Beta(k-1) is the difference of level (k-1)’s average from the overall average. The negative sum of the coefficients will be the difference of level k’s average from the overall average. Don’t use these coefficients for interpretation of the model – use the model graphs!

df: Degrees of Freedom – equal to one for testing coefficients.

Standard Error: The standard deviation associated with the coefficient estimate.

95% CI High and Low: If this range spans 0 (one limit is positive and the other negative) then the coefficient of 0 could be true, indicating the term is not significant.

VIF: Variance Inflation Factor – Measures how much the variance around the coefficient estimate is inflated by the lack of orthogonality in the design. If the factor is orthogonal to all other factors in the model, the VIF is one. Values greater than 10 indicate that the factors are too correlated together (they are not independent.) VIF’s are a less important statistic when working with mixture designs and constrained response surface designs.

The predictive model is listed in both actual and coded terms. (For mixture experiments, the prediction equations are given in pseudo, real and actual values of the components.) The coded (or pseudo) equation is useful for identifying the relative significance of the factors by comparing the factor coefficients. All equations give identical predictions when hierarchy is enforced. These equations, used for prediction, have no block effects. Blocking is a restriction on the randomization of the experiment, used to reduce error. It is not a factor being studied. Blocks are only used to fit the observation for this experiment, not to make predictions.

For Linear Mixture Models Only:

The coefficient table is augmented for linear mixture models to include statistics on the adjusted linear effects. Because the linear coefficients cannot be compared to zero, the linear effect of component i is measured by how different the ith coefficient is from the other (q-1) coefficients. The t-test is applicable to the difference in the mixture coefficient estimates. When the design space is not a simplex, the formula for calculating the component effects is adjusted for the differences in the ranges.

The gradient is the estimated slope across the linear response surface projected through a reference blend in both Cox and Piepel’s direction. The total effect of a component is the gradient times the range the component varied. These effects are plotted as a trace plot under the model graphs button.

For One Factor Designs Only:

The next section in the ANOVA lists results for each treatment (factor level) and shows the significance of the difference between each pair of treatments.

Estimated Mean: The average response at each treatment level.

Standard Error: The standard error associated with the calculation of this mean. It comes from the standard deviation of the data divided by the square root of the number of repetitions in a sample.

Treatment: This lists each pairwise combination of the factor levels.

Mean Difference: The difference between the average response from the two treatments.

df: The degrees of freedom associated with the difference.

Standard Error: The standard error associated with the difference between the two means.

t value: This is calculated by the Mean Difference divided by the Standard Error. It represents the number of standard deviations separating the two means.

Prob>t: This is the probability of getting this t-value if the two means are truly not different. A value less than 0.05 indicates that there is a statistically significant difference between the means.