Diagnostics Plots

Externally Studentized residuals are the default with Internally Studentized and raw residuals as options. Unless the leverages of all the runs in a design are identical, the standard errors of the residuals are different. This means that each raw residual belongs to different populations (one for each different standard error). Therefore, raw residuals should not be used for checking the regression assumptions. Studentizing the residuals maps all the different normal distributions to a single standard normal distribution.

Externally Studentized residuals based on a deletion method are the default due to being more sensitive for finding problems with the analysis. Internally Studentized residuals are also available but are less sensitive to finding such problems.

Normal Probability: The normal probability plot indicates whether the residuals follow a normal distribution, thus follow the straight line. Expect some scatter even with normal data. Look only for definite patterns, like an “S-shaped” curve, which indicates that a transformation of the response may provide a better analysis.

Note

The Shapiro-Wilk test for normality (used on the Half-Normal and Normal Plots of Effects) is not shown on the Residuals Normal Probability plot because this plot violates the assumption of independence by ordering the residuals. Therefore, it is not an appropriate test.

Residuals vs. Predicted: This is a plot of the residuals versus the ascending predicted response values. It tests the assumption of constant variance. The plot should be a random scatter (constant range of residuals across the graph). Expanding variance (“megaphone pattern <”) in this plot indicates the need for a transformation.

Residuals vs. Run: This is a plot of the residuals versus the experimental run order. It checks for lurking variables that may have influenced the response during the experiment. The plot should show a random scatter. Trends indicate a time-related variable lurking in the background. Blocking and randomization provide insurance against trends ruining the analysis.

Predicted vs. Actual: A graph of the predicted response values versus the actual response values. The purpose is to detect a value, or group of values, that are not easily predicted by the model.

Box-Cox Plot for Power Transforms: This plot provides a guideline for selecting the correct power law transformation. A recommended transformation is listed, based on the best lambda value, which is found at the minimum point of the curve generated by the natural log of the sum of squares of the residuals. If the 95% confidence interval around this lambda includes 1, then the software does not recommend a specific transformation. This plot is not displayed when either the logit or the arcsine square root transformation has been applied.

Residuals vs. Factor: This is a plot of the residuals versus any factor of your choosing. It checks whether the variance not accounted for by the model is different for different levels of a factor. If all is okay, the plot should exhibit a random scatter. Pronounced curvature may indicate a systematic contribution of the independent factor that is not accounted for by the model.

For further reading:

  • Geoff Vining. Technical advice: residual plots to check assumptions. Quality Engineering, 23(1):105–110, 2011.