Fit Summary (RSM/MIX Model Selection)

Stat-Ease provides several useful statistical tables that you can use to identify which model to choose for in-depth study (the selection is actually made on the Model button screen). The software underlines and labels as “Suggested” the full-order model that meets the criteria specified below.

Stat-Ease displays a warning about aliased models. In this case there are not enough unique design points to estimate all the model coefficients. The least squares estimates will not be unique, which may result in contour plots with misleading shapes.

Fit Summary

The Fit Summary collects the important statistics used to select the correct starting point for the final model. The model(s) suggested are picked via the Whitcomb Score. The suggested model should be considered a good starting point for the model fitting.

Sequential Model Sum of Squares

Mean: The sum of squares for the effect of the mean.

Blocks: Sequential sum of squares for the effect of blocking (if applicable), after removing the effect of the intercept.

Linear: Sequential sum of squares for the linear terms. The F-value tests the significance of adding linear terms to the intercept and block effects. A small p-value (Prob>F) indicates that adding linear terms has improved the model.

2FI: Sequential sum of squares for the two-factor interaction (AB, BC, etc.) terms. The F-value tests the significance of adding interaction terms to the linear model. A small p-value (Prob>F) indicates that adding interaction terms has improved the model.

Quadratic: Sequential sum of squares for the quadratic (A-squared, B-squared, etc.) terms. The F-value tests the significance of adding quadratic terms to the 2FI model. A small p-value (Prob>F) indicates that adding quadratic terms has improved the model.

Cubic: Sequential sum of squares for the cubic terms. The F-value tests the significance of adding cubic terms to the quadratic model. A small p-value (Prob>F) indicates that adding cubic terms has improved the model.

The column labeled “df” provides the degrees of freedom for each source. In response surface methodology, the total degrees of freedom equals the number of model coefficients added sequentially line by line.

For a mixture model: let q be the number of components in a mixture. The degrees of freedom for the linear terms in a mixture model is (q-1), rather than q, because the sums of squares are corrected for the mean.

You should select the highest degree model that has a p-value (Prob>F) that is lower than your chosen level of significance (for example 0.05).

Lack of Fit

The next table displays lack-of-fit tests that diagnose how well each of the full models fit the data. Models with a significant lack-of-fit should not be used for predictions. The interpretation for the linear model is given. The lines for the other models can be interpreted likewise.

Linear – Lack of fit sum of squares for the linear model. The F-value compares the variation of the differences in the average responses at the design points, and the corresponding estimated responses using the linear model, with the expected experimental variation as estimated from replicated design points (Pure Error). It is the mean square for the linear model lack-of-fit divided by the mean square for pure error.

The lack-of-fit tests compare the residual error to the pure error from replicated design points. A lack-of-fit error significantly larger than the pure error indicates that something remains in the residuals that can be removed by a more appropriate model. If you see significant lack-of-fit (Prob>F value 0.10 or smaller) then don’t use the model as a predictor of the response.

Note

There will be no Lack of Fit statistics if there are no replicates and/or more unique design points than model coefficients.

Model Summary

The “Model Summary Statistics” table lists other statistics used to compare models.