Screenshots may differ slightly depending on software version.

Historical Data (pt 2)

Part 2 – Advanced Topics

Design Evaluation

If you still have the Longley data active in Stat-Ease® software from Part 1 of this tutorial, continue on. If you exited the program, re-start it and use the Help, Tutorial Data menu and select Employment. Under the Design branch of the program, click Evaluation. The software brings up a quadratic polynomial model by default. The screen shot shows the Response field set at “Design Only” as opposed to the Employment response. In other words, it will evaluate the entire matrix of factors, regardless whether response data are present. The other option (response by response) comes in handy when experimenters end up with missing data, thus degrading the “designed-for” model.


Design evaluation (design only)

Press the Results tab.


Results of evaluation for quadratic polynomial

This model is badly aliased. For example, the effect of A is confounded with -24.5 CD, etc. Go back to Model and reduce the Order to Linear.


Re-setting order to linear

Press Results again, then move to the Alias Matrix pane and note “No aliases found…” Much better!


Results of evaluation for linear model

Move over to the Degrees of Freedom pane to evaluate them.


Degrees of Freedom

Looking over the annotations provided by the software (activated via View, Show Annotation), notice this design flunks the recommendation for pure error df. Of course this really is not a designed experiment, but rather historical data collected at happenstance.


Annotations for degrees of freedom

Study the Model Terms section of the evaluation. Do any of the statistics pass the tests suggested for a good design? No!


Details on model terms, including power

Now move on to the Leverage report. These statistics come out surprisingly good – none exceeds twice the average.

More statistics are available by going back to Model, selecting Options, and turning on (checkmarks) Matrix Measure and Highlight Correlation Values (if not already selected).


Turning on more options for report

Click OK and view the Results. Now look at the Matrix pane and go to the Matrix Measures tab to see new statistics.


Matrix measures for design evaluation

Notice the condition number (12,220) far exceeds the level considered to represent severe multicollinearity for a design matrix (1000 or fewer). Check out the Correlation Matrix and Pearson’s r panes to see specific correlations and reveal why.


Click the blue layout icons on your toolbar to select different pane layouts. Click and drag the tab for each pane to the different sections to customize your view.

The Correlation Matrix shows how the factors are correlated with one another on a scale of -1 (perfect negative correlation) to +1 (perfect positive correlation). These correlations are shown in a grid form and color coded to see at a glance where there may be issues. Remember, we don’t want our factors to be correlated. We want independent estimates of how they affect the responses. Therefore, white boxes on the grid are good. By just glancing at this grid, you can see there are a lot of correlations among factors (dark blue and red colors). It’s no wonder Longley picked this data set to test regression software! The Pearson’s r matrix shows Pearson’s correlation coefficients. It’s just a different way of calculating correlation. You can learn more about that by clicking on the tips (tips) icon.


Correlation matrices

Now, just for fun, press the Graphs tab and select Perturbation from the toolbar.


Perturbation plot for standard error

Notice factors B and F exhibit the most dramatic tracks for standard error. On the Graphs Toolbar select 3D Surface. On the Factors Tool, right-click factor F:Time and change it to X1 axis.


3D view of standard error for factors B and F

There’s no sense doing anything more. By now it’s clear that this ‘design’ fails all the tests for a good experiment, but that’s generally the nature of the beast for happenstance data.