Stat-Ease Blog

10 highly intelligent features that make the most from every experiment

posted by Mark Anderson on May 26, 2026

Stat-Ease software provides powerful tools for design of experiments (DOE) with a great deal of intelligence baked in. Here are 10 “smart” features that make DOE easy for our users. From bottom to top (ordered by DOE phase: design, modeling, optimization, and confirmation), every one of them provides great value.

Here we go—the countdown begins!

Half-normal plot for the selection of effects.

Factorial design-building wizard guides you to right-sized experiments via a ‘heads-up’ on power to detect important effects despite the variability of run, sample, and test.
Optimal design builder’s exchange algorithm delivers a finely crafted experiment customized per your specifications.
Preset lineup of near-zero effects on the half-normal graph of factorial effects makes it easy to see those that merit selection.
Scoring system for polynomial models suggests just the right 'Goldilocks' level that does not underfit or overfit your results.
Box-Cox plot studies your model residuals and recommends whether or not to apply a transformation for a better fit and advises which one will do best.
Detection of non-hierarchical models and, if you agree to fix this, the needed terms get added back for a well-formulated polynomial.

3D surface plot of a factorial design with centerpoints.

Application of a curvature test to two-level factorial designs with center points with advice on how to augment the design if significant.
Annotations on statistical outputs that explain them in plain English and provide advice on what to do when they go awry.
Numerical search using a highly effective variable-size simplex algorithm finds the most desirable combination of factor settings and/or component levels meeting all your goals for process efficiency, product efficacy, and cost reduction.
Confirmation tool smartly updates the prediction interval based on the number of follow-up runs at your chosen setting.

Finally, one bonus feature in Stat-Ease software that will make you more intelligent: screen tips via the lightbulb icon (click the >> chevron if showing) next to the Help bubble. This will show interesting information about each feature on the screen for you to understand the underlying statistics.

Email me your favorite “they thought of everything” quality aspect of Stat-Ease software, and I will add it to my list for my next ‘shout out’ on intelligent features.

Good Enough is Great: Why the Simpler Model Might Be Best

posted by Stat-Ease Team on April 15, 2026

(Adapted from Mark Anderson’s 2023 webinar “Selecting a Most Useful Predictive Model”)

There can be a moment when analyzing your response surface method (RSM) experiment that you feel let down. You designed it carefully, maybe as a central composite design built specifically to capture curvature via a quadratic model, but when the results come in, the fit statistics tell you that a linear model fits just fine—no curves needed.

At this point you probably feel cheated. You paid for quadratic, but you only got linear. Now you have to recognize that's not a failure: that's the experiment doing its job.

Designed for Quadratic, Fitted with Less

When George Box and K.B. Wilson developed the central composite design back in 1951, they built it to estimate a full quadratic model: main effects, two-factor interactions, and squared terms that let you map response peaks, valleys, and saddle points. It's a powerful structure, and for many process optimization problems you'll need every bit of it. But not always.

Take a typical study with three factors: say, reaction time, temperature, and catalyst concentration; and two responses to optimize, for example, conversion (yield) and activity. Fit the conversion response, and the quadratic earns its keep. The squared terms are significant, and curvature is real. You get a rich surface to work with. Satisfying.

Then you turn to activity. You run through the same fitting sequence: check the mean, add linear terms, layer in two-factor interactions, and try the quadratic, but the data keeps saying “no thank you” at each step beyond linear. The sequential p-values tell a clear story: main effects matter, but the added complexity contributes nothing.

The right answer isn't to force a quadratic model because that's what you designed for. Use the linear model. That's what the data supports.

Simpler Models Are Easier to Trust

A more parsimonious model—statistician-speak for "simpler, with fewer unnecessary terms"—has real advantages beyond just passing significance tests. Every term you add raises the risk of overfitting: chasing noise instead of signal. A model stuffed with insignificant terms can look impressive on paper while quietly falling apart when you try to predict new results.

The major culprit for bloated models is the R-squared (R²) statistic that most scientists tout as a measure of how well they fitted their results. Unfortunately, R² in its raw form is a very poor quality-indicator for predictive models because it climbs whenever you add a term, regardless of whether it means anything. It is far better to use a more refined form of this statistic called “predicted” R², which estimates how well your model will perform on data it hasn't seen yet.

Trim the insignificant terms from a bloated model and you'll often see predicted R² go up, even as raw R² dips slightly. That's a good sign. For a good example of this counterintuitive behavior of R²s, check out this Stat-Ease software table showing the fit statistics on activity fit by quadratic versus linear models:

Fit statistics on activity fit by quadratic versus linear models
	Activity (quadratic)	Activity (linear)
Std. Dev.	1.08	0.9806
Mean	60.23	60.23
C.V. %	1.79	1.63
R²	0.9685	0.9564
Adjusted R²	0.9370	0.9477
Predicted R²	0.7696	0.9202
Adeq Precision	18.2044	29.2274
Lack of Fit (p-values)	0.3619	0.5197

By the way, if you have Stat-Ease software installed, you can easily reproduce these results by opening the Chemical Conversion tutorial data (accessible via program Help) and, via the [+] key on the Analysis branch, creating these alternative models. This is a great way to work out which model will be most useful. Don’t forget, all else equal, the simpler one is always best—easier to explain with fewer terms to tell a cleaner story.

Here's a guiding principle: if adjusted R² and predicted R² differ by more than 0.2, try reducing your model. Bringing those two statistics closer together is usually a sign you're moving in the right direction.

So, When Do You Stop Tweaking?

This is where a lot of practitioners get into trouble—not by underfitting, but by endlessly refitting. There's always another criterion to check, another comparison to agonize over. Beware of “paralysis by analysis”!

George Box said it well: all models are wrong, but some are useful. The goal isn't a perfect model. The goal is a useful one. Here's how you know when you’ve made a good choice:

Check adequate precision. This statistic measures signal-to-noise ratio: anything above 4 is generally good. Strong adequate precision alongside reasonable R² values usually means you have enough model to work with, even if lack of fit is technically significant. (Lack-of-fit can mislead you, particularly when center-point replicates are run by highly practiced hands who nail that standard condition every time, giving you an artificially tight estimate of pure error.)

Look at your diagnostics, but don't over-interpret them. The top three are the normal plot of residuals, residuals-versus-run, and the Box-Cox plot for potential transformations. On the normal plot, apply the “fat pencil” test: if you can cover the points with a broad marker held along the line, you're fine. You're looking for a dramatic S-shape or an obvious outlier, not minor wobbles.

Try the algorithmic reduction, then compare. Stat-Ease software offers automatic model reduction tools. Run it, compare the reduced model to the full model on predicted R² and adequate precision, and make a judgment call. If the statistics are similar and the model is simpler, take it.

Then press ahead. Once you've checked your fit statistics, run your diagnostics, and done a sensible reduction, go use the model! You can always get a second opinion (Stat-Ease users can request one from our StatHelp team), but at some point the model is good enough. That's the whole point.

The Liberating Truth

There's something freeing about accepting a linear model from an experiment designed for a quadratic. It means your process is well-behaved in that region, easy to interpret and likely to predict well. Now you can get on with finding the conditions that meet your experimental goals—a process that hits the sweet spot for quality and cost at robust operating conditions.

The experiment isn't a failure when it gives you something simpler than expected. It's doing exactly what a good experiment should do: telling you the truth.

Like the blog? Never miss a post - sign up for our blog post mailing list.

Tips and tricks for designing statistically optimal experiments

posted by Mark Anderson on Nov. 19, 2025

Like the blog? Never miss a post - sign up for our blog post mailing list.

A fellow chemical engineer recently asked our StatHelp team about setting up a response surface method (RSM) process optimization aimed at establishing the boundaries of his system and finding the peak of performance. He had been going with the Stat-Ease software default of I-optimality for custom RSM designs. However, it seemed to him that this optimality “focuses more on the extremes” than modified distance or distance.

My short answer, published in our September-October 2025 DOE FAQ Alert, is that I do not completely agree that I-optimality tends to be too extreme. It actually does a lot better at putting points in the interior than D-optimality as shown in Figure 2 of "Practical Aspects for Designing Statistically Optimal Experiments." For that reason, Stat-Ease software defaults to I-optimal design for optimization and D-optimal for screening (process factorials or extreme-vertices mixture).

I also advised this engineer to keep in mind that, if users go along with the I-optimality recommended for custom RSM designs and keep the 5 lack-of-fit points added by default using a distance-based algorithm, they achieve an outstanding combination of ideally located model points plus other points that fill in the gaps.

For a more comprehensive answer, I will now illustrate via a simple two-factor case how the choice of optimality parameters in Stat-Ease software affects the layout of design points. I will finish up with a tip for creating custom RSM designs that may be more practical than ones created by the software strictly based on optimality.

An illustrative case

To explore options for optimal design, I rebuilt the two-factor multilinearly constrained “Reactive Extrusion” data provided via Stat-Ease program Help to accompany the software’s Optimal Design tutorial via three options for the criteria: I vs D vs modified distance. (Stat-Ease software offers other options, but these three provided a good array to address the user’s question.)

For my first round of designs, I specified coordinate exchange for point selection aimed at fitting a quadratic model. (The default option tries both coordinate and point exchange. Coordinate exchange usually wins out, but not always due to the random seed in the selection algorithm. I did not want to take that chance.)

As shown in Figure 1, I added 3 additional model points for increased precision and kept the default numbers of 5 each for the lack-of-fit and replicate points.

Stat-Ease 360 software screenshot showing the Optimal (Custom) Design screen.

Figure 1: Set up for three alternative designs—I (default) versus D versus modified distance

As seen in Figure 2’s contour graphs produced by Stat-Ease software’s design evaluation tools for assessing standard error throughout the experimental region, the differences in point location are trivial for only two factors. (Replicated points display the number 2 next to their location.)

Design evaluation for the I-optimal design.

Design evaluation for the D-optimal design.

Design evaluation for the modified distance design.

Figure 2: Designs built by I vs D vs modified distance including 5 lack-of-fit points (left to right)

Keeping in mind that, due to the random seed in our algorithm, run-settings vary when rebuilding designs, I removed the lack-of-fit points (and replicates) to create the graphs in Figure 2.

Design evaluation for the I-optimal design without lack-of-fit points.

Design evaluation for the D-optimal design without lack-of-fit points.

Design evaluation for the modified distance design without lack-of-fit points.

Figure 3: Designs built by I vs D vs modified distance excluding lack-of-fit points (left to right)

Now you can see that D-optimal designs put points around the outside, whereas I-optimal designs put points in the interior, and the space-filling criterion spreads the points around. Due to the lack of points in the interior, the D-optimal design in this scenario features a big increase in standard error as seen by the darker shading—a very helpful graphical feature in Stat-Ease software. It is the loser as a criterion for a custom RSM design. The I-optimal wins by providing the lowest standard error throughout the interior as indicated by the light shading. Modified distance base selection comes close to I optimal but comes up a bit short—I award it second place, but it would not bother me if a user liking a better spread of their design points make it their choice.

In conclusion, as I advised in my DOE FAQ Alert, to keep things simple, accept the Stat-Ease software custom-design defaults of I optimality with 5 lack-of-fit points included and 5 replicate points. If you need more precision, add extra model points. If the default design is too big, cut back to 3 lack-of-fit points included and 3 replicate points. When in a desperate situation requiring an absolute minimum of runs, zero out the optional points and ignore the warning that Stat-Ease software pops up (a practice that I do not generally recommend!).

A practical tip for point selection

Look closely at the I-optimal design created by coordinate exchange in Figure 3 on the left and notice that two points are placed in nearly the same location (you may need a magnifying glass to see the offset!). To avoid nonsensical run specifications like this, I prefer to force the exchange algorithm to point selection. This restricts design points to a geometrically registered candidate set, that is, the points cannot move freely to any location in the experimental region as allowed by coordinate exchange.

Figure 4 shows the location of runs for the reactive-extrusion experiment with point selection specified.

Design evaluation for the I-optimal design by point exchange.

Design evaluation for the D-optimal design by point exchange.

Design evaluation for the modified distance design by point exchange.

Figure 4: Designs built by I vs D vs modified distance by point exchange (left to right)

The D optimal remains a bad choice—the same as before. The edge for I optimal over modified distance narrows due to point exchange not performing quite as well for as coordinate exchange.

As an engineer with a wealth of experience doing process development, I like the point exchange because it:

Reaches out for the ‘corners’—the vertices in the design space,
Restricts runs to specific locations, and
Allows users to see where they are by showing space point type on the design layout enabled via a right-click over the upper left corner.

Figures 5a and 5b illustrate this advantage of point over coordinate exchange.

Figure 5a: Design built by coordinate exchange with Space Point Type toggled on

On the table displayed in Figure 5a for a design built by coordinate exchange, notice how points are identified as “Vertex” (good the software recognized this!), “Edge” (not very specific) and “Interior” (only somewhat helpful).

Figure 5b: Design built by point exchange with Space Point Type shown

As shown by Figure 5b, rebuilding the design via point exchange produces more meaningful identification of locations (and better registered geometrically): “Vertex” (a corner), “CentEdge” (center of edge—a good place to make a run), “Center” (another logical selection) and “Interior” (best bring up the contour graph via design evaluation to work out where these are located—click any point to identify them by run number).

Full disclosure: There is a downside to point exchange—as the number of factors increases beyond 12, the candidate set becomes excessive and thus the build takes more time than you may be willing to accept. Therefore, Stat-Ease software recommends going only with the far faster coordinate exchange. If you override this suggestion and persist with point exchange, no worries—during the build you can cancel it and switch to coordinate exchange.

Final words

A fellow chemical engineer often chastised me by saying “Mark, you are overthinking things again.” Sorry about that. If you prefer to keep things simple (and keep statisticians happy!), go with the Stat-Ease software defaults for optimal designs. Allow it to run both exchanges and choose the most optimal one, even though this will likely be the coordinate exchange. Then use the handy Round Columns tool (seen atop Figure 5a) to reduce the number of decimal places on impossibly precise settings.

Like the blog? Never miss a post - sign up for our blog post mailing list.

Unraveling the Mystery of Multi-Response Optimization

posted by Shari Kraber on June 1, 2023

The final stage of analyzing designed experiments data is determining the optimal set of process conditions that works for all responses. Stat-Ease software does this via a numerical optimization algorithm. This routine simultaneously optimizes all responses at once, based on goals set by the experimenter. This is achieved by deploying the Derringer-Suich(1) desirability criteria in conjunction with the Nelder-Mead(2) variable-sized simplex search algorithm. This optimization function balances competing response goals to find the “sweet spot” that produces the best of all worlds. Without getting deep into the mathematical weeds of these tools, I would like to provide some basic concepts and discuss how to use this method to optimize DOE results.

Starting point: minimum model requirements

Numerical optimization uses prediction models created by the analysis of each measured response. The stronger the prediction models, the more accurate the optimization results. If the analysis does not show a strong relationship between the factors and the response, then optimization will not work well. At a minimum, the model p-value should be less than 0.05, and the model should only include terms that are statistically significant plus those needed to maintain model hierarchy. If the DOE data included replicates, then there should be an insignificant lack of fit test (p-value >0.10). Key summary statistics for modeling include adjusted R-squared and predicted R-squared. Higher is better for each of these, meaning that more variation in the data and in the predictions is explained by the model. There is not a particular “cut-off” for these values but models that explain more than 50% of the variation are going to perform better than those that do not. In summary, start optimization with response models that explain the data and produce reliable predictions.

Desirability at a specific point

Numerical optimization is driven by a mathematical calculation called desirability. Points within the design space are evaluated via the desirability function that is defined by the user-specified goals for each response. The overall (multi-response) desirability (D) is the geometric mean of the individual desirability (di) for each response.

Figure 1: Desirability function

An individual desirability “little d” (range of 0 to 1) is defined by how closely the evaluated point meets the response goal. Typical response goals are maximize, minimize or target a specific value. In addition to the goal, upper and lower “acceptable” limits on the response values must be set.

Illustration: The experimenters study a process that has 3 input factors and 2 output responses. In this example, the first response (% Conversion) measurements has an observed range of 51-97 percent. The goal for conversion is maximize. Considering business expectations, the minimum acceptable conversion is determined to be 80%, so that is defined as the lower limit. The upper limit is set to the theoretical maximum of 100%. These limits, along with the goal, define the desirability function for the conversion response. When evaluating a particular point in the design space, if the measured conversion is less than 80% (defined lower limit), desirability = 0. If conversion is 80-100%, desirability equals the proportion of the way towards the upper limit (100). Therefore, a conversion of 90 gives d=.5 and a conversion of 95 gives d=0.75. Any point that gives % conversion at 100% or higher will result in d=1.

Figure 2: Response 1 goal: Maximize with an acceptable range of 80-100%.

Response 2 is Activity and the goal is a Target of 63 and a range of 60-66 (Figure 3). Desirability will be 1 only at the exact value of 63. Evaluated points that result in activity levels between 60-63 and 63-66 are rated with desirability values that are proportional to the distance from the target. Activity levels that are either below 60 or above 66 are assigned a desirability of 0.

Figure 3: Response 2 goal: Target 63, with acceptable range 60-66.

The optimization algorithm at work

Once the goals and limits for each response are defined, the search algorithm can start. Stat-Ease software begins with a set of starting points (locations in the design space). For a single starting point, overall desirability (D) is calculated. Then the simplex search starts evaluating desirability (D) in the nearby area and takes “steps” that increase desirability. Steps are taken across the design space until desirability is maximized. All the starting points follow this process, resulting in a set of final “solutions” which are process conditions that at least minimally meet the requirements for all responses (individual desirability is greater than 0).

Optimization solutions

If the process is easy to optimize (the responses don’t compete with each other too much), there may be a large robust space that meets the response goals. In this case a very large number of solutions (process conditions) may be found. These solutions are sorted by the desirability value. Common practice is to focus on the top solution(s). Remember however, all the solutions meet the goals set by the experimenter. Optimization does not mean there is a single set of conditions that is best. If the area is very large (many solutions found) then tightening up the upper or lower limits may be merited. There may also be other external criteria to consider such as cost of the solution, manufacturability, ease of implementation, etc. The experimenter should review all the solutions presented and consider which ones make sense from a business perspective.

Figure 4 shows the optimal conditions for the illustration. The red dots show the location of the optimal settings for the factors, within their range. In this case time is set mid-way in the range (47 min), while temperature is maximized at 90 degrees and catalyst is approximately 2.7%. These process conditions are predicted to result in a conversion of 91% and activity level of 63. Confirmation runs should be completed to verify these results.

Figure 4: Numerical solution “ramps view” for illustration

A side note: Desirability is only a mathematical evaluation tool to compare solutions. Although it ranges from 0 to 1, it is a relative measure within a set of solutions, and not a statistic that needs to be as high as possible. Within a specific DOE, higher desirability means that the solution (set of conditions) met the stated goals more closely than a solution with lower desirability.

Summary

The success of numerical optimization starts with strong prediction models from the DOE analysis. Once models are established, the experimenter specifies each response goal, as well as upper and lower limits around that goal. The numerical search algorithm evaluates areas within the design space, searching for areas that simultaneously meet the goals for all the responses. This optimization function balances competing response goals to find the “sweet spot” that produces the best of all worlds.

References:

G.C. Derringer and R. Suich in “Simultaneous Optimization of Several Response Variables,” Journal of Quality Technology, October 1980, pp. 214-219
Numerical Recipes in Pascal by William H. Press et. al., p.326

Wrap-Up: Thanks for a great 2022 Online DOE Summit!

posted by Rachel Poleke on Oct. 10, 2022

Thank you to our presenters and all the attendees who showed up to our 2022 Online DOE Summit! We're proud to host this annual, premier DOE conference to help connect practitioners of design of experiments and spread best practices & tips throughout the global research community. Nearly 300 scientists from around the world were able to make it to the live sessions, and many more will be able to view the recordings on the Stat-Ease YouTube channel in the coming months.

Due to a scheduling conflict, we had to move Martin Bezener's talk on "The Latest and Greatest in Design-Expert and Stat-Ease 360." This presentation will provide a briefing on the major innovations now available with our advanced software product, Stat-Ease 360, and a bit of what's in store for the future. Attend the whole talk to be entered into a drawing for a free copy of the book DOE Simplified: Practical Tools for Effective Experimentation, 3rd Edition. New date and time: Wednesday, October 12, 2022 at 10 am US Central time.

Even if you registered for the Summit already, you'll need to register for the new time on October 12. Click this link to head to the registration page. If you are not able to attend the live session, go to the Stat-Ease YouTube channel for the recording.

Want to be notified about our upcoming live webinars throughout the year, or about other educational opportunities? Think you'll be ready to speak on your own DOE experiences next year? Sign up for our mailing list! We send emails every month to let you know what's happening at Stat-Ease. If you just want the highlights, sign up for the DOE FAQ Alert to receive a newsletter from Engineering Consultant Mark Anderson every other month.

Thank you again for helping to make the 2022 Online DOE Summit a huge success, and we'll see you again in 2023!