Stat-Ease Blog

Tips and tricks for designing statistically optimal experiments

posted by Mark Anderson on Nov. 19, 2025

Like the blog? Never miss a post - sign up for our blog post mailing list.

A fellow chemical engineer recently asked our StatHelp team about setting up a response surface method (RSM) process optimization aimed at establishing the boundaries of his system and finding the peak of performance. He had been going with the Stat-Ease software default of I-optimality for custom RSM designs. However, it seemed to him that this optimality “focuses more on the extremes” than modified distance or distance.

My short answer, published in our September-October 2025 DOE FAQ Alert, is that I do not completely agree that I-optimality tends to be too extreme. It actually does a lot better at putting points in the interior than D-optimality as shown in Figure 2 of "Practical Aspects for Designing Statistically Optimal Experiments." For that reason, Stat-Ease software defaults to I-optimal design for optimization and D-optimal for screening (process factorials or extreme-vertices mixture).

I also advised this engineer to keep in mind that, if users go along with the I-optimality recommended for custom RSM designs and keep the 5 lack-of-fit points added by default using a distance-based algorithm, they achieve an outstanding combination of ideally located model points plus other points that fill in the gaps.

For a more comprehensive answer, I will now illustrate via a simple two-factor case how the choice of optimality parameters in Stat-Ease software affects the layout of design points. I will finish up with a tip for creating custom RSM designs that may be more practical than ones created by the software strictly based on optimality.

An illustrative case

To explore options for optimal design, I rebuilt the two-factor multilinearly constrained “Reactive Extrusion” data provided via Stat-Ease program Help to accompany the software’s Optimal Design tutorial via three options for the criteria: I vs D vs modified distance. (Stat-Ease software offers other options, but these three provided a good array to address the user’s question.)

For my first round of designs, I specified coordinate exchange for point selection aimed at fitting a quadratic model. (The default option tries both coordinate and point exchange. Coordinate exchange usually wins out, but not always due to the random seed in the selection algorithm. I did not want to take that chance.)

As shown in Figure 1, I added 3 additional model points for increased precision and kept the default numbers of 5 each for the lack-of-fit and replicate points.

Stat-Ease 360 software screenshot showing the Optimal (Custom) Design screen.

Figure 1: Set up for three alternative designs—I (default) versus D versus modified distance

As seen in Figure 2’s contour graphs produced by Stat-Ease software’s design evaluation tools for assessing standard error throughout the experimental region, the differences in point location are trivial for only two factors. (Replicated points display the number 2 next to their location.)

Design evaluation for the I-optimal design.

Design evaluation for the D-optimal design.

Design evaluation for the modified distance design.

Figure 2: Designs built by I vs D vs modified distance including 5 lack-of-fit points (left to right)

Keeping in mind that, due to the random seed in our algorithm, run-settings vary when rebuilding designs, I removed the lack-of-fit points (and replicates) to create the graphs in Figure 2.

Design evaluation for the I-optimal design without lack-of-fit points.

Design evaluation for the D-optimal design without lack-of-fit points.

Design evaluation for the modified distance design without lack-of-fit points.

Figure 3: Designs built by I vs D vs modified distance excluding lack-of-fit points (left to right)

Now you can see that D-optimal designs put points around the outside, whereas I-optimal designs put points in the interior, and the space-filling criterion spreads the points around. Due to the lack of points in the interior, the D-optimal design in this scenario features a big increase in standard error as seen by the darker shading—a very helpful graphical feature in Stat-Ease software. It is the loser as a criterion for a custom RSM design. The I-optimal wins by providing the lowest standard error throughout the interior as indicated by the light shading. Modified distance base selection comes close to I optimal but comes up a bit short—I award it second place, but it would not bother me if a user liking a better spread of their design points make it their choice.

In conclusion, as I advised in my DOE FAQ Alert, to keep things simple, accept the Stat-Ease software custom-design defaults of I optimality with 5 lack-of-fit points included and 5 replicate points. If you need more precision, add extra model points. If the default design is too big, cut back to 3 lack-of-fit points included and 3 replicate points. When in a desperate situation requiring an absolute minimum of runs, zero out the optional points and ignore the warning that Stat-Ease software pops up (a practice that I do not generally recommend!).

A practical tip for point selection

Look closely at the I-optimal design created by coordinate exchange in Figure 3 on the left and notice that two points are placed in nearly the same location (you may need a magnifying glass to see the offset!). To avoid nonsensical run specifications like this, I prefer to force the exchange algorithm to point selection. This restricts design points to a geometrically registered candidate set, that is, the points cannot move freely to any location in the experimental region as allowed by coordinate exchange.

Figure 4 shows the location of runs for the reactive-extrusion experiment with point selection specified.

Design evaluation for the I-optimal design by point exchange.

Design evaluation for the D-optimal design by point exchange.

Design evaluation for the modified distance design by point exchange.

Figure 4: Designs built by I vs D vs modified distance by point exchange (left to right)

The D optimal remains a bad choice—the same as before. The edge for I optimal over modified distance narrows due to point exchange not performing quite as well for as coordinate exchange.

As an engineer with a wealth of experience doing process development, I like the point exchange because it:

Reaches out for the ‘corners’—the vertices in the design space,
Restricts runs to specific locations, and
Allows users to see where they are by showing space point type on the design layout enabled via a right-click over the upper left corner.

Figures 5a and 5b illustrate this advantage of point over coordinate exchange.

Figure 5a: Design built by coordinate exchange with Space Point Type toggled on

On the table displayed in Figure 5a for a design built by coordinate exchange, notice how points are identified as “Vertex” (good the software recognized this!), “Edge” (not very specific) and “Interior” (only somewhat helpful).

Figure 5b: Design built by point exchange with Space Point Type shown

As shown by Figure 5b, rebuilding the design via point exchange produces more meaningful identification of locations (and better registered geometrically): “Vertex” (a corner), “CentEdge” (center of edge—a good place to make a run), “Center” (another logical selection) and “Interior” (best bring up the contour graph via design evaluation to work out where these are located—click any point to identify them by run number).

Full disclosure: There is a downside to point exchange—as the number of factors increases beyond 12, the candidate set becomes excessive and thus the build takes more time than you may be willing to accept. Therefore, Stat-Ease software recommends going only with the far faster coordinate exchange. If you override this suggestion and persist with point exchange, no worries—during the build you can cancel it and switch to coordinate exchange.

Final words

A fellow chemical engineer often chastised me by saying “Mark, you are overthinking things again.” Sorry about that. If you prefer to keep things simple (and keep statisticians happy!), go with the Stat-Ease software defaults for optimal designs. Allow it to run both exchanges and choose the most optimal one, even though this will likely be the coordinate exchange. Then use the handy Round Columns tool (seen atop Figure 5a) to reduce the number of decimal places on impossibly precise settings.

Like the blog? Never miss a post - sign up for our blog post mailing list.

October Publication Roundup

posted by Rachel Poleke, Mark Anderson on Nov. 3, 2025

Here's the latest Publication Roundup! In these monthly posts, we'll feature recent papers that cited Design-Expert® or Stat-Ease® 360 software. Please submit your paper to us if you haven't seen it featured yet!

Featured Article

Green extraction of poplar type propolis: ultrasonic extraction parameters and optimization via response surface methodology
BMC Chemistry, 19, Article number: 266 (2025)
Authors: Milena Popova, Boryana Trusheva, Ralitsa Chimshirova, Hristo Petkov, Vassya Bankova

Mark's comments: A worthy application of response surface methods for optimizing an environmentally friendly process producing valuable bioactive compounds. I see they used Box-Behnken designs appropriately - good work!

Be sure to check out this important study, and the other research listed below!

More new publications from October

Development and evaluation of a battery powered harvester for sustainable leafy vegetable cultivation
Scientific Reports, volume 15, Article number: 33812 (2025)
Authors: Kalluri Praveen, Yenikapalli Anil Kumar, Atul Kumar Shrivastava
Rutin/ZnO/mesoporous Silica-based Nano-hydrogel accelerated topical wound healing in albino mice via potential synergistic bioactive response
European Journal of Pharmaceutics and Biopharmaceutics, Volume 216, November 2025, 114875
Authors: Huma Butt, Haji Muhammad Shoaib Khan, Muhammad Sohail, Amina Izhar, Farhan Siddique, Maryam Bashir, Usman Aftab, Hasnain Shaukat
Production of improved Ethiopian Tej using mixed lactic acid bacteria and yeast starter cultures
Scientific Reports, volume 15, Article number: 33460 (2025)
Authors: Ketemaw Denekew, Fitsum Tigu, Dagim Jirata Birri, Mogessie Ashenafi, Feng-Yan Bai, Asnake Desalegn
Design and Optimization of Trastuzumab-Functionalized Nanolipid Carriers for Targeted Capecitabine Delivery: Anti-Cancer Effectiveness Evaluation in MCF-7 and SKBR3 Cells
International Journal of Nanomedicine, Volume 2025:20 Pages 12075—12102, 3 October 2025
Authors: Shubhashree Das, Bhabani Sankar Satapathy, Gurudutta Pattnaik, Sovan Pattanaik, Yahya Alhamhoom, Mohamed Rahamathulla, Mohammed Muqtader Ahmed, Ismail Pasha
**Design and test analysis of a rotary cutter device for root cutting of golden needle mushroom
Scientific Reports, volume 15, Article number: 37219 (2025)
Authors: Limin Xie, Yuxuan Gao, Zhiqiang Lin, Feifan He, Wenxin Duan, Dapeng Ye
Central composite design optimized fluorescent method using dual doped graphene quantum dots for lacosamide determination in biological samples
Scientific Reports, volume 15, Article number: 36507 (2025)
Authors: Ahmed Serag, Rami M. Alzhrani, Reem M. Alnemari, Maram H. Abduljabbar, Atiah H. Almalki
Innovation in functional bakery products: formulation and analysis of moringa-fortified millet cookies
Journal of Food Measurement and Characterization, Published: 17 October 2025
Authors: Anshu, Neeru & Ashwani Kumar
Improving the efficacy and targeting of letrozole for the control of breast cancer: in vitro and in vivo studies
Naunyn-Schmiedeberg's Archives of Pharmacology, Published: 13 October 2025
Authors: Shahira F. El Menshawe, Seif E. Ahmed, Amr Gamal Fouad, Amira H. Hassan
Optimization and Evaluation of Functionally Engineered Paliperidone Nanoemulsions for Improved Brain Delivery via Nasal Route
Molecular Pharmaceutics, Published October 7, 2025
Authors: Niserga D. Sawant, Pratima A. Tatke, Namita D. Desai
Innovative inhalable dry powder: nanoparticles loaded with Crizotinib for targeted lung cancer therapy
BMC Cancer, volume 25, Article number: 1526 (2025)
Authors: Faiza Naureen, Yasar Shah, Maqsood Ur Rehman, Fazli Nasir Fazli Nasir, Abdul Saboor Pirzada, Jamelah Saleh Al-Otaibi, Maria Daglia, Haroon Khan

September Publication Roundup

posted by Rachel Poleke, Mark Anderson on Oct. 1, 2025

When choosing a featured article for each month, we try to make sure it's available for everyone to read. Unfortunately, none of this month's publications that met our standards are available to the public, so there's no featured article this month. We still recommend checking out the incredible research done by these teams, and congratulations to everyone for publishing!

New publications from September

Evaluating the efficacy of nintedanib-invasomes as a therapy for non-small cell lung cancer
European Journal of Pharmaceutics and Biopharmaceutics, Volume 214, September 2025, 114810
Authors: Tamer Mohamed Mahmoud, Mohamed AbdElrahman, Mary Eskander Attia, Marwa M. Nagib, Amr Gamal Fouad, Amany Belal, Mohamed A.M. Ali, Nisreen Khalid Aref Albezrah, Shatha Hallal Al-Ziyadi, Sherif Faysal Abdelfattah Khalil, Mary Girgis Shahataa, Dina M. Mahmoud
Optimization of fermentation conditions for bioethanol production from oil palm trunk sap
Journal of the Indian Chemical Society, Volume 102, Issue 9, September 2025, 101943
Authors: Abdul Halim Norhazimah, Teh Ubaidah Noh, Siti Fatimah Mohd Noor
Microwave-assisted modification of a solid epoxy resin with a Peruvian oil
Progress in Organic Coatings, Volume 206, September 2025, 109333
Authors: Daniel Obregón, Antonella Hadzich, Lunjakorn Amornkitbamrung, G. Alexander Groß, Santiago Flores
Authors: Hüsniye Hande Aydın, Esra Karataş, Zeynep Şenyiğit, Hatice Yeşim Karasulu
Development of an innovative method of Salmonella Typhi biofilm quantification using tetrahydrofuran and response surface methodology
Microbial Pathogenesis, Volume 208, November 2025, 107992
Authors: Aditya Upadhyay, Dharm Pal, Awanish Kumar
Bauhinia monandra derived mesoporous activated carbon for the efficient adsorptive removal of phenol from wastewater
Scientific Reports volume 15, Article number: 31790 (2025)
Authors: Bhojaraja Mohan, Chikmagalur Raju Girish, Gautham Jeppu, Praveengouda Patil
Quality by Design-Driven Development, Greenness, and Whiteness Assessment of a Robust RP-HPLC Method for Simultaneous Quantification of Ellagic, Sinapic, and Syringic Acids
Separation Science Plus, Volume 8, Issue 9, September 2025, e70126
Authors: V. S. Mannur, Rahul Koli, Atith Muppayyanamath
Dissolution and separation of carbon dioxide in biohydrogen by monoethanolamine-based deep eutectic solvents
Journal of Chemical Technology and Biotechnology, Early View, 12 September 2025
Authors: Xiaokai Zhou, Yanyan Jing, Cunjie Li, Quanguo Zhang, Yameng Li, Tian Zhang, Kai Zhang
Strategic Implementation of Analytical Quality by Design in RP-HPLC Method Development for Andrographis paniculata and Chrysopogon zizanioides Extract-Loaded Phytosomes
Separation Science Plus, Volume 8, Issue 9, September 2025, e70125
Authors: Abisesh Muthusamy, Vinayak Mastiholimath, Darasaguppe R. Harish, Atith Muppayyanamath, Rahul Koli
Improving the bioavailability and therapeutic efficacy of valsartan for the control of cardiotoxicity-associated breast cancer
Journal of Drug Targeting, Published online: 29 Sep 2025
Authors: Mary Eskander Attia, Fatma I. Abo El-Ela, Saad M. Wali, Amr Gamal Fouad, Amany Belal, Fahad H. Baali, Nisreen Khalid Aref Albezrah, Mohammed S. Alharthi, Marwa M. Nagib

August Publication Roundup

posted by Rachel Poleke, Mark Anderson on Sept. 2, 2025

Featured Article

Design and optimization of imageable microspheres for locoregional cancer therapy
Scientific Reports volume 15, Article number: 27487 (2025)
Authors: Brenna Kettlewell, Andrea Armstrong, Kirill Levin, Riad Salem, Edward Kim, Robert J. Lewandowski, Alexander Loizides, Robert J. Abraham, Daniel Boyd

Mark's comments: This is a great application of mixture design for optimal formulation of a medical-grade glass. The researchers used Stat-Ease software tools to improve the properties of microspheres to an extent that their use can be extended to cancers beyond the current application to those located in the liver. Well done!

Be sure to check out this important study, and the other research listed below!

More new publications from August

Use of experimental design for screening and optimization of variables influencing photocatalytic degradation of pollutants in aqueous media: A review of chemometrics tools
Chemical Engineering Research and Design, Volume 220, August 2025, Pages 270-291
Authors: Pedro César Quero–Jiménez, Aracely Hernández–Ramírez, Jorge Luis Guzmán–Mar, Jorge Basilio de la Torre–López, Matheus Silva–Gigante, Laura Hinojosa–Reyes
Analytical Quality by Design-Based Stability-Indicating UHPLC Method for Determination of Inavolisib in Bulk and Formulation
Separation Science Plus, no. 8 (2025): 8, e70110
Authors: Ashwinkumar Matta, Raja Sundararajan
Enhanced anti-infective activities of sinapic acid through nebulization of lyophilized protransferosomes
Frontiers in Nanotechnology | Biomedical Nanotechnology, Volume 7 - 2025
Authors: Hani A. Alhadrami, Amr Gamal, Ngozi Amaeze, Ahmed M. Sayed, Mostafa E. Rateb, and Demiana M. Naguib
Optimizing Anti-Corrosive Properties of Polyester Powder Coatings Through Montmorillonite-Based Nanoclay Additive and Film Thickness
Corrosion and Materials Degradation, 2025, 6(3), 39
Authors: Marshall Shuai Yang, Chengqian Xian, Jian Chen, Yolanda Susanne Hedberg, James Joseph Noël
Regulatory mechanism and multi-index coordinated optimization of pipeline transportation performance of coarse-grained gangue slurry: Experimental and simulation investigation
Physics of Fluids 37, 073343 (2025)
Authors: Jianfei Xu (许健飞); Jixiong Zhang (张吉雄); Nan Zhou (周楠); Hao Yan (闫浩); Wenfu Zhou (周文福); Qian Chen (陈乾); Jiarun Chen (陈嘉润)
Optimization of clayey soil parameters with aeolian sand through response surface methodology and a desirability function
Scientific Reports volume 15, Article number: 30831 (2025)
Authors: Ghania Boukhatem, Messaouda Bencheikh, Mohammed Benzerara, Mehmet Serkan Kırgız, N. Nagaprasad, Krishnaraj Ramaswamy, Souhila Rehab-Bekkouche, R. Shanmugam
Development of electromagnetic drop weight release mechanism for human occupied vehicle
Scientific Reports volume 15, Article number: 30663 (2025)
Authors: Sathia Narayanan Dharmaraj, Karthikeyan Shanmugam, Jothi Chithiravel, Ramesh Sethuraman
Operating parameter optimization and experiment of spiral outer grooved wheel seed metering device based on discrete element method
Scientific Reports volume 15, Article number: 30762 (2025)
Authors: Tao Zhang, Xinglong Tang, Cong Dai, Guiying Ren
Parameter optimization of key components in seed-metering device for pre-cut seed stems of Pennisetum hydridum
Scientific Reports volume 15, Article number: 31318 (2025)
Authors: Chong Liu, Xiongfei Chen, Qiang Xiong, Muhua Liu, Junan Liu, Jiajia Yu, Peng Fang, Yihan Zhou, Chuanhong Zhan, Yao Xiao
Optimization of new and thermally aged natural monoesters blends for a sustainable management of power transformers
Industrial Crops and Products, Volume 235, 1 November 2025, 121741
Authors: Gerard Ombick Boyekong, Gabriel Ekemb, Emeric Tchamdjio Nkouetcha, Ghislain Mengata Mengounou, Adolphe Moukengue Imano

Beware of totally leveraged runs!

posted by Mark Anderson on Aug. 18, 2025

A challenge for a statistical sleuth

A few weeks ago, a process engineer hoping to glean a model of yield as function of 8 factors asked me to explain the failure by analysis of variance (ANOVA) to produce p values. See this deficiency on the left side of the software output shown in Figure 1. On the right side notice the dire warning about the fit statistics. The missing p’s and other non-available (“NA”) stats created great concern about the validity of the entire analysis.

Design-Expert software screenshot showing the right-click menu for a factor.

Figure 1: Alarming results in ANOVA and fit statistics

The tip-off for what went wrong can be found in the footnote: “Case(s) with leverage of 1.” After poring over the inputs, which stemmed from existing data—not from a designed experiment, I discovered that many of the rows had been duplicated. Removing these ‘dups’ left only 9 unique runs to fit a linear model featuring 8 coefficients for the 8 factors (main-effect slopes) plus 1 coefficient required for the intercept. The statistical software did the best it could from this ‘mission impossible.’ It did nothing wrong.

Creating total leverage as in this multifactor case can be likened to fitting a line to two points. It leaves no degrees of freedom (df) for estimating error (see this shown in the Figure 1 ANOVA). Thus, the F-test cannot be performed and, therefore, no p values can be estimated.

A model can be generated (barely!), but the lack of statistical tests provides no confidence in the outcome, literally (zero).

The remedy is very simple: Collect more data!

What is leverage?

Leverage is a numerical value between 0 and 1 that indicates the potential for a design point to influence the model fit. It’s strictly a function of the design itself—not the responses. Thus, leverage can be assessed before running the experiment.

A leverage of 1 means that the model will exactly fit the observation. That is never good because, unless that point falls exactly where it ought to be, your predictive model will be off kilter.

Leverage (“L”) is an easy statistic to master. It equals the number of coefficients in your model divided by the number of unique runs (dups do not count!).

You have seen what happens when all the runs are completely leveraged (L=1). But even one run at a leverage of 1 creates issues. For example, consider a hypothetical experiment aimed at establishing a linear fit of a key process attribute Y on a single factor X. The researchers intend to make 20 runs at two levels. However, due to circumstances beyond their control, they only achieve one run at high level. The 10 points at the low end come in at a leverage of 0.1 each, so none of them individually create much influence on the fitting. That’s good. But the single point at the highest level exhibits a leverage of 1, so it will be exactly fitted wherever it may go. That’s not good, but it may be OK if the result is where it ought to be. However, if something unusual happens at high level, there will be no way of knowing. I would be very skeptical of such an experiment—best to go for a complete ‘do over.’

Watch for leverages close to 1.0. Consider replicating these points, or make sure they are run very carefully.

What if no runs exhibit leverage of 1, but some are highly leveraged relative to others?

Some designs, such as standard two-level factorials with no center points, produce runs with equal leverage. However, others do not. For example, a two-level design on 4 factors with 4 center points features 16 runs with a leverage of 0.9875—far exceeding the center-point leverages of 0.05. Nevertheless, applying generally accepted guidelines that leverages less than 2 times the average cause no great concern, this design gets a pass—the average leverage being 0.8. A two-level design with center points is like a teeter-totter, points at the center are at the fulcrum and thus create very low leverage.

I advise you focus only on runs with leverage greater than 2 times the average leverage (or any with leverage of 1, of course). It is best to identify high-leverage points before running the experiment via a design evaluation and, if affordable, replicate them, thus reducing their leverage.

Do not be greatly concerned if leverages get flagged after you reduce insignificant terms from your model. For example, see the case study by our founder Pat Whitcomb in his article on “Bad Leverages” in the March 1998, Stat-Teaser—a must read if you want to get a good grasp on leverage.

Keep in mind that, despite being flagged for high leverage (2x average), a design point may generate a response that typifies how the process behaves at that setting. In that case it does not invalidate the model. Apply your subject matter and/or ask an expert colleague to be the judge of that.

General advice on leverage and situations to avoid

If you use standard DOE templates or optimal tools to lay out an experiment, it is unlikely that your design will include points with leverage over twice the average leverage. But, if you override the defaults and warnings in your software, issues with leverage can arise. For example, I often see published factorial designs with only 1 center point—not the 3 or 4 that our software advises. This creates a leverage of 1 for the curvature test—not good. Believe it or not, as a peer reviewer for a number of technical journals I’ve also seen many manuscripts that lay out the recommended number of center points for standard designs (e.g., 4 for a two-level factorial). But they all show the same results. As already explained, when it comes to leverage do not be duped by ‘dups.’

The dangers of happenstance data

I am particularly wary of historical data with runs done haphazardly (no plan). These often create a cloud of points at one end with very few at the opposite extreme. For example, see the scatter plot in Figure 2 (real data from a study of infection rates after varying number of days at various hospitals in the USA).

Figure 2: A real-life dataset with a badly leveraged point

In this case, the point at the upper right exhibits a leverage of 0.99 versus all the other 12 points averaging 0.17. If possible, replicating such a high-leverage point would be very helpful, thus reducing its leverage by half. Better yet, do two more replicates to reduce this problematic point’s leverage by one-third. Though not emerging as an outlier in the diagnostics (very unlikely for a highly leveraged point--it will be closely fitted), this particular result must be carefully evaluated and ignored if determined to be exceptional.

Conclusion

Pay attention to leverage, ideally before you complete your experiment, but if you are developing a model from existing data, do so in the diagnostics from your statistical software. Beware of totally leveraged runs—this being the worse-case scenario. If not quite this bad, watch for leverages more than twice the average—if possible, replicate them. Otherwise, apply engineering and scientific expertise to decide if the results can be accepted.

Stat-Ease Blog

Categories

Tips and tricks for designing statistically optimal experiments

An illustrative case

A practical tip for point selection

Final words

October Publication Roundup

Featured Article

More new publications from October

September Publication Roundup

New publications from September

August Publication Roundup

Featured Article

More new publications from August

Beware of totally leveraged runs!

A challenge for a statistical sleuth

What is leverage?

What if no runs exhibit leverage of 1, but some are highly leveraged relative to others?

General advice on leverage and situations to avoid

The dangers of happenstance data

Conclusion