Saturday, August 29, 2009

George Box's "Improving Almost Anything"

George E. P. Box is probably the greatest industrial statistician in the 20th century. George was quoted as saying "Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful." In short whenever a scientist produces a model or formula from observed or historic data, that model inherently contains errors. Models provide us a method of predicting the behavior of physical systems. I have seen many times the use of a scientific model on a process or system that will not predict the outcome of a change to a system.

The reasons are simple.

1) Models built on historical data cannot sort out causal factors from nuisance factors or noise very well.
2) Time trends effect physical process. If data was collected over long time periods such as years there is little accounting for random factors effecting a process or system.

Box also stated, "To find out what happens to a system when you interfere with it you have to interfere with it (not just passively observe it)." Here George Box presents the wisdom of conducting designed experiments and planned experimentation. When you deliberately interfere with a process or system you can observe a result. I think George believes almost anything can be improved through the use of DOE and statistical methods. However he really insists and advocates a scientific approach using planned experimentation and I definitely am in agreement.

A recent trend in Data Mining is somewhat disturbing to me. It goes against the teachings of R A Fisher and George Box. Whenever we are looking at past data we never know for sure if causal factors are real or not. There is the possibility of spurious correlations and erroneous conclusions. None the less due to the large growth in available data, the profession and activities of Data Miners is growing. I always caution about research that involves data mining. My time spent looking at historic data from chemical processes shows little promise at good models without additional experimentation to validate or verify results.

To read more about George Box's philosophies find the book "Improving Almost Anything: Ideas and Essays" by George Box and Friends, 2006, Wiley.

Almost all processes and systems can be improved. All that is needed is a planned and guided scientific approach to a solution. Box points out that this is the critical ingredient to the Six Sigma revolution when CEOs of Motorola, GE, Honeywell and Texas Instruments realized the power of statistical methods and the necessity to train employees for continuous improvement.

