Design of Experiments
An
Introduction
Design
of Experiments (DOE) is powerful statistical tool for solving complex
problems. After reading that you are
probably thinking "Powerful, statistical, and complex, this sounds
hard." and "I already know how to use lots of tools, why should I
learn one more?" Our job is to
convince you that this tool can save you time and effort and that you can use
it to solve otherwise unsolvable problems.
During
the first 14 years of schooling you have learned about the Scientific Method,
especially the notion that to learn anything about a system through
experimentation, you have to vary only one factor and hold everything else
constant. With DOE we can get more
information from the Scientific Method or we can "cheat the system"
by not following all the required steps and still getting lots of useful
information. To show you how DOE can
help you, we will first discuss problems engineers face and the inadequacies of
the traditional approaches.
Real World
Problems
Suppose
you want to do something common like blow-mold a 2-liter bottle. Since this is a commodity item, profit
margins are tight and very small differences in manufacturing efficiency can
mean the difference between wealth and financial ruin. As the engineer in charge you are faced with
several problems:
A number of factors which affect the
outcome (temperatures, pressures, flow rate...)
Inadequate theoretical models (Note that
programs like Moldflow are available)
The huge cost of downtime
Lets
say that you have only 8 factors and you wish to set up the experimental plan
to determine the effect of each. If you
only look at two levels for each factor (high-low) in your work and you use the
scientific method, you will get to do 28
or 256 experiments. If each test takes
10 minutes and you run three repetitions of each, the test program will require
only 128 hours of work. Then when you
analyze the data you will probably miss any interactive effects between the
factors because you were only looking for individual effects.
Note
that we just snuck in several terms that are very important in DOE and bear a
quick look:
Since
you are worried about excessive downtime of the machine, you decide to find an
analytical solution. After struggling
with the problems of transient heat transfer, the flow of a viscoplastic solid
undergoing dynamic deformation while changing temperature, and the complex flow
of a compressible fluid you decide that your theoretical model is unlikely to
reflect reality.
How Can DOE Help?
With
design of experiments you can:
run a relatively small number of tests to
isolate the most important factors (screening test).
determine if any of the factors interact
(combined effects are as important as individual effects) and the level of
interaction.
predict response for any combination of
factors using only empirical results
optimize using only empirical results
An
example of the screening test is the
Helicopter Drop Experiment run in ME 311.
In that test 8 factors were selected as important to the flight time of
a paper helicopter. Instead of the 256
tests required by the scientific method, a carefully selected set of 16 tests (fractional
factorial) was done to isolate the most significant factors. This would be done so that designers could
more productively focus their efforts.
Note that this is not a good technique for squeezing the last little bit
of performance out of a system. It is a
good way of determining how to allocate resources at the beginning of a design
effort, or where to look for likely bottlenecks in a process.
Response
prediction
and factor interaction can be determined using DOE. Consider the table below. Suppose that factor A is the silicon content
of a heat-treatable aluminum. Suppose
that factor B is the heat treating temperature. In this case the response values may represent the hardness.

A
plot of the data shows how response is a function of factors A and B. Since the lines are parallel, we can assume
that little interaction between the variables occurs.

Consider
another set of data as shown below. In
this instance, the response may be wear rate, factor A may be RPM, and factor B
may be the type of lubricant.

Again,
a plot of the data show how response depends on the factors. Now we see that B2 is a better lubricant at
low RPM but not at high RPM. The
crossing of the lines indicates an interactive effect between the factors. In this case, test RPM and lubricant type
are inter-related.

We
can construct an empirical model from these results that includes predictive
equations. The predictive empirical
model from DOE doesn't provide as much engineering intuition regarding the
process as a theoretical model, but it often is a better predictor of
outcomes. Since the ultimate goal of
the work was to predict outcomes, we are not overly disappointed.
Optimization was not
specifically covered in any of the examples, but can be an important reason for
using DOE. Figure 2 shows the response
surfaces that can be generated using DOE software. Maxima and minima can be found from these response surfaces using
standard numerical techniques.
Optimizing with DOE can be faster than with other techniques and ,
because of the consideration of factor interactions can be more accurate.

Figure
2 Response surface.
Overview
- Steps in the Process
The
approach to an experimental problem using DOE is not significantly different
than the traditional approach. You
still need to define the problem and determine the properties to be
measured. The general approach is
outlined below.
1. Statement of
the Problem
As with any engineering problem, defining
what you wish to know is the key starting point. In the case of the helicopter, we wished to design a paper
helicopter that had the longest time to drop.
We also had certain constraints in terms of available materials,
etc. For the catapult, we wanted to
predict the distance traveled by the marble.
The most important point is that this problem statement is not
significantly different than those you have done before for design or
experimentation.
2. Choice of
Factors and Levels
Factors are the
potential design parameters, such as wing length or paper type.
Levels are the range of
values for the factors. In our work we
used two level experiments. Each factor
was either High or Low. Alternatively,
we could have chosen High, Medium, and Low, or we could have used some larger
number of intervals.
3. Selection of
Response Variable
For the helicopter design, the most
important response was the time to drop.
In other cases, the appropriate response variable may be more difficult
to choose. For example, improved gas
mileage may be important, but measuring brake specific fuel economy may be a
better measure.
4. Choice of
Design
A variety of choice are possible here. For the helicopter, we selected a screening
test. Specifics on how to set up the
different test designs can be found in design of experiments textbooks. One important aspect of testing is the
testing order. Tests should be run in a
random order (including replicate tests).
This is to minimize the effect of unknown or uncontrollable
factors. For example, muzzle velocity
tests of different ammunition would be done randomly to minimize the effect
barrel heating would have on skewing the results.
5. Perform
Experiment
6. Data Analysis
Effects of each
factor can be found by the difference between the high responses and the low
responses for the factor. To get these,
the responses from each experiment at high setting for one factor are summed,
and the responses from each experiment at low setting for the same factor are
summed. The difference between these
two sums is a measure of the importance of the factor. Interactive effects can be compared in the
same way.
Paper
Helicopter Experiment
Goal:
To determine the important factors
controlling the time aloft of a paper helicopter.
Task:
Eight variables or factors were
chosen that we believe could be important in the flight time of the
helicopter. Since no theoretical
relationship is available relating flight time to the chosen factors, we will
try to screen the factors to determine which are the most important, and thus
worthy of more study.
To keep things simple we will only
look at high and low settings for each variable. For further simplicity, we select a screening method with only 16
required tests. If we wished to look
for interactions between variables, we would need a larger set of tests. To fully test 8 variables at two settings
(high and low) would require 28 or
256 tests. This would be prohibitively
expensive to most companies in terms of time and money, so the screening test
is used to narrow the scope of the investigation before looking for higher
order effects.
Experimental
procedure:
Table 1 lists the eight factors that
may influence flight time. Figure 1
shows the pattern of the helicopter.
Each factor is assigned a high and low value that will be designated +
and - respectively in the design matrix (table 2).
A total of 16 helicopters must be
constructed. Each will have a different
combination of factors as designated in the design matrix of table 2. (Note that table 2 represents a 1/16
factorial screening matrix. The method
of setting up such a matrix is described in reference 1.)
After constructing all helicopters,
each will be dropped in a randomized order.
Three runs of each will be made and the average of the three drop times
used for the response value for that helicopter.
Data Analysis:
To evaluate the effect of a variable
such as the paper type, we look at the difference in average response values
between helicopters with plain paper and helicopters made of construction
paper. This is done by first summing
the response values for helicopters made from plain paper and finding an
average. Then sum the response values
for helicopters made with construction paper and find an average. The difference in these averages represents
the "main effect" of paper type on flight time. If the difference is relatively high with
respect to the "effects" of other variables, then paper type is
important.
So, for each variable, sum all
response values at the + setting and find an average. From this "average of the highs" subtract the
"average of the lows" (the average of all response values at the
minus setting) to get the "effect".
List the variables with their "effects" on flight time to see
which are most important. A plot may be
helpful.
Conclusion:
This was a very basic glimpse at a
very powerful collection of tools. We
only looked at screening of variables and not at variable interactions and
optimization. To learn more about this
topic you can take STAT 303 Design of Engineering Experiments.
Statistical design of experiments is
being used by a number of companies to optimize their designs or
processes. Software for this purpose is
readily available. Some of the software
companies offer training courses in the method and/or assistance with your
particular problem (for a fee).
Reference: Statistical
Design and Analysis of Experiments, Mason, Guest, and Hess.




Catapult
Experiment
Schubert,
et al. published an article on a catapult experiment like this in Quality
Engineering in 1992. Much of what
follows is drawn from their work.

Figure
3 The catapult.
Analytical
Approach
If
we consider the catapult shown in Figure 3, we might reasonably conclude that
with a little dynamics we could predict exactly how far a marble would be
thrown for any combination of factors.
Schubert et al. report a the following equation for such an analysis.
(eqn.
1)
where
I0
is the moment of inertial of the moment arm/ball combination relative to the
pivot point 0. q0 and q1 are the initial and launching
angles, M an m are the masses of the moment arm and ball, respectively, and rG
is the distance from 0 to the mass center G of the moment arm. The other dimensions are shown in figure
4. F(q)
is the force exerted by the rubber band on the moment arm.

Figure
4
Schematic of the catapult with dimensions.
Now
that you have the equation, you can measure all the parts and try to model the
nonlinear response of the rubber band.
According to the paper, the analysis took about 200 man hours and
experimental results showed that predicting the ball landing site was accurate
to 15 inches. Accuracy is affected by the measurement precision and
idealizations in the analysis. The
history dependent nonlinear rubber band is also troublesome.
DOE Approach
In
their case, a design of experiments approach required 6 man hours and accuracy
was within 3 inches.
The design of experiments approach effectively defines a response
envelope for the system. Once the
envelope is defined, any combination of factor settings can be estimated.
Factors
-
For the example in this class we use only three factors
Stop angle
Hook attachment point
Arm length
Levels - These factors
are each tested at only two levels, high and low. A complete set of tests is run as shown in Table 1. In Table 1, a plus sign indicates a high
level and a minus sign indicates a low level.
Thus, in test #1 the hook, the stop angle, and the arm length are all at
the low level. In test #8, all the
levels are high setting.
Interactive
effects
- You notice three other columns in Table 1, AB, BC, and AC. AB is the products of the level values for A
and B (hook and arm length). The other
columns are also products. These
products will be indicators of interactive effects between the factors.
Table 1 Test matrix for catapult test
|
Test # |
Hook |
Arm length |
Stop angle |
AB |
BC |
AC |
|
1 |
-1 |
-1 |
-1 |
+1 |
+1 |
+1 |
|
2 |
-1 |
-1 |
+1 |
+1 |
-1 |
-1 |
|
3 |
-1 |
+1 |
-1 |
-1 |
-1 |
+1 |
|
4 |
-1 |
+1 |
+1 |
-1 |
+1 |
-1 |
|
5 |
+1 |
-1 |
-1 |
-1 |
+1 |
-1 |
|
6 |
+1 |
-1 |
+1 |
-1 |
-1 |
+1 |
|
7 |
+1 |
+1 |
-1 |
+1 |
-1 |
-1 |
|
8 |
+1 |
+1 |
+1 |
+1 |
+1 |
+1 |
For
each test number, four repetitions were made.
In other words, the marble was thrown four times at each setting and the
travel distance was recorded. From
these an average value for each test is determined.
Table 2 Measured values
|
Test # |
Trial 1 |
Trial 2 |
Trial 3e |
Trial 4 |
Avg. (Yr) |
|
1 |
50.5 |
51 |
51.5 |
51.5 |
51 |
|
2 |
24 |
23.5 |
24 |
24.5 |
24 |
|
3 |
90 |
94 |
90.25 |
87.5 |
91 |
|
4 |
39.5 |
42 |
40 |
40.5 |
41 |
|
5 |
76.5 |
76 |
76.5 |
75.5 |
76 |
|
6 |
48.5 |
48.5 |
50 |
50.5 |
49 |
|
7 |
117.5 |
116 |
117 |
119.5 |
117 |
|
8 |
84 |
81.5 |
82 |
80 |
82 |
Data
Reduction
- The interesting thing is how the data is analyzed. If we average all the distances from tests in which the hook is
at the high level, then compare that with the average from all the tests for
the hook is at the low level, we would
expect to see some difference. Indeed,
this difference between high and low settings is the key to this
technique. Table 3 shows the results
for those calculations.
Important
variables (screening) - If we see a small difference we can say that
the factor is not so important. Thus,
the difference defines the relative importance of each factor. This also works for the combined factors so
we can see the relative importance of the interactive effects.
Table 3 Relative effects of each factor and
interactive effects
|
|
Hook A |
Arm Length B |
Stop Angle C |
AB |
BC |
AC |
|
Avg Y- |
51.8 |
50.1 |
83.8 |
64.4 |
70.4 |
64.5 |
|
Avg Y+ |
81.1 |
82.8 |
49.1 |
68.5 |
62.5 |
68.4 |
|
Y |
29.4 |
32.7 |
-34.7 |
4.1 |
-7.9 |
3.9 |
From
the Y values we can see that the three factors
are of relatively equal importance to the travel distance of the marble. The interactive effects are of relatively
small importance, however.
Predicting with
the Empirical Model
The
information in Table 3 can be combined to produce a predictive equation for
travel distance.