Design of Experiments

An Introduction

 

Design of Experiments (DOE) is powerful statistical tool for solving complex problems.  After reading that you are probably thinking "Powerful, statistical, and complex, this sounds hard." and "I already know how to use lots of tools, why should I learn one more?"  Our job is to convince you that this tool can save you time and effort and that you can use it to solve otherwise unsolvable problems.

 

During the first 14 years of schooling you have learned about the Scientific Method, especially the notion that to learn anything about a system through experimentation, you have to vary only one factor and hold everything else constant.  With DOE we can get more information from the Scientific Method or we can "cheat the system" by not following all the required steps and still getting lots of useful information.  To show you how DOE can help you, we will first discuss problems engineers face and the inadequacies of the traditional approaches.

 

Real World Problems

Suppose you want to do something common like blow-mold a 2-liter bottle.  Since this is a commodity item, profit margins are tight and very small differences in manufacturing efficiency can mean the difference between wealth and financial ruin.  As the engineer in charge you are faced with several problems:

     A number of factors which affect the outcome (temperatures, pressures, flow rate...)

     Inadequate theoretical models (Note that programs like Moldflow are available)

     The huge cost of downtime

 

Lets say that you have only 8 factors and you wish to set up the experimental plan to determine the effect of each.  If you only look at two levels for each factor (high-low) in your work and you use the scientific method, you will get to do 28 or 256 experiments.  If each test takes 10 minutes and you run three repetitions of each, the test program will require only 128 hours of work.  Then when you analyze the data you will probably miss any interactive effects between the factors because you were only looking for individual effects.

 

Note that we just snuck in several terms that are very important in DOE and bear a quick look:

 

Since you are worried about excessive downtime of the machine, you decide to find an analytical solution.  After struggling with the problems of transient heat transfer, the flow of a viscoplastic solid undergoing dynamic deformation while changing temperature, and the complex flow of a compressible fluid you decide that your theoretical model is unlikely to reflect reality.

 

 

How Can DOE Help?

 

With design of experiments you can:

     run a relatively small number of tests to isolate the most important factors (screening test).

     determine if any of the factors interact (combined effects are as important as individual effects) and the level of interaction.

     predict response for any combination of factors using only empirical results

     optimize using only empirical results

 

An example of the screening test is the Helicopter Drop Experiment run in ME 311.  In that test 8 factors were selected as important to the flight time of a paper helicopter.  Instead of the 256 tests required by the scientific method, a carefully selected set of 16 tests (fractional factorial) was done to isolate the most significant factors.  This would be done so that designers could more productively focus their efforts.  Note that this is not a good technique for squeezing the last little bit of performance out of a system.  It is a good way of determining how to allocate resources at the beginning of a design effort, or where to look for likely bottlenecks in a process.

 

Response prediction and factor interaction can be  determined using DOE.  Consider the table below.  Suppose that factor A is the silicon content of a heat-treatable aluminum.  Suppose that factor B is the heat treating temperature.  In this case the response values may represent the hardness. 

A plot of the data shows how response is a function of factors A and B.  Since the lines are parallel, we can assume that little interaction between the variables occurs.

Consider another set of data as shown below.  In this instance, the response may be wear rate, factor A may be RPM, and factor B may be the type of lubricant.

Again, a plot of the data show how response depends on the factors.  Now we see that B2 is a better lubricant at low RPM but not at high RPM.  The crossing of the lines indicates an interactive effect between the factors.  In this case, test RPM and lubricant type are inter-related.

We can construct an empirical model from these results that includes predictive equations.  The predictive empirical model from DOE doesn't provide as much engineering intuition regarding the process as a theoretical model, but it often is a better predictor of outcomes.  Since the ultimate goal of the work was to predict outcomes, we are not overly disappointed.

 

Optimization was not specifically covered in any of the examples, but can be an important reason for using DOE.  Figure 2 shows the response surfaces that can be generated using DOE software.  Maxima and minima can be found from these response surfaces using standard numerical techniques.  Optimizing with DOE can be faster than with other techniques and , because of the consideration of factor interactions can be more accurate.

Figure 2  Response surface.


Overview - Steps in the Process

 

The approach to an experimental problem using DOE is not significantly different than the traditional approach.  You still need to define the problem and determine the properties to be measured.  The general approach is outlined below.

 

1. Statement of the Problem

As with any engineering problem, defining what you wish to know is the key starting point.  In the case of the helicopter, we wished to design a paper helicopter that had the longest time to drop.  We also had certain constraints in terms of available materials, etc.  For the catapult, we wanted to predict the distance traveled by the marble.  The most important point is that this problem statement is not significantly different than those you have done before for design or experimentation.

 

2. Choice of Factors and Levels

Factors are the potential design parameters, such as wing length or paper type.

Levels are the range of values for the factors.  In our work we used two level experiments.  Each factor was either High or Low.  Alternatively, we could have chosen High, Medium, and Low, or we could have used some larger number of intervals.

 

3. Selection of Response Variable

For the helicopter design, the most important response was the time to drop.  In other cases, the appropriate response variable may be more difficult to choose.  For example, improved gas mileage may be important, but measuring brake specific fuel economy may be a better measure.

 

4. Choice of Design

A variety of choice are possible here.  For the helicopter, we selected a screening test.  Specifics on how to set up the different test designs can be found in design of experiments textbooks.  One important aspect of testing is the testing order.  Tests should be run in a random order (including replicate tests).  This is to minimize the effect of unknown or uncontrollable factors.  For example, muzzle velocity tests of different ammunition would be done randomly to minimize the effect barrel heating would have on skewing the results.

 

5. Perform Experiment

 

6. Data Analysis

Effects of each factor can be found by the difference between the high responses and the low responses for the factor.  To get these, the responses from each experiment at high setting for one factor are summed, and the responses from each experiment at low setting for the same factor are summed.  The difference between these two sums is a measure of the importance of the factor.  Interactive effects can be compared in the same way.

 

 


Paper Helicopter Experiment

 

Goal:

            To determine the important factors controlling the time aloft of a paper helicopter.

Task:

            Eight variables or factors were chosen that we believe could be important in the flight time of the helicopter.  Since no theoretical relationship is available relating flight time to the chosen factors, we will try to screen the factors to determine which are the most important, and thus worthy of more study.

            To keep things simple we will only look at high and low settings for each variable.  For further simplicity, we select a screening method with only 16 required tests.  If we wished to look for interactions between variables, we would need a larger set of tests.  To fully test 8 variables at two settings (high and low) would require 28 or 256 tests.  This would be prohibitively expensive to most companies in terms of time and money, so the screening test is used to narrow the scope of the investigation before looking for higher order effects.

Experimental procedure:

            Table 1 lists the eight factors that may influence flight time.  Figure 1 shows the pattern of the helicopter.  Each factor is assigned a high and low value that will be designated + and - respectively in the design matrix (table 2).

            A total of 16 helicopters must be constructed.  Each will have a different combination of factors as designated in the design matrix of table 2.  (Note that table 2 represents a 1/16 factorial screening matrix.  The method of setting up such a matrix is described in reference 1.)

            After constructing all helicopters, each will be dropped in a randomized order.  Three runs of each will be made and the average of the three drop times used for the response value for that helicopter.

Data Analysis:

            To evaluate the effect of a variable such as the paper type, we look at the difference in average response values between helicopters with plain paper and helicopters made of construction paper.  This is done by first summing the response values for helicopters made from plain paper and finding an average.  Then sum the response values for helicopters made with construction paper and find an average.  The difference in these averages represents the "main effect" of paper type on flight time.  If the difference is relatively high with respect to the "effects" of other variables, then paper type is important.

            So, for each variable, sum all response values at the + setting and find an average.  From this "average of the highs" subtract the "average of the lows" (the average of all response values at the minus setting) to get the "effect".  List the variables with their "effects" on flight time to see which are most important.  A plot may be helpful.

Conclusion:

            This was a very basic glimpse at a very powerful collection of tools.  We only looked at screening of variables and not at variable interactions and optimization.  To learn more about this topic you can take STAT 303 Design of Engineering Experiments.

            Statistical design of experiments is being used by a number of companies to optimize their designs or processes.  Software for this purpose is readily available.  Some of the software companies offer training courses in the method and/or assistance with your particular problem (for a fee). 

 

Reference: Statistical Design and Analysis of Experiments, Mason, Guest, and Hess.

 


Catapult Experiment

 

Schubert, et al. published an article on a catapult experiment like this in Quality Engineering in 1992.  Much of what follows is drawn from their work.

Figure 3  The catapult.

 

Analytical Approach

If we consider the catapult shown in Figure 3, we might reasonably conclude that with a little dynamics we could predict exactly how far a marble would be thrown for any combination of factors.  Schubert et al. report a the following equation for such an analysis.

                                      (eqn. 1)

 

where I0 is the moment of inertial of the moment arm/ball combination relative to the pivot point 0. q0 and q1 are the initial and launching angles, M an m are the masses of the moment arm and ball, respectively, and rG is the distance from 0 to the mass center G of the moment arm.  The other dimensions are shown in figure 4.  F(q) is the force exerted by the rubber band on the moment arm.

Figure 4 Schematic of the catapult with dimensions.


Now that you have the equation, you can measure all the parts and try to model the nonlinear response of the rubber band.  According to the paper, the analysis took about 200 man hours and experimental results showed that predicting the ball landing site was accurate to 15 inches.  Accuracy is affected by the measurement precision and idealizations in the analysis.  The history dependent nonlinear rubber band is also troublesome.

 

DOE Approach

In their case, a design of experiments approach required 6 man hours and accuracy was within 3 inches.  The design of experiments approach effectively defines a response envelope for the system.  Once the envelope is defined, any combination of factor settings can be estimated.

 

Factors - For the example in this class we use only three factors

     Stop angle

     Hook attachment point

     Arm length

 

Levels - These factors are each tested at only two levels, high and low.  A complete set of tests is run as shown in Table 1.  In Table 1, a plus sign indicates a high level and a minus sign indicates a low level.  Thus, in test #1 the hook, the stop angle, and the arm length are all at the low level.  In test #8, all the levels are high setting.

 

Interactive effects - You notice three other columns in Table 1, AB, BC, and AC.  AB is the products of the level values for A and B (hook and arm length).  The other columns are also products.  These products will be indicators of interactive effects between the factors.

 

Table 1  Test matrix for catapult test

 

Test #

Hook

Arm length

Stop angle

AB

BC

AC

1

-1

-1

-1

+1

+1

+1

2

-1

-1

+1

+1

-1

-1

3

-1

+1

-1

-1

-1

+1

4

-1

+1

+1

-1

+1

-1

5

+1

-1

-1

-1

+1

-1

6

+1

-1

+1

-1

-1

+1

7

+1

+1

-1

+1

-1

-1

8

+1

+1

+1

+1

+1

+1

 

 

For each test number, four repetitions were made.  In other words, the marble was thrown four times at each setting and the travel distance was recorded.  From these an average value for each test is determined.

 


Table 2  Measured values

 

Test #

Trial 1

Trial 2

Trial 3e

Trial 4

Avg. (Yr)

1

50.5

51

51.5

51.5

51

2

24

23.5

24

24.5

24

3

90

94

90.25

87.5

91

4

39.5

42

40

40.5

41

5

76.5

76

76.5

75.5

76

6

48.5

48.5

50

50.5

49

7

117.5

116

117

119.5

117

8

84

81.5

82

80

82

                                                                                                                    

 

Data Reduction - The interesting thing is how the data is analyzed.  If we average all the distances from tests in which the hook is at the high level, then compare that with the average from all the tests for the hook is at the low level,  we would expect to see some difference.  Indeed, this difference between high and low settings is the key to this technique.  Table 3 shows the results for those calculations.

 

Important variables (screening) - If we see a small difference we can say that the factor is not so important.  Thus, the difference defines the relative importance of each factor.  This also works for the combined factors so we can see the relative importance of the interactive effects.

 

Table 3  Relative effects of each factor and interactive effects

 

Hook

 

A

Arm Length

B

Stop Angle

C

 

AB

 

BC

 

AC

Avg Y-

51.8

50.1

83.8

64.4

70.4

64.5

Avg Y+

81.1

82.8

49.1

68.5

62.5

68.4

Y

29.4

32.7

-34.7

4.1

-7.9

3.9

 

From the Y values we can see that the three factors are of relatively equal importance to the travel distance of the marble.  The interactive effects are of relatively small importance, however.

 

Predicting with the Empirical Model

The information in Table 3 can be combined to produce a predictive equation for travel distance.