Provided by James R. Martin, Ph.D., CMA
Professor Emeritus, University of South Florida
The components of the red bead experiment include a box of 4,000 wooden beads (800 red and 3,200 white), a paddle with fifty bead size depressions, a second smaller box for mixing the beads, six willing workers, two inspectors who make independent counts, a chief inspector who verifies the counts, an accountant who records the counts and a customer who will not accept red beads. The job is to produce white beads and the standard for each worker is fifty white beads per day.
The daily production operation for each worker includes: 1) poring the beads from the first box into the second box and then back into the first box (to mix the beads), 2) dipping the paddle into the first box without shaking it, 3) carrying the loaded paddle to each inspector for separate counts and then verification, and 4) dumping the day's work back into the supply box. The six workers perform this operation four times to represent four days' work. The results of one of Deming's experiments appear in Table 1.
Red Beads Recorded in one of Deming'Red Bead Experiments*
|Worker||Day 1||Day 2||Day 3||Day 4||Total||Mean|
|*Conducted in a seminar in West Springfield Mass, February 6, 1985 (Walton, Chapter 4).|
Table 1 reveals that each of the six workers performed differently with daily defects (red beads) ranging from five to 17 and four day averages ranging from 7.75 to 10.75. The first point is clear. With identical tools, tasks and abilities, performance will vary. Now, perhaps the reader is thinking that over the long run the mean defects for each worker will be ten. Logically, since there are 20 percent red beads in the box, the long run average will be 20 percent, or ten. Right? Wrong! First, the beads are different. No two beads are, or could be, exactly alike and the red beads tend to be slightly larger and heavier than the white beads (or it could be the other way around depending on how the beads were produced). The paddle responds to the red and white beads differently and the red and white beads respond to the paddle differently. In addition, the depressions in the paddle are different. No two depressions are exactly alike. The different depressions and beads interact differently. Even though all of these differences are undetectable without precision equipment, they affect the results. In addition, although each of the six workers in the experiment used the same paddle, the variation in the results would be different if each worker had used a different paddle. Deming used four paddles over the forty-five to fifty-year period and the long-run mean defects for the four paddles were 11.3, 9.6, 9.2 and 9.4 (Deming 1993, 168). All of the variation in these experiments was attributed to the system, i.e., incoming beads and paddles.
What is the point of the red bead experiment?
1. The experiment provides a
typical illustration of bad management. There are too many employees
involved (e.g., inspectors), and the rigid procedures do not allow
workers to offer suggestions for improvement. In addition, during the
experiment, Deming (who serves as the manager) continually blames the
workers for defective products that are caused by the system.
2. System variation (frequently referred to as random variation) is inevitably present in any process, operation or activity.
3. Knowledge of one source of system variation, such as the proportion of defects (red beads) in the incoming supply, cannot be used to determine the total effect of system variation, such as the proportion of defects in the output. This is because unobservable factors will always affect performance and there is no basis for assuming that the effects of these factors will be equally distributed across workers.
4. All workers perform within a system that is beyond their control.
5. There will always be some workers that are above the average and some workers that are below the average.
6. A worker's position in the ranking may vary from one period to the next.
7. Workers should not be ranked because doing so merely represents a ranking of the effect of the system on the workers. In the red bead experiment, 100 percent of the variation in the workers' performances is determined by the system. Even in this controlled experiment where the workers use the same inputs and tools, they are all victims of the system and cannot be compared in any meaningful way.1
8. Only management can change the system.
9. Empirical evidence (i.e., observations of facts, as opposed to secondhand information, or information further removed from fact such as opinion) is never complete. There are always a large number of variables that affect any set of performance results, many of which are unknown and unknowable.
The red bead experiment is deceptively simple because it provides a powerful message that is difficult for many to grasp. In summary, the misconception that workers can be meaningfully ranked is based on two faulty assumptions. The first assumption is that each worker can control his or her performance. Deming (1986, 315) estimated that 94 percent of the variation in any system is attributable to the system, not to the people working in the system. The second assumption is that any system variation will be equally distributed across workers. Deming (1986, 353) taught that there is no basis for this assumption in real life experiences. The source of the confusion comes from statistical (probability) theory where random numbers are used to obtain samples from a known population. When random numbers are used in an experiment, there is only one source of variation, so the randomness tends to be equally distributed. This is because samples based on random numbers are not influenced by such things as the characteristics of the inputs and tools (e.g., size of the beads and depressions in the paddles) and other real world phenomena. However, in real life experiences, there are many identifiable causes of variation, as well as a great many others that are unknown. The interaction of these forces will produce unbelievably large differences between people (Deming 1986, 110) and there is no logical basis for assuming that these differences will be equally distributed.2
1 Some readers may think that Deming's experiment is inappropriate because he deliberately chose
different size beads, and paddles with different size depressions,
but this was not the case. Deming points out that the only way to
remove system variation from an experiment is to use random numbers
(Deming, 1986, 334).
2 Deming was critical of probability theory and central limit theory in his seminars. For example, in response to a comment from the audience that the mean number of defects in the read bead experiment would be ten based on these concepts, Deming said "I think it is necessary to think and not to assume what you don't know" (Walton 1986, 49).
References and related summaries:
Deming, W. E. 1986. Out of the Crisis. Cambridge, MA. Massachusetts Institute of Technology Center for Advanced Engineering Study.
Deming, W. E. 1993. The New Economics for Industry For Industry, Government & Education. MA. Massachusetts Institute of Technology Center for Advanced Engineering Study. Chapter 7. (Summary).
Huber, M. M. 2016. Work less, play more... Get results: Achieve gamification success with an appropriate, effective design and the right performance measures. Strategic Finance (April): 40-46. (Summary).
Martin, J. R. Not dated. Deming's Theory of Profound Knowledge. Management And Accounting Web. DemingExhibit
Martin, J. R. Not dated. Illustration of common cause vs. special cause variation. Management And Accounting Web. DogInYard
Walton, M. 1986. Deming Management Method. New York, NY: The Putnam Publishing Company.
Red Bead Experiment Videos by Steve Prevette (You Tube link).