How to Construct Histograms

An important aspect of total quality is the identificationclass. Continue in this way until you have drawn in all
and control of all the sources of variation so thatthe classes.
processes produce essentially the same result again10. Draw a vertical dotted line through your histogram
and again. A histogram is a tool that allows you toto represent the mean value of all your data points.
understand at a glance the variation that exists in a11. If there are specification limits for the characteristic
process. Although the histogram is essentially a baryou are studying, indicate them as vertical lines as well.
chart, it creates a "lumpy distribution curve" that can be12. Title and label your histogram.
used to help identify and eliminate the causes ofNow what?
process variation. Histograms are especially useful inThe shape that your histogram takes tells a lot about
the measure, analyze and control phases of the Leanyour process. Often, it will tell you to dig deeper for
Six Sigma methodology.otherwise unseen causes of variation.
What can it do for you?The symmetrical or bell-shaped type of histogram: The
A histogram will show you the central value of amean value is in the middle of the range of data. The
characteristic produced by your process, and thefrequency is high in the middle of the range and falls
shape and size of the dispersion on either side of thisoff fairly evenly to the right and left. This shape occurs
central value. The shape and size of the dispersion willmost often.
help identify otherwise hidden sources of variation. TheThe "comb" or multi-modal type of histogram: Adjacent
data used to produce a histogram can ultimately beclasses alternate higher and lower in frequency. This
used to determine the capability of a process tousually indicates a data collection problem. The problem
produce output that consistently falls withinmay lie in how a characteristic was measured or how
specification limits.values were rounded. It could also indicate an error in
How do you do it?the calculation of class boundaries.
1. Decide which Critical-To-Quality characteristic youIf the distribution of frequencies is shifted noticeably to
wish to examine. This CTQ must be measurable on aeither side of the center of the range, the distribution is
linear scale. That is, the incremental value betweensaid to be skewed. When the histogram is positively
units of measurement must be the same. For example,skewed. The mean value is to the left of the center of
a micrometer or a thermometer or a stopwatch canthe range, and the frequency decreases abruptly to
produce linear data. Asking your customers to ratethe left but gently to the right. This shape normally
your performance from "poor" to "excellent" on aoccurs when the lower limit, the one on the Left, is
five-point scale probably will not.controlled either by specification or because values
2. Measure the characteristic and record the results. Iflower than a certain value do not occur for some
the characteristic is continually being produced-such asother reason.
voltage in a line or temperature in an oven, or if thereIf the skewness of the distribution is even more
are too many items being produced to measure all ofextreme, a clearly asymmetrical, precipice-type
them, you will have to sample. Take care to ensurehistogram is the result. This shape frequently occurs
that your sampling is random.when a 100% screening is being done for one
3. Count the number of individual data points. Add thespecification limit.
values for each of the data points and divide by theIf the classes in the center of the distribution have
number of points. This is the mean (or average) value.more or less the same frequency, the resulting
4. Determine the highest data value and the lowesthistogram looks like a plateau. This shape occurs when
data value. Subtract the lower number from the higher.there is a mixture of two distributions with different
This is the range.mean values blended together. Look for ways to
5. The next step is determining how many "classes" orstratify the data to separate the two distributions. You
bars your histogram should have.can then produce two separate histograms to more
To make an initial determination, you can use this table:accurately depict what is going on in the process.
Number of data points Number of classesunder 50 5If two distributions with widely different means are
to 7combined in one data set, the plateau splits to become
50 to 100 6 to 10twin peaks. The two separate distributions become
100 to 250 7 to 12over 250 10 to 20much more evident than with the plateau. Examining
6. Divide the range by the trial number of classes youthe data to identify the two different distributions will
selected. The resulting number will be your trial classhelp you understand how variation is entering the
interval (the horizontal graduation or width) for eachprocess.
bar on your chart. You may round or simplify thisIf there is a small, essentially disconnected peak along
number to make it easier to work with, but the totalwith a normal, symmetrical peak, this is called an
number of classes should be within those shownisolated-peak histogram. It occurs when there is a small
above. In determining the number of classes and theamount of data from a different distribution included in
class interval, consider how you are measuring data.the data set. This could also represent a short-term
Increase or decrease the number of classes or modifyprocess abnormality, a measurement error or a data
the class interval until there is essentially the samecollection problem.
number of measurement possibilities in each class.If specification limits are involved in your process, the
7. Determine the class boundaries. You can do this byhistogram is an especially valuable indicator for
starting at the center of the range. If you have an oddcorrective action. The histogram shows that the
number of classes, center the middle classprocess is centered between the limits with a good
approximately at the mid-point of the range, thenmargin on either side. Maintaining the process is all that
alternately add or subtract the class interval to defineis needed.
the other class boundaries. If you have an evenWhen the process is centered but with no margin, It is
number of classes, begin the process of adding ora good idea to work at reducing the variation in the
subtracting the class interval at approximately theprocess since even a slight shift in the process center
center of the range.will produce defective material.
8. Tally the number of data points that fall in each ofA process that would have produced material within
the classes. Add the frequency totals for each class.specification limits if it were centered is shifted to the
This number should equal the total number of dataleft. Action must be taken to bring the mean closer to
points. Divide the number of data points in each classthe center of the specification limits. A histogram that
by the total number of data points. This will give youshows a process that has too much variation to meet
the percentage of points falling in each class. Add thespecifications no matter how it is centered. Action
percentages of all the classes. The result should bemust be taken to reduce variation in this process.
approximately 100.A process that is both shifted, in this case to the right,
9. Graph the results by beginning with the lowestand has too much variation. Action is necessary to
measurement-value class. Make the bar heightboth center the process and reduce variation. This is a
correspond to the percentage of data points that fall inpicture of the statistical variation in your process. Not
that class. Draw the bar for the second class to theonly can histograms help you know which processes
right and touching the first bar. Again, make the heightneed improvement, they can also help you track that
correspond to the percentage of data points in thatimprovement.