| Box-and-whisker diagrams, or Box Plots, use the | | | | of the box. Find the inter-quartile range (IQR) by |
| concept of breaking a data set into fourths, or quartiles, | | | | subtracting the value of the first quartile boundary from |
| to create a display. The box part of the diagram is | | | | that of the third quartile boundary. |
| based on the middle (the second and third quartiles) of | | | | |
| the data set. The whiskers are lines that extend from | | | | 1. Smallest data point is bigger than or equal to Q1 -1.5 |
| either side of the box. The maximum length of the | | | | IQR |
| whiskers is calculated based on the length of the box. | | | | 2. Largest data point is less than or equal to Q3 +1.5 |
| The actual length of each whisker is determined after | | | | IQR |
| considering the data points in the first and the fourth | | | | 3. Any points not in the interval [Q1-1.5 IQR; Q3+1.5 IQR] |
| quartiles. | | | | are plotted separately. |
| Although box-and-whisker diagrams present less | | | | 11. Multiply the IQR by 1.5. (The use of 1.5 as a multiplier |
| information than histograms or dot plots, they do say a | | | | is a convention that has no exact statistical basis. |
| lot about distribution, location and spread of the | | | | Multiplying by this constant helps take into consideration |
| represented data. They are particularly valuable | | | | the fact that the first and fourth quartiles will naturally |
| because several box plots can be placed next to | | | | have a somewhat wider dispersion than the second |
| each other in a single diagram for easy comparison of | | | | and third quartiles.) |
| multiple data sets. | | | | 12. Subtract the value of 1.5(IQR) from the value of the |
| What can it do for you? | | | | first quartile boundary. Find the smallest data point in |
| If your improvement project involves a relatively limited | | | | your list that is equal to or larger than this value. Make |
| amount of individual quantitative data, a | | | | a tick mark representing this data point to the left of |
| box-and-whisker diagram can give you an instant | | | | your box (or above, if you used a vertical scale). Draw |
| picture of the shape of variation in your process. Often | | | | a line, the first whisker, from the side of the box to the |
| this can provide an immediate insight into the search | | | | tick mark. |
| strategies you could use to find the cause of that | | | | 13. Add the value of 1.5(IQR) to the value of the third |
| variation. | | | | quartile boundary. Find the largest data point in your list |
| Box-and-whisker diagrams are especially valuable to | | | | that is equal to or smaller than this value. Make a tick |
| compare the output of two processes creating the | | | | mark representing this data point to the right of your |
| same characteristic or to track improvement in a single | | | | box (or below, if you used a vertical scale). Draw |
| process. They can be used throughout the phases of | | | | another whisker to this tick mark. |
| the Lean Six Sigma methodology, but you will find | | | | 14. It is possible that some data points in your list will lie |
| box-and-whisker diagrams particularly useful in the | | | | outside of the ends of the whiskers you determined in |
| analyze phase. | | | | steps 12 and 13. These points are called outliers. Plot |
| How do you do it? | | | | any outliers as dots beyond the whiskers. |
| 1. Decide which Critical-To-Quality (CTQ) characteristic | | | | [Note: steps 3 through 14 happen automatically if you |
| you wish to examine. This CTQ must be measurable | | | | use Excel, Minitab, or JMP to create your |
| on a linear scale. That is, the incremental value | | | | box-and-whisker diagram. If you are familiar with these |
| between units of measurement must be the same. For | | | | software packages, their use can greatly simplify the |
| example, time, temperature, dimension and spatial | | | | process of making effective box-and-whisker |
| relationships can usually be measured in consistent | | | | diagrams.] |
| incremental units. | | | | 15. Title and label your box-and-whisker diagram. |
| 2. Measure the characteristic and record the results. If | | | | Now what? |
| the characteristic is continually being produced, such as | | | | The shape that your box-and-whisker diagram takes |
| voltage in a line or temperature in an oven, or if there | | | | tells a lot about your process. |
| are too many items being produced to measure all of | | | | One way to help you interpret box plots is to imagine |
| them, you will have to sample. Take care to ensure | | | | that the way a data set looks as a histogram is |
| that your sampling is random. | | | | something like a mountain viewed from ground level |
| 3. Count the number of individual data points. | | | | and a box-and-whisker diagram is something like a |
| 4. List the data points in ascending order. | | | | contour map of that mountain as viewed from above. |
| 5. Find the median value. If there are an odd number of | | | | In a Skewed histogram and box plot compared |
| data points, the median is the data point that is halfway | | | | The second-quartile box is considerably larger than the |
| between the largest and the smallest ones. (For | | | | third-quartile box, and the whisker associated with the |
| example, if there are 35 data points, the median value | | | | first quartile extends almost to the end of the 1.5 IQR |
| is the value of the 18th data point from either the top | | | | limit. An outlier beyond the 1.5 IQR limit of the whisker |
| or the bottom of the list.) If there is an even number of | | | | further emphasizes the fact that the data is strongly |
| points, the median is halfway between the two points | | | | skewed in this direction. On the other side of the |
| that occupy the center most position. (If there were 36 | | | | distribution, the whisker associated with the fourth |
| points, the median would be halfway between point 18 | | | | quartile is well within the 1.5 IQR. In fact, the |
| and point 19. To find the median value, add the values | | | | fourth-quartile whisker is shorter than the third-quartile |
| of points 18 and 19, and divide the result by 2.) If you | | | | box. A histogram of this data would show a strongly |
| think of the list of data points being divided into | | | | skewed distribution verging on a precipice that fell off |
| quarters (quartiles), the median is the boundary | | | | at the high end of the values. This kind of data set |
| between the second and the third quartile. | | | | often occurs when there is a natural limit at one end of |
| Order Value Boundary | | | | the distribution or a 100% screening is done for one |
| 1 27.75 | | | | specification limit. |
| 2 37.35 | | | | Although box-and-whisker diagrams can be oriented |
| 3 38.35 | | | | horizontally, they are more often displayed vertically, |
| 4 38.35 | | | | with lower values at the bottom of the scale. |
| 5 38.75 | | | | Normal distribution curve and box plot compared |
| Second Quartile 39.250 | | | | The second- and third-quartile boxes are |
| 6 39.75 | | | | approximately the same size. The whiskers are similar |
| 7 40.50 | | | | to each other in length and extend close to the 1.5 IQR |
| 8 41.00 | | | | limit. If the data set were actually a combination of two |
| 9 41.15 | | | | different distributions, for example, material from two |
| 10 42.55 | | | | suppliers or two machines, it might form a histogram |
| Third Quartile 42.725 | | | | that looked like a plateau or a mountain with twin |
| 11 42.90 | | | | peaks. |
| 12 43.60 | | | | Plateau histogram and box plot compared |
| 13 43.85 | | | | The box plot would show an even distribution, but |
| 14 47.30 | | | | would have relatively large boxes and relatively short |
| 15 47.90 | | | | whiskers. If there were a small amount of data from a |
| Fourth Quartile 48.025 | | | | different distribution included in the data set, for |
| 16 48.15 | | | | example, if there were a short-term process |
| 17 49.86 | | | | abnormality or a data collection error, the histogram |
| 18 51.25 | | | | formed would look like a mountain with a small isolated |
| 19 51.60 | | | | peak. |
| 20 56.00 | | | | Isolated peak histogram and box plot compared |
| Data table divided into quartiles | | | | The box plot for that data set would look like one for |
| 6. The next step is to find the boundaries between the | | | | a normal distribution but with a number of outliers |
| first and second and the third and fourth quartiles. The | | | | beyond one whisker. |
| first quartile boundary is halfway between the last | | | | Some final tips |
| data point in the first quartile and the first data point in | | | | A box-and-whisker diagram is an easy way to |
| the second quartile. (If one data point is on the median, | | | | compare processes or to chart the improvement |
| that data point is considered to be the last point in the | | | | process in one process. Box-and-whisker diagrams |
| second quartile and the first point in the third quartile.) In | | | | can quickly give you a comparative feel of the |
| a similar way, find the third quartile boundary, the | | | | distribution of sets of data. They show the distributional |
| halfway point between the last value in the third | | | | spread through the length of the box and the whiskers. |
| quartile and the first value in the fourth quartile. | | | | Some idea of the symmetry of the distribution can |
| 7. Draw and label a scale line with values. The value of | | | | also be gained by comparing the two segments of the |
| the scale should begin lower than your lowest value | | | | box and the relative lengths of the whiskers. The |
| and extend higher than your highest value. The scale | | | | existence and displacement of outliers gives some |
| line may be either vertical or horizontal. | | | | indication of the level of control in the process. |
| 8. Using the scale as a guideline, create a box above | | | | Two or more box-and-whisker diagrams drawn side |
| or to the right of the scale. One end of the box will be | | | | by side to the same scale are an effective way to |
| the first quartile boundary; the other will be the third | | | | compare samples in a way that is compact and |
| quartile boundary. (The width of the box is somewhat | | | | uncluttered. Many box plots can be added to a |
| arbitrary. Boxes tend to be long and thin. As an option, | | | | diagram without creating visual overload. |
| if you have multiple data sets with different numbers | | | | Not only can box-and-whisker diagrams help you see |
| of data points in each set, make the width of the | | | | which processes need improvement, by comparing |
| boxes so that they correspond roughly with the | | | | initial box-and-whisker diagrams with subsequent ones, |
| relative quantity of data represented in each box.) | | | | they can also help you track that improvement. If |
| 9. Draw a line through the box to represent the median | | | | specification limits or improvement targets are involved |
| (second quartile boundary). | | | | in your process, they can be added to the diagram to |
| 10. The next step is to draw the whiskers on the ends | | | | help visualize progress. |