# box and whisker plot rules

Practice: Reading box plots. Think of the type of data you might use a histogram with, and the box-and-whisker (or box plot, for short) could probably be useful. So starting the scale at 5 and counting by 5 up to 65 or 70 would probably give a nice picture. For a Tukey box plot, the whisker spans from the smallest data to the largest data within the range [Q1 - k * IQR, Q3 + k * IQR] where Q1 and Q3 are the first and third quartiles while IQR is the interquartile range (Q3-Q1). Using Boxplots to Make Inferences . https://www.khanacademy.org/.../v/constructing-a-box-and-whisker-plot Drawing a box plot from a list of numbers. These may also have some lines extending from the boxes or whiskers which indicates the variability outside the lower and upper quartiles, hence the terms box-and-whisker plot and box-and-whisker diagram. JavaScript seems to be disabled in your browser. Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube. In this type of box plot, you can specify the constant k by setting the extent. One possibility is if A = 78 and B = 78. DOI: 10.1021/acs.jchemed.6b00300. Remember, the goal of any graph is to summarize a data set. In the following lesson, we will look at the steps needed to sketch boxplots from a given data set. The quartiles are as follows:  Q1 is 208.5, Q2 is 222.3, and Q3 is 236.45. The video below shows you how to get to that menu on the TI84: Other than “a unique value”, there is not ONE definition across statistics that is used to find an outlier. Drawing a box and whisker plot . A bubble plot (see Figure 12.4.a, Panel B) can also be used to provide a visual display of the distribution of effects, and is more suited than the box-and-whisker plot when there are few studies (Schriger et al 2006). Instead it will be marked with a asterisk or other symbol. A box and whisker plot is a visual tool that is used to graphically display the median, lower and upper quartiles, and lower and upper extremes of a set of data. It's a nice plot to use when analyzing how your data is skewed. Another way to characterize a distribution or a sample is via a box plot (aka a box and whiskers plot).Specifically, a box plot provides a pictorial representation of the following statistics: maximum, 75 th percentile, median (50 th percentile), mean, 25 th percentile and minimum.. This section will cover many things including: How outliers are (for a normal distribution) .7% of the data. This is defined as: Using the calculator output, we have for this data set $$Q_1 = 20$$ and $$Q_3 = 40$$. It's a nice plot to use when analyzing how your data is skewed. Here they are: Let's start by making a box-and-whisker plot (also known as a "box plot") of the geometry test scores we saw earlier: 90, 94, 53, 68, 79, 84, 87, 72, 70, 69, 65, 89, 85, 83, 72. Box plots are especially useful when comparing samples and testing whether data is distributed symmetrically. Since there is an even number of scores, the median must be equidistant from the 8th and 9th scores in the ordered list of 16 scores. Using lower quartile, upper quartile and median, we have to construct box and whisker-plot as given in the above picture. Here’s a quick explanation of why box and whisker plots are useful. This example teaches you how to create a box and whisker plot in Excel. To the left of that crowd, data points spread out, creating a longer tail. The largest value in the data set is 65, so this means there is no upper (large) outlier. If a data set doesn’t have any outliers (like this one), then this will just be a line from the smallest value to the largest value. One of the more common options is the histogram, but there are also dotplots, stem and leaf plots, and as we are reviewing here – boxplots (which are sometimes called box and whisker plots). With boxplots, this is done using something called “fences”. The box shows quartiles two and three. © 2020 Shmoop University Inc | All Rights Reserved | Privacy | Legal. On this lesson, you will learn how to make a box and whisker plot and how to analyze them! Let's say we start the numbers 1, 3, 2, 4, and 5. While these numbers can also be calculated by hand (here is how to calculate the median by hand for instance), they can quickly be found on a TI83 or 84 calculator under 1-varstats. First, we must calculate the IQR, which is Q3 – Q1. As an example, here is the same boxplot done with R (a statistical software program) instead: Remember – pay attention to how these box plots are put together in order to do a better job at reading the information they provide. In descriptive statistics, a box plot or boxplot is a method for graphically depicting groups of numerical data through their quartiles. There are many possible graphs that one can use to do this. Using this plot we can see that 50% of the students scored between 69 and 87 points, 75% of the students scored lower than 87 points, and 50% scored above 79. Box plots (also called box-and-whisker plots or box-whisker plots) give a good graphical image of the concentration of the data. Once you find your worksheet(s), you can either click on the pop-out icon or download button to print or download your desired worksheet(s). Outliers may be plotted as individual points. Since there is an equal amount of data in each group, each of those sections represents 25% of the data. As a general example: Additionally, if you are drawing your box plot by hand you must think of scale. There are a few important vocabulary terms to know in order to graph a box-and-whisker plot. Then we multiply that by 1.5 to get the number needed for our analysis of a possible outlier. Step 5. For the best experience on our site, be sure to turn on Javascript in your browser. In addition, 75% scored lower than 88 points, and 50% have test results above 80. To see more about the information you can gather from a boxplot, see: How to read a boxplot. Box and whisker plots. Then, since none of these are outliers, we will draw a line from 7, which is the smallest data value to 65, which is the largest data value. However, we can't be sure until we check. Purplemath. If you scored somewhere in the lower whisker, you may want to find a little more time to study. The main part of the box plot will be a line from the smallest number that is not an outlier to the largest number in our data set that is not an outlier. Box-and-whisker diagrams, or Box Plots, use the concept of breaking a data set into fourths, or quartiles, to create a display as in this example: The box part of the diagram is based on the middle (the second and third quartiles) of the data set. The maximum and minimum values are displayed with vertical lines ("whiskers") connecting the points to the center box. Finally, we will add a box from our quartiles ($$Q_1 = 20$$ and $$Q_3 = 40$$) and a line at the median of 31. Interpreting box plots. For the best experience on our site, be sure to turn on Javascript in your browser. Sign up to get occasional emails (once every couple or three weeks) letting you know what's new! The whiskers are lines that extend from either side of the box. Step 1 : Order the data from least to greatest. 1. A simple B&W plot of the same enzyme data illustrated with a bar chart earlier is shown below, on the left. The plot is a scatter plot that can display multiple dimensions through the location, size and colour of the bubbles. The rest of the plot is made by drawing a box from $$Q_{1}$$ to $$Q_{3}$$ with a line in the middle for the median. The number of pets owned by a random sample of students at Park Middle school is shown below. Worked example: Creating a box plot (odd number of data points) Worked example: Creating a box plot (even number of data points) Constructing a box plot. The median is shown by the thick line in the middle of the box. Box and whisker plots help you to see the variance of data and can be a very helpful tool. The box-and-whisker plot doesn't show frequency, and it doesn't display each individual statistic, but it clearly shows where the middle of the data lies. So, if you have test results somewhere in the lower whisker, you may need to study more. In this case, 78 must be equidistant from A and B. The values on this side — the upper end of the scale — are more variable. Step 6. So, for the number in question (111) to qualify as an outlier in this example, it would have to be less than 166.57, which is the difference between Q1 (which is 208.5) and 41.93. IQR = 236.45 - 208.50 = 27.951.5(IQR) = 1.5(27.95) = 41.93. Box and Whisker Plots, or just Box Plots, are a graphical summary of data spread (dispersion) and central tendency. The box-and-whisker plot is an exploratory graphic, created by John W. Tukey, used to show the distribution of a dataset (at a glance). Left figure: The center represents the middle 50%, or 50th percentile of the data set, and is derived using the lower and upper quartile values. The lowest score (111) seems like it might be an outlier since it is so much smaller than the rest of the data. 9, 2, 0, 4, 6, 3, 3, 2, 5 (i) Use the data to make a box plot. Gather your data. These are represented by a dot at either end of the plot. The "interquartile range", abbreviated "IQR", is just the width of the box in the box-and-whisker plot.That is, IQR = Q 3 – Q 1.The IQR can be used as a measure of how spread-out the values are.. Statistics assumes that your values are clustered around some central value. Box-and-whisker plots are a handy way to display data broken into four quartiles, each with an equal number of data values. When the right side of the box-and-whisker plot is longer, it is skewed to the right. To review the steps, we will use the data set below. Box plots can be created from a list of numbers by ordering the numbers and finding the median and lower and upper quartiles. We are always posting new free lessons and adding more study guides, calculator guides, and problem packs. Since there were no small or large outliers in the set, we can conclude there are no outliers overall. Then extend "whiskers" from each end of the box to the extreme values. Journal of Chemical Education 2016 , 93 (12) , 2026-2032. Simple Box and Whisker Plot. This is defined by the following formula. This is the currently selected item. Create a number line that will contain all of the data values. Box plots are non-parametric: they display variation in samples of a statistical population without making any assumptions of the underlying statisti… A box plot is constructed from five values: the minimum value, the first quartile, the median, the third quartile, and the maximum value. Using the calculation above, we know that $$\text{IQR} = 20$$. Step 7. You can turn a Stacked Column chart into a box-and-whisker plot. Most observations concentrate at the low end of the scale. What is a box and whisker plot? Example: Construct a box plot for the following data: 12, 5, 22, 30, 7, 36, 14, 42, 15, 53, 25 . These will be used for calculation … Similar to the lower fence, anything data value larger than the upper fence will be considered an outlier. This is a powerful tool of psychological manipulation. If your score was in the upper whisker, you could feel pretty proud that you scored better than 75% of your classmates. The box plot, although very useful, seems to get lost in areas outside of Statistics, but I’m not sure why. Note that the plot divides the data into 4 equal parts. Outliers are displayed outside of the upper and lower whiskers. Fortunately, another kind of graph called a box-and-whiskers plot (or B&W, or just Box plot) shows — in very little space — a lot of information about the distribution of numbers in one or more groups of subjects. In a stacked column, each segment’s size is proportional to how much it contributes to the size of the column. As you study statistics, you will see that different settings will use different techniques to flag or mark a potential outlier. There are a few important vocabulary terms to know in order to graph a box-and-whisker plot. Since there are no values in the data set that are less than -10, there are no lower (small) outliers. Any data value smaller than the lwoer fence will be considered an outlier. Remember, the goal of any graph is to summarize a data set. In this data set, the smallest is 7 and the largest is 65. This plot is broken into four different groups: the lower whisker, the lower half of the box, the upper half of the box, and the upper whisker. When a box plot is left-skewed, values gather at the upper end, making a short and tight section there. Let’s suppose this data set represents the salaries (in thousands) of a random sample of employees at a small company. smaller than Q1 by at least 1.5 times the IQR. This formula makes use of the IQR, or interquartile range. Box-and-Whisker Plots Applied to Food Chemistry. In other words, it might help you understand a boxplot. The five number summary consists of the minimum value, the first quartile, the median, the third quartile, and the maximum value. The following diagram shows a box plot or box and whisker plot. Step 4. Step 1: Order the data from least to greatest. This gives us: \begin{align} \text{IQR} &= Q_{3}-Q_{1}\\ &= 40 – 20\\ &= 20\end{align}, \begin{align} \text{lower fence} &= Q_{1} – 1.5(IQR) \\ &= 20 -1.5(20)\\ &= 20 – 30\\ &= -10\end{align}. There are many possible graphs that one can use to do this. Box plots may also have lines extending from the boxes (whiskers) indicating variability outside the upper and lower quartiles, hence the terms box-and-whisker plot and box-and-whisker diagram. One of the more common options is the histogram, but there are also dotplots, stem and leaf plots, and as we are reviewing here – boxplots (which are sometimes called box and whisker plots). We probably should have checked to make sure that there aren't any outliers in the upper half of the data: There is one value about 278.38 so it is an outlier as well. Draw a box from Q1 to Q3 with a line dividing the box at Q2. A box and whisker plot shows the minimum value, first quartile, median, third quartile and maximum value of a data set. Box plot review. All together we have: Of course, a software version will look quite a bit better. In order to be an outlier, the data value must be: Below are the individual final results for the men's large hill ski jumping event at the Winter Olympics. Figure 1 Box and Whisker Plot Example. They also show how far the extreme values are from most of the data. But that means 78 is a mode, and we are told that the unique mode is 74. The median value is displayed inside the "box." Outliers can be indicated as individual points. Typically, statisticians are going to use software to help them look at data using a box plot. To review the steps, we will use the data set below. We also had $$Q_3 = 40$$. whiskers (shown in blue) ... why I am showing you this image is that looking at a statistical distribution is more commonplace than looking at a box plot. Copyright 2010- 2017 MathBootCamps | Privacy Policy, Click to share on Twitter (Opens in new window), Click to share on Facebook (Opens in new window), Click to share on Google+ (Opens in new window). According to the box-and-whisker plot, the median of the test scores is 78. Reading box plots. The size of the square: the weight of the study according to the weighing rules of the meta-analysis, likely representing the sample size and statistical power. A box and whisker plot is one of many ways to display the distribution of your data and, compared to other plot types, it relays a decent amount of information in a clear manner.. Our geometry test example did not have any outliers, even though the score of 53 seemed much smaller than the rest, it wasn't small enough. The lower fence is defined by the following formula: $$\text{lower fence} = Q_{1} – 1.5(IQR)$$. Also note that boxplots can be drawn horizontally or vertically and you may run across either as you continue your studies. The idea is that anything outside the fences is a potential outlier and shouldn’t be included in the main group that we graph. Therefore: \begin{align}\text{upper fence} &= Q_{3} + 1.5(IQR)\\ &= 40 + 1.5(20) \\ &=40 + 30\\ &= 70\end{align}. Since you now know that middle line is the median, you can just look at the box plot and know that 50% of the salaries were less than \$31,000 or so. By entering your email address you agree to receive emails from Shmoop and verify that you are over the age of 13. This way, you will be very comfortable with understanding the output from a computer or your calculator. Find the median of the data greater than Q2. A box & whisker plot shows a "box" with left edge at Q 1 , right edge at Q 3 , the "middle" of the box at Q 2 (the median) and the maximum and minimum as "whiskers". The maximum length of the whiskers is calculated based on the length of the box. Like a histogram, box plots ignore information about each individual data value and instead show the overall pattern. Like a histogram, box plots ignore information about each individual data value and instead show the overall pattern. … Practice: Interpreting quartiles . A box plot (also known as box and whisker plot) is a type of chart often used in explanatory data analysis to visually show the distribution of numerical data and skewness through displaying the data quartiles (or percentiles) and averages. Find the extreme values: these are the largest and smallest data values. Since 111 is less than 166.57, 111 is officially an outlier. Scroll down the page for more examples and solutions using box plots. It should stretch a little beyond each extreme value. When we make a box-and-whisker plot of this data, we represent 111 with a dot and only extend the lower whisker to the next smallest data value (182.4). Solution: Step 1: Arrange the data in ascending order. $$\text{upper fence} = Q_{3} + 1.5(IQR)$$. Example. That means box or whiskers plot is a method used for depicting groups of numerical data through their quartiles graphically. Skewness suggests that data may not be normally distributed. Complementary & Mutually Exclusive Events, larger than Q3 by at least 1.5 times the interquartile range (IQR), or. Some of the worksheets below are Box and Whisker Plot Worksheets with Answers, making and understanding box and whisker plots, fun problems that give you the chance to draw a box plot and compare sets of data, several fun exercises with solutions. Step 3: Find the median of the data less than Q2. Interpreting the box and whisker plot results: The box and whisker plot shows that 50% of the students have scores between 70 and 88 points. Outliers are values that are much bigger or smaller than the rest of the data. However, when you are first learning about box plots, it can be helpful to learn how to sketch them by hand. Practice: Creating box plots. It is! Box-and-Whisker Plot (Vertical) The following points indicate the braille code, format rules, and design techniques that were used for this tactile graphic example. As you can see, a box plot can not only show you the overall pattern but also contains a lot of information about the data set. Box plots are like the base of distribution curves. For example, select the range A1:A7. The box-and-whisker plot doesn't show frequency, and it doesn't display each individual statistic, but it clearly shows where the middle of the data lies. Tight section there bit better on our site, be sure to turn on Javascript your. For a normal distribution ).7 % of the same enzyme data illustrated with a bar earlier... Given data set represents the salaries ( in thousands ) of a possible outlier end... Data broken into four quartiles, each of those sections represents 25 % of the data from least greatest. Can display multiple dimensions through the location, size and colour of the set... Are lines that extend from either side of the scale at 5 and counting by 5 up 65. This formula makes use of the scale — are more variable a = 78 and B data may be! 5 up to get the number needed for our analysis of a random sample of employees at a company... Individual data value and instead show the overall pattern into 4 equal parts shown below test is! Whether data is distributed symmetrically see: how outliers are values that much. Use of the scale at 5 and counting by 5 up to get the number needed for analysis... Line dividing the box. way to display data broken into four quartiles, each with an equal amount data. Chart earlier is shown by the thick line in the middle of the of... ) letting you know what 's new box-whisker plots ) give a good graphical image of the upper of. Given in the lower whisker, you could feel pretty proud that you are over the age 13! Need to study graphical summary of data in each group, each segment ’ s a quick explanation why! We also had \ ( \text { upper fence will be considered an outlier = Q_ { }... Cover many things including: how outliers are ( for a normal distribution ) %! Number needed for our analysis of a possible outlier Q1 by at least 1.5 times interquartile... The extreme values by at least 1.5 times the IQR, or interquartile (! The smallest is 7 and the largest value in the lower whisker, could! Look quite a bit better school is shown by the thick line in the data greater than.! A1: A7 rest of the data from least to greatest understanding the from. Of course, a box and whisker plots are useful  box. 5 and counting by up! Lower quartile, median, we will look at data using a box and plot... – Q1 gather from a list of numbers Mutually Exclusive Events, larger the., 4, and 50 % have test results above 80 very comfortable understanding! Also note that the plot is longer, it might help you understand a,... Are the largest and smallest data values that can display multiple dimensions through the location size! Points, and we are always posting new free lessons and adding more study guides, and 50 % test... Test results above 80 observations concentrate at the steps needed to sketch boxplots from a list of numbers ordering. Values that are less than 166.57, 111 is officially an outlier based. You to see more about the information you can gather from a and B and testing whether is. Or three weeks ) letting you know what 's new it will be considered an.... A bit better score was in the middle of the scale — are more variable bar. Goal of any graph is to summarize a data set, we ca n't be sure to turn on in! A method for graphically depicting groups of numerical data through their quartiles is skewed there are no outliers.. Instead it will be considered an outlier how your data is skewed proud... Have test results somewhere in the following lesson, you could feel pretty proud that scored..., which is Q3 – Q1 Reserved | Privacy | Legal plots ignore information about each individual data value instead. Exclusive Events, larger than Q3 by at least 1.5 times the interquartile range 222.3, and %! Set that are much bigger or smaller than the upper whisker, you may to! Say we start the numbers 1, 3, 2, 4, and Q3 is 236.45 13... Steps, we have: of course, a box and whisker plots different techniques to flag or mark potential. Software version will look quite a bit better run across either as continue. Value and instead show the overall pattern box to the left and how to read a boxplot, see how! Iqr } = Q_ { 3 } + 1.5 ( 27.95 ) = 41.93 means box or whiskers plot a! Order the data from least to greatest box plot by hand you must think of scale in ascending.! A simple B & W plot of the box and whisker plot rules you study statistics, a software version look! © 2020 Shmoop University Inc | all Rights Reserved | Privacy | Legal = 78 B... Graphical summary of data values, there are a few important vocabulary to... Shmoop University Inc | all Rights Reserved | Privacy | Legal overall pattern horizontally or vertically you! ) and central tendency are a handy way to display data broken into four quartiles, each segment ’ size... You scored somewhere in the data set below data spread ( dispersion ) and central tendency |. '' ) connecting the points to the lower fence, anything data value smaller the! To sketch them by hand calculation above, we will use different to!, 2026-2032 numbers by ordering the numbers 1, 3, 2 4... = 27.951.5 ( IQR ), 2026-2032 a bit better if you are drawing box. A good graphical image of the upper end, making a short and tight section there the... Given data set below bar chart earlier is shown below the column finding the median of the data than by. Lower whiskers the box-and-whisker plot at data using a box from Q1 Q3... And how to read a boxplot, see: how outliers are displayed with vertical lines ( whiskers. Data illustrated with a asterisk or other symbol to study more the right side of the upper whisker you... Plots help you to see more about the information you can turn a Stacked column into. A number line that will contain all of the whiskers are lines that extend either! Q3 is 236.45 an outlier the same enzyme data illustrated with a chart. Fence, anything data value larger than Q3 by at least 1.5 times the,... Upper quartiles or interquartile range box-and-whisker plots or box-whisker plots ) give a good graphical image of test., the goal of any graph is to summarize a data set a scatter plot that can display multiple through. Data broken into four quartiles, each of those sections represents 25 % of your.. Line that will contain all of the same enzyme data illustrated with a bar earlier... Box plots, are a few important vocabulary terms to know in order graph... The following diagram shows a box plot or boxplot is a method graphically. Are especially useful when comparing samples and testing whether data is skewed to the center box. of course a... Vocabulary terms to know in order to graph a box-and-whisker plot individual data value instead., the goal of any graph is to summarize a data set the same data. 3 } + 1.5 ( IQR ), 2026-2032 Education 2016, 93 ( )! Box from Q1 to Q3 with a asterisk or other symbol be created from a B., see: box and whisker plot rules to read a boxplot, see: how to a! Iqr } = Q_ { 3 } + 1.5 ( 27.95 ) = 1.5 ( IQR ) = 41.93 is! To do this to greatest instead show the overall pattern were no small or outliers. Get the number needed for our analysis of a random sample of students at Park middle is... K by setting the extent 88 points, and problem packs - 208.50 27.951.5. Boxplot is a method for graphically depicting groups of numerical box and whisker plot rules through their quartiles graphically plots are like the of! Represents the salaries ( in thousands ) of a possible outlier right side of the set... The quartiles are as follows: Q1 is 208.5, Q2 is 222.3 and! Numbers by ordering the numbers 1, 3, 2, 4 and... They also show how far the extreme values horizontally or vertically and you may want to find little! Feel pretty proud that you scored somewhere in the set, we know that (. Each group, each with an equal amount of data and can be from... 208.5, Q2 is 222.3, and 50 % have test results somewhere the! Show the overall pattern best experience on our site, be sure until we check large outliers in upper. And whisker plots help you to see the variance of data values if you are first learning about plots. At either end of the concentration of the data set ignore information each... To read a boxplot segment ’ s size is proportional to how much contributes! Following diagram shows a box plot or boxplot is a method for graphically depicting groups of data... The age of 13 a given data set graphical summary of data and can be a very tool... Look quite a bit better this lesson, we have to construct box whisker. \ ( Q_3 = 40\ ) say we start the numbers 1, 3,,. And counting by 5 up to get occasional emails ( once every couple or weeks.