by Infogram
Create infographics

A Look At Box Plots

Severino Ribecca continues his special series about some of the most popular types of charts

April 7, 2015

[This is a guest post by Severino Ribecca*, as part of a series dedicated to each individual kind of chart that he has read into as part of his main research project.]

 

 

Box Plots, also known as Box & Whisker Plots, are a type of chart ideal for visually displaying the distribution of numerical data through displaying their quartiles (or percentiles) and their averages. Typically, Box Plots are used in descriptive statistics, as they are a great way to quickly examine the overview of one or more sets of distributed data and their range. While they may seem primitive in comparison to a histogram or density plot, they have the advantage of being more compact.

Below is a diagram on how to read a Box Plot:

box_plot_anatomy

If you need to test yourself on reading Box Plots, you can use Khan Academy’s section on Box Plots to improve your skill.

box_plot_1

There are a number of observations one can make from viewing a box plot:

• What the key values are, such as the average, the median or the lower quartile etc.
• If there are any outliers and what their values are.
• If the data is symmetrical or not.
• How tightly the data is grouped.
• If the data is skewed at all and if so, in what direction.

History

Box Plots were invented by John Wilder Tukey, an American mathematician. Tukey first developed the Box Plot in 1970* as part of his toolkit for exploratory data analysis. However, his technique didn’t become widely known until he formally published it in his book Exploratory Data Analysis in 1977.


*Date reference: source 1, source 2

John Wilder Tukey | (photo source: http://thisisstatistics.org/famous-statisticians/ )

Different types

After Tukey introduced the Box Plot, there have been a number variations that have been developed:

box_plot_variations

Variable Width Box Plots use the width of the box to represent the size of the data within each group. So a group with a larger total in the data will have a larger width. Notched Box Plots have a narrowing of the box around the median. This is useful way to compare the differences between median values as the “notches” act as a visual guide. Violin Plots are a pair of joined kernel density plots and Vase Plots and Bean Plots are another couple variations of a Box Plot.

While Box Plots are great for showing the different ranges in a data set, their structure is not intuitive and reading them takes time to learn. Box Plots are primarily used for statistical insights, so would not be understood for an audience who are not literate in statistics. This would be most of the population, so if you plan to design for a wider audience, avoid Box Plots.

In my next post I will be looking at Bubbles Charts.

Further reading on Box Plots:

Box and Whisker Plot Reference Page – The Data Visualisation Catalogue
How to Read and Use a Box-and-Whisker Plot – Flowing Data
40 years of boxplots – Hadley Wickham and Lisa Stryjewski
Box Plots – Khan Academy
Boxes of Insight – Stephen Few
Wikipedia entry on box plots

 

*Severino Ribecca is a British graphic and information designer interested in data visualization. Currently he’s building an online library of different information visualization methods called The Data Visualisation Catalogue. You can follow the project’s updates on Twitter (@dataviz_catalog) and support further developments on the Patreon Page.

Written by Tiago Veloso

Tiago Veloso is the founder and editor of Visualoop and Visualoop Brasil . He is Portuguese, currently based in Bonito, Brazil.

Follow: