Box and Whisker Plots
I’m going to share what I learnt about Box and Whisker Plots. Also known as Box Plot is a visual representation of variation in a set of data.
It primarily shows five-number summary such as minimum, first quartile, median, third quartile, and maximum.
In ML world, this visualization provides a good understanding of distribution/variation of data. Hence, understanding this plot is very useful in the ML journey.
For instance, I was working on Iris dataset. If I describe this dataset, this is how it looks like:
Do note that, the parameters from min to max including 25%, 50%, and 75% are represented graphically in Box Plots. The resulting Box and Whisker plot looks like this:
If you observe, for sepal-width, there are some small circles. They are called Outliers.
This video in Khan Academy beautifully explains about interpreting box plots. This video in YouTube explains outliers.