Box and Whisker Plots

I’m going to share what I learnt about Box and Whisker Plots. Also known as Box Plot is a visual representation of variation in a set of data.

It primarily shows five-number summary such as minimum, first quartile, median, third quartile, and maximum.

In ML world, this visualization provides a good understanding of distribution/variation of data. Hence, understanding this plot is very useful in the ML journey.

For instance, I was working on Iris dataset. If I describe this dataset, this is how it looks like:

Do note that, the parameters from min to max including 25%, 50%, and 75% are represented graphically in Box Plots. The resulting Box and Whisker plot looks like this:

If you observe, for sepal-width, there are some small circles. They are called Outliers.

This video in Khan Academy beautifully explains about interpreting box plots. This video in YouTube explains outliers.


Popular posts from this blog

Understanding Correlation Matrix Plot

Car Evaluation - Another Classifier Program

My first Classifier program in Python