How to compute the quantiles of a DataFrame with Pandas using Python

7-Day Challenge

Land Your First Data Science Job

A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.

Build portfolios that hiring managers love
Master the Python and SQL essentials to be industry-ready
Practice with real interview questions from tech companies
Access to the $100k/y Data Scientist Cheatsheet

Join thousands of developers who transformed their careers through our challenge. Unsubscribe anytime.

Looking for quantiles is extremely useful when you want to locate the mass of your distribution or to remove outliers.

The quantiles on a normal distribution - source : wikipedia

Usually the most used quantiles are the quartiles. These are quantiles that divides the distribution into 4 equal parts. (25% each as shown on the image)

The values located below the Q1 quartile will give you 25% of the lowest values in our distribution.

The values located above the Q3 quartile will give you 25% of the highest values in our distribution.

To stay simple, the Q2 quartile is the median of our distribution and so, half of the mass of the distribution is located above and half of the mass of the distribution is located below.

Quartiles are nice, but percentiles are much better.

Percentiles is probably more intuitive to most readers.

Instead of dividing the distribution into 4 equal parts like the quartiles, the percentile divides it into 100 equal parts. So to say in 100%.

So if you want to know the range of your 5% lowest values, you can get all the values that are located below the 5th percentile.

Sometimes it might come in handy to be able to filter out outliers, so to say extreme values. We can do so by filtering out the values that are below a certain percentile. (e.g. we don't want values above the 95th percentile)

With the DataFrame.quantile() method, you can compute the quantile of the entire DataFrame or of a specific column.

Here is the code

# To use dataframes
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# We create a sample dataframe
df = pd.DataFrame({"col1" : np.random.normal(0, 1, 1000)})

# We print the 1st quartile
print(df.quantile(.25))

# We print the 2nd quartile
print(df.quantile(.5))

# We print the 3nd quartile
print(df.quantile(.75))

# We print the 95th percentile
print(df.quantile(.95))

# We print the 5th percentile
print(df.quantile(.05))

Here you are! You now know how to compute the quantile of a DataFrame with Pandas using Python.

More on DataFrames

If you want to know more about DataFrame and Pandas. Check out the other articles I wrote on the topic, just here :

Pandas - The Python You Need
We gathered the only Python essentials that you will probably ever need.

7-Day Challenge

Land Your First Data Science Job

A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.

Build portfolios that hiring managers love
Master the Python and SQL essentials to be industry-ready
Practice with real interview questions from tech companies
Access to the $100k/y Data Scientist Cheatsheet

Join thousands of developers who transformed their careers through our challenge. Unsubscribe anytime.

Free Newsletter

Master Data Science in Days, Not Months 🚀

Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚

Weekly simple and practical lessons
Access to ready to use code examples
Skip the math, focus on results
Learn while drinking your coffee

By subscribing, you agree to receive our newsletter. You can unsubscribe at any time.