How to compute the statistics of a DataFrame

7-Day Challenge

Land Your First Data Science Job

A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.

Build portfolios that hiring managers love
Master the Python and SQL essentials to be industry-ready
Practice with real interview questions from tech companies
Access to the $100k/y Data Scientist Cheatsheet

Join thousands of developers who transformed their careers through our challenge. Unsubscribe anytime.

It is often useful to look at the statistics of each columns in order to make a quick analysis of our data.

To do so, we can use the Pandas .describe() method.

import matplotlib.pyplot as plt
import pandas as pd

# We read a sample dataset from the web.
df = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')
We use a sample DataFrame
print(df.describe())
We use the describe method to print each column statistics

You would end up with the following indicators

  1. The count
  2. The mean
  3. The minima
  4. The standard deviation
  5. The 25 percentile
  6. The 50 percentile
  7. The 75 percentile
  8. The maxima

Giving you a good overview of how your data looks like.

Categorical Variables

By default .describe() will only compute the statistics of numeric values, but if you have categorical variables, you could also use the include="all" which will include all variables types.

print(df.describe(include='all'))
Describe all variable types

Timestamps

.describe() can also describe timestamp but you will have to specify it using the datetime_is_numeric=True parameter.

print(df_containing_timestamp.describe(datetime_is_numeric=True))
Describe timestamp data
7-Day Challenge

Land Your First Data Science Job

A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.

Build portfolios that hiring managers love
Master the Python and SQL essentials to be industry-ready
Practice with real interview questions from tech companies
Access to the $100k/y Data Scientist Cheatsheet

Join thousands of developers who transformed their careers through our challenge. Unsubscribe anytime.

Free Newsletter

Master Data Science in Days, Not Months 🚀

Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚

Weekly simple and practical lessons
Access to ready to use code examples
Skip the math, focus on results
Learn while drinking your coffee

By subscribing, you agree to receive our newsletter. You can unsubscribe at any time.