How to compute the standard error of the mean with Pandas using Python

7-Day Challenge

Land Your First Data Science Job

A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.

Build portfolios that hiring managers love
Master the Python and SQL essentials to be industry-ready
Practice with real interview questions from tech companies
Access to the $100k/y Data Scientist Cheatsheet

Join thousands of developers who transformed their careers through our challenge. Unsubscribe anytime.

What is the standard error of the mean?

The standard error usually gives you an idea of how close your sample is to the true population.

With the mean that would be how close your sample mean is to the true population mean.

An example

E.g. You have a basket of apples, some are rotten. You can say that on average 1 out of 100 are rotten (so 1/100 = 0.01 or 1%).

The farmer that sold you the apples does have on average 1000 rotten apples out of 50000 in his stock. (so 1000/50000 = 0.02  or 2%).

We can assume that you were lucky when buying your basket of apples because on average you got more good apples than you should have got. (0.02  - 0.01 = 0.01 or 1% difference).

Now, if your friends also bought some apples, the standard error of the mean would be the variation in the number of rotten apples between you and your friends.

The conclusion

All that to say that you cannot reliably conclude that you get 1% rotten apple on average and the standard error of the mean is here to tell you how far you are from reality.

Enough talking,

Here is the code

# Wassup, you look good today!
# If you want more check out other snippets on thepythonyouneed.com!

# To work with dataframes
import pandas as pd

# To generate data
import numpy as np

# For you to be able to reproduce the results
np.random.seed(3)

# We create our sample dataframe
df_apple_stock = pd.DataFrame({"rotten_apple": 
							    np.random.choice(a=[True, False], 
                                size=50000, 
                                p=[0.02, .98])})

# Our first friend stock
df_your_stock = df_apple_stock.sample(100)

# We show the % of rotten/non rotten apples of your stock
print(df_your_stock["rotten_apple"].value_counts() / len(df_your_stock))

# The farmer's
print(df_apple_stock["rotten_apple"].value_counts() / len(df_apple_stock))

# We print the unbiased standard error of the mean
print(df_apple_stock["rotten_apple"].sem())
How to compute the standard error of the mean with Pandas using Python

Here you are! You now know how to compute the standard error of the mean with Pandas using Python.

More on DataFrames

If you want to know more about DataFrame and Pandas. Check out the other articles I wrote on the topic, just here :

Pandas - The Python You Need
We gathered the only Python essentials that you will probably ever need.
7-Day Challenge

Land Your First Data Science Job

A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.

Build portfolios that hiring managers love
Master the Python and SQL essentials to be industry-ready
Practice with real interview questions from tech companies
Access to the $100k/y Data Scientist Cheatsheet

Join thousands of developers who transformed their careers through our challenge. Unsubscribe anytime.

Free Newsletter

Master Data Science in Days, Not Months 🚀

Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚

Weekly simple and practical lessons
Access to ready to use code examples
Skip the math, focus on results
Learn while drinking your coffee

By subscribing, you agree to receive our newsletter. You can unsubscribe at any time.