How to resample a DataFrame with Pandas using Python
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
When you'll start dealing with time-series data, the resample method will come in handy.
This method allows you to reduce or increase the interval size between observations.
Let me illustrate
Here we can see that we want to transform daily observation to a monthly observation.
The resample method will take daily observation, month by month.
By further applying a mean() method, you take the monthly average of those daily observations.
Here is a figure that depicts exactly that.
The code
# We import the libraries
import numpy as np
import pandas as pd
# We generate the dates
dates = pd.date_range(start="01-01-1980",
end="01-01-2021",
freq="D")
# we generate random observation
x = np.random.normal(0, 1, len(dates))
# We create a sample dataframe
df = pd.DataFrame(index=dates,
data={"col1": x})
# We print out the initial DataFrame
print(df)
# We resmaple our dataframe
df = df.resample("M").mean()
# We print the transformed dataframe
print(df)
Here you are! You now know how to resample a DataFrame with Pandas using Python.
More on DataFrames
If you want to know more about DataFrame and Pandas. Check out the other articles I wrote on the topic, just here :
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Related Articles
Continue your learning journey with these related topics
Master Data Science in Days, Not Months 🚀
Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚