How to compute the cumulative sum of a column with Pandas using Python

7-Day Challenge

Land Your First Data Science Job

A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.

Build portfolios that hiring managers love
Master the Python and SQL essentials to be industry-ready
Practice with real interview questions from tech companies
Access to the $100k/y Data Scientist Cheatsheet

Join thousands of developers who transformed their careers through our challenge. Unsubscribe anytime.

If you already worked with time-series data, let me tell you a bit more about the cumulative sum function.

The cumulative sum function is extremely helpful when you want to know what is the sum of a variable over time.

Let me give you an example with E-commerce sales data.

# For the dataframe
import pandas as pd

# We create our e-commerce sales_in_usd over time dataframe
df_sales = pd.DataFrame(index=['2021-01-31', '2021-02-28', 
                               '2021-03-31', '2021-04-30',
                               '2021-05-31', '2021-06-30',
                               '2021-07-31', '2021-08-31',
                               '2021-09-30', '2021-10-31',
                               '2021-11-30', '2021-12-31'],
                        data={"sales_in_usd" : 
                              [303.0, 591.0, 918.0, 1221.0, 1509.0,
                               1806.0, 2112.0, 2413.0, 2706.0, 3005.0,
                               3291.0, 3592.0]})
Monthly E-commerce sales volume
df_sales["sales_in_usd"].plot(kind='bar', 
							  grid=True, 
                              title="Monthly E-commerce sales")
We plot our monthly sales
The result

So far so good!

Now that we have the sales volume per month in USD. One could ask the question how much total sales volume have we done so far.

Using the DataFrame.cumsum() we can compute that metric over time.

In Python

# We compute the cumulative sum
df_sales["total_sales"] = df_sales["sales_in_usd"].cumsum()
Here is how to compute the cumulative sum of a DataFrame column

Resulting such DataFrame:

And if we plot it,

df_sales["total_sales"].plot(kind='bar', grid=True, title="Total E-commerce sales")
We plot the total sales
We plot the number of total sales over time

Here you are! You now know how to compute the cumulative sum of a column with Pandas in Python.

More on DataFrames

If you want to know more about DataFrame and Pandas. Check out the other articles I wrote on the topic, just here :

Pandas - The Python You Need
We gathered the only Python essentials that you will probably ever need.
7-Day Challenge

Land Your First Data Science Job

A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.

Build portfolios that hiring managers love
Master the Python and SQL essentials to be industry-ready
Practice with real interview questions from tech companies
Access to the $100k/y Data Scientist Cheatsheet

Join thousands of developers who transformed their careers through our challenge. Unsubscribe anytime.

Free Newsletter

Master Data Science in Days, Not Months 🚀

Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚

Weekly simple and practical lessons
Access to ready to use code examples
Skip the math, focus on results
Learn while drinking your coffee

By subscribing, you agree to receive our newsletter. You can unsubscribe at any time.