How to group a DataFrame by a specific column with Pandas using Python

The Pandas .groupby() method is an extremely powerful tool that can be used to perform aggregation.

So, why aggregating ?

Here is a simple example.

Imagine you are an e-commerce owner and want to know which of your product did overall sold the most.

So we first need to aggregate by product then .sum() the resulting groupby to obtain the total sales amount per product.

Here is the code

# Import the Pandas library
import pandas as pd

# We create our example dataframe
df = pd.DataFrame({"product" :  ["Stickers", "T-shirts", "Mug", "Stickers", "Jeans", "Mug"],
                   "sales_in_usd" : [10000, 2142, 3321, 11141, 12133, 3321],
                   "year" : [2020, 2020, 2020, 2021, 2021, 2021]})

# We print the total sales amount per product (all years combined)

# We can plot it
df.groupby("product")["sales_in_usd"].sum().plot(kind='bar', title="total sales per product")

Here you are! You now know how to group a DataFrame by a specific column with Pandas using Python.

