How to groupby using multiple operations with Pandas using Python
• 2 minPandas will give you the DataFrame.aggregate() (or DataFrame.agg()) method to perform multiple aggregation operations on one or many columns.
As you can see, the method requires a dictionary containing the column and the operation you want to perform.
It can contain multiple operations in a list.
Here is the code
One column
# Import the Pandas library
import pandas as pd
# We create our example dataframe
df = pd.DataFrame({"product" : ["Stickers", "Jeans", "Mug", "Stickers", "Jeans", "Mug"],
"sales_in_usd" : [10000, 24198, 1210, 13123, 31903, 7312],
"year" : [2020, 2020, 2020, 2021, 2021, 2021]})
# We print the total sales amount per product and the avg sales per product (all years combined)
print(df.groupby("product").aggregate({"sales_in_usd": [sum, 'mean']}))
Multiple columns
# Import the Pandas library
import pandas as pd
# We create our example dataframe
df = pd.DataFrame({"product" : ["Stickers", "Jeans", "Mug", "Stickers", "Jeans", "Mug"],
"sales_in_usd" : [10000, 24198, 1210, 13123, 31903, 7312],
"cogs" : [3420, 12345, 913, 5670, 16402, 6402],
"year" : [2020, 2020, 2020, 2021, 2021, 2021]})
# We print the total sales amount per product and the avg sales per product (all years combined)
print(df.groupby("product").aggregate({"sales_in_usd": [sum, 'mean'],
"cogs" : [sum, 'mean']}))
Here you are! You now know how to groupby using multiple operations with Pandas using Python.
More on DataFrames
If you want to know more about DataFrame and Pandas. Check out the other articles I wrote on the topic, just here :