How to compare two DataFrames and show the differences with Pandas using Python
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
As a Data Scientist, you will be used to analyze and compare data.
A useful method provided by the Pandas library is the DataFrame.compare() method.
This method is used to compare and show the differences between two identically indexed and labeled DataFrames.
Here is the code
# How to compare two dataframes and show the differences
# To work with dataframes
import pandas as pd
# We create a sample dataframe
df_year_2020 = pd.DataFrame({"Country" : ["Germany", "USA", "France"],
"GDP" : [3332000000,
20839000000,
2603000000],
"Year" : [2020, 2020, 2020]})
# We create a second sample dataframe
df_year_2021 = pd.DataFrame({"Country" : ["Germany", "USA", "France"],
"GDP" : [4218000000,
22939000000,
2785210000],
"Year" : [2021, 2021, 2021]})
print(df_year_2020.compare(df_year_2021))
Here you are! You now know how to compare two DataFrames and show the differences with Pandas using Python.
More on DataFrames
If you want to know more about DataFrame and Pandas. Check out the other articles I wrote on the topic, just here :
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Related Articles
Continue your learning journey with these related topics
Master Data Science in Days, Not Months 🚀
Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚