How to drop duplicated rows in a DataFrame with Pandas using Python
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
In order to drop duplicated rows or columns, you can use the DataFrame.drop_duplicates() method.
The example
Imagine we have a DataFrame that has two similar rows but you want to eliminate one of the duplicates.
Here is the code
# To work with dataframes
import pandas as pd
# We create a sample dataframe
df = pd.DataFrame({"col1" : [0, 10, 3, 11, 3, 12],
"col2" : [1, 2, 2, 12, 2, 12]})
# We assign the dataframe
df = df.drop_duplicates()
# We print the dataframe
print(df)
Here you are! You now know how to drop duplicated rows in a DataFrame with Pandas using Python.
More on DataFrames
If you want to know more about DataFrame and Pandas. Check out the other articles I wrote on the topic, just here :
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Related Articles
Continue your learning journey with these related topics
Master Data Science in Days, Not Months 🚀
Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚