How to transform categorical text variables into integers using Pandas
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
This process is a way to transform your category into an integer that can be used as a reference in some kind of algorithm.
data:image/s3,"s3://crabby-images/6f4f6/6f4f6d68eb1bd52eee63d21d7fef254a7ec9fa6f" alt=""
It is extremely useful when you want to feed this data into a machine learning algorithm. Because algorithms usually prefer numbers since it is easier to digest and comprehend.
Here is how to do it
import pandas as pd
df = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv')
# We transform text categorical variables into numerical variables
df["species_codes"] = pd.Categorical(df["species"]).codes
More on DataFrames
If you want to know more about DataFrame and Pandas. Checkout the other articles I wrote on the topic, just here :
Land Your First Data Science Job
A proven roadmap to prepare for $75K+ entry-level data roles. Perfect for Data Scientist ready to level up their career.
Related Articles
Continue your learning journey with these related topics
Master Data Science in Days, Not Months 🚀
Skip the theoretical rabbit holes. Get practical data science skills delivered in bite-sized lessons – Approach used by real data scientist. Not bookworms. 📚