How to plot categorical variables as color with Matplotlib and Pandas

Sometimes it is useful to add an extra dimension to a plot, especially to see whether there are some clusters forming.

To do so, you will first have to transform the categories into numbers and then set the color as the category number.

Here is the example

# For our DataFrame
import pandas as pd

# In order to plot
import matplotlib.pyplot as plt

# We get our sample data from github
df = pd.read_csv('')

# We transform text categorical variables into numerical variables
df["species_codes"] = pd.Categorical(df["species"]).codes

# We setup our figure
fig, axes = plt.subplots(1,1, figsize=(6,5))

# We plot the scatter with c = our categorical variables 
# We plot the grid            

# We add better labels
axes.set_xlabel("Petal Length")
axes.set_ylabel("Sepal Length")

# We set the title
axes.set_title("How to plot categorical variable as color with Matplotlib")

# We tidy things up

# We plot our data
The code

Here is the result

The results

