How to format a column to numeric with Pandas
• 1 minIt is highly probable that you'll end up one day with a column of integer that Pandas understands as Strings.
To fix this problem here are two solutions.
Using .to_numeric()
Using the .to_numeric() method and apply it to a column we can transform all that strings into numbers.
import pandas as pd
# We create the example dataframe
df = pd.DataFrame({"col1" : ['1','2','3','4','5','6','7']})
df["col1"] = pd.to_numeric(df["col1"])
print(df["col1"])
But sometimes it doesn't work....
Using .apply()
If we take another example
import pandas as pd
# We create the example dataframe
df = pd.DataFrame({"col1" : ['1','2','3','4','Unkown','6','7']})
We can see that the column contains a 'Unkown' string in this case.
Now the DataFrame will give an error when using the .to_numeric() method. Because not all are numbers as strings. Pandas doesn't understand the "Unkown" string.
To fix this problem we can use a lambda function.
import pandas as pd
def return_if_number(x):
"""Check wether there is a number otherwise return None"""
try:
return int(x)
except Exception as e:
print(e)
return None
# We create the example dataframe
df = pd.DataFrame({"col1" : ['1','2','3','4','Unkown','6','7']})
df["col1"].apply(return_if_number)
More on DataFrames
If you want to know more about DataFrame and Pandas. Checkout the other articles I wrote on the topic, just here :
Pandas - The Python You Need
We gathered the only Python essentials that you will probably ever need.