How to format a column to numeric with Pandas

1 min

It is highly probable that you'll end up one day with a column of integer that Pandas understands as Strings.

To fix this problem here are two solutions.

Using .to_numeric()

Using the .to_numeric() method and apply it to a column we can transform all that strings into numbers.

import pandas as pd

# We create the example dataframe
df = pd.DataFrame({"col1" : ['1','2','3','4','5','6','7']})

df["col1"] = pd.to_numeric(df["col1"])

print(df["col1"])

But sometimes it doesn't work....

Using .apply()

If we take another example

import pandas as pd

# We create the example dataframe
df = pd.DataFrame({"col1" : ['1','2','3','4','Unkown','6','7']})

We can see that the column contains a 'Unkown' string in this case.

Now the DataFrame will give an error when using the .to_numeric() method. Because not all are numbers as strings. Pandas doesn't understand the "Unkown" string.

To fix this problem we can use a lambda function.

import pandas as pd

def return_if_number(x):
    """Check wether there is a number otherwise return None"""
    try:
        return int(x)
    except Exception as e:
        print(e)
        return None

# We create the example dataframe
df = pd.DataFrame({"col1" : ['1','2','3','4','Unkown','6','7']})

df["col1"].apply(return_if_number)

More on DataFrames

If you want to know more about DataFrame and Pandas. Checkout the other articles I wrote on the topic, just here :

Pandas - The Python You Need
We gathered the only Python essentials that you will probably ever need.