How to compute correlation using Pandas

0 min

When analyzing data, you will end up needing to check the existing correlation between two different variables.

The DataFrame.corr() method will give you the correlation between your dataframe variables.

# We import the libraries
import pandas as pd
import numpy as np

# Define two variables that have some correlation between each other
# Using the multivariate normal function from numpy
df = pd.DataFrame(np.random.multivariate_normal(mean=[0,0],
                                                cov=[[1, 0], [0, 100]],
                                                size=100),
                  columns=["col1","col2"])
                  
# We print the correlation between those two columns
print(df.corr())
How to compute the correlation between two columns using pandas

Here you are, you now know how to compute the correlation between two variables.