How to compute the covariance matrix of two Pandas DataFrame columns using Python

1 min

The covariance is a widely used statistical estimate to measure how much a variable does vary when the other increase or decreases.

Here is an example of when you plot two variables.

The covariance is also used to compute the correlation between two variables.

The Pandas library provides the DataFrame.cov() method that will compute the covariance for you.

Here is the code

# to work with dataframe
import pandas as pd

# We create our sample dataset with negative covariance
df = pd.DataFrame({"col1": range(10),
                   "col2": range(10)[::-1]})

# We plot to check the coviariance visually
df.plot(kind="scatter", x="col1", y="col2")

# We compute and print the covariance between the two variables
print(df[["col1", "col2"]].cov())
How to compute the covariance

Here you are! You should be by now the king of covariance in Python!