How to analyze the correlation between two variables

There are multiple ways to analyze the correlation between two variables in Python, here are a few examples:

Using numpy.corrcoef(): This function returns the correlation coefficient between two variables. It takes two arrays as input, and returns a 2D array with the correlation coefficients.

import numpy as np
x = [1, 2, 3, 4, 5]
y = [5, 4, 3, 2, 1]
print(np.corrcoef(x, y))

Using pandas.DataFrame.corr(): This function returns the correlation between columns of a DataFrame as a DataFrame.

import pandas as pd
df = pd.DataFrame({'x': x, 'y': y})
print(df.corr())

Using scipy.stats.pearsonr(): This function returns the Pearson correlation coefficient and the p-value for testing non-correlation. It takes two arrays as input and returns a tuple of correlation coefficient and p-value.

from scipy.stats import pearsonr
corr, p_value = pearsonr(x, y)
print(corr)

Using seaborn.pairplot() :This function is to quickly visualize the relationship between multiple variables. It creates a matrix of scatterplots of all variables against all variables.

mport seaborn as sns
sns.pairplot(df)

Using scipy.stats.spearmanr(): This function returns the Spearman rank-order correlation coefficient and the p-value for testing non-correlation. It takes two arrays as input and returns a tuple of correlation coefficient and p-value.

from scipy.stats import spearmanr
corr, p_value = spearmanr(x, y)
print(corr)

In all the above examples, x and y should be numpy arrays or pandas dataframe or series containing the two variables whose correlation is to be analysed. It is important to note that the correlation coefficient ranges from -1 to 1, where -1 represents a strong negative correlation, 0 represents no correlation, and 1 represents a strong positive correlation.

Land Your First Data Science Job

Land Your First Data Science Job

Related Articles

Master Data Science in Days, Not Months 🚀

Related Articles

AI
9 min read
Carry Trading: A step by step Guide to Profitable Strategies and Risk Management using Python
We explore what is carry trading and how to apply it in Python. Why today's economic landscape it is a profitable strategy and what risks are associated with it.
9/10/2023Read More

Data Visualization
6 min read
How to smooth a line using Python - 4 Methods
Uncover hidden insights in your data with these 4 methods for smoothing a line. From rolling windows to spline interpolation, we'll reveal the secrets you need to know to unlock the full potential of your data.
4/15/2023Read More

Webscraping
1 min read
What you need to know before doing Webscraping in Python
Learn tips for successful web scraping in Python: check website's rules, request rate, dynamic content, and data format. Scrape responsibly!
4/12/2023Read More