How to calculate correlation and covariance using pandas in python
In this tutorial, you will learn how to write a program to calculate correlation and covariance using pandas in python. We can do easily by using inbuilt functions like corr() an cov().
corr():
Syntax : DataFrame.corr(method=’pearson’, min_periods=1)
Parameters : method : {‘pearson’, ‘kendall’, ‘spearman’}
pearson : standard correlation coefficient
kendall : Kendall Tau correlation coefficient
spearman : Spearman rank correlation
min_periods : int, optional
Minimum number of observations required per pair of columns to have a valid result. Currently only available for pearson and spearman correlation
cov():
Syntax: DataFrame.cov(min_periods=None)
min_periods : int, optional
Minimum number of observations required per pair of columns to have a valid result.
#python program to calculate correlation and covariance
import pandas as pd
data = pd.DataFrame({
'name':['ravi','david','raju','david','kumar','teju'],'experience':[1,2,3,4,5,2],
'salary':[15000,20000,30000,45389,50000,20000],
'join_year' :[2017,2017,2018,2018,2019,2018]
})
print(data.corr())
print(data.cov())
Output:
experience join_year salary experience 1.000000 0.872402 0.983243 join_year 0.872402 1.000000 0.821615 salary 0.983243 0.821615 1.000000 experience join_year salary experience 2.166667 0.966667 2.109077e+04 join_year 0.966667 0.566667 9.012967e+03 salary 21090.766667 9012.966667 2.123592e+08