python - how to compute correlation-matrix with nans in data-matrix
--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------
Take control of your privacy with Proton's trusted, Swiss-based, secure services.
Choose what you need and safeguard your digital life:
Mail: https://go.getproton.me/SH1CU
VPN: https://go.getproton.me/SH1DI
Password Manager: https://go.getproton.me/SH1DJ
Drive: https://go.getproton.me/SH1CT
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Puzzle3
--
Chapters
00:00 Python - How To Compute Correlation-Matrix With Nans In Data-Matrix
02:45 Accepted Answer Score 14
03:23 Thank you
--
Full question
https://stackoverflow.com/questions/2710...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #scipy #correlation
#avk47
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------
Take control of your privacy with Proton's trusted, Swiss-based, secure services.
Choose what you need and safeguard your digital life:
Mail: https://go.getproton.me/SH1CU
VPN: https://go.getproton.me/SH1DI
Password Manager: https://go.getproton.me/SH1DJ
Drive: https://go.getproton.me/SH1CT
Music by Eric Matyas
https://www.soundimage.org
Track title: Hypnotic Puzzle3
--
Chapters
00:00 Python - How To Compute Correlation-Matrix With Nans In Data-Matrix
02:45 Accepted Answer Score 14
03:23 Thank you
--
Full question
https://stackoverflow.com/questions/2710...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #scipy #correlation
#avk47
ACCEPTED ANSWER
Score 14
I think the method you are looking for is corr() from pandas. For example, a dataframe as following. You can also refer to this question. How to efficiently get the correlation matrix (with p-values) of a data frame with NaN values?
import pandas as pd
df = pd.DataFrame({'A': [2, None, 1, -4, None, None, 3],
'B': [None, 1, None, None, 1, 3, None],
'C': [2, 1, None, 2, 2.1, 1, 0],
'D': [-2, 1.1, 3.2, 2, None, 1, None]})
df
A B C D 0 2 NaN 2 -2 1 NaN 1 1 1.1 2 1 NaN NaN 3.2 3 -4 NaN 2 2 4 NaN 1 2.1 NaN 5 NaN 3 1 1 6 3 NaN 0 NaN
rho = df.corr()
rho
A B C D A 1.000000 NaN -0.609994 -0.441784 B NaN 1.0 -0.500000 -1.000000 C -0.609994 -0.5 1.000000 -0.347928 D 0.041204 -1.0 -0.347928 1.000000