python - how to compute correlation-matrix with nans in data-matrix
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Cool Puzzler LoFi
--
Chapters
00:00 Python - How To Compute Correlation-Matrix With Nans In Data-Matrix
01:46 Accepted Answer Score 14
02:27 Thank you
--
Full question
https://stackoverflow.com/questions/2710...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #scipy #correlation
#avk47
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Cool Puzzler LoFi
--
Chapters
00:00 Python - How To Compute Correlation-Matrix With Nans In Data-Matrix
01:46 Accepted Answer Score 14
02:27 Thank you
--
Full question
https://stackoverflow.com/questions/2710...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #numpy #scipy #correlation
#avk47
ACCEPTED ANSWER
Score 14
I think the method you are looking for is corr() from pandas. For example, a dataframe as following. You can also refer to this question. How to efficiently get the correlation matrix (with p-values) of a data frame with NaN values?
import pandas as pd
df = pd.DataFrame({'A': [2, None, 1, -4, None, None, 3],
'B': [None, 1, None, None, 1, 3, None],
'C': [2, 1, None, 2, 2.1, 1, 0],
'D': [-2, 1.1, 3.2, 2, None, 1, None]})
df
A B C D 0 2 NaN 2 -2 1 NaN 1 1 1.1 2 1 NaN NaN 3.2 3 -4 NaN 2 2 4 NaN 1 2.1 NaN 5 NaN 3 1 1 6 3 NaN 0 NaN
rho = df.corr()
rho
A B C D A 1.000000 NaN -0.609994 -0.441784 B NaN 1.0 -0.500000 -1.000000 C -0.609994 -0.5 1.000000 -0.347928 D 0.041204 -1.0 -0.347928 1.000000