The Python Oracle

Preserve NaN values in pandas boolean comparisons

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Darkness Approaches Looping

--

Chapters
00:00 Preserve Nan Values In Pandas Boolean Comparisons
01:03 Accepted Answer Score 9
01:21 Answer 2 Score 7
01:59 Thank you

--

Full question
https://stackoverflow.com/questions/4477...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas #boolean #missingdata

#avk47



ACCEPTED ANSWER

Score 9


Let's use np.logical_and:

import numpy as np
import pandas as pd
df = pd.DataFrame({'A':[True, True, False, True, np.nan, np.nan], 
                   'B':[True, False, True, np.nan, np.nan, False]})

s = np.logical_and(df['A'],df['B'])
print(s)

Output:

0     True
1    False
2    False
3      NaN
4      NaN
5    False
Name: A, dtype: object



ANSWER 2

Score 7


pandas >= 1.0

This operation is directly supported by pandas provided you are using the new Nullable Boolean Type boolean (not to be confused with the traditional numpy bool type).

# Setup
df = pd.DataFrame({'A':[True, True, False, True, np.nan, np.nan], 
                   'B':[True, False, True, np.nan, np.nan, False]})

df.dtypes                                                                  

A    object
B    object
dtype: object
# A little shortcut to convert the data type to `boolean`
df2 = df.convert_dtypes()                                                  
df2.dtypes                                                                 

A    boolean
B    boolean
dtype: object

df2['A'] & df2['B']                                                        

0     True
1    False
2    False
3     <NA>
4     <NA>
5    False
dtype: boolean

In conclusion, please consider upgrading to pandas 1.0 :-)