The Python Oracle

How to identify certain rows within a specified range of columns?

--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Ominous Technology Looping

--

Chapters
00:00 How To Identify Certain Rows Within A Specified Range Of Columns?
01:11 Accepted Answer Score 6
01:47 Answer 2 Score 1
02:16 Thank you

--

Full question
https://stackoverflow.com/questions/4895...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas

#avk47



ACCEPTED ANSWER

Score 6


I forgot to add .any().

The code below works.

df['HemoFacB'] = np.where(df[cols].isin(codes),1,0).any(1)

The error suggests that I was trying to compare too many (6 cols) items into 1 result. By using .any(), this function returns 'True' if any of the iterables (cols) = 'True', and false if the iterable returned all 'False', ultimately reducing the number of items to just 1. So by adding .any(1) to the end, the script consolidates the 6 items passed to just 1 item.




ANSWER 2

Score 1


Here's a solution that doesn't use numpy. I didn't use all of the fields but I'm sure you'll understand it. Also, I used a DataFrame last after manipulating my dictionary. I find it much easier to do that.

import pandas as pd

mydict = {'KEY': ['1312', '1345', '5555', '5555','5555'], 'Month1': [1, 'J', 3,4,'J']}

#print(df)

truth_list = []
for val in zip(*mydict.values()):
    #print(val)
    #print("This is key: {} and value: {}".format(key, val))
    if 'J' in val:
        #print("True")
       truth_list.append('True')
    else:
        #print("False")
        truth_list.append('False')
    #print("Row {}".format(row = row + 1))

mydict.update({'HemoFacB': truth_list})

df = pd.DataFrame(mydict)
print(df)