How to identify certain rows within a specified range of columns?
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Ominous Technology Looping
--
Chapters
00:00 How To Identify Certain Rows Within A Specified Range Of Columns?
01:11 Accepted Answer Score 6
01:47 Answer 2 Score 1
02:16 Thank you
--
Full question
https://stackoverflow.com/questions/4895...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas
#avk47
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Ominous Technology Looping
--
Chapters
00:00 How To Identify Certain Rows Within A Specified Range Of Columns?
01:11 Accepted Answer Score 6
01:47 Answer 2 Score 1
02:16 Thank you
--
Full question
https://stackoverflow.com/questions/4895...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas
#avk47
ACCEPTED ANSWER
Score 6
I forgot to add .any().
The code below works.
df['HemoFacB'] = np.where(df[cols].isin(codes),1,0).any(1)
The error suggests that I was trying to compare too many (6 cols) items into 1 result. By using .any(), this function returns 'True' if any of the iterables (cols) = 'True', and false if the iterable returned all 'False', ultimately reducing the number of items to just 1. So by adding .any(1) to the end, the script consolidates the 6 items passed to just 1 item.
ANSWER 2
Score 1
Here's a solution that doesn't use numpy. I didn't use all of the fields but I'm sure you'll understand it. Also, I used a DataFrame last after manipulating my dictionary. I find it much easier to do that.
import pandas as pd
mydict = {'KEY': ['1312', '1345', '5555', '5555','5555'], 'Month1': [1, 'J', 3,4,'J']}
#print(df)
truth_list = []
for val in zip(*mydict.values()):
#print(val)
#print("This is key: {} and value: {}".format(key, val))
if 'J' in val:
#print("True")
truth_list.append('True')
else:
#print("False")
truth_list.append('False')
#print("Row {}".format(row = row + 1))
mydict.update({'HemoFacB': truth_list})
df = pd.DataFrame(mydict)
print(df)