Apply pandas function to column to create multiple new columns?
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Flying Over Ancient Lands
--
Chapters
00:00 Question
01:18 Accepted answer (Score 133)
01:50 Answer 2 (Score 277)
02:25 Answer 3 (Score 197)
02:43 Answer 4 (Score 94)
03:19 Thank you
--
Full question
https://stackoverflow.com/questions/1623...
Question links:
[this]: https://stackoverflow.com/questions/7837...
[v0.11.0]: https://github.com/pandas-dev/pandas/rel...
[df.assign()]: https://pandas.pydata.org/docs/reference...
[added in v0.16]: https://pandas.pydata.org/pandas-docs/st...
Accepted answer links:
https://ys-l.github.io/posts/2015/08/28/.../
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #merge #multiplecolumns #returntype
#avk47
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Flying Over Ancient Lands
--
Chapters
00:00 Question
01:18 Accepted answer (Score 133)
01:50 Answer 2 (Score 277)
02:25 Answer 3 (Score 197)
02:43 Answer 4 (Score 94)
03:19 Thank you
--
Full question
https://stackoverflow.com/questions/1623...
Question links:
[this]: https://stackoverflow.com/questions/7837...
[v0.11.0]: https://github.com/pandas-dev/pandas/rel...
[df.assign()]: https://pandas.pydata.org/docs/reference...
[added in v0.16]: https://pandas.pydata.org/pandas-docs/st...
Accepted answer links:
https://ys-l.github.io/posts/2015/08/28/.../
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #pandas #merge #multiplecolumns #returntype
#avk47
ANSWER 1
Score 299
I usually do this using zip:
>>> df = pd.DataFrame([[i] for i in range(10)], columns=['num'])
>>> df
num
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
>>> def powers(x):
>>> return x, x**2, x**3, x**4, x**5, x**6
>>> df['p1'], df['p2'], df['p3'], df['p4'], df['p5'], df['p6'] = \
>>> zip(*df['num'].map(powers))
>>> df
num p1 p2 p3 p4 p5 p6
0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1
2 2 2 4 8 16 32 64
3 3 3 9 27 81 243 729
4 4 4 16 64 256 1024 4096
5 5 5 25 125 625 3125 15625
6 6 6 36 216 1296 7776 46656
7 7 7 49 343 2401 16807 117649
8 8 8 64 512 4096 32768 262144
9 9 9 81 729 6561 59049 531441
ACCEPTED ANSWER
Score 136
Building off of user1827356 's answer, you can do the assignment in one pass using df.merge:
df.merge(df.textcol.apply(lambda s: pd.Series({'feature1':s+1, 'feature2':s-1})),
left_index=True, right_index=True)
textcol feature1 feature2
0 0.772692 1.772692 -0.227308
1 0.857210 1.857210 -0.142790
2 0.065639 1.065639 -0.934361
3 0.819160 1.819160 -0.180840
4 0.088212 1.088212 -0.911788
EDIT: Please be aware of the huge memory consumption and low speed: https://ys-l.github.io/posts/2015/08/28/how-not-to-use-pandas-apply/ !
ANSWER 3
Score 95
This is what I've done in the past
df = pd.DataFrame({'textcol' : np.random.rand(5)})
df
textcol
0 0.626524
1 0.119967
2 0.803650
3 0.100880
4 0.017859
df.textcol.apply(lambda s: pd.Series({'feature1':s+1, 'feature2':s-1}))
feature1 feature2
0 1.626524 -0.373476
1 1.119967 -0.880033
2 1.803650 -0.196350
3 1.100880 -0.899120
4 1.017859 -0.982141
Editing for completeness
pd.concat([df, df.textcol.apply(lambda s: pd.Series({'feature1':s+1, 'feature2':s-1}))], axis=1)
textcol feature1 feature2
0 0.626524 1.626524 -0.373476
1 0.119967 1.119967 -0.880033
2 0.803650 1.803650 -0.196350
3 0.100880 1.100880 -0.899120
4 0.017859 1.017859 -0.982141
ANSWER 4
Score 90
This is the correct and easiest way to accomplish this for 95% of use cases:
>>> df = pd.DataFrame(zip(*[range(10)]), columns=['num'])
>>> df
num
0 0
1 1
2 2
3 3
4 4
5 5
>>> def example(x):
... x['p1'] = x['num']**2
... x['p2'] = x['num']**3
... x['p3'] = x['num']**4
... return x
>>> df = df.apply(example, axis=1)
>>> df
num p1 p2 p3
0 0 0 0 0
1 1 1 1 1
2 2 4 8 16
3 3 9 27 81
4 4 16 64 256