How to conditionally copy a substring into a new column of a pandas dataframe?
--------------------------------------------------
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Magical Minnie Puzzles
--
Chapters
00:00 How To Conditionally Copy A Substring Into A New Column Of A Pandas Dataframe?
01:21 Accepted Answer Score 3
01:58 Thank you
--
Full question
https://stackoverflow.com/questions/4585...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #string #pandas #dataframe #substring
#avk47
Rise to the top 3% as a developer or hire one of them at Toptal: https://topt.al/25cXVn
--------------------------------------------------
Music by Eric Matyas
https://www.soundimage.org
Track title: Magical Minnie Puzzles
--
Chapters
00:00 How To Conditionally Copy A Substring Into A New Column Of A Pandas Dataframe?
01:21 Accepted Answer Score 3
01:58 Thank you
--
Full question
https://stackoverflow.com/questions/4585...
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #string #pandas #dataframe #substring
#avk47
ACCEPTED ANSWER
Score 3
You can use pd.Series.str.extract:
In [737]: df
Out[737]:
A B
0 VALID asdfafX'XextractthisY'Yeaaadf
1 INVALID secondrowX'XsubtextY'Yelakj
2 VALID secondrowX'XextractthistooY'Yelakj
In [745]: df['C'] = df[df.A == 'VALID'].B.str.extract("(?<=X'X)(.*?)(?=Y'Y)", expand=False)
In [746]: df
Out[746]:
A B C
0 VALID asdfafX'XextractthisY'Yeaaadf extractthis
1 INVALID secondrowX'XsubtextY'Yelakj NaN
2 VALID secondrowX'XextractthistooY'Yelakj extractthistoo
The regex pattern is:
(?<=X'X)(.*?)(?=Y'Y)
(?<=X'X)is a lookbehind forX'X(.*?)matches everything between the lookbehind and lookahead(?=Y'Y)is a lookahead forY'Y