Split string based on regex
Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Secret Catacombs
--
Chapters
00:00 Question
00:38 Accepted answer (Score 170)
00:52 Answer 2 (Score 70)
01:33 Answer 3 (Score 1)
01:46 Thank you
--
Full question
https://stackoverflow.com/questions/1320...
Accepted answer links:
[this demo]: http://ideone.com/qoaTqr
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #regex #split
#avk47
--
Music by Eric Matyas
https://www.soundimage.org
Track title: Secret Catacombs
--
Chapters
00:00 Question
00:38 Accepted answer (Score 170)
00:52 Answer 2 (Score 70)
01:33 Answer 3 (Score 1)
01:46 Thank you
--
Full question
https://stackoverflow.com/questions/1320...
Accepted answer links:
[this demo]: http://ideone.com/qoaTqr
--
Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...
--
Tags
#python #regex #split
#avk47
ACCEPTED ANSWER
Score 170
ANSWER 2
Score 70
You could use a lookahead:
re.split(r'[ ](?=[A-Z]+\b)', input)
This will split at every space that is followed by a string of upper-case letters which end in a word-boundary.
Note that the square brackets are only for readability and could as well be omitted.
If it is enough that the first letter of a word is upper case (so if you would want to split in front of Hello as well) it gets even easier:
re.split(r'[ ](?=[A-Z])', input)
Now this splits at every space followed by any upper-case letter.
ANSWER 3
Score 1
Your question contains the string literal "\b[A-Z]{2,}\b",
but that \b will mean backspace, because there is no r-modifier.
Try: r"\b[A-Z]{2,}\b".