The Python Oracle

detect allusions (e.g. very fuzzy matches) in language of inaugural addresses

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Over Ancient Waters Looping

--

Chapters
00:00 Detect Allusions (E.G. Very Fuzzy Matches) In Language Of Inaugural Addresses
03:04 Accepted Answer Score 2
03:26 Thank you

--

Full question
https://stackoverflow.com/questions/1449...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #text #nlp #nltk

#avk47



ACCEPTED ANSWER

Score 2


If you are inspired to use bigrams, you could build your bigrams while allowing gaps of one, two, or even three words so as to loosen up the definition of bigram a little bit. This could work since allowing n gaps means not even n times as many "bigrams", and your corpus is pretty small. With this, for example, a "bigram" from your first paragraph could be (similar, inaugurals).