The Python Oracle

reading date from a data frame based on conditions in a different data frame

Become part of the top 3% of the developers by applying to Toptal https://topt.al/25cXVn

--

Music by Eric Matyas
https://www.soundimage.org
Track title: Lost Jungle Looping

--

Chapters
00:00 Question
01:08 Accepted answer (Score 1)
01:24 Answer 2 (Score 2)
02:02 Thank you

--

Full question
https://stackoverflow.com/questions/5063...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #pandas

#avk47



ANSWER 1

Score 2


Set no as the index for words and then iterate over sentences using a list comprehension:

v = words.set_index('no')['word']
sentences = [
    ' '.join(v.loc[i:j]) for i, j in zip(sentences['start'], sentences['stop'])
]

Or index agnostic:

v = words['word'].tolist()
sentences = [
    ' '.join(v[i - 1:j - 1] for i, j in zip(sentences['start'], sentences['stop'])
]

['cat in hat', 'the dog', 'in love ! <3']

Saving to a file should be straightforward from here:

with open('file.txt', 'w') as f:
    for sent in sentences:
        f.write(sent + '\n')
        f.write('***\n')



ACCEPTED ANSWER

Score 1


one way to solve this,

res=pd.DataFrame()
res['s']=sentences.apply(lambda x: ' '.join(words.iloc[(x['start']-1):(x['stop'])]['word']),axis=1)
res.to_csv('a.txt',index=False,header=False,line_terminator='\n***\n')