I have two dataframes as follows,
import pandas as pd
df = pd.DataFrame({'text':['I go to school','open the green door', 'go out and play'],
'pos':[['PRON','VERB','ADP','NOUN'],['VERB','DET','ADJ','NOUN'],['VERB','ADP','CCONJ','VERB']]})
df2 = pd.DataFrame({'verbs':['go','open','close','share','divide'],
'new_verbs':['went','opened','closed','shared','divided']})
I would like to replace the verbs in df.text with their past form in df2.new_verbs if the verbs are found in df2.verbs. and so far I have done the following,
df['text'] = df['text'].str.split() new_df = df.apply(pd.Series.explode) new_df = new_df.assign(new=lambda d: d['pos'].mask(d['pos'] == 'VERB', d['text'])) new_df.text[new_df.new.isin(df2.verbs)] = df2.new_verbs
but when I print out the result, not all verbs are correctly replaced. My desired output would be,
text pos new 0 I PRON PRON 0 went VERB go 0 to ADP ADP 0 school NOUN NOUN 1 opened VERB open 1 the DET DET 1 green ADJ ADJ 1 door NOUN NOUN 2 went VERB go 2 out ADP ADP 2 and CCONJ CCONJ 2 play VERB play
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
You can use a regex for that:
import re
regex = '|'.join(map(re.escape, df2['verbs']))
s = df2.set_index('verbs')['new_verbs']
df['text'] = df['text'].str.replace(regex, lambda m: s.get(m.group(), m),
regex=True)
output (here as column text2 for clarity):
text pos text2 0 I go to school [PRON, VERB, ADP, NOUN] I went to school 1 open the green door [VERB, DET, ADJ, NOUN] opened the green door 2 go out and play [VERB, ADP, CCONJ, VERB] went out and play
Method 2
For smaller lists, you can use pandas replace and a dictionary like this:
verbs_map = dict(zip(df2.verbs, df2.new_verbs))
new_df.text.replace(verbs_map)
Basically, dict(zip(df2.verbs, df2.new_verbs) creates a new dictionary that maps old verbs to their new (past tense) verbs, e.g. {'go' : 'went' , 'close' : 'closed', ...}.
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0