我在一栏中有文字,而另一栏中有相应的字典。我已经标记了文本,并希望替换那些在相应字典中找到与键匹配的标记。文本和字典特定于熊猫数据框的每个记录。
import pandas as pd
data =[['1','i love mangoes',{'love':'hate'}],['2', 'its been a long time we have not met',{'met':'meet'}],['3','i got a call from one of our friends',{'call':'phone call','one':'couple of'}]]
df = pd.DataFrame(data, columns = ['id', 'text','dictionary'])
输出的最终数据帧应为
data =[['1','i hate mangoes'],['2', 'its been a long time we have not meet'],['3','i got a phone call from couple of of our friends']
df = pd.DataFrame(data, columns =['id, 'modified_text'])
我在Windows机器中使用Python 3
我在键和值中添加了空格,以区分整个单词和部分单词:
def replace(text, mapping):
new_s = text
for key in mapping:
k = ' '+key+' '
val = ' '+mapping[key]+' '
new_s = new_s.replace(k, val)
return new_s
df_out = (df.assign(modified_text=lambda f:
f.apply(lambda row: replace(row.text, row.dictionary), axis=1))
[['id', 'modified_text']])
print(df_out)
id modified_text
0 1 i hate mangoes
1 2 its been a long time we have not met
2 3 i got a phone call from couple of of our friends
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句