将一栏中的文字替换为另一栏中的字典

巴苏杰夫

我在一栏中有文字,而另一栏中有相应的字典。我已经标记了文本,并希望替换那些在相应字典中找到与键匹配的标记。文本和字典特定于熊猫数据框的每个记录。

import pandas as pd

data =[['1','i love mangoes',{'love':'hate'}],['2', 'its been a long time we have not met',{'met':'meet'}],['3','i got a call from one of our friends',{'call':'phone call','one':'couple of'}]]

df = pd.DataFrame(data, columns = ['id', 'text','dictionary']) 

输出的最终数据帧应为

data =[['1','i hate mangoes'],['2', 'its been a long time we have not meet'],['3','i got a phone call from couple of of our friends']
df = pd.DataFrame(data, columns =['id, 'modified_text'])

我在Windows机器中使用Python 3

贺拉斯

我在键和值中添加了空格,以区分整个单词和部分单词:

def replace(text, mapping):
    new_s = text
    for key in mapping:
        k = ' '+key+' '
        val = ' '+mapping[key]+' '
        new_s = new_s.replace(k, val)
    return new_s

df_out = (df.assign(modified_text=lambda f: 
                    f.apply(lambda row: replace(row.text, row.dictionary), axis=1))
          [['id', 'modified_text']])

print(df_out)
  id                                     modified_text
0  1                                    i hate mangoes
1  2              its been a long time we have not met
2  3  i got a phone call from couple of of our friends

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章