我用这些名称创建了一个Dataframe df。我试图使用设置操作从列中提取子字符串。但是我不能提取多个单词(多个字符串)。我只能提取一个单词。请检查我的获得的输出和期望的输出,并为此提供有效的解决方案
import pandas as pd
import numpy as np
df=pd.DataFrame({"Names":["This is Santhosh","This is Sneha Alphonse Shaji","This is Vikram Karthi"]})
df
Name_set={'Santhosh','Sneha Alphonse Shaji','Vikram Karthi'}
def sub(x):
df_words= set(x.split(' '))
extract_words=Name_set.intersection(df_words)
return ' '.join(extract_words)
df['Extracted Names']= df.Names.apply(sub)
df
import pandas as pd
import numpy as np
df=pd.DataFrame({"Names":["This is Santhosh","This is Sneha Alphonse Shaji","This is Vikram Karthi"]})
df
Name_set=['Santhosh','Sneha Alphonse Shaji','Vikram Karthi']
def sub(x):
ans = [y for y in Name_set if y in x]
return ' '.join(ans)
df['Extracted Names']= df.Names.apply(sub)
df
Names Extracted Names
0 This is Santhosh Santhosh
1 This is Sneha Alphonse Shaji Sneha Alphonse Shaji
2 This is Vikram Karthi Vikram Karthi
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句