我有如下元素列表:
A= ['loans','s-class','veyron','trump','rihana','drake','election']
我也有另一只大熊猫数据框B
与列category
和words
是逗号分隔字符串: -
category words
audi a4, a6
bugatti veyron, chiron
mercedez s-class, e-class
dslr canon, nikon
apple iphone,macbook,ipod
finance sales,loans,sales price
politics trump, election, votes
entertainment spiderman,thor, ironmen
music beiber, rihana,drake
........ ..............
......... .........
所有我要地图列表的元素A
与列words
,并指定相应的category
进入一个新的list.So,预计产量会。
matched_categories=['finance','mercedez','bugatti','politics','music','music','politics']
用筛选boolean indexing
以iat
选择第一个匹配的值:
#if always matched all values
matched_categories = [df.loc[df['words'].str.contains(x), 'category'].iat[0] for x in A]
print (matched_categories)
['finance', 'mercedez', 'bugatti', 'politics', 'music', 'music', 'politics']
如果某些值不匹配,则为更通用的解决方案-然后返回not matched
值:
#added last aaa value
A = ['loans','s-class','veyron','trump','rihana','drake','election','aaa']
matched_categories = [next(iter(df.loc[df['words'].str.contains(x),'category']),'not matched')
for x in A]
print (matched_categories)
['finance', 'mercedez', 'bugatti', 'politics', 'music', 'music', 'politics', 'not matched']
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句