我想计算复习字符串中一个单词被重复的次数
我正在读取csv文件并将其存储在使用以下行的python数据框中
reviews = pd.read_csv("amazon_baby.csv")
当我将其应用于单个审阅时,以下各行中的代码有效。
print reviews["review"][1]
a = reviews["review"][1].split("disappointed")
print a
b = len(a)
print b
以上行的输出是
it came early and was not disappointed. i love planet wise bags and now my wipe holder. it keps my osocozy wipes moist and does not leak. highly recommend it.
['it came early and was not ', '. i love planet wise bags and now my wipe holder. it keps my osocozy wipes moist and does not leak. highly recommend it.']
2
当我使用以下行将相同的逻辑应用于整个数据框时。我收到错误消息
reviews['disappointed'] = len(reviews["review"].split("disappointed"))-1
错误信息:
Traceback (most recent call last):
File "C:/Users/gouta/PycharmProjects/MLCourse1/Classifier.py", line 12, in <module>
reviews['disappointed'] = len(reviews["review"].split("disappointed"))-1
File "C:\Users\gouta\Anaconda2\lib\site-packages\pandas\core\generic.py", line 2360, in __getattr__
(type(self).__name__, name))
AttributeError: 'Series' object has no attribute 'split'
您正在尝试拆分数据框的整个检查列(这是错误消息中提到的系列)。您要做的是将一个函数应用于数据框的每一行,您可以通过在数据框上调用apply来实现:
f = lambda x: len(x["review"].split("disappointed")) -1
reviews["disappointed"] = reviews.apply(f, axis=1)
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句