我有以下代码,这是一个非常简单的模型,基于BaseEstimator和ClassifierMixin在python中应用sklearn。它旨在报告城市(X)的预测得分(y)。在这里,作为一个简单的模型,我只希望它在呼叫城市时报告城市的平均得分作为其预测值。
class MeanClassifier(BaseEstimator, ClassifierMixin):
def __inif__(self):
self.cityid_ = []
self.cntX = []
def X3(self, X):
self.cityid_, idx = np.unique(X, return_inverse = True)
self.cntX = map(list(self.cityid_).index, X)
return self.cntX
def fit(self, X, y):
self.meanclasses_, meanindicies = np.unique(y, return_inverse = True)
self.cityid_, idx = np.unique(X, return_inverse = True)
self.df = pd.DataFrame({"X":X, "y":y})
self.mean_ = self.df.groupby(['X'].mean())
def predict(self, X):
return self.df['y']['X']
要使用该类,我有B,其中city是在该类中充当X并充当y的城市的列表。
B = MeanClassifier()
asncityid = city
B.fit(asncityid, stars)
pred = B.predict(asncityid[2]) #use the third city in the city list for prediction
print(pred)
运行此代码时,收到以下错误
`File "ml2_cp.py", line 66, in <module>
pred = B.predict(asncityid[2])
File "ml2_cp.py", line 58, in predict
return self.df['y']['X'] ## using sklearn requires all X inputs
File "/opt/conda/lib/python2.7/site-packages/pandas/core/series.py", line 583, in __getitem__
result = self.index.get_value(self, key)
File "/opt/conda/lib/python2.7/site-packages/pandas/indexes/base.py", line 1980, in get_value
tz=getattr(series.dtype, 'tz', None))
File "pandas/index.pyx", line 103, in pandas.index.IndexEngine.get_value (pandas/index.c:3332)
File "pandas/index.pyx", line 111, in pandas.index.IndexEngine.get_value (pandas/index.c:3035)
File "pandas/index.pyx", line 161, in pandas.index.IndexEngine.get_loc (pandas/index.c:4084)
KeyError: 'X'`
但是,我非常困惑,如何def predict(self, X)
确定X的整个列表,因为我确定我的编写方式不正确,因为我y
在那里也有。请让我知道任何可能的解决方案,如果它们不清楚,我想进一步解释我的代码和问题。非常感谢你。
我想也许你想拥有
self.mean_ = self.df.groupby(['X']).mean()
代替
self.mean_ = self.df.groupby(['X'].mean())
和
return self.mean_.ix[X].values
代替
return self.df['y']['X']
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句