OneHotEncoder：ValueError：系列的真值不明确。使用a.empty，a.bool（），a.item（），a.any（）或a.all（）

德里克·泰（Derrick Tay）

from sklearn.preprocessing import OneHotEncoder

df.LotFrontage = df.LotFrontage.fillna(value = 0)
categorical_mask = (df.dtypes == "object")
categorical_columns = df.columns[categorical_mask].tolist()
ohe = OneHotEncoder(categories = categorical_mask, sparse = False)
df_encoded = ohe.fit_transform(df)
print(df_encoded[:5, :])

错误：

我可以知道我的代码有什么问题吗？

这是数据片段：

[ df.head]（） 2

和

无法使用中的categories参数OneHotEncoder来选择要编码的功能，因为您需要使用ColumnTransformer。尝试这个：

df.LotFrontage = df.LotFrontage.fillna(value = 0)
categorical_features = df.select_dtypes("object").columns

column_trans = ColumnTransformer(
    [
        ("onehot_categorical", OneHotEncoder(), categorical_features),
    ],
    remainder="passthrough",  # or drop if you don't want the non-categoricals at all...
)
df_encoded = column_trans.fit_transform(df)

请注意，根据docs，Categories参数为

category'auto'或类似数组的列表，默认='auto'
Categories (unique values) per feature:

    ‘auto’ : Determine categories automatically from the training data.

    list : categories[i] holds the categories expected in the ith column. The passed categories should not mix strings and numeric
单个功能中的值，如果是数值，则应排序。

因此，它应该包含每个分类功能的每个可能的类别或级别。您可能会使用此方法，因为您知道所有可能的级别，但怀疑您的训练数据可能会省略一些。在您的情况下，我认为您不会；将需要它'auto'（即默认值）就可以了。

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-01-24

我来说两句

0 条评论

登录后参与评论

TOP 榜单

文章

OneHotEncoder：ValueError：系列的真值不明确。使用a.empty，a.bool（），a.item（），a.any（）或a.all（）

OneHotEncoder：ValueError：系列的真值不明确。使用a.empty，a.bool（），a.item（），a.any（）或a.all（）

我来说两句

相关文章

TOP 榜单

隐藏发件人没有短信PHP

Hashchange事件侦听器在将事件处理程序附加到事件之前进行侦听

在浏览器中请求URL时会发生什么？

flask-admin 如何自定义删除按钮

材质UI垂直滑块。如何改变在垂直材料UI滑块导轨的厚度（反应）

用日期数据透视表和日期顺序查询

Jqgrid：多级别组摘要

java io ioexception无法解析服务器地址解析器的响应

Swift如何使用Base64Url编码JWT标头和有效负载之类的json对象

sshd AllowGroups组未授予访问权限

jQuery无限滚动固定div中的滚动

android 背部按下

Flexbox CSS 对齐属性环境惰性？

为什么随机森林中的平均降低基尼系数取决于人口规模？

ClickHouse 创建临时表

为什么PlusShare.Builder setRecipients方法不起作用？

如何在Android中识别MICR代码

PyQt4.QtCore模块无法向sip模块注册

正则表达式，用于查找所有以任何字母开头和数字开头的文件

是否可以通过编程方式对很多动画进行重新着色？

机器密钥生成

热门标签

归档