数据框上的python函数未返回预期结果

我编写了以下函数将变量转换为虚拟变量:

def convert_to_dummies(df, column):
    dummies = pd.get_dummies(df[column])
    df = pd.concat([df, dummies], axis=1)
    df = df.drop(column, axis=1) #when dropping column don't forget "axis=1"

    return df

但是,当我将其应用于df中的分类变量时:

for col in ['col1', 'col2', ....]:
    convert_to_dummies(df, col)

* 'col1', ''col2', ... are categorical columns in df.

我得到了原始的df,并且没有任何分类变量转换为虚拟变量。我做错什么了?

耶斯列尔

您需要将输出分配回去:

for col in ['col1', 'col2', ....]:
    df = convert_to_dummies(df, col)

样品:

df = pd.DataFrame({'col1':list('abcdef'),
                   'col2':list('abadec'),
                   'col3':list('aaadee'),
                   'col4':list('aabbcc')})

print (df)
  col1 col2 col3 col4
0    a    a    a    a
1    b    b    a    a
2    c    a    a    b
3    d    d    d    b
4    e    e    e    c
5    f    c    e    c

for col in ['col1', 'col2']:
    df = convert_to_dummies(df, col)

print (df)
  col3 col4  a  b  c  d  e  f  a  b  c  d  e
0    a    a  1  0  0  0  0  0  1  0  0  0  0
1    a    a  0  1  0  0  0  0  0  1  0  0  0
2    a    b  0  0  1  0  0  0  1  0  0  0  0
3    d    b  0  0  0  1  0  0  0  0  0  1  0
4    e    c  0  0  0  0  1  0  0  0  0  0  1
5    e    c  0  0  0  0  0  1  0  0  1  0  0

如果需要唯一分类列更好的是删除循环:

def convert_to_dummies_cols(df, cols):
    #create all dummies once with all columns selected by subset
    dummies = pd.get_dummies(df[cols], prefix='', prefix_sep='')
    #aggregate max by columns
    dummies = dummies.groupby(level=0, axis=1).max()
    #add to original df
    df = pd.concat([df, dummies], axis=1)
    df = df.drop(cols, axis=1)
    return df


#parameter is list of columns for dummies
df = convert_to_dummies_cols(df, ['col1', 'col2'])
print (df)
  col3 col4  a  b  c  d  e  f
0    a    a  1  0  0  0  0  0
1    a    a  0  1  0  0  0  0
2    a    b  1  0  1  0  0  0
3    d    b  0  0  0  1  0  0
4    e    c  0  0  0  0  1  0
5    e    c  0  0  1  0  0  1

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章