使用MultiIndex在DataFrame上建立索引

JNevens

我有一个大熊猫DataFrame需要填充。

这是我的代码：

trains = np.arange(1, 101) 
#The above are example values, it's actually 900 integers between 1 and 20000
tresholds = np.arange(10, 70, 10)
tuples = []
for i in trains:
    for j in tresholds:
        tuples.append((i, j))

index = pd.MultiIndex.from_tuples(tuples, names=['trains', 'tresholds'])
df = pd.DataFrame(np.zeros((len(index), len(trains))), index=index, columns=trains, dtype=float)

metrics = dict()
for i in trains:
    m = binary_metric_train(True, i) 
    #Above function returns a binary array of length 35
    #Example: [1, 0, 0, 1, ...]
    metrics[i] = m

for i in trains:
    for j in tresholds:
        trA = binary_metric_train(True, i, tresh=j)
        for k in trains:
            if k != i:
                trB = metrics[k]
                corr = abs(pearsonr(trA, trB)[0])
                df[k][i][j] = corr
            else:
                df[k][i][j] = np.nan

我的问题是，当这段代码最终完成计算后，我的DataFramedf仍然只包含零。即使NaN没有插入。我认为我的索引编制是正确的。另外，我已经分别测试了我的binary_metric_train函数，它确实返回了长度为35的数组。

有人可以在这里发现我的失踪吗？

编辑：为清楚起见，此DataFrame看起来像这样：

                    1   2   3   4   5   ...
trains  tresholds
     1         10
               20
               30
               40
               50
               60
     2         10
               20
               30
               40
               50
               60
   ...

马特

正如@EdChum指出的那样，您应该关注一下pandas索引编制。这是一些用于说明目的的测试数据，应将其清除。

import numpy as np
import pandas as pd

trains     = [ 1,  1,  1,  2,  2,  2]
thresholds = [10, 20, 30, 10, 20, 30]
data       = [ 1,  0,  1,  0,  1,  0]
df = pd.DataFrame({
    'trains'     : trains,
    'thresholds' : thresholds,
    'C1'         : data,
    'C2'         : data
}).set_index(['trains', 'thresholds'])

print df
df.ix[(2, 30), 0] = 3 # using column index
# or...
df.ix[(2, 30), 'C1'] = 3 # using column name
df.loc[(2, 30), 'C1'] = 3 # using column name
# but not...
df.loc[(2, 30), 1] = 3 # creates a new column
print df

输出DataFrame修改前和修改后的内容：

                   C1  C2
trains thresholds        
1      10           1   1
       20           0   0
       30           1   1
2      10           0   0
       20           1   1
       30           0   0
                   C1  C2   1
trains thresholds            
1      10           1   1 NaN
       20           0   0 NaN
       30           1   1 NaN
2      10           0   0 NaN
       20           1   1 NaN
       30           3   0   3

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。