按一个级别排序后,如何重新编号MultiIndex级别?这是排序后的DataFrame:
+--------+---+------+
| | | text |
+--------+---+------+
| letter | | |
+--------+---+------+
| a | 0 | blah |
+--------+---+------+
| | 3 | blah |
+--------+---+------+
| | 6 | blah |
+--------+---+------+
| b | 1 | blah |
+--------+---+------+
| | 4 | blah |
+--------+---+------+
| | 7 | blah |
+--------+---+------+
| c | 2 | blah |
+--------+---+------+
| | 5 | blah |
+--------+---+------+
| | 8 | blah |
+--------+---+------+
这就是我想要的(但可能会将原始索引保留在其自己的列中):
+--------+---+------+
| | | text |
+--------+---+------+
| letter | | |
+--------+---+------+
| a | 0 | blah |
+--------+---+------+
| | 1 | blah |
+--------+---+------+
| | 2 | blah |
+--------+---+------+
| b | 0 | blah |
+--------+---+------+
| | 1 | blah |
+--------+---+------+
| | 2 | blah |
+--------+---+------+
| c | 0 | blah |
+--------+---+------+
| | 1 | blah |
+--------+---+------+
| | 2 | blah |
+--------+---+------+
我试图寻找答案,尝试编写不同的代码,但是我很困惑。
复制上面第一张表的代码:
import pandas as pd
df = pd.DataFrame({'letter': ['a', 'b', 'c'] * 3, 'text': ['blah'] * 9})
df.set_index(keys='letter', append=True, inplace=True)
df = df.reorder_levels(order=[1, 0])
df.sort_index(level=0, inplace=True)
print(df)
这是我所做的:
df["new_index"] = df.groupby("letter").cumcount()
df
这给您:
text new_index
letter
a 0 blah 0
3 blah 1
6 blah 2
b 1 blah 0
4 blah 1
7 blah 2
c 2 blah 0
5 blah 1
8 blah 2
然后,您可以重置索引:
df.reset_index().set_index(["letter","new_index"])
level_1 text
letter new_index
a 0 0 blah
1 3 blah
2 6 blah
b 0 1 blah
1 4 blah
2 7 blah
c 0 2 blah
1 5 blah
2 8 blah
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句