I have a pandas DataFrame of the general form (let's call it df
)
A C D E F
0 0.4 0.1 0.2 0 0.7
1 0 0.2 0 0.5 0.6
2 0.1 0.5 0 0 0
3 0 0 0 0.8 0
4 0.8 0.1 0 0 0
5 0 0.9 0 0 0
Though each axis will be longer (21 columns, varying index length).
I need to find the largest number in the entire DataFrame, and extract the index and column name, then the next largest (if not in same index) and again find index and column name, and so on.
I can of course extract each index as a series (pd.Index.to_series
) or list, and do it item-wise. But I'm thinking there's a more efficient way using the dataframes (since I cannot predict the maximum index size this will eventually be used on), that I just can't seem to find/think of.
I've found plenty of answer using df.max()
or df.idmax()
, to get the maximum, if I only needed that.
Use the following:
df.stack().reset_index().sort_values(by=0, ascending=False)
Out[168]:
level_0 level_1 0
26 5 C 0.9
18 3 E 0.8
20 4 A 0.8
4 0 F 0.7
9 1 F 0.6
8 1 E 0.5
11 2 C 0.5
0 0 A 0.4
6 1 C 0.2
2 0 D 0.2
10 2 A 0.1
1 0 C 0.1
21 4 C 0.1
27 5 D 0.0
28 5 E 0.0
25 5 A 0.0
24 4 F 0.0
23 4 E 0.0
22 4 D 0.0
15 3 A 0.0
19 3 F 0.0
17 3 D 0.0
16 3 C 0.0
14 2 F 0.0
13 2 E 0.0
12 2 D 0.0
7 1 D 0.0
5 1 A 0.0
3 0 E 0.0
29 5 F 0.0
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments