I have a dataframe df
with 40000 rows:
0 bin
0 4.506840 4-5
1 4.506840 4-5
2 4.444245 4-5
3 4.485975 4-5
4 4.527705 4-5
... ... ...
39995 6.572475 6-7
39996 6.697665 6-7
39997 6.322095 6-7
39998 6.322095 6-7
39999 6.676800 6-7
It stores for every number in column '0' the interval (bin) it belongs to. I want to convert it to a dict by:
dict(zip(df[0],df.bin))
to get an output like:
{4.506840: '4-5', 4.506840: '4-5', 4.444245: '4-5, ... }
so I want to store every value from '0' and the bin it belongs to. Somehow my dict has a length of 340, not 40000, so it doesn't store all of the rows. My question is: why? And how do I get all 40000 rows in the dict? Cheers!
Due to the duplicates you have in your df[0]
, and due to the fact that you cannot have the same key duplicated in a python dictionary, you can do:
result = {}
for i_0, i_bin in zip(df[0],df.bin):
if i_0 not in result.keys():
result[i_0] = []
result[i_0].append(i_bin)
output:
{
"4.506840": ["4-5", "4-5"],
"4.444245": ["4-5"],
...
}
It depends on what you want to achieve, but this is a way to perceive all the values.
Edit:
As per @anky comment, you can make use of pandas aggregation function to do the same instead of the loop. Definitely, it is of better performance:
df.groupby(0)['bin'].agg(list).to_dict()
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments