Filter pandas multi index data frame based on index column values

sam

I have the below Muti index data frame,

|  shape  |  colour  |        data        |
|                    |  d1  |  d2  |  d3  |
-------------------------------------------
| circle  |  green   |  2   |  4   |  9   |
| circle  |  red     |  -9  |  3   |  1   |
| square  |  orange  |  5   |  -6  |  2   |
| square  |  yellow  |  9   |  8   |  2   |

created with the following,

header = [ "shape", "colour", "label", "data" ]  
data = [      
[ "circle",  "green",  "d1", 2  ],
[ "circle",  "green",  "d2", 4  ],
[ "circle",  "green",  "d3", 9  ],
[ "square",  "orange", "d1", 5  ],
[ "square",  "orange", "d2", -6 ],
[ "square",  "orange", "d3", 2  ],
[ "circle",  "red",    "d1", -9 ],
[ "circle",  "red",    "d2", 3  ],
[ "circle",  "red",    "d3", 1  ],
[ "square",  "yellow", "d1", 9  ],
[ "square",  "yellow", "d2", 8  ],
[ "square",  "yellow", "d3", 2  ],
]

raw = pd.DataFrame(data, columns=header)
df = raw.pivot(index=["shape", "colour"], columns=["label"], values=["data"])

and I want to filter the data values based on the colour. It is easy with a set value, I can do, df[(abs(df["data"]) > 2)] but I do not know/cannot find how to do it when the threshold depends on the colour. With the following filter,

filter = {
    "red": {"threshold": 5},
    "green": {"threshold": 5},
    "yellow": {"threshold": 7},
    "orange": {"threshold": 1},
}

The desired output would be,

|  shape  |  colour  |        data        |
|                    |  d1  |  d2  |  d3  |
-------------------------------------------
| circle  |  green   | Nan  | Nan  |  9   |
| circle  |  red     |  -9  | Nan  | Nan  |
| square  |  orange  |  5   |  -6  |  2   |
| square  |  yellow  |  9   |  8   | Nan  |

I tried,

df.apply(lambda r: r["data"] if abs(r["data"]) > filter[r.index.get_level_values(1)].get("threshold", 2) else 0)

and

df[(abs(df["data"]) > filter[df.index.get_level_values(1)].get("threshold", 0))]

to no avail.

Vitalizzare

From the filter dictionary create DataFrame with colours as index. Then reindex the latter with the df index's 'colour' level. Compare df to the obtained values (an individual threshold for each row depending on colour):

threshold = pd.DataFrame.from_dict(filter, orient='index') 
mask = df.abs() > threshold.reindex(df.index.get_level_values('colour')).values
print(df[mask])

Output:

              data          
label           d1   d2   d3
shape  colour               
circle green   NaN  NaN  9.0
       red    -9.0  NaN  NaN
square orange  5.0 -6.0  2.0
       yellow  9.0  8.0  NaN

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to filter values from a multi index Pandas data frame

Getting column values from multi index data frame pandas

Adding a column based on index to a data frame in Pandas

Pandas: replace values of one data frame with values of another data frame based on index and column

Group index values based on other index values in pandas Data Frame

Filter rows from a data frame based on the highest index and values from column

replacing the values of a matrix cells in a data frame based on the index and column values

Assigning values to a multi-index pandas data frame

Select rows of the multi-level index data frame with index values equal to columns in another data frame in pandas

R add index column to data frame based on row values

Getting values out of a data.frame based on an index column in R

pandas - How to append a column to data frame by matching index values?

Use column with duplicated values as data frame index in Pandas

Move index values into column names in pandas Data Frame

How to create a new column in a pandas data frame for values with the same index

Pandas filter multi index based on statistic

set multi index of an existing data frame in pandas

Pivot with multi index in Pandas data frame

Filtering in a multi index pandas data frame

Filter data frame based on index value in Python

Filter index values in koalas Data frame

Get column names as index in multi index data frame

convert single index pandas data frame to multi-index

Pandas - Filter multi-index by condition on all values within index

Select column and row values with an index in data frame

Pandas data frame fill null values with index

Index pandas data frame by index

Using column values of a data frame to index rows of a multiindex data frame

How to get values based on column index in pandas?