I have a dataframe which can be constructed as:
df = pd.DataFrame({'A': [1, 4, 6, 3, 2, 3, 6, 8],
'B': [4, 7, 1, 5, 6, 8, 3, 9],
'C': [1, 5, 3, 1, 6, 8, 9, 0],
'D': [6, 3, 7, 8, 9, 4, 2, 1]})
The df
looks like:
A B C D
0 1 4 1 6
1 4 7 5 3
2 6 1 3 7
3 3 5 1 8
4 2 6 6 9
5 3 8 8 4
6 6 3 9 2
7 8 9 0 1
And there are 2 other variables which are to be used in substitution of values in the df:
mx = pd.core.series.Series([7, 8, 8, 7], index=["A", "B", "C", "D"])
dm = pd.core.series.Series([5, 8, 6, 4], index=["A", "B", "C", "D"])
PROBLEM: I want to replace all the values from the dataframe greater than the corresponding value in dm
but less than that in mx
with the values from dm
. In other words, let's say for "D", I want to replace all the values between 4 and 7 with 4.
So the expected output would look something like:
A B C D
0 1 4 1 4
1 4 7 5 3
2 5 1 3 4
3 3 5 1 8
4 2 6 6 9
5 3 8 6 4
6 5 3 9 2
7 8 9 0 1
I have tried using df.apply
and df.update
but I'm unable to make the condition. It always throws a ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Is there an efficient way to achieve this? Any help would be appreciated.
Use DataFrame.mask
with compare DataFrame by Series
by DataFrame.le
and DataFrame.ge
, chained mask by &
for bitwise AND
and replace by Series
with parameter axis=1
:
df = df.mask(df.ge(dm) & df.le(mx), dm, axis=1)
print (df)
A B C D
0 1 4 1 4
1 4 7 5 3
2 5 1 3 4
3 3 5 1 8
4 2 6 6 9
5 3 8 6 4
6 5 3 9 2
7 8 9 0 1
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments