I'm trying to calculate local max and min for a series of data: if current row value is greater or lower both following and preceding row, set it to current value, else set to NaN. Is there any more elegant way to do it, other than this one:
import pandas as pd
import numpy as np
rng = pd.date_range('1/1/2014', periods=10, freq='5min')
s = pd.Series([1, 2, 3, 2, 1, 2, 3, 5, 7, 4], index=rng)
df = pd.DataFrame(s, columns=['val'])
df.index.name = "dt"
df['minmax'] = np.NaN
for i in range(len(df.index)):
if i == 0:
continue
if i == len(df.index) - 1:
continue
if df['val'][i] >= df['val'][i - 1] and df['val'][i] >= df['val'][i + 1]:
df['minmax'][i] = df['val'][i]
continue
if df['val'][i] <= df['val'][i - 1] and df['val'][i] <= df['val'][i + 1]:
df['minmax'][i] = df['val'][i]
continue
print(df)
Result is:
val minmax
dt
2014-01-01 00:00:00 1 NaN
2014-01-01 00:05:00 2 NaN
2014-01-01 00:10:00 3 3
2014-01-01 00:15:00 2 NaN
2014-01-01 00:20:00 1 1
2014-01-01 00:25:00 2 NaN
2014-01-01 00:30:00 3 NaN
2014-01-01 00:35:00 5 NaN
2014-01-01 00:40:00 7 7
2014-01-01 00:45:00 4 NaN
We can use shift
and where
to determine what to assign the values, importantly we have to use the bit comparators &
and |
when comparing series. Shift
will return a Series or DataFrame shifted by 1 row (default) or the passed value.
When using where
we can pass a boolean condition and the second param NaN
tells it to assign this value if False
.
In [81]:
df['minmax'] = df['val'].where(((df['val'] < df['val'].shift(1))&(df['val'] < df['val'].shift(-1)) | (df['val'] > df['val'].shift(1))&(df['val'] > df['val'].shift(-1))), NaN)
df
Out[81]:
val minmax
dt
2014-01-01 00:00:00 1 NaN
2014-01-01 00:05:00 2 NaN
2014-01-01 00:10:00 3 3
2014-01-01 00:15:00 2 NaN
2014-01-01 00:20:00 1 1
2014-01-01 00:25:00 2 NaN
2014-01-01 00:30:00 3 NaN
2014-01-01 00:35:00 5 NaN
2014-01-01 00:40:00 7 7
2014-01-01 00:45:00 4 NaN
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments