I have a pandas data frame with 2 columns (time series date and value)
Input data frame:
date | Value |
---|---|
2021-05-01 | -2 |
2021-05-02 | 3 |
2021-05-03 | 5 |
2021-05-04 | 4 |
2021-05-05 | 6 |
2021-05-06 | -3 |
2021-05-07 | -8 |
2021-05-08 | -1 |
2021-05-09 | 5 |
2021-05-10 | 4 |
2021-05-11 | 5 |
2021-05-12 | 1 |
2021-05-13 | -1 |
2021-05-14 | -2 |
2021-05-15 | -1 |
I need to subset 2 data frames from this one. The condition is I need to loop through the rows and subset all the positive values in the same order along with 1 row before and after, that has negative values.
My expected outputs are below
Output data frame1
date | Value |
---|---|
2021-05-01 | -2 |
2021-05-02 | 3 |
2021-05-03 | 5 |
2021-05-04 | 4 |
2021-05-05 | 6 |
2021-05-06 | -3 |
Output data frame2:
date | Value |
---|---|
2021-05-08 | -1 |
2021-05-09 | 5 |
2021-05-10 | 4 |
2021-05-11 | 5 |
2021-05-12 | 1 |
2021-05-13 | -1 |
Any suggestions on how to do this in a most efficient manner? This is a sample data, but i might have a much longer series to be considered.
You can filter before and after values less like 0
and create list of DataFrames in list comprehension:
m0 = df['Value'].lt(0)
m1 = m0 & df['Value'].shift(-1).ge(0)
m2 = m0 & df['Value'].shift().ge(0)
df['g'] = m1.cumsum()
df2 = df[m1 | m2 | ~m0].copy()
dfs = [g.drop('g', axis=1) for i, g in df2.groupby('g')]
print (dfs)
[ date Value
0 2021-05-01 -2
1 2021-05-02 3
2 2021-05-03 5
3 2021-05-04 4
4 2021-05-05 6
5 2021-05-06 -3, date Value
7 2021-05-08 -1
8 2021-05-09 5
9 2021-05-10 4
10 2021-05-11 5
11 2021-05-12 1
12 2021-05-13 -1]
Collected from the Internet
Please contact [email protected] to delete if infringement.
Comments