检测时间序列的快速增长

金属规则211

我有一个时间序列,我想检测何时值迅速增加,并确定(可能是颜色)增加的时间范围。例如,在下面的图中,我想查明x轴上特定时间窗口的峰值(例如2018-05-22)

在此处输入图片说明

我能够找到20%的最大值,但这对我没有帮助。我想专注于快速增长。

d = pd.Series(df['TS'].values[1:] - df['TS].values[:-1], index=df['TS'].index[:-1]).abs()
threshold = 0.8
m = d.max()
print(d > m * threshold)

例如,破裂在视觉上做着类似的事情。有一个示例,其中随机数据沿其产生的图显示:

import matplotlib.pyplot as plt
import ruptures as rpt

# generate signal
n_samples, dim, sigma = 1000, 3, 4
n_bkps = 4  # number of breakpoints
#signal, bkps = rpt.pw_constant(n_samples, dim, n_bkps, noise_std=sigma)


# detection
algo = rpt.Pelt(model="rbf").fit(signal)
result = algo.predict(pen=10)

# display
rpt.display(signal, bkps, result)
plt.show()

在此处输入图片说明

野蛮的亨利

You could try differencing the series to get the amount the series changed at each step. This is a standard method in time-series analysis. Here is a good discussion of the theory behind the idea.

In practice, pandas gives us a simple way to difference the series via the .diff() method.

df = pd.DataFrame({'a': [10, 20, 30, 40, 50, 60],
                   'b': [1, 1000, 20000, 3, 50, 80],
                   'c': [1000, 4, 97, 16, 2500, 36]})
df.diff()

      a        b       c
0   NaN      NaN     NaN
1  10.0    999.0  -996.0
2  10.0  19000.0    93.0
3  10.0 -19997.0   -81.0
4  10.0     47.0  2484.0
5  10.0     30.0 -2464.0

This process subtracts the next item in the series from the prior (which is why the first observation is now NaN). You can add a feature to your data that is the first-differenced series:

df['a_first_difference'] = df['a'].diff()

然后,该系列中的值遵循它们自己的分布。也许找到差异序列中高于某个阈值的所有值,将为您提供一种方法,通过识别与前一个值相比变化最大的时期来衡量“快速增长”。还可以.diff()告诉您下一个的更改

df.diff(periods=-1)

      a        b       c
0 -10.0   -999.0   996.0
1 -10.0 -19000.0   -93.0
2 -10.0  19997.0    81.0
3 -10.0    -47.0 -2484.0
4 -10.0    -30.0  2464.0
5   NaN      NaN     NaN

这样,这将有助于确定下一步发生较大变化的时期。

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章