用插值法在熊猫上采样时间序列

品牌42

我有一个

import pandas as pd
index = pd.date_range('1/1/2000', periods=9, freq='0.9S')
series = pd.Series(range(9), index=index)

>>> series
2000-01-01 00:00:00.000    0
2000-01-01 00:00:00.900    1
2000-01-01 00:00:01.800    2
2000-01-01 00:00:02.700    3
2000-01-01 00:00:03.600    4
2000-01-01 00:00:04.500    5
2000-01-01 00:00:05.400    6
2000-01-01 00:00:06.300    7
2000-01-01 00:00:07.200    8
Freq: 900L, dtype: int64

现在我明白了

>>> series.resample(rule='0.5S').head(100)
2000-01-01 00:00:00.000    0.0
2000-01-01 00:00:00.500    1.0
2000-01-01 00:00:01.000    NaN
2000-01-01 00:00:01.500    2.0
2000-01-01 00:00:02.000    NaN
2000-01-01 00:00:02.500    3.0
2000-01-01 00:00:03.000    NaN
2000-01-01 00:00:03.500    4.0
2000-01-01 00:00:04.000    NaN
2000-01-01 00:00:04.500    5.0
2000-01-01 00:00:05.000    6.0
2000-01-01 00:00:05.500    NaN
2000-01-01 00:00:06.000    7.0
2000-01-01 00:00:06.500    NaN
2000-01-01 00:00:07.000    8.0
Freq: 500L, dtype: float64

如我所料,但我得到了

>>> series.resample(rule='0.5S').interpolate(method='linear')
2000-01-01 00:00:00.000    0.000000
2000-01-01 00:00:00.500    0.555556
2000-01-01 00:00:01.000    1.111111
2000-01-01 00:00:01.500    1.666667
2000-01-01 00:00:02.000    2.222222
2000-01-01 00:00:02.500    2.777778
2000-01-01 00:00:03.000    3.333333
2000-01-01 00:00:03.500    3.888889
2000-01-01 00:00:04.000    4.444444
2000-01-01 00:00:04.500    5.000000
2000-01-01 00:00:05.000    5.000000
2000-01-01 00:00:05.500    5.000000
2000-01-01 00:00:06.000    5.000000
2000-01-01 00:00:06.500    5.000000
2000-01-01 00:00:07.000    5.000000
Freq: 500L, dtype: float64

我本来希望时间戳的最后一个值仍然是8.0,而6.5秒仍然是7.0。那是怎么回事?

品牌42

使这至少部分正确的一种方法(对于真实数据,结果不是很好,使用scipy的interp1d更好地取得了成功)是mean()在以下方法之间使用

>>> series.resample(rule='0.5S').mean().interpolate(method='linear')
2000-01-01 00:00:00.000    0.0
2000-01-01 00:00:00.500    1.0
2000-01-01 00:00:01.000    1.5
2000-01-01 00:00:01.500    2.0
2000-01-01 00:00:02.000    2.5
2000-01-01 00:00:02.500    3.0
2000-01-01 00:00:03.000    3.5
2000-01-01 00:00:03.500    4.0
2000-01-01 00:00:04.000    4.5
2000-01-01 00:00:04.500    5.0
2000-01-01 00:00:05.000    6.0
2000-01-01 00:00:05.500    6.5
2000-01-01 00:00:06.000    7.0
2000-01-01 00:00:06.500    7.5
2000-01-01 00:00:07.000    8.0
Freq: 500L, dtype: float64

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章