我有一个数据集:
x y
A1 start
A2 ID
A3 delete
A4 start
A5 ID
A6 delete
A7 ID
A8 delete
A9 start
A10 ID
A11 delete
A12 delete
A13 start
A14 ID
A15 start
A16 delete
A17 ID
A18 delete
A19 delete
如您所见,y列中有一个连接:“删除”在“ ID”之后。但是也有例外:在A12中,“删除”在“删除”之后,在A16中,“删除”在“开始”之后,在A19中,“删除”在“删除”之后。我如何才能仅对不包含在“ ID”之后的这些“删除”进行子集化。因此,理想的结果是:
x y
A12 delete
A16 delete
A19 delete
您可以lag
在中使用dplyr
:
library(dplyr)
df %>% filter(y == 'delete' & lag(y) != 'ID')
# x y
#1 A12 delete
#2 A16 delete
#3 A19 delete
并等同于data.table
:
library(data.table)
setDT(df)[y == 'delete' & shift(y) != 'ID']
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句