我有一个结构如下的数据框:
example <- data.frame(id = c(1,1,1,1,1,1,1,2,2,2,2,2),
event = c("email","email","email","draw","email","email","draw","email","email","email","email","draw"),
date = c("2020-03-01","2020-06-01","2020-07-15","2020-07-28","2020-08-07","2020-09-01","2020-09-15","2020-05-22","2020-06-15","2020-07-13","2020-07-15","2020-07-31"))
我正在尝试过滤掉不包含在事件列中指示抽奖事件之前30天范围内的每个ID中的电子邮件。这是我想得出的结果:
desiredResult <- data.frame(id = c(1,1,1,1,2,2,2),
event = c("email","draw","email","draw","email","email","draw"),
date = c("2020-07-15","2020-07-28","2020-09-01","2020-09-15","2020-07-13","2020-07-15","2020-07-31"))
我只需要包括在每个抽奖活动开始前30天内发生的活动。我不确定如何实现
在每id
行中,我们都可以选择距-30天的行event = "draw"
。
library(dplyr)
example %>%
mutate(date = as.Date(date)) %>%
group_by(id) %>%
filter(Reduce(`|`, purrr::map(date[event == 'draw'],
~between(date, .x - 30, .x))))
# id event date
# <dbl> <chr> <date>
#1 1 email 2020-07-15
#2 1 draw 2020-07-28
#3 1 email 2020-09-01
#4 1 draw 2020-09-15
#5 2 email 2020-07-13
#6 2 email 2020-07-15
#7 2 draw 2020-07-31
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句