干杯,我有两个具有以下结构的数据帧。
DF1:
Airlines HeadQ Date Cost_Index
American PHX 07-31-2016 220
American ATL 08-31-2016 150
American ATL 10-31-2016 150
Delta ATL 10-31-2016 180
American ATL 08-31-2017 200
第二数据帧DF2具有以下结构:
DF2:
Airlines HeadQ Date
American ATL 09-30-2016
Delta ATL 03-31-2017
现在查找数据帧DF1和DF2,我想将DF1更改为以下数据帧。
DF1:
Airlines HeadQ Date Cost_Index
American PHX 07-31-2016 220
American ATL 08-31-2016 0
American ATL 10-31-2016 150
Delta ATL 10-31-2016 180
American ATL 08-31-2017 200
条件是,从DF2查找DF1的Airlines和HeadQ,如果DF1 $ Date <DF2 $ Date,则将Cost_Index设置为0,否则继续使用Cost_Index。
我尝试了以下操作,但均未成功:
DF1$Cost_Index <- ifelse(DF1$Airlines == DF2$Airlines & DF1$HeadQ == DF2$HeadQ
& DF1$Date < DF2$Date, 0, DF1$Cost_Index)
Warning:
1: In DF1$Airlines == DF2$Airlines : longer object
length is not a multiple of shorter object length".
2: In<=.default(DF1$Date, DF2$Date) : longer object length is not a
multiple of shorter object length
DF1:
Airlines HeadQ Date Cost_Index
American PHX 07-31-2016 220
American ATL 08-31-2016 0
American ATL 10-31-2016 0
Delta ATL 10-31-2016 0
American ATL 08-31-2017 200
谁能指出我正确的方向?
笔记:
str(DF1$Date): Date, format: "2016-10-31"
str(DF2$Date): Date, format: "2016-08-31"
你可以做一个左连接的DF1
,并DF2
在Airlines
与HeadQ
列,然后更改的价值Cost_Index
基础上,如果Date.x
(从日DF1
)小于Date.y
(日期从DF2)。随着dplyr
语法(如果额外的检查Date.y
是NA
按照你的逻辑,如果有在DF2不匹配,则应用于此Cost_Index
应保持不变):
library(dplyr)
# convert the Date column to Date class for comparison
df1$Date <- as.Date(df1$Date, "%m-%d-%Y")
df2$Date <- as.Date(df2$Date, "%m-%d-%Y")
df1 %>%
left_join(df2, by = c("Airlines", "HeadQ")) %>%
mutate(Cost_Index = ifelse(Date.x < Date.y & !is.na(Date.y), 0, Cost_Index),
Date = Date.x) %>%
select(-Date.x, -Date.y)
# Airlines HeadQ Cost_Index Date
# 1 American PHX 220 2016-07-31
# 2 American ATL 0 2016-08-31
# 3 American ATL 150 2016-10-31
# 4 Delta ATL 180 2016-10-31
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句