我试图在比较一个字符串与data.table中的两个以上其他字符串时创建一个逻辑值的变量,而我需要忽略NA。
D2的样本数据:
structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1",
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1",
"char1", "char1")), row.names = c(NA, -3L), class = c("data.table",
"data.frame"), .internal.selfref = <pointer: 0x0000015eb1261ef0>)
尝试了以下建议的解决方案:
D2[, Match := apply(sapply(.SD, `==`, D2[, "var1"]), 1, any), .SDcols =
c("var2", "var3")]
a003的结果为TRUE,而应为FALSE,因为var1和var3不匹配:
structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1",
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1",
"char1", "char1"), Match = c(TRUE, TRUE, TRUE)), row.names = c(NA,
-3L), class = c("data.table", "data.frame"), .internal.selfref = <pointer:
0x0000015eb1261ef0>)
所需结果:
structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1",
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1",
"char1", "char1"), Match = c(TRUE, TRUE, FALSE)), row.names = c(NA,
-3L), class = c("data.table", "data.frame"), .internal.selfref = <pointer:
0x0000015eb1261ef0>)
接下来呢
setDT(D1)
D1[, Match := apply(sapply(.SD, `==`, D1[, "var1"]), 1, any), .SDcols = c("var2", "var3")]
D1
#ID var1 var2 var3 Match
#1: a001 char1 char1 char1 TRUE
#2: a002 char1 <NA> char1 TRUE
#3: a003 char2 char1 char1 FALSE
说明:我们的子项进行比较data.table
,通过定义.SDcols
与项目D1[, "var1"]
; 如果any
匹配,则返回TRUE
,否则FALSE
。
根据您的评论,您可以
setDT(D1)
D1[, Match := apply(sapply(.SD, `==`, D1[, "var1"]), 1, all, na.rm = T), .SDcols = c("var2", "var3")]
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句