R:将多列中的字符串与单个列中的字符串进行比较

用户名

我试图在比较一个字符串与data.table中的两个以上其他字符串时创建一个逻辑值的变量,而我需要忽略NA。

D2的样本数据:

structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1", 
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1", 
"char1", "char1")), row.names = c(NA, -3L), class = c("data.table", 
"data.frame"), .internal.selfref = <pointer: 0x0000015eb1261ef0>)

尝试了以下建议的解决方案:

D2[, Match := apply(sapply(.SD, `==`, D2[, "var1"]), 1, any), .SDcols = 
c("var2", "var3")]

a003的结果为TRUE,而应为FALSE,因为var1和var3不匹配:

structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1", 
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1", 
"char1", "char1"), Match = c(TRUE, TRUE, TRUE)), row.names = c(NA, 
-3L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 
0x0000015eb1261ef0>)

所需结果:

structure(list(ID = c("a001", "a002", "a003"), var1 = c("char1", 
"char1", "char2"), var2 = c("char1", NA, "char2"), var3 = c("char1", 
"char1", "char1"), Match = c(TRUE, TRUE, FALSE)), row.names = c(NA, 
-3L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 
0x0000015eb1261ef0>)
莫里斯·埃弗斯(Maurits Evers)

接下来呢

setDT(D1)
D1[, Match := apply(sapply(.SD, `==`, D1[, "var1"]), 1, any), .SDcols = c("var2", "var3")]
D1
#ID  var1  var2  var3 Match
#1: a001 char1 char1 char1  TRUE
#2: a002 char1  <NA> char1  TRUE
#3: a003 char2 char1 char1 FALSE

说明:我们的子项进行比较data.table,通过定义.SDcols与项目D1[, "var1"]; 如果any匹配,则返回TRUE,否则FALSE


更新资料

根据您的评论,您可以

setDT(D1)
D1[, Match := apply(sapply(.SD, `==`, D1[, "var1"]), 1, all, na.rm = T), .SDcols = c("var2", "var3")]

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章