我希望有人可以帮助我:)
我有一个约1000列的数据框。在其中,我有这样命名的列:X1,X2,X3,X4,X5,X6等... Y1,Y2,Y3,Y4,Y5,Y6等...
df <- data.frame("X1" = c("Yes","No","Yes","NA","NA","NA","Yes","No","Yes","NA","NA","NA","NA"),
"X2" = c("Yes","NA","NA","NA","NA","Yes","NA","NA","NA","NA","Yes","NA","NA"),
"X3" = c("Yes","NA","NA","NA","Yes","No","Yes","NA","Yes","NA","NA","NA", "Yes"),
"X4" = c("Yes","No","Yes","NA","NA","NA","Yes","No","Yes","NA","NA","NA","NA"),
"X5" = c("Yes","NA","NA","NA","NA","Yes","NA","NA","NA","NA","Yes","NA","NA"),
"X6" = c("Yes","NA","NA","NA","Yes","No","Yes","NA","Yes","NA","NA","NA", "Yes"),
"Y1" = c("Yes","No","Yes","NA","NA","NA","Yes","No","Yes","NA","NA","NA","NA"),
"Y2" = c("Yes","NA","NA","NA","NA","Yes","NA","NA","NA","NA","Yes","NA","NA"),
"Y3" = c("Yes","NA","NA","NA","Yes","No","Yes","NA","Yes","NA","NA","NA", "Yes"),
"Y4" = c("Yes","No","Yes","NA","NA","NA","Yes","No","Yes","NA","NA","NA","NA"),
"Y5" = c("Yes","NA","NA","NA","NA","Yes","NA","NA","NA","NA","Yes","NA","NA"),
"Y6" = c("Yes","NA","NA","NA","Yes","No","Yes","NA","Yes","NA","NA","NA", "Yes"))
在某些列中,我将“ Yes”替换为1,将“ No”替换为0,并将其他任何内容替换为NA。
我已经试过了:
names = c("X","Y")
for (name in names){
try(
for (j in 1:6){
j <- toString(j)
colname <- paste(name , j, sep="")
df$colname <- gsub("Yes", as.integer(1), df$colname)
df$colname <- gsub("No", as.integer(0), df$colname)
})}
但是,这不起作用,并引发错误消息:
Error in `$<-.data.frame`(`*tmp*`, "colname", value = character(0)) : replacement has 0 rows, data has 13
我的第一个问题是:为什么列名引用不正确?
第二个问题是:如何用“ NA”替换这些列中非0或1的任何内容?
这可能是我忽略的非常简单的事情,但是我不太清楚该怎么做。任何帮助将不胜感激。
在此先感谢,Rich
我不会在这里使用循环或gsub,您可以使用以下代码:
df[] <- lapply(df, function(x) x <- car::recode(x, "'Yes'=1; 'No'=0; 'NA'=NA"))
这将遍历数据框中的每一列,并根据需要重新编码值。如果将来获得更多价值,这也更容易扩展。
如果只需要某些列,则可以这样修改它:
df[, col_list] <- lapply(df[, col_list], function(x) x <- car::recode(x, "'Yes'=1; 'No'=0; 'NA'=NA"))
col_list
您要更改的变量的向量在哪里。您可以使用以下命令为他们grepcol_list <- grep('^X|Y', names(df), value = T)
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句