df1中与数据帧lookup_df中的lab_pt匹配的级别,我想替换为lookup_df第二栏中的相应级别(此处为lab_en)。但我想保持其余的不变。非常感谢!
--
主数据框
df1 <- data.frame(
num_var = sample(200, 15),
col1 = rep(c("onda","estrela","rato","caneta","ceu"), 3),
col2 = rep(c("muro","gato","pa","rato","ceu"), 3),
col3 = rep(c("surf","onda","dente","onda","sei"), 3),
col3 = rep(c("onda","casa",NA,"nao","net"), 3))
查找数据帧
lookup_df <- data.frame(
lab_pt = c("onda","estrela","rato","caneta","ceu"),
lab_en = c("wave","star","rat","pen","sky"))
我已经在下面尝试过了。它可以完成工作,但是不匹配的信息会转换为NA,这是我不想要的。
rownames(lookup_df) <- lookup_df$lab_pt
apply(df1[,2:ncol(df1)], 2, function(x) lookup_df[as.character(x),]$lab_en)
这里的帖子非常相似,但是在这种情况下,所有级别都是可匹配的,与此不同。非常感谢!根据查找表替换数据框中的值
# Fake dataframe
df1 <- tibble(
num_var = sample(200, 15),
col1 = rep(c("onda","estrela","rato","caneta","ceu"), 3),
col2 = rep(c("muro","gato","pa","rato","ceu"), 3),
col3 = rep(c("surf","onda","dente","onda","sei"), 3),
col4 = rep(c("onda","casa",NA,"nao","net"), 3))
# Lookup dictionary dataframe
lookup_dat <- tibble(
lab_pt = c("onda","estrela","rato","caneta","ceu"),
lab_en = c("wave","star","rat","pen","sky"))
#******************************************************************
#
# Translation by replacement of lookup dictionary
# Developed to generate Rmd report with labels of plots in different languages
replace_level <- function(df, lookup_df, col_langu_in, col_langu_out){
library(data.table)
# function to replace levels in the df given a reference list in
# another df when level match it replace with the correspondent
#level in the same row name but in other column.
# !!!! Variables col_langu need to be quoted
# 1) Below it creates a dictionary style with the reference df (2cols)
lookup_vec <- setNames(as.character(lookup_df[[col_langu_out]]),
lookup_df[[col_langu_in]])
# 2) iterating over main df col names
for (i in names(df)) { # select cols?: names(df)[sapply(df, is.factor)]
# 3) return index of levels from df levels matching with those from
# the dictionary type to replace (for each cols of df i)
if(is.character(df[[i]])){df[i] <- as.factor(df[[i]])}
# Changing from character to factor before the translation
index_match <- which(levels(df[[i]]) %in%
names(lookup_vec))
# 4) replacing matchable levels based on the index on step 3).
# with the reference to translate
levels(df[[i]])[index_match] <-
lookup_vec[levels(df[[i]])[index_match]]}
return(df)}
# test here
replace_level(df1, lookup_dat, "lab_pt", "lab_en")
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句