我经常使用充满特殊字符(feá,ľ,š,č,ť,ž,ý,á,í,é等)的表格。我发现了一个非常有用的函数mgsub
,该函数可以同时进行多个字符串替换。我的向量效果很好,但是我正在努力将哪个函数应用于整个数据帧。
函数mgsub
工作如下:
library(mgsub)
mgsub::mgsub("...A čo i tam dušu dáš v tom boji divokom: Mor ty len, a voľ nebyť, ako byť otrokom.",
pattern = c(".","A","č","š","á",":",",","ľ","ť","M"," "),
replacement = c("","a","c","s","a","","","","t","m",""), fixed = TRUE)
[1] "acoitamdusudasvtombojidivokommortylenavonebytakobytotrokom"
但是如何将此功能应用于整个data.frame?例如在此data.frame ...
my.df <- data.frame(v1 = c("...A čo i tam dušu","dáš v tom boji"),
v2 = c("divokom:","Mor ty len,"),
v3 = c("a voľ nebyť,","ako byť otrokom."))
v1 v2 v3
1 ...A čo i tam dušu divokom: a voľ nebyť,
2 dáš v tom boji Mor ty len, ako byť otrokom.
我试着愉快地玩。但它只给出错误...
data.frame(lapply(my.df, mgsub::mgsub,
pattern = c(".","A","č","š","á",":",",","ľ","ť","M"," "),
replacement = c("","a","c","s","a","","","","t","m",""), fixed = TRUE))
Error in nchar(string) : 'nchar()' requires a character vector
欢迎任何建议。
问题是这些列是factor
并且mgsub
需要character
输入。根据?mgsub
字符串-寻求替换的字符向量
将所有列都转换为character
类
my.df[] <- lapply(my.df, as.character)
或使用 type.convert
my.df <- type.convert(my.df, as.is = TRUE)
或stringsAsFactors = FALSE
在创建中data.frame
作为默认选项时使用data.frame
是stringsAsFactors = TRUE
my.df <- data.frame(v1 = c("...A čo i tam dušu","dáš v tom boji"),
v2 = c("divokom:","Mor ty len,"),
v3 = c("a voľ nebyť,","ako byť otrokom."),
stringsAsFactors = FALSE)
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句