我在\"
从文本中删除时遇到问题。
以下是我拥有的数据的示例:
Date Text
15/03/2015 \"My name is Jane. I \" am a girl.
20/03/2015 Hi, \"I am bored\". Are you\"?
我想获得此输出(通过删除\"
):
Date Text
15/03/2015 My name is Jane. I am a girl.
20/03/2015 Hi, I am bored. Are you?
以下是我尝试过的代码之一:
text <- c(" \"My name is Jane. I \" am a girl.",
"Hi, \"I am bored\". Are you\"? ")
text <- gsub ("[^[:alum:][:space:]?|.|,]", "", text, perl = TRUE)
cname <- file.path ("~", "Desktop", "Demo", "Corpus")
length(dor(cname))
dir(cname)
a <- Corpus (DirSoure(cname))
test <- DocumentTermMatrix (a)
findFreqTerms(helo)
我得到的输出是:
[1]\"My
[2]name
[3]is
[4]Jane
[5]I
[6]\"
[7]am
[8]a
[9]girl.
[10]Hi,
[11]\"I
[12]am
[13]bored\".
[14]Are
[15]you\"?
您需要转义反斜杠和引号。也许试试看
text <- c(" \"My name is Jane. I \" am a girl.",
"Hi, \"I am bored\". Are you\"? ")
output <- gsub("\\\"","",text)
output
[1] " My name is Jane. I am a girl." "Hi, I am bored. Are you? "
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句