我有一个像这样的向量:
> vector <- c("1: this is the first sentence2: this is the second sentence", "1: this is the first sentence2: this is the second sentence3: this is the third sentence")
向量[1]“ 1:这是第一句话2:这是第二句话”
[2]“ 1:这是第一句话2:这是第二句话3:这是第三句话”
我想在保留定界符的情况下将字符串分成多个部分。我挣扎着不同长度的期望输出向量。我想要一个这样的输出数据框:
first_vector second_vector third_vector
1 1: this is the first sentence 2: this is the second sentence <NA>
2 1: this is the first sentence 2: this is the second sentence 3: this is the third sentence
我尝试了来自不同程序包的不同拆分功能,但尚未达到预期的结果。任何帮助,不胜感激。谢谢
您可以str_extract_all()
从stringr
包装中做到这一点:
res <- str_extract_all(vector, "\\d:[^\\d]*")
maxlen <- max(sapply(res, length))
res <- t(sapply(res, function(x)c(x, rep(NA, maxlen-length(x)))))
colnames(res) <- c("first_vector", "second_vector", "third_vector")
res
# first_vector second_vector third_vector
# [1,] "1: this is the first sentence" "2: this is the second sentence" NA
# [2,] "1: this is the first sentence" "2: this is the second sentence" "3: this is the third sentence"
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句