R：如何将具有不同元素数量的字符串向量拆分为多个向量？

赫尔比施

我有一个像这样的向量：

> vector <- c("1: this is the first sentence2: this is the second sentence", "1: this is the first sentence2: this is the second sentence3: this is the third sentence")

向量[1]“ 1：这是第一句话2：这是第二句话”
[2]“ 1：这是第一句话2：这是第二句话3：这是第三句话”

我想在保留定界符的情况下将字符串分成多个部分。我挣扎着不同长度的期望输出向量。我想要一个这样的输出数据框：

                   first_vector                  second_vector                  third_vector
1 1: this is the first sentence 2: this is the second sentence                          <NA>
2 1: this is the first sentence 2: this is the second sentence 3: this is the third sentence

我尝试了来自不同程序包的不同拆分功能，但尚未达到预期的结果。任何帮助，不胜感激。谢谢

戴夫·阿姆斯特朗

您可以str_extract_all()从stringr包装中做到这一点：

res <- str_extract_all(vector, "\\d:[^\\d]*")
maxlen <- max(sapply(res, length))
res <- t(sapply(res, function(x)c(x, rep(NA, maxlen-length(x)))))
colnames(res) <- c("first_vector", "second_vector", "third_vector")
res
#      first_vector                    second_vector                    third_vector                   
# [1,] "1: this is the first sentence" "2: this is the second sentence" NA                             
# [2,] "1: this is the first sentence" "2: this is the second sentence" "3: this is the third sentence"

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。