str_extract（）和summarise（）给我没有行

pkpto39

这应该非常简单，因为我只是想对自己所看到的内容进行验证。

我试图用来将str_extract()感兴趣的区域从数据框中的列中拉出，然后计算每个单词出现的频率。我遇到了一个问题，但是当我执行此操作时，我生成的数据帧已NA列在其中一行中。这让我感到困惑，因为我不知道是什么原因导致的，或者这是否表示我的代码中有错误。我不确定如何解决此问题。

此外，请注意，单词中的最后一项是“桌子很轻”，在此示例中包含两个感兴趣的单词。我这样做是有意的，因为我想确保将其计算两次。

library(tidyverse)

df <- data.frame(words =c("paper book", "food press", "computer monitor", "my fancy speakers",
                 "my two dogs", "the old couch", "the new couch", "loud speakers", 
                 "wasted paper", "put the dishes away", "set the table", "put it on the table", 
                 "lets go to church", "turn out the lights", "why are the lights on",
                 "the table is light"))

keep <- c("dogs|paper|table|light|couch")

new_df <- df %>% 
  mutate(Subject = str_extract(words, keep), n = n()) %>% 
  group_by(Subject)%>%
  summarise(`Word Count` = length(Subject))

这就是我现在要得到的

 Subject `Word Count`
  <chr>          <int>
1 couch              2
2 dogs               1
3 light              2
4 paper              2
5 table              3
6 NA                 6

所以我的问题是-是什么导致Subject中的NA行？还有其他所有记录吗？

罗纳克·沙

在NA那些地方有中没有字的值出现keep出现在该行因此没有什么可提取物。

library(dplyr)
library(stringr)

df %>%  mutate(Subject = str_extract(words, keep))

#                   words Subject
#1             paper book   paper
#2             food press    <NA>
#3       computer monitor    <NA>
#4      my fancy speakers    <NA>
#5            my two dogs    dogs
#6          the old couch   couch
#7          the new couch   couch
#8          loud speakers    <NA>
#9           wasted paper   paper
#10   put the dishes away    <NA>
#11         set the table   table
#12   put it on the table   table
#13     lets go to church    <NA>
#14   turn out the lights   light
#15 why are the lights on   light
#16    the table is light   table

例如，对于第二行，'food press'其中没有任何值，"dogs|paper|table|light|couch"因此它返回NA。

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-01-26

我来说两句

0 条评论

登录后参与评论

上一篇：创建加载消息，这些消息将根据闪亮的应用程序中情节的加载时间而改变

str_extract 所有语法

stringr::str_extract 列表 R 的所有元素

str_extract特定模式

str_extract（）和summarise（）给我没有行

str_extract（）和summarise（）给我没有行

蓝屏死机没有修复解决方案

计算数据帧中每行的NA

UITableView的项目向下滚动后更改颜色，然后快速备份

Node.js中未捕获的异常错误，发生调用

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

Linux的官方Adobe Flash存储库是否已过时？

验证REST API参数

ggplot：对齐多个分面图-所有大小不同的分面

Mac OS X更新后的GRUB 2问题

通过 Git 在运行 Jenkins 作业时获取 ClassNotFoundException

带有错误“ where”条件的查询如何返回结果？

用日期数据透视表和日期顺序查询

VB.net将2条特定行导出到DataGridView

如何从视图一次更新多行（ASP.NET - Core）

Java Eclipse中的错误13，如何解决？

尝试反复更改屏幕上按钮的位置 - kotlin android studio

离子动态工具栏背景色

应用发明者仅从列表中选择一个随机项一次

当我尝试下载 StanfordNLP en 模型时，出现错误

python中的boto3文件上传

在同一Pushwoosh应用程序上Pushwoosh多个捆绑ID