在R中定义和分类单独的网络

whi

我有一个我无法优化的问题,我确定igraph或整洁的图必须已经具有此功能,或者必须有更好的方法来实现。我正在使用R和igraph来做到这一点,但可能tidygraphs也可以完成这项工作。

问题:如何定义网络,将超过两百万个边缘(节点1-链接至-节点2)的列表定义为自己的独立网络,然后将网络定义为权重最高的节点类别。

数据:

边缘:

1个 2
3 4
5 6
7 6
8 6

在实际示例中,这创建了3个网络NB,我们有回路,并且有往返于节点的多个边(这就是为什么我使用igraph的原因,因为它可以轻松地处理这些问题)。

网络

数据:节点类别:

ID 重量
1个 交通意外 10
2 滥用 50
3 滥用 50
4 超速 5
5 谋杀 100
6 滥用 50
7 超速 5
8 滥用 50

最终表:最终表将每个节点分类,并使用节点的最大类别标记每个网络

ID 爱猫 网络ID 网络猫
1个 交通意外 1个 50
2 滥用 1个 50
3 滥用 2 50
4 超速 2 50
5 谋杀 3 100
6 滥用 3 100
7 超速 3 100
8 滥用 3 100

当前的迭代解决方案和代码:如果没有更好的解决方案,那么我们可以加快迭代速度吗?

library(tidyverse)
library(igraph)
library(purrr) #might be an answer
library(tidyverse)
library(tidygraph) #might be an answer

from <- c(1,3,5,7,8)
to <- c(2,4,6,6,6)
edges <- data.frame(from,to)

id <- c(1,2,3,4,5,6,7,8)
cat <- c("traffic accident","abuse","abuse","speeding","murder","abuse","speeding","abuse")
weight <- c(10,50,50,5,100,50,5,50)

details <- data.frame(id,cat,weight) 

g <- graph_from_data_frame(edges)# we can add the vertex details here as well g <- 
graph_from_data_frame(edges,vertices=details) but we join these in later
plot(g)

dg <- decompose(g)# decomposing the network defines the separate networks 

networks <- data.frame(id=as.integer(),
                   network_id=as.integer())

for (i in 1:length(dg)) { # this is likely too many to do at once. As the networks are already defined we can split this into chunks. There is a case here for parellisation
  n <- dg[[i]][1] %>% # using the decomposed list of lists from i graph. There is an issue here as the list comes back with the node as an index. I can't find an easier way to get this out
    as.data.frame() %>% # I can't work a way to bring out the data without changing to df and then using row names
    row.names() %>% # and this returns a vector
    as.data.frame() %>% 
    rename(id=1) %>% 
    mutate(network_id = i,
           id=as.integer(id))

  networks <-bind_rows(n,networks)
}  

networks <- networks %>% 
  inner_join(details) # one way to bring in details

n_weight <- networks %>%
  group_by(network_id) %>% 
  summarise(network_weight=max(weight))

networks <- networks %>% 
  inner_join(n_weight)

networks # final answer

filtered_n <- networks %>% 
  filter(network_weight==100) %>% 
  select(network_id) %>% 
  distinct()#this brings out just the network ID's of whatever we happen to want

filtered_n <- networks %>% 
  filter(network_id %in% filtered_n_id$network_id)

edges %>% 
  filter(from %in% filtered_n$id | to %in% filtered_n$id ) %>% 
  graph_from_data_frame() %>% 
  plot() # returns only the network/s that we want to view
G5W

这是仅使用igraph和基数R的解决方案。

networkid <- components(g)$membership
Table <- aggregate(id, list(networkid),  function(x) { max(weight[x]) })
networkcat <-  Table$x[networkid]
Final <- data.frame(id, idcat=cat, networkid, networkcat)

Final
  id            idcat networkid networkcat
1  1 traffic accident         1         50
2  2            abuse         1         50
3  3            abuse         2         50
4  4         speeding         2         50
5  5           murder         3        100
6  6            abuse         3        100
7  7         speeding         3        100
8  8            abuse         3        100

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章