使用 2 坐标数据点订购数据框/条形图 - ggplot2

哈里·史密斯

我有数据，由 2 个坐标数据点（例如 [0,2]）汇总。但是，即使坐标是因子数据类型，我的数据框和条形图也是按字母顺序排列的。

数据框/ggplot默认行为：[0,1], [0,13], [0,2] 我想要发生的事情：[0,1], [0,2], [0,13]

此坐标变量是通过粘贴 2 列中的数字创建的

mutate(swimlane_coord = factor(paste0("[", sl_subsection_index, ",", sl_element_index, "]")))

其中 sl_subsection_index 是一个整数，sl_element_index 是一个整数。

可以有任何坐标组合，所以我想避免手动强制因子定义。

下面是一个数据示例：

structure(list(application_type1 = c("SamsungTV", "SamsungTV", 
"SamsungTV", "SamsungTV", "SamsungTV", "SamsungTV", "SamsungTV", 
"SamsungTV", "SamsungTV", "SamsungTV", "SamsungTV", "SamsungTV", 
"SamsungTV", "SamsungTV"), variant_uuid = structure(c(1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Control", 
"BackNav"), class = "factor"), allStreamSec = c("curatedCatalog", 
"curatedCatalog", "curatedCatalog", "curatedCatalog", "curatedCatalog", 
"curatedCatalog", "curatedCatalog", "curatedCatalog", "curatedCatalog", 
"curatedCatalog", "curatedCatalog", "curatedCatalog", "curatedCatalog", 
"curatedCatalog"), swimlane_coord = structure(c(1L, 2L, 8L, 9L, 
10L, 21L, 1L, 2L, 8L, 9L, 10L, 11L, 25L, 29L), .Label = c("[0,0]", 
"[0,1]", "[0,10]", "[0,11]", "[0,12]", "[0,13]", "[0,14]", "[0,2]", 
"[0,3]", "[0,4]", "[0,5]", "[0,6]", "[0,7]", "[0,8]", "[0,9]", 
"[1,0]", "[1,1]", "[1,3]", "[1,4]", "[1,5]", "[1,7]", "[2,0]", 
"[2,11]", "[3,1]", "[3,11]", "[3,2]", "[3,5]", "[3,6]", "[3,7]", 
"[3,8]"), class = "factor"), ESPerVisitBySL = c(1.775, 1.83333333333333, 
0.976190476190476, 0.966666666666667, 1.08333333333333, 1, 1.33333333333333, 
1.45161290322581, 1.68965517241379, 1.44827586206897, 1.5, 1, 
1, 1), UESPerVisitBySL = c(13, 16.4, 8.80952380952381, 8.4, 9.33333333333333, 
1, 11.5555555555556, 17.741935483871, 16.3448275862069, 8.10344827586207, 
15.3571428571429, 6, 7, 2)), row.names = c(NA, -14L), groups = structure(list(
    application_type1 = c("SamsungTV", "SamsungTV"), variant_uuid = structure(1:2, .Label = c("Control", 
    "BackNav"), class = "factor"), allStreamSec = c("curatedCatalog", 
    "curatedCatalog"), .rows = structure(list(1:6, 7:14), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, -2L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

请注意，[3,11] 出现在 [3,2] 之前。

我加载的唯一包是 tidyverse 和 data.table。

谢谢哈利

斯蒂芬

为了达到你想要的结果，你可以

通过sl_subsection_index和安排您的data.framesl_element_index
这样做之后，你可以设置的顺序swimlane_coord使用forcats::fct_inorder

library(ggplot2)
library(dplyr)
library(forcats)

d %>%
  ungroup() %>%
  mutate(
    sl_subsection_index = gsub("^\\[(\\d+),\\d+\\]$", "\\1", swimlane_coord),
    sl_element_index = gsub("^\\[\\d+,(\\d+)\\]$", "\\1", swimlane_coord)
  ) %>%
  arrange(as.integer(sl_subsection_index), as.integer(sl_element_index)) %>%
  mutate(swimlane_coord = forcats::fct_inorder(factor(swimlane_coord))) %>% 
  ggplot(aes(swimlane_coord)) +
  geom_bar()