我正在使用循环绘制多个基因的散点图。每个基因产生多个png文件。每个基因/ png文件包含两个散点图:左侧为Group1,右侧为Group2。每个组都包含健康和不健康的样本。到目前为止,我已经成功地成功导出了代码。
但是,我现在要做的是为每个健康和不健康的组在每个“时间点”的x轴上添加样本数。这基于“样本”列。对于每个时间点,它应显示为“((健康状态下的样品数量,不健康状态下的样品数量)”)。谁能帮助我实现这一目标?
我当前的2个基因的示例数据框'data'如下:
Biomarkers TimePoint Group Scale Readings Condition samples
Gene1 52.5 Group1 25 0.027 Healthy 33
Gene1 52.5 Group2 25 0.024 Healthy 35
Gene1 57.5 Group1 25 0.029 Healthy 39
Gene1 57.5 Group2 25 0.023 Healthy 46
Gene1 62.5 Group1 25 0.030 Healthy 38
Gene1 62.5 Group2 25 0.024 Healthy 42
Gene1 67.5 Group1 25 0.033 Healthy 23
Gene1 67.5 Group2 25 0.026 Healthy 16
Gene2 52.5 Group1 25 0.051 Healthy 33
Gene2 52.5 Group2 25 0.046 Healthy 35
Gene2 57.5 Group1 25 0.052 Healthy 39
Gene2 57.5 Group2 25 0.048 Healthy 46
Gene2 62.5 Group1 25 0.049 Healthy 38
Gene2 62.5 Group2 25 0.051 Healthy 42
Gene2 67.5 Group1 25 0.051 Healthy 23
Gene2 67.5 Group2 25 0.052 Healthy 16
Gene1 52.5 Group1 25.01 0.026 Unhealthy 41
Gene1 52.5 Group2 25.01 0.023 Unhealthy 57
Gene1 57.5 Group1 25.01 0.027 Unhealthy 79
Gene1 57.5 Group2 25.01 0.024 Unhealthy 70
Gene1 62.5 Group1 25.01 0.030 Unhealthy 93
Gene1 62.5 Group2 25.01 0.025 Unhealthy 84
Gene1 67.5 Group1 25.01 0.033 Unhealthy 98
Gene1 67.5 Group2 25.01 0.022 Unhealthy 64
Gene2 52.5 Group1 25.01 0.043 Unhealthy 36
Gene2 52.5 Group2 25.01 0.044 Unhealthy 57
Gene2 57.5 Group1 25.01 0.043 Unhealthy 79
Gene2 57.5 Group2 25.01 0.043 Unhealthy 70
Gene2 62.5 Group1 25.01 0.043 Unhealthy 93
Gene2 62.5 Group2 25.01 0.044 Unhealthy 84
Gene2 67.5 Group1 25.01 0.044 Unhealthy 98
Gene2 67.5 Group2 25.01 0.044 Unhealthy 64
Gene1 52.5 Group1 50 0.035 Healthy 33
Gene1 52.5 Group2 50 0.029 Healthy 35
Gene1 57.5 Group1 50 0.039 Healthy 39
Gene1 57.5 Group2 50 0.031 Healthy 46
Gene1 62.5 Group1 50 0.038 Healthy 38
Gene1 62.5 Group2 50 0.030 Healthy 42
Gene1 67.5 Group1 50 0.040 Healthy 23
Gene1 67.5 Group2 50 0.035 Healthy 16
Gene2 52.5 Group1 50 0.058 Healthy 33
Gene2 52.5 Group2 50 0.053 Healthy 35
Gene2 57.5 Group1 50 0.059 Healthy 39
Gene2 57.5 Group2 50 0.056 Healthy 46
Gene2 62.5 Group1 50 0.057 Healthy 38
Gene2 62.5 Group2 50 0.058 Healthy 42
Gene2 67.5 Group1 50 0.061 Healthy 23
Gene2 67.5 Group2 50 0.058 Healthy 16
Gene1 52.5 Group1 50.01 0.038 Unhealthy 41
Gene1 52.5 Group2 50.01 0.030 Unhealthy 57
Gene1 57.5 Group1 50.01 0.038 Unhealthy 79
Gene1 57.5 Group2 50.01 0.031 Unhealthy 70
Gene1 62.5 Group1 50.01 0.040 Unhealthy 93
Gene1 62.5 Group2 50.01 0.032 Unhealthy 84
Gene1 67.5 Group1 50.01 0.043 Unhealthy 98
Gene1 67.5 Group2 50.01 0.033 Unhealthy 64
Gene2 52.5 Group1 50.01 0.052 Unhealthy 36
Gene2 52.5 Group2 50.01 0.051 Unhealthy 57
Gene2 57.5 Group1 50.01 0.052 Unhealthy 79
Gene2 57.5 Group2 50.01 0.051 Unhealthy 70
Gene2 62.5 Group1 50.01 0.052 Unhealthy 93
Gene2 62.5 Group2 50.01 0.052 Unhealthy 84
Gene2 67.5 Group1 50.01 0.053 Unhealthy 98
Gene2 67.5 Group2 50.01 0.051 Unhealthy 64
Gene1 52.5 Group1 75 0.045 Healthy 33
Gene1 52.5 Group2 75 0.038 Healthy 35
Gene1 57.5 Group1 75 0.048 Healthy 39
Gene1 57.5 Group2 75 0.041 Healthy 46
Gene1 62.5 Group1 75 0.047 Healthy 38
Gene1 62.5 Group2 75 0.040 Healthy 42
Gene1 67.5 Group1 75 0.050 Healthy 23
Gene1 67.5 Group2 75 0.043 Healthy 16
Gene2 52.5 Group1 75 0.066 Healthy 33
Gene2 52.5 Group2 75 0.064 Healthy 35
Gene2 57.5 Group1 75 0.065 Healthy 39
Gene2 57.5 Group2 75 0.064 Healthy 46
Gene2 62.5 Group1 75 0.068 Healthy 38
Gene2 62.5 Group2 75 0.071 Healthy 42
Gene2 67.5 Group1 75 0.070 Healthy 23
Gene2 67.5 Group2 75 0.071 Healthy 16
Gene1 52.5 Group1 75.01 0.057 Unhealthy 41
Gene1 52.5 Group2 75.01 0.041 Unhealthy 57
Gene1 57.5 Group1 75.01 0.056 Unhealthy 79
Gene1 57.5 Group2 75.01 0.040 Unhealthy 70
Gene1 62.5 Group1 75.01 0.057 Unhealthy 93
Gene1 62.5 Group2 75.01 0.043 Unhealthy 84
Gene1 67.5 Group1 75.01 0.059 Unhealthy 98
Gene1 67.5 Group2 75.01 0.046 Unhealthy 64
Gene2 52.5 Group1 75.01 0.063 Unhealthy 36
Gene2 52.5 Group2 75.01 0.060 Unhealthy 57
Gene2 57.5 Group1 75.01 0.061 Unhealthy 79
Gene2 57.5 Group2 75.01 0.062 Unhealthy 70
Gene2 62.5 Group1 75.01 0.062 Unhealthy 93
Gene2 62.5 Group2 75.01 0.062 Unhealthy 84
Gene2 67.5 Group1 75.01 0.061 Unhealthy 98
Gene2 67.5 Group2 75.01 0.060 Unhealthy 64
Gene1 52.5 Group1 100 0.056 Healthy 33
Gene1 52.5 Group2 100 0.046 Healthy 35
Gene1 57.5 Group1 100 0.063 Healthy 39
Gene1 57.5 Group2 100 0.048 Healthy 46
Gene1 62.5 Group1 100 0.060 Healthy 38
Gene1 62.5 Group2 100 0.052 Healthy 42
Gene1 67.5 Group1 100 0.064 Healthy 23
Gene1 67.5 Group2 100 0.055 Healthy 16
Gene2 52.5 Group1 100 0.082 Healthy 33
Gene2 52.5 Group2 100 0.074 Healthy 35
Gene2 57.5 Group1 100 0.070 Healthy 39
Gene2 57.5 Group2 100 0.075 Healthy 46
Gene2 62.5 Group1 100 0.074 Healthy 38
Gene2 62.5 Group2 100 0.078 Healthy 42
Gene2 67.5 Group1 100 0.080 Healthy 23
Gene2 67.5 Group2 100 0.075 Healthy 16
Gene1 52.5 Group1 100.01 0.090 Unhealthy 41
Gene1 52.5 Group2 100.01 0.060 Unhealthy 57
Gene1 57.5 Group1 100.01 0.093 Unhealthy 79
Gene1 57.5 Group2 100.01 0.053 Unhealthy 70
Gene1 62.5 Group1 100.01 0.089 Unhealthy 93
Gene1 62.5 Group2 100.01 0.057 Unhealthy 84
Gene1 67.5 Group1 100.01 0.089 Unhealthy 98
Gene1 67.5 Group2 100.01 0.065 Unhealthy 64
Gene2 52.5 Group1 100.01 0.074 Unhealthy 36
Gene2 52.5 Group2 100.01 0.074 Unhealthy 57
Gene2 57.5 Group1 100.01 0.077 Unhealthy 79
Gene2 57.5 Group2 100.01 0.078 Unhealthy 70
Gene2 62.5 Group1 100.01 0.073 Unhealthy 93
Gene2 62.5 Group2 100.01 0.073 Unhealthy 84
Gene2 67.5 Group1 100.01 0.072 Unhealthy 98
Gene2 67.5 Group2 100.01 0.074 Unhealthy 64
我的数据的Dput是:
dput(data)
structure(list(Biomarkers = structure(c(1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Gene1",
"Gene2"), class = "factor"), TimePoint = c(52.5, 52.5, 57.5,
57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5,
67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5,
52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5,
62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5,
67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5,
57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5,
62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5,
52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5,
57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5,
67.5, 67.5, 52.5, 52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5,
52.5, 57.5, 57.5, 62.5, 62.5, 67.5, 67.5, 52.5, 52.5, 57.5, 57.5,
62.5, 62.5, 67.5, 67.5), Group = structure(c(1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L,
1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L, 1L, 2L), .Label = c("Group1",
"Group2"), class = "factor"), Scale = c(25, 25, 25, 25, 25, 25,
25, 25, 25, 25, 25, 25, 25, 25, 25, 25, 25.01, 25.01, 25.01,
25.01, 25.01, 25.01, 25.01, 25.01, 25.01, 25.01, 25.01, 25.01,
25.01, 25.01, 25.01, 25.01, 50, 50, 50, 50, 50, 50, 50, 50, 50,
50, 50, 50, 50, 50, 50, 50, 50.01, 50.01, 50.01, 50.01, 50.01,
50.01, 50.01, 50.01, 50.01, 50.01, 50.01, 50.01, 50.01, 50.01,
50.01, 50.01, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75, 75,
75, 75, 75, 75, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01,
75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01, 75.01,
100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100, 100,
100, 100, 100, 100.01, 100.01, 100.01, 100.01, 100.01, 100.01,
100.01, 100.01, 100.01, 100.01, 100.01, 100.01, 100.01, 100.01,
100.01, 100.01), Readings = c(0.027, 0.024, 0.029, 0.023, 0.03,
0.024, 0.033, 0.026, 0.051, 0.046, 0.052, 0.048, 0.049, 0.051,
0.051, 0.052, 0.026, 0.023, 0.027, 0.024, 0.03, 0.025, 0.033,
0.022, 0.043, 0.044, 0.043, 0.043, 0.043, 0.044, 0.044, 0.044,
0.035, 0.029, 0.039, 0.031, 0.038, 0.03, 0.04, 0.035, 0.058,
0.053, 0.059, 0.056, 0.057, 0.058, 0.061, 0.058, 0.038, 0.03,
0.038, 0.031, 0.04, 0.032, 0.043, 0.033, 0.052, 0.051, 0.052,
0.051, 0.052, 0.052, 0.053, 0.051, 0.045, 0.038, 0.048, 0.041,
0.047, 0.04, 0.05, 0.043, 0.066, 0.064, 0.065, 0.064, 0.068,
0.071, 0.07, 0.071, 0.057, 0.041, 0.056, 0.04, 0.057, 0.043,
0.059, 0.046, 0.063, 0.06, 0.061, 0.062, 0.062, 0.062, 0.061,
0.06, 0.056, 0.046, 0.063, 0.048, 0.06, 0.052, 0.064, 0.055,
0.082, 0.074, 0.07, 0.075, 0.074, 0.078, 0.08, 0.075, 0.09, 0.06,
0.093, 0.053, 0.089, 0.057, 0.089, 0.065, 0.074, 0.074, 0.077,
0.078, 0.073, 0.073, 0.072, 0.074), Condition = structure(c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Healthy",
"Unhealthy"), class = "factor"), samples = c(33L, 35L, 39L, 46L,
38L, 42L, 23L, 16L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 41L,
57L, 79L, 70L, 93L, 84L, 98L, 64L, 36L, 57L, 79L, 70L, 93L, 84L,
98L, 64L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 33L, 35L, 39L,
46L, 38L, 42L, 23L, 16L, 41L, 57L, 79L, 70L, 93L, 84L, 98L, 64L,
36L, 57L, 79L, 70L, 93L, 84L, 98L, 64L, 33L, 35L, 39L, 46L, 38L,
42L, 23L, 16L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 41L, 57L,
79L, 70L, 93L, 84L, 98L, 64L, 36L, 57L, 79L, 70L, 93L, 84L, 98L,
64L, 33L, 35L, 39L, 46L, 38L, 42L, 23L, 16L, 33L, 35L, 39L, 46L,
38L, 42L, 23L, 16L, 41L, 57L, 79L, 70L, 93L, 84L, 98L, 64L, 36L,
57L, 79L, 70L, 93L, 84L, 98L, 64L)), class = "data.frame", row.names = c(NA,
-128L))
我现在拥有的代码是这样的:
# Load libraries
library(ggplot2)
library(magrittr)
library(dplyr)
library(gridExtra)
library(grid)
proc_plot <- function(sub) {
data_Group1 <- sub[sub$Group == "Group1", ]
data_Group2 <- sub[sub$Group == "Group2", ]
min_rdg <- min(data_Group1$Readings, data_Group2$Readings)
max_rdg <- max(data_Group1$Readings, data_Group2$Readings)
# Group1
graph_Group1 <- ggplot(data_Group1, aes(x = TimePoint, y = Readings, group = Scale)) +
labs(title="Group1", x="Time point", y="Readings") +
scale_x_continuous(breaks = c(52.5, 57.5, 62.5, 67.5),
labels = c("1", "2", "3", "4")) +
geom_line(aes(color = Scale, linetype=Condition), na.rm = TRUE, size = 0.8) +
geom_point(aes(color = Scale),size = 2.5, na.rm = TRUE) +
scale_color_continuous(name = "Scale", breaks = c(25, 50, 75, 100)) +
scale_y_continuous(limits = c(min_rdg, max_rdg)) +
theme(legend.key.height = unit(2.3, "cm"))
# Group2
graph_Group2 <- ggplot(data_Group2, aes(x = TimePoint, y = Readings, group = Scale)) +
labs(title="Group2", x="Time point", y="Readings") +
scale_x_continuous(breaks = c(52.5, 57.5, 62.5, 67.5),
labels = c("1", "2", "3", "4")) +
geom_line(aes(color = Scale, linetype=Condition), na.rm = TRUE, size = 0.8) +
geom_point(aes(color = Scale), size = 2.5, na.rm = TRUE) +
scale_color_continuous(name = "Scale", breaks = c(25, 50, 75, 100)) +
scale_y_continuous(limits = c(min_rdg, max_rdg)) +
theme(legend.key.height = unit(2.3, "cm"))
png (paste0("ScatterPlot_", sub$Biomarkers[[1]], ".png"), height=600, width=1111)
output <- grid.arrange(graph_Group1, graph_Group2, nrow = 1,
top=textGrob(sub$Biomarkers[[1]], gp=gpar(fontsize=20)))
dev.off()
return(output)
}
# BUILD PLOT LIST AND PNG FILES
plot_list <- by(data, data$Biomarkers, proc_plot)
dev.off()
grid.draw(plot_list$Gene1)
dev.off()
grid.draw(plot_list$Gene2)
我还在下面附加了Gene1的示例png文件。我已经用红色手动添加了数字以突出显示并显示出每个基因/ png文件确实需要的数字(但黑色)。
任何帮助表示赞赏。感谢您。
您可以\n
在标签中使用换行符。例如,
scale_x_continuous(breaks = c(52.5, 57.5, 62.5, 67.5),
labels = c("1\n(33, 41)", "2\n(39, 79)", "3\n(38, 93)", "4\n(23, 98)"))
您可以通过编程方式执行以下操作:
lab_df = data_Group1 %>% group_by(TimePoint) %>%
summarize(label = sprintf("(%s, %s)", first(samples[Condition == "Healthy"]), first(samples[Condition == "Unhealthy"])))
lab_df
# # A tibble: 4 x 2
# TimePoint label
# <dbl> <chr>
# 1 52.5 (33, 41)
# 2 57.5 (39, 79)
# 3 62.5 (38, 93)
# 4 67.5 (23, 98)
ggplot(...) + ... +
scale_x_continuous(
breaks = lab_df$TimePoint,
labels = paste(1:nrow(lab_df), lab_df$label, sep = "\n")
)
全方位服务解决方案。简化为使用for
循环而不是单独处理组,而是通过程序处理标签。
proc_plot <- function(sub) {
lab_df = sub %>% group_by(TimePoint, Group) %>%
summarize(label = sprintf(
"(%s, %s)",
first(samples[Condition == "Healthy"]),
first(samples[Condition == "Unhealthy"])
)) %>%
arrange(Group, TimePoint) # make sure things are in order
min_rdg <- min(sub$Readings)
max_rdg <- max(sub$Readings)
graphs = list()
for (i in unique(sub$Group)) {
this_lab = lab_df[lab_df$Group == i, ]
graphs[[i]] = ggplot(sub[sub$Group == i, ], aes(x = TimePoint, y = Readings, group = Scale)) +
labs(title = i, x = "Time point", y = "Readings") +
scale_x_continuous(breaks = this_lab$TimePoint,
labels = paste(1:nrow(this_lab), this_lab$label, sep = "\n")) +
geom_line(aes(color = Scale, linetype=Condition), na.rm = TRUE, size = 0.8) +
geom_point(aes(color = Scale),size = 2.5, na.rm = TRUE) +
scale_color_continuous(name = "Scale", breaks = c(25, 50, 75, 100)) +
scale_y_continuous(limits = c(min_rdg, max_rdg)) +
theme(legend.key.height = unit(2.3, "cm"))
}
png (paste0("ScatterPlot_", sub$Biomarkers[[1]], ".png"), height=600, width=1111)
output <- grid.arrange(grobs = graphs, nrow = 1,
top = textGrob(sub$Biomarkers[[1]], gp = gpar(fontsize = 20)))
dev.off()
return(output)
}
proc_plot(sub[sub$Biomarkers == "Gene1", ])
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句