Here is my code. The data set is artificially generated to simulate data similar to my actual problem.
Code:
library(ggplot2)
DataSet1 <- data.frame("Cat" = rep("A",10000), "Bin" = rep(c(-49:50),100),
"Value" = c(seq(0,4.9, by=0.1),
seq(4.9,0, by=-0.1)) * rep(rnorm(100,50,1),100))
DataSet2 <- data.frame("Cat" = rep("B",10000), "Bin" = rep(c(-49:50),100),
"Value" = c(seq(0,4.9, by=0.1),
seq(4.9,0, by=-0.1)) * rep(rnorm(100,75,1),100))
DataSet3 <- data.frame("Cat" = rep("C",10000), "Bin" = rep(c(-49:50),100),
"Value" = c(seq(0,4.9, by=0.1),
seq(4.9,0, by=-0.1)) * rep(rnorm(100,100,1),100))
DataSet <- rbind(DataSet1, DataSet2, DataSet3)
d <- ggplot(data = DataSet, aes(Bin, Value, color = Cat))
d + stat_summary(fun.y = sum, geom = 'step', size = 1)
My result:
What I want to do:
Normalize each of these plots, i.e., divide the sum at each bin width by the total Value for that curve.
As far as I am aware, stat_summary
is not meant to operate over all values of x
and y
simultaneously, so this type of per-group summary isn't possible strictly within ggplot. In cases such as this, it's usually best to compute your summary ahead of time and then plot that. Using dplyr to make summarization easy:
library(dplyr)
DataSet <- DataSet %>%
group_by(Cat, Bin) %>%
summarize(Value = sum(Value)) %>%
group_by(Cat) %>%
mutate(Value = Value / sum(Value))
d <- ggplot(data = DataSet, aes(Bin, Value, color = Cat))
d + stat_summary(fun.y = mean, geom = 'step', size = 1)
Эта статья взята из Интернета, укажите источник при перепечатке.
Если есть какие-либо нарушения, пожалуйста, свяжитесь с[email protected] Удалить.
я говорю два предложения