我是R的新手,所以这很明显。
我到目前为止的代码:
rm(list=ls())
kdata = read.table("data_fra_klassen_v20.txt",header = TRUE,)
library(openxlsx)
kdata = read.xlsx("data_fra_klassen_v20.xlsx")
head(kdata)
这是数据集:
gender shoe height colour
Man 43 176 Green
Woman 36 166 Brown
Man 43 182 Other
Man 36 151 Brown
Woman 43 183 Blue
Man 44 184 Blue
Woman 38 164 Brown
Woman 37 160 Brown
Man 41 175 Brown
我正在尝试寻找均值和每种性别的中位数。
我在想也许是这样的:
heightmen = kdata$height[kdata$gender=="Man"]
mean(heightmen)
但是,似乎找不到任何值。
您可以使用dplyr
R中的包来执行此操作:
使用mutate
:
library(dplyr)
df %>%
group_by(gender)%>%
mutate(mean_height = mean(height))%>%
mutate(median_height = median(height)) %>%
select(gender, mean_height, median_height) %>%
unique()
或使用summarise
:
df %>%
group_by(gender) %>%
summarise(mean_height = mean(height), median_height = median(height))
# A tibble: 2 x 3
# Groups: gender [2]
# gender mean_height median_height
#<fct> <dbl> <dbl>
#1 Man 174. 176
#2 Woman 168. 165
df <- structure(list(gender = structure(c(1L, 2L, 1L, 1L, 2L, 1L, 2L, 2L, 1L), .Label = c("Man", "Woman"), class = "factor"), shoe = c(43L,36L, 43L, 36L, 43L, 44L, 38L, 37L, 41L), height = c(176L, 166L,182L, 151L, 183L, 184L, 164L, 160L, 175L), colour = structure(c(3L,2L, 4L, 2L, 1L, 1L, 2L, 2L, 2L), .Label = c("Blue", "Brown", "Green", "Other"), class = "factor")), class = "data.frame", row.names = c(NA,-9L))
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句