我正在尝试根据特定的名称序列对列表中的数据帧重新排序。这些是复杂方差分析的结果,我尝试使用 Cars93 数据集对其进行复制。我需要根据制造商名称的特定序列对最终方差分析的结果进行排序。例如。我希望列表中的所有数据框都按照名称的特定顺序进行排序:“Eagle”、“Acura”、“Buick”、“Chevy”、“Cadillac”、“Dodge”、“Chrysler”、“Hyundai” ”、“福特”……等等。
anova_carmanuf_letters 是此分析产生的数据帧的目标/最终列表。在此列表中,我需要根据第一列“制造商”中的名称对结果重新排序。这需要按照特定的顺序进行。理想情况下,这应该按照我喜欢的顺序排列,例如。“雄鹰”、“讴歌”、“别克”……等等。但即使是按字母顺序排列也会很棒。
#Copy-paste and run this code in RStudio to end up with the anova_carmanuf_letters list with dataframes.
#libraries for ANOVA, including multiple comparisons, linear models and least square means.
library(FSA)
library(car)
library(multcomp)
library(lsmeans)
library(multcompView)
#Using the Cars93 dataset, and keeping specific columns
carmanuf <- subset.data.frame(Cars93, select = c("Manufacturer", "Min.Price", "Price", "Max.Price", "MPG.city", "MPG.highway", "Cylinders", "EngineSize", "Horsepower", "RPM"))
#Names of columns with data to run ANOVA on
datanames <- names(carmanuf)[2:10]
#Using lapply to run least squares means, tukey post-hoc, etc on all parameters
model_carmanuf <- lapply(datanames, function(x) {
lm(substitute(i ~ Manufacturer, list(i = as.name(x))), data = carmanuf)})
ls_carmanuf <- lapply(model_carmanuf, function(model_carmanuf)
lsmeans(model_carmanuf, pairwise ~ Manufacturer, adjust = "tukey"))
anova_carmanuf_letters <- lapply(ls_carmanuf, function(ls_carmanuf) cld(ls_carmanuf[[1]], alpha = .05, Letters = letters, adjust = "tukey"))
#I am able to reorder one data frame at a time using the following code, and changing the [[1]] to [[2]] and so on. But it would be great if I could use lapply or a for loop to do it for all the data frames contained within this list, as my actual analysis has ~90 such data frames in a list.
anova_carmanuf_letters[[1]]$Manufacturer <- factor(anova_carmanuf_letters[[1]]$Manufacturer, levels = c("Suzuki", "Geo", "Saturn", "Hyundai", "Subaru", "Plymouth", "Ford", "Dodge", "Eagle", "Honda", "Pontiac", "Toyota", "Mercury", "Nissan", "Mitsubishi", "Chevrolet", "Mazda", "Volkswagen", "Oldsmobile", "Chrylser", "Saab", "Buick", "Acura", "Chrysler", "Volvo", "BMW", "Audi", "Lexus", "Lincoln", "Cadillac", "Mercedes-Benz", "Infiniti"))
由于输出是数据框列表,特别是"summary_emm"
扩展的类对象列表data.frame
,您可以运行简单的数据框操作:
str(anova_carmanuf_letters[[1]])
# Classes ‘summary_emm’ and 'data.frame': 32 obs. of 7 variables:
# $ Manufacturer: Factor w/ 32 levels "Suzuki","Geo",..: 1 2 3 4 5 6 7 8 9 10 ...
# $ lsmean : num 7.3 9.1 9.2 9.32 11.37 ...
# $ SE : num 5.93 4.19 5.93 2.96 3.42 ...
# $ df : num 61 61 61 61 61 61 61 61 61 61 ...
# $ lower.CL : num -12.2807 -4.7457 -10.3807 -0.4654 0.0617 ...
# $ upper.CL : num 26.9 22.9 28.8 19.1 22.7 ...
# $ .group : chr " abcd " " ab " " abcde " " a " ...
# - attr(*, "estName")= chr "lsmean"
# - attr(*, "clNames")= chr "lower.CL" "upper.CL"
# - attr(*, "pri.vars")= chr "Manufacturer"
# - attr(*, "adjust")= chr "tukey"
# - attr(*, "side")= num 0
# - attr(*, "delta")= num 0
# - attr(*, "type")= chr "link"
# - attr(*, "mesg")= chr "Confidence level used: 0.95" "Conf-level adjustment: sidak method for 32 estimates" "P value adjustment: tukey method for comparing a family of 32 estimates" "significance level used: alpha = 0.05"
具体来说,考虑lapply
调整制造商因素水平。
myvars <- c("Suzuki", "Geo", "Saturn", "Hyundai", "Subaru", "Plymouth", "Ford",
"Dodge", "Eagle", "Honda", "Pontiac", "Toyota", "Mercury", "Nissan",
"Mitsubishi", "Chevrolet", "Mazda", "Volkswagen", "Oldsmobile",
"Chrylser", "Saab", "Buick", "Acura", "Chrysler", "Volvo", "BMW",
"Audi", "Lexus", "Lincoln", "Cadillac", "Mercedes-Benz", "Infiniti")
anova_carmanuf_letters <- lapply(anova_carmanuf_letters, function(df)
within(df, Manufacturer <- factor(Manufacturer, levels=myvars))
)
但是,请注意上面的命令没有可见的排序,只有因子列中的级别。order
之后您甚至可以将两个lapply
调用结合到新的级别:
anova_carmanuf_letters <- lapply(anova_carmanuf_letters, function(df)
df[order(df$Manufacturer),])
anova_carmanuf_letters[[1]]
# Manufacturer lsmean SE df lower.CL upper.CL .group
# 29 Suzuki 7.30000 5.927581 61 -12.28072477 26.88072 abcd
# 12 Geo 9.10000 4.191433 61 -4.74566327 22.94566 ab
# 27 Saturn 9.20000 5.927581 61 -10.38072477 28.78072 abcde
# 14 Hyundai 9.32500 2.963790 61 -0.46536239 19.11536 a
# 28 Subaru 11.36667 3.422290 61 0.06172995 22.67160 ab
# 24 Plymouth 11.40000 5.927581 61 -8.18072477 30.98072 abcde
# 11 Ford 12.43750 2.095716 61 5.51466837 19.36033 a
# 9 Dodge 12.51667 2.419925 61 4.52286925 20.51046 ab
# 10 Eagle 12.70000 4.191433 61 -1.14566327 26.54566 abcd
# 13 Honda 13.06667 3.422290 61 1.76172995 24.37160 abc
# 25 Pontiac 13.28000 2.650895 61 4.52323367 22.03677 ab
# 30 Toyota 14.02500 2.963790 61 4.23463761 23.81536 abc
# 20 Mercury 14.10000 4.191433 61 0.25433673 27.94566 abcde
# 22 Nissan 14.85000 2.963790 61 5.05963761 24.64036 abc
# 21 Mitsubishi 15.05000 4.191433 61 1.20433673 28.89566 abcde
# 6 Chevrolet 16.08750 2.095716 61 9.16466837 23.01033 abc
# 18 Mazda 16.34000 2.650895 61 7.58323367 25.09677 abcd
# 31 Volkswagen 16.45000 2.963790 61 6.65963761 26.24036 abcde
# 23 Oldsmobile 16.55000 2.963790 61 6.75963761 26.34036 abcde
# 7 Chrylser 18.40000 5.927581 61 -1.18072477 37.98072 abcdef
# 26 Saab 20.30000 5.927581 61 0.71927523 39.88072 abcdef
# 4 Buick 20.75000 2.963790 61 10.95963761 30.54036 abcdef
# 1 Acura 21.05000 4.191433 61 7.20433673 34.89566 abcdef
# 8 Chrysler 22.00000 4.191433 61 8.15433673 35.84566 abcdef
# 32 Volvo 23.30000 4.191433 61 9.45433673 37.14566 abcdef
# 3 BMW 23.70000 5.927581 61 4.11927523 43.28072 abcdef
# 2 Audi 28.35000 4.191433 61 14.50433673 42.19566 abcdef
# 16 Lexus 31.10000 4.191433 61 17.25433673 44.94566 bcdef
# 17 Lincoln 33.85000 4.191433 61 20.00433673 47.69566 cdef
# 5 Cadillac 35.25000 4.191433 61 21.40433673 49.09566 def
# 19 Mercedes-Benz 36.40000 4.191433 61 22.55433673 50.24566 ef
# 15 Infiniti 45.40000 5.927581 61 25.81927523 64.98072 f
要保留所需的列并重命名为相应的原始列名,请使用Map
ekementwise 循环遍历所有 9 个数据名项和 9 个数据框:
new_anova_carmanuf_letters <- Map(function(df, nm)
setNames(df[c("Manufacturer", ".group")],
c("Manufacturer", nm)),
df = anova_carmanuf_letters, nm = datanames)
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句