根据特定的名称序列对数据框列表中的列中的名称重新排序

尼尔

我正在尝试根据特定的名称序列对列表中的数据帧重新排序。这些是复杂方差分析的结果,我尝试使用 Cars93 数据集对其进行复制。我需要根据制造商名称的特定序列对最终方差分析的结果进行排序。例如。我希望列表中的所有数据框都按照名称的特定顺序进行排序:“Eagle”、“Acura”、“Buick”、“Chevy”、“Cadillac”、“Dodge”、“Chrysler”、“Hyundai” ”、“福特”……等等。

anova_carmanuf_letters 是此分析产生的数据帧的目标/最终列表。在此列表中,我需要根据第一列“制造商”中的名称对结果重新排序。这需要按照特定的顺序进行。理想情况下,这应该按照我喜欢的顺序排列,例如。“雄鹰”、“讴歌”、“别克”……等等。但即使是按字母顺序排列也会很棒。

#Copy-paste and run this code in RStudio to end up with the anova_carmanuf_letters list with dataframes.
#libraries for ANOVA, including multiple comparisons, linear models and least square means.
library(FSA)
library(car)
library(multcomp)
library(lsmeans)
library(multcompView)

#Using the Cars93 dataset, and keeping specific columns
carmanuf <- subset.data.frame(Cars93, select = c("Manufacturer", "Min.Price", "Price", "Max.Price", "MPG.city", "MPG.highway", "Cylinders", "EngineSize", "Horsepower", "RPM"))

#Names of columns with data to run ANOVA on
datanames <- names(carmanuf)[2:10]

#Using lapply to run least squares means, tukey post-hoc, etc on all parameters
model_carmanuf <- lapply(datanames, function(x) {
  lm(substitute(i ~ Manufacturer, list(i = as.name(x))), data = carmanuf)})

ls_carmanuf <- lapply(model_carmanuf, function(model_carmanuf) 
lsmeans(model_carmanuf, pairwise ~ Manufacturer, adjust = "tukey"))

anova_carmanuf_letters <- lapply(ls_carmanuf, function(ls_carmanuf) cld(ls_carmanuf[[1]], alpha = .05, Letters = letters, adjust = "tukey"))

#I am able to reorder one data frame at a time using the following code, and changing the [[1]] to [[2]] and so on. But it would be great if I could use lapply or a for loop to do it for all the data frames contained within this list, as my actual analysis has ~90 such data frames in a list.
anova_carmanuf_letters[[1]]$Manufacturer <- factor(anova_carmanuf_letters[[1]]$Manufacturer, levels = c("Suzuki", "Geo", "Saturn", "Hyundai", "Subaru", "Plymouth", "Ford", "Dodge", "Eagle", "Honda", "Pontiac", "Toyota", "Mercury", "Nissan", "Mitsubishi", "Chevrolet", "Mazda",  "Volkswagen", "Oldsmobile", "Chrylser", "Saab", "Buick",  "Acura", "Chrysler", "Volvo", "BMW", "Audi", "Lexus", "Lincoln", "Cadillac", "Mercedes-Benz", "Infiniti"))


完美的

由于输出是数据框列表,特别是"summary_emm"扩展类对象列表data.frame,您可以运行简单的数据框操作:

str(anova_carmanuf_letters[[1]])
# Classes ‘summary_emm’ and 'data.frame':   32 obs. of  7 variables:
# $ Manufacturer: Factor w/ 32 levels "Suzuki","Geo",..: 1 2 3 4 5 6 7 8 9 10 ...
# $ lsmean      : num  7.3 9.1 9.2 9.32 11.37 ...
# $ SE          : num  5.93 4.19 5.93 2.96 3.42 ...
# $ df          : num  61 61 61 61 61 61 61 61 61 61 ...
# $ lower.CL    : num  -12.2807 -4.7457 -10.3807 -0.4654 0.0617 ...
# $ upper.CL    : num  26.9 22.9 28.8 19.1 22.7 ...
# $ .group      : chr  " abcd  " " ab    " " abcde " " a     " ...
# - attr(*, "estName")= chr "lsmean"
# - attr(*, "clNames")= chr  "lower.CL" "upper.CL"
# - attr(*, "pri.vars")= chr "Manufacturer"
# - attr(*, "adjust")= chr "tukey"
# - attr(*, "side")= num 0
# - attr(*, "delta")= num 0
# - attr(*, "type")= chr "link"
# - attr(*, "mesg")= chr  "Confidence level used: 0.95" "Conf-level adjustment: sidak method for 32 estimates" "P value adjustment: tukey method for comparing a family of 32 estimates" "significance level used: alpha = 0.05"

具体来说,考虑lapply调整制造商因素水平。

myvars <- c("Suzuki", "Geo", "Saturn", "Hyundai", "Subaru", "Plymouth", "Ford",
            "Dodge", "Eagle", "Honda", "Pontiac", "Toyota", "Mercury", "Nissan", 
            "Mitsubishi", "Chevrolet", "Mazda",  "Volkswagen", "Oldsmobile", 
            "Chrylser", "Saab", "Buick",  "Acura", "Chrysler", "Volvo", "BMW", 
            "Audi", "Lexus", "Lincoln", "Cadillac", "Mercedes-Benz", "Infiniti")

anova_carmanuf_letters <- lapply(anova_carmanuf_letters, function(df) 
            within(df, Manufacturer <- factor(Manufacturer, levels=myvars))
) 

但是,请注意上面的命令没有可见的排序,只有因子列中的级别。order之后甚至可以将两个lapply调用结合到新的级别

anova_carmanuf_letters <- lapply(anova_carmanuf_letters, function(df) 
            df[order(df$Manufacturer),])


anova_carmanuf_letters[[1]]

# Manufacturer   lsmean       SE df     lower.CL upper.CL  .group
# 29        Suzuki  7.30000 5.927581 61 -12.28072477 26.88072  abcd  
# 12           Geo  9.10000 4.191433 61  -4.74566327 22.94566  ab    
# 27        Saturn  9.20000 5.927581 61 -10.38072477 28.78072  abcde 
# 14       Hyundai  9.32500 2.963790 61  -0.46536239 19.11536  a     
# 28        Subaru 11.36667 3.422290 61   0.06172995 22.67160  ab    
# 24      Plymouth 11.40000 5.927581 61  -8.18072477 30.98072  abcde 
# 11          Ford 12.43750 2.095716 61   5.51466837 19.36033  a     
# 9          Dodge 12.51667 2.419925 61   4.52286925 20.51046  ab    
# 10         Eagle 12.70000 4.191433 61  -1.14566327 26.54566  abcd  
# 13         Honda 13.06667 3.422290 61   1.76172995 24.37160  abc   
# 25       Pontiac 13.28000 2.650895 61   4.52323367 22.03677  ab    
# 30        Toyota 14.02500 2.963790 61   4.23463761 23.81536  abc   
# 20       Mercury 14.10000 4.191433 61   0.25433673 27.94566  abcde 
# 22        Nissan 14.85000 2.963790 61   5.05963761 24.64036  abc   
# 21    Mitsubishi 15.05000 4.191433 61   1.20433673 28.89566  abcde 
# 6      Chevrolet 16.08750 2.095716 61   9.16466837 23.01033  abc   
# 18         Mazda 16.34000 2.650895 61   7.58323367 25.09677  abcd  
# 31    Volkswagen 16.45000 2.963790 61   6.65963761 26.24036  abcde 
# 23    Oldsmobile 16.55000 2.963790 61   6.75963761 26.34036  abcde 
# 7       Chrylser 18.40000 5.927581 61  -1.18072477 37.98072  abcdef
# 26          Saab 20.30000 5.927581 61   0.71927523 39.88072  abcdef
# 4          Buick 20.75000 2.963790 61  10.95963761 30.54036  abcdef
# 1          Acura 21.05000 4.191433 61   7.20433673 34.89566  abcdef
# 8       Chrysler 22.00000 4.191433 61   8.15433673 35.84566  abcdef
# 32         Volvo 23.30000 4.191433 61   9.45433673 37.14566  abcdef
# 3            BMW 23.70000 5.927581 61   4.11927523 43.28072  abcdef
# 2           Audi 28.35000 4.191433 61  14.50433673 42.19566  abcdef
# 16         Lexus 31.10000 4.191433 61  17.25433673 44.94566   bcdef
# 17       Lincoln 33.85000 4.191433 61  20.00433673 47.69566    cdef
# 5       Cadillac 35.25000 4.191433 61  21.40433673 49.09566     def
# 19 Mercedes-Benz 36.40000 4.191433 61  22.55433673 50.24566      ef
# 15      Infiniti 45.40000 5.927581 61  25.81927523 64.98072       f

要保留所需的列并重命名为相应的原始列名,请使用Mapekementwise 循环遍历所有 9 个数据名和 9 个数据框:

new_anova_carmanuf_letters <- Map(function(df, nm) 
       setNames(df[c("Manufacturer", ".group")], 
                c("Manufacturer", nm)), 
       df = anova_carmanuf_letters, nm = datanames)

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章

根据列名称对pandas数据框中的列重新排序

根据列表对列进行排序,然后对数据框中的另一列进行排序

使用数据框名称重命名列表中的数据框列

根据包含R中数据框的元素列表的名称创建一列

根据名称重复表中的列

根据R中列表的名称重命名数据框的列

如何用R中的先前数据框名称重命名合并数据框的列名称?

在大型数据框中重新排序列

根据列名称在R数据框中创建列以创建时间序列

在R中逐行对数据框的列进行重新排序

在R中逐行对数据框的列进行重新排序

根据特定列中的值对数据框行进行排序

如何根据列中的值对数据框的行进行重新排序

如何从熊猫数据框中的当前行中减去上一行,以创建一个新列以每个名称重新启动该过程?

根据值条件对数据框中的列重新排序

在大型数据框中重新排序列

根据列均值对数据框的列进行重新排序

R清洁和重新排序数据框中的名称/序列号

在数据框中重新排序因子名称的级别

使用对象名称重命名存储在列表中的数据框变量

如何根据python中的名称列表删除数据框列?

根据列的值和另一列中列表的长度对数据框进行排序

对列表中矩阵的行名称重新排序,并用 1 替换 NaN 和零

根据名称包含列表中的字符串选择熊猫数据框列

根据列标题名称重新排序 pandas 数据框中的列,其中列的名称是字符串,末尾包含数字

如何根据条件对数据框的名称进行排序

如何根据 R 中的特定样本名称字符对列中的数据进行排序?

根据 R 中数据框中的名称重命名列名

删除列并按特定列的名称对数据进行排序