根据特定的名称序列对数据框列表中的列中的名称重新排序

Neil 发表于 Dev

尼尔

我正在尝试根据特定的名称序列对列表中的数据帧重新排序。这些是复杂方差分析的结果，我尝试使用 Cars93 数据集对其进行复制。我需要根据制造商名称的特定序列对最终方差分析的结果进行排序。例如。我希望列表中的所有数据框都按照名称的特定顺序进行排序：“Eagle”、“Acura”、“Buick”、“Chevy”、“Cadillac”、“Dodge”、“Chrysler”、“Hyundai” ”、“福特”……等等。

anova_carmanuf_letters 是此分析产生的数据帧的目标/最终列表。在此列表中，我需要根据第一列“制造商”中的名称对结果重新排序。这需要按照特定的顺序进行。理想情况下，这应该按照我喜欢的顺序排列，例如。“雄鹰”、“讴歌”、“别克”……等等。但即使是按字母顺序排列也会很棒。

#Copy-paste and run this code in RStudio to end up with the anova_carmanuf_letters list with dataframes.
#libraries for ANOVA, including multiple comparisons, linear models and least square means.
library(FSA)
library(car)
library(multcomp)
library(lsmeans)
library(multcompView)

#Using the Cars93 dataset, and keeping specific columns
carmanuf <- subset.data.frame(Cars93, select = c("Manufacturer", "Min.Price", "Price", "Max.Price", "MPG.city", "MPG.highway", "Cylinders", "EngineSize", "Horsepower", "RPM"))

#Names of columns with data to run ANOVA on
datanames <- names(carmanuf)[2:10]

#Using lapply to run least squares means, tukey post-hoc, etc on all parameters
model_carmanuf <- lapply(datanames, function(x) {
  lm(substitute(i ~ Manufacturer, list(i = as.name(x))), data = carmanuf)})

ls_carmanuf <- lapply(model_carmanuf, function(model_carmanuf) 
lsmeans(model_carmanuf, pairwise ~ Manufacturer, adjust = "tukey"))

anova_carmanuf_letters <- lapply(ls_carmanuf, function(ls_carmanuf) cld(ls_carmanuf[[1]], alpha = .05, Letters = letters, adjust = "tukey"))

#I am able to reorder one data frame at a time using the following code, and changing the [[1]] to [[2]] and so on. But it would be great if I could use lapply or a for loop to do it for all the data frames contained within this list, as my actual analysis has ~90 such data frames in a list.
anova_carmanuf_letters[[1]]$Manufacturer <- factor(anova_carmanuf_letters[[1]]$Manufacturer, levels = c("Suzuki", "Geo", "Saturn", "Hyundai", "Subaru", "Plymouth", "Ford", "Dodge", "Eagle", "Honda", "Pontiac", "Toyota", "Mercury", "Nissan", "Mitsubishi", "Chevrolet", "Mazda",  "Volkswagen", "Oldsmobile", "Chrylser", "Saab", "Buick",  "Acura", "Chrysler", "Volvo", "BMW", "Audi", "Lexus", "Lincoln", "Cadillac", "Mercedes-Benz", "Infiniti"))

完美的

由于输出是数据框列表，特别是"summary_emm"扩展的类对象列表data.frame，您可以运行简单的数据框操作：

str(anova_carmanuf_letters[[1]])
# Classes ‘summary_emm’ and 'data.frame':   32 obs. of  7 variables:
# $ Manufacturer: Factor w/ 32 levels "Suzuki","Geo",..: 1 2 3 4 5 6 7 8 9 10 ...
# $ lsmean      : num  7.3 9.1 9.2 9.32 11.37 ...
# $ SE          : num  5.93 4.19 5.93 2.96 3.42 ...
# $ df          : num  61 61 61 61 61 61 61 61 61 61 ...
# $ lower.CL    : num  -12.2807 -4.7457 -10.3807 -0.4654 0.0617 ...
# $ upper.CL    : num  26.9 22.9 28.8 19.1 22.7 ...
# $ .group      : chr  " abcd  " " ab    " " abcde " " a     " ...
# - attr(*, "estName")= chr "lsmean"
# - attr(*, "clNames")= chr  "lower.CL" "upper.CL"
# - attr(*, "pri.vars")= chr "Manufacturer"
# - attr(*, "adjust")= chr "tukey"
# - attr(*, "side")= num 0
# - attr(*, "delta")= num 0
# - attr(*, "type")= chr "link"
# - attr(*, "mesg")= chr  "Confidence level used: 0.95" "Conf-level adjustment: sidak method for 32 estimates" "P value adjustment: tukey method for comparing a family of 32 estimates" "significance level used: alpha = 0.05"

具体来说，考虑lapply调整制造商因素水平。

myvars <- c("Suzuki", "Geo", "Saturn", "Hyundai", "Subaru", "Plymouth", "Ford",
            "Dodge", "Eagle", "Honda", "Pontiac", "Toyota", "Mercury", "Nissan", 
            "Mitsubishi", "Chevrolet", "Mazda",  "Volkswagen", "Oldsmobile", 
            "Chrylser", "Saab", "Buick",  "Acura", "Chrysler", "Volvo", "BMW", 
            "Audi", "Lexus", "Lincoln", "Cadillac", "Mercedes-Benz", "Infiniti")

anova_carmanuf_letters <- lapply(anova_carmanuf_letters, function(df) 
            within(df, Manufacturer <- factor(Manufacturer, levels=myvars))
)

但是，请注意上面的命令没有可见的排序，只有因子列中的级别。order之后您甚至可以将两个lapply调用结合到新的级别：

anova_carmanuf_letters <- lapply(anova_carmanuf_letters, function(df) 
            df[order(df$Manufacturer),])


anova_carmanuf_letters[[1]]

# Manufacturer   lsmean       SE df     lower.CL upper.CL  .group
# 29        Suzuki  7.30000 5.927581 61 -12.28072477 26.88072  abcd  
# 12           Geo  9.10000 4.191433 61  -4.74566327 22.94566  ab    
# 27        Saturn  9.20000 5.927581 61 -10.38072477 28.78072  abcde 
# 14       Hyundai  9.32500 2.963790 61  -0.46536239 19.11536  a     
# 28        Subaru 11.36667 3.422290 61   0.06172995 22.67160  ab    
# 24      Plymouth 11.40000 5.927581 61  -8.18072477 30.98072  abcde 
# 11          Ford 12.43750 2.095716 61   5.51466837 19.36033  a     
# 9          Dodge 12.51667 2.419925 61   4.52286925 20.51046  ab    
# 10         Eagle 12.70000 4.191433 61  -1.14566327 26.54566  abcd  
# 13         Honda 13.06667 3.422290 61   1.76172995 24.37160  abc   
# 25       Pontiac 13.28000 2.650895 61   4.52323367 22.03677  ab    
# 30        Toyota 14.02500 2.963790 61   4.23463761 23.81536  abc   
# 20       Mercury 14.10000 4.191433 61   0.25433673 27.94566  abcde 
# 22        Nissan 14.85000 2.963790 61   5.05963761 24.64036  abc   
# 21    Mitsubishi 15.05000 4.191433 61   1.20433673 28.89566  abcde 
# 6      Chevrolet 16.08750 2.095716 61   9.16466837 23.01033  abc   
# 18         Mazda 16.34000 2.650895 61   7.58323367 25.09677  abcd  
# 31    Volkswagen 16.45000 2.963790 61   6.65963761 26.24036  abcde 
# 23    Oldsmobile 16.55000 2.963790 61   6.75963761 26.34036  abcde 
# 7       Chrylser 18.40000 5.927581 61  -1.18072477 37.98072  abcdef
# 26          Saab 20.30000 5.927581 61   0.71927523 39.88072  abcdef
# 4          Buick 20.75000 2.963790 61  10.95963761 30.54036  abcdef
# 1          Acura 21.05000 4.191433 61   7.20433673 34.89566  abcdef
# 8       Chrysler 22.00000 4.191433 61   8.15433673 35.84566  abcdef
# 32         Volvo 23.30000 4.191433 61   9.45433673 37.14566  abcdef
# 3            BMW 23.70000 5.927581 61   4.11927523 43.28072  abcdef
# 2           Audi 28.35000 4.191433 61  14.50433673 42.19566  abcdef
# 16         Lexus 31.10000 4.191433 61  17.25433673 44.94566   bcdef
# 17       Lincoln 33.85000 4.191433 61  20.00433673 47.69566    cdef
# 5       Cadillac 35.25000 4.191433 61  21.40433673 49.09566     def
# 19 Mercedes-Benz 36.40000 4.191433 61  22.55433673 50.24566      ef
# 15      Infiniti 45.40000 5.927581 61  25.81927523 64.98072       f

要保留所需的列并重命名为相应的原始列名，请使用Mapekementwise 循环遍历所有 9 个数据名项和 9 个数据框：

new_anova_carmanuf_letters <- Map(function(df, nm) 
       setNames(df[c("Manufacturer", ".group")], 
                c("Manufacturer", nm)), 
       df = anova_carmanuf_letters, nm = datanames)

本文收集自互联网，转载请注明来源。

如有侵权，请联系 [email protected] 删除。

编辑于 2021-07-27

我来说两句

0 条评论

登录后参与评论

TOP 榜单

文章

根据特定的名称序列对数据框列表中的列中的名称重新排序

根据特定的名称序列对数据框列表中的列中的名称重新排序

UITableView的项目向下滚动后更改颜色，然后快速备份

Linux的官方Adobe Flash存储库是否已过时？

用日期数据透视表和日期顺序查询

应用发明者仅从列表中选择一个随机项一次

Mac OS X更新后的GRUB 2问题

验证REST API参数

Java Eclipse中的错误13，如何解决？

带有错误“ where”条件的查询如何返回结果？

ggplot：对齐多个分面图-所有大小不同的分面

尝试反复更改屏幕上按钮的位置 - kotlin android studio

如何从视图一次更新多行（ASP.NET - Core）

计算数据帧中每行的NA

蓝屏死机没有修复解决方案

在 Python 2.7 中。如何从文件中读取特定文本并分配给变量

离子动态工具栏背景色

VB.net将2条特定行导出到DataGridView

通过 Git 在运行 Jenkins 作业时获取 ClassNotFoundException

在Windows 7中无法删除文件（2）

python中的boto3文件上传

当我尝试下载 StanfordNLP en 模型时，出现错误

Node.js中未捕获的异常错误，发生调用