我有一个分析脚本,该脚本处理结构相似但列名称不同的批量数据。我需要为以后的ETL脚本保留列名,但是我们想做一些处理,例如:
results <- data.frame();
for (name in names(data[[1]])) {
# Start by combining each column into a single matrix
working <- lapply(data, function(item)item[[name]]);
working <- matrix(unlist(working), ncol = 50, byrow = TRUE);
# Dump the data for the archive
write.csv(working, file = paste(PATH, prefix, name, '.csv', sep = ''), row.names = FALSE);
# Calculate the mean and SD for each year, bind to the results
df <- data.frame(colMeans(working), colSds(working));
names(df) <- c(paste(name, '.mean', sep = ''), paste(name, '.sd', sep = ''));
# Combine the working df with the processing one
}
根据示例中的最后一条注释,如何合并数据帧?我试过了rbind
,rbind.fill
但都没有用,它们可能是数据文件中不同列名的10到100。
搜索正确的关键字可能更多地是一个问题,但是该cbind
方法实际上是与矩阵配合使用的方法,
# Allocate for the number of rows needed
results <- matrix(nrow = rows)
for (name in names(data[[1]])) {
# Data processing
# Append the results to the working data
results <- cbind(results, df)
}
# Drop the first placeholder column created upon allocation
results <- results[, -1];
显然,要注意的是,列需要具有相同数量的行,但否则,只需将列添加到矩阵即可。
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句