应用规范化

本A

我正在为一位同事开展一个项目,以标准化 GC 数据并将其从 mol% 转换为 mass%。

编辑:我正在做逐行归一化。即每次物种的总和norm1应该是 100(尽管每个都乘以质量,因此不再总和为 100。在 for 循环中,它相当于一个非常繁重的:

for (time in Nmass[,1]){
   for species in norm1{
      Nmass[time,species] = Fmolwt[species,] = Nmass[time,species] / rowSums(Nmass[time,norm1])
                       }
                       }

我导入了 CSV 文件,它们被排列为物种名称列和注射时间行(处理虚拟数据,因此当前全部为零)。

> Nmass[1:5,c("Time",norm1)]
# A tibble: 5 x 13
  Time                HTFeed_Methane HTFeed_Ethane HTFeed_Ethylene HTFeed_Propane HTFeed_Propylene `HTFeed_iso-butane` `HTFee~ `HTFeed~ `HTFe~ HTFee~ `HTFee~ `HTFee~
  <dttm>                       <dbl>         <dbl>           <dbl>          <dbl>            <dbl>               <dbl>   <dbl>    <dbl>  <dbl>  <dbl>   <dbl>   <dbl>
1 2019-10-06 13:02:00              0             0               0              0                0                   0       0        0      0      0       0       0
2 2019-10-06 13:17:00              0             0               0              0                0                   0       0        0      0      0       0       0
3 2019-10-06 13:32:00              0             0               0              0                0                   0       0        0      0      0       0       0
4 2019-10-06 13:47:00              0             0               0              0                0                   0       0        0      0      0       0       0
5 2019-10-06 14:02:00              0             0               0              0                0                   0       0        0      0      0       0       0

我有一个正常工作的例程:

norm1 = c('HTFeed_Methane','HTFeed_Ethane','HTFeed_Ethylene','HTFeed_Propane','HTFeed_Propylene','HTFeed_iso-butane','HTFeed_n-Butane',
        'HTFeed_trans-2-butene','HTFeed_1-Butene','HTFeed_Isobutylene','HTFeed_cis-2-butene','HTFeed_1,3-Butadiene')

Nmass[,norm1] = as.data.frame(apply(Nmass[,norm1], 2, function(x) x/sum(x)))

但是当我尝试使用按物种预先构建的质量列表来实现质量转换时:

Fmolwt = data.frame(c(16.04,30.07,28.05,44.9,42.08,58.12,58.12,56.11,56.11,56.11,56.11,54.1))
colnames(Fmolwt)[1] = 'weight'
rownames(Fmolwt) = c('HTFeed_Methane','HTFeed_Ethane','HTFeed_Ethylene','HTFeed_Propane','HTFeed_Propylene','HTFeed_iso-butane',
                    'HTFeed_n-Butane','HTFeed_trans-2-butene','HTFeed_1-Butene','HTFeed_Isobutylene','HTFeed_cis-2-butene','HTFeed_1,3-Butadiene')

例程变为(我认为):

Nmass[,norm1] = as.data.frame(apply(Nmass[,norm1], 2, function(x) x*Fmolwt[x,]/sum(x)))

我收到关于尺寸不同的错误。

Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 0, 3696
In addition: Warning messages:
1: In x * Fmolwt[x, ] :
  longer object length is not a multiple of shorter object length
2: In x * Fmolwt[x, ] :
  longer object length is not a multiple of shorter object length
3: In x * Fmolwt[x, ] :
  longer object length is not a multiple of shorter object length
4: In x * Fmolwt[x, ] :
  longer object length is not a multiple of shorter object length
5: In x * Fmolwt[x, ] :
  longer object length is not a multiple of shorter object length
6: In x * Fmolwt[x, ] :
  longer object length is not a multiple of shorter object length
7: In x * Fmolwt[x, ] :

我预计这是由于 apply 语句试图同时引入所有命名的分子量norm1

我可以按照我尝试的方式完成这项工作,还是需要写出一个 for 循环?

笨狼

你这里有一个错误:

Nmass[,norm1] = as.data.frame(apply(Nmass[,norm1], 2, function(x) x*Fmolwt[x,]/sum(x)))

使用 apply(..,2,..),你用 x 调出列条目,从我收集的信息来看,你需要进行逐行操作。其次, Fmolwt[x,] 给出了一个错误,因为您正在调用与 Fmolwt 的行名匹配的值(而不是列名)。

我模拟了一些看起来像下面的数据,以供说明:

set.seed(1234)

norm1 = c('HTFeed_Methane','HTFeed_Ethane','HTFeed_Ethylene',
'HTFeed_Propane','HTFeed_Propylene','HTFeed_iso-butane',
'HTFeed_n-Butane','HTFeed_trans-2-butene',
'HTFeed_1-Butene','HTFeed_Isobutylene','HTFeed_cis-2-butene',
'HTFeed_1,3-Butadiene')

values <- matrix(abs(rnorm(120,1000,100)),ncol=12)
colnames(values) = norm1

ts <- seq(as.POSIXct("2017-01-01", tz = "UTC"),
    as.POSIXct("2017-01-02", tz = "UTC"),
    length.out = 100)

Nmass = data.frame(Time=ts,values,check.names=F)

Fmolwt = data.frame(c(16.04,30.07,28.05,44.9,42.08,58.12,58.12,
56.11,56.11,56.11,56.11,54.1))
colnames(Fmolwt)[1] = 'weight'
rownames(Fmolwt) = c('HTFeed_Methane','HTFeed_Ethane','HTFeed_Ethylene',
'HTFeed_Propane','HTFeed_Propylene',
'HTFeed_iso-butane','HTFeed_n-Butane','HTFeed_trans-2-butene',
'HTFeed_1-Butene','HTFeed_Isobutylene','HTFeed_cis-2-butene',
'HTFeed_1,3-Butadiene')

模拟数据的样子:

> head(Nmass,2)
                 Time HTFeed_Methane HTFeed_Ethane HTFeed_Ethylene
1 2017-01-01 00:00:00       879.2934      952.2807       1013.4088
2 2017-01-01 00:14:32      1027.7429      900.1614        950.9314
  HTFeed_Propane HTFeed_Propylene HTFeed_iso-butane HTFeed_n-Butane
1      1110.2298        1144.9496          819.3969        1065.659
2       952.4407         893.1357          941.7924        1254.899
  HTFeed_trans-2-butene HTFeed_1-Butene HTFeed_Isobutylene HTFeed_cis-2-butene
1             1000.6893        982.2210           994.6841           1041.4524
2              954.4531        983.0006          1025.5196            952.5282
  HTFeed_1,3-Butadiene
1             980.4065
2             935.0930

第一步,我们以第一行为例,对其进行归一化(按其总数),然后乘以相应的质量,例如第 1 行,执行:

Fmolwt[norm1,]*Nmass[1,norm1]/sum(Nmass[1,norm1])

为您提供以下结果:

  HTFeed_Methane HTFeed_Ethane HTFeed_Ethylene HTFeed_Propane HTFeed_Propylene
1       1.176825      2.389309        2.371873       4.159423         4.020092
  HTFeed_iso-butane HTFeed_n-Butane HTFeed_trans-2-butene HTFeed_1-Butene
1          3.973688        5.167942              4.685041        4.598576
  HTFeed_Isobutylene HTFeed_cis-2-butene HTFeed_1,3-Butadiene
1           4.656926            4.875886             4.425653

如果你想使用内置的 r 函数,最简单的是 apply,你已经使用过:

results = t(apply(Nmass[,norm1],1,function(x){
      Fmolwt[norm1,]*x/sum(x)
    }))

所以按照我们之前的情况,x 是来自 Nmass[,norm1] 的一行,所以我们做 x/sum(x) 来归一化,然后乘以 Fmolwt[norm1,]。值匹配是因为我们从 Nmass[,norm1] 开始。现在我们需要转置结果以获得与 Nmass 相同的维度,因此是 t(apply(..))。

如果我们查看第一行,它会给出与上面示例相同的输出:

> results[1,]
       HTFeed_Methane         HTFeed_Ethane       HTFeed_Ethylene 
             1.176825              2.389309              2.371873 
       HTFeed_Propane      HTFeed_Propylene     HTFeed_iso-butane 
             4.159423              4.020092              3.973688 
      HTFeed_n-Butane HTFeed_trans-2-butene       HTFeed_1-Butene 
             5.167942              4.685041              4.598576 
   HTFeed_Isobutylene   HTFeed_cis-2-butene  HTFeed_1,3-Butadiene 
             4.656926              4.875886              4.425653

所以如果你想把结果放回去,做

Nmass[,norm] = results

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章