计算多维数组 PHP 中的文档频率

邦杰科

我有一个这样的数组

 Array ( 
        [0] => Array ( [id_doc] => 1 [term] => curi ) 
        [1] => Array ( [id_doc] => 1 [term] => tidur ) 
        [2] => Array ( [id_doc] => 1 [term] => kamar ) 
        [3] => Array ( [id_doc] => 2 [term] => curi ) 
        [4] => Array ( [id_doc] => 2 [term] => cela ) 
        [5] => Array ( [id_doc] => 2 [term] => hukum ) 
        [6] => Array ( [id_doc] => 3 [term] => nyanyi ) 
        [7] => Array ( [id_doc] => 3 [term] => dangdut ) 
        [8] => Array ( [id_doc] => 3 [term] => curi )   
    ) 

如何从这些文档上的术语中获取文档频率的计数。我想要这样的输出。

Array ( 
        [0] => Array ( [id_doc] => 1 [term] => curi [doc_frequency] => 3 ) 
        [1] => Array ( [id_doc] => 1 [term] => tidur [doc_frequency] => 1 ) 
        [2] => Array ( [id_doc] => 1 [term] => kamar [doc_frequency] => 1 ) 
        [3] => Array ( [id_doc] => 2 [term] => curi [doc_frequency] => 3 ) 
        [4] => Array ( [id_doc] => 2 [term] => cela [doc_frequency] => 1 ) 
        [5] => Array ( [id_doc] => 2 [term] => hukum [doc_frequency] => 1 ) 
        [6] => Array ( [id_doc] => 3 [term] => nyanyi [doc_frequency] => 1 ) 
        [7] => Array ( [id_doc] => 3 [term] => dangdut [doc_frequency] => 1 ) 
        [8] => Array ( [id_doc] => 3 [term] => curi [doc_frequency] => 3 )  
    ) 

所以术语“curi”有 3 个文档频率,因为它出现在 3 个文档中。我试过这个

$count_df = array_count_values(array_map(function($item) {
   return $item['term'];
}, $dokumen_frek));
print_r($count_df);

但结果是

Array ( 
[curi] => 3 
[tidur] => 1 
[kamar] => 1 
[cela] => 1 
[hukum] => 1 
[nyanyi] => 1 
[dangdut] => 1 

)

飞溅58

使用array_count_values函数

$terms = array_count_values(array_column($arr, 'term'));

foreach($arr as &$x) {
   $x['doc_frequency'] = $terms[$x['term']];
}

演示

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章