我们有数据到达以下结构
entity_id entity_value category_id category_weight group_id group_weight
1 100 11 6 101 4
1 100 11 6 102 3
1 100 12 5 102 3
1 100 12 5 103 2
1 100 13 6 101 4
实体可以属于任何类别和任何组合的任何组,类别成员资格和组成员资格之间没有隐式关系。
数据是冗余但一致的;如果一行显示类别11的权重为6,则所有行将显示类别11的权重为6。这同样适用于组及其权重。
一行数据由唯一标识{entity_id, category_id, group_id}
。
目的是根据各种权重在所有行中分配实体的价值。首先,按类别分配,然后按组分配。
步骤1:按类别分摊
实体1与权重为{6,5,4}的3个类别{11,12,13}相关联
将100 *(6 /(6 + 5 + 6))
分配给类别11 => 35.29将100 *(5 /(6 + 5 + 6))
分配给类别12 => 29.41将100 *(6 /(6 + 5) +6))类别13 => 35.29
步骤2:按组分配结果
Entity1Category11与权重为{4,3}的组{101,102}相关联
将35.29 *(4 /(4 + 3))
分配给组101 => 20.17将35.29 *(3 /(4 + 3))分配给组102 => 15.12Entity1Category12与权重为{3,2}的组{102,103}相关联
将29.41 *(3 /(3 + 2))
分配给组102 => 17.65将29.41 *(2 /(3 + 2))分配给组103 => 11.76Entity1Category13与权重为{4}的组{101}相关联
将35.29 *(4 /(4))分配给组103 => 35.29
我可以使用窗口功能执行第二步。干净整洁,没有自我参与。
但是,第一步似乎需要子查询和自我联接。
例如... http://sqlfiddle.com/#!18/be890/1
SELECT
sample.entity_id,
sample.category_id,
sample.group_id,
sample.entity_value AS original_value,
sample.entity_value
* (sample.category_weight / entity.total_category_weight)
* (sample.group_weight / SUM(sample.group_weight) OVER (PARTITION BY sample.entity_id, sample.category_id))
AS apportioned_value
FROM
(
SELECT
entity_id,
SUM(category_weight) AS total_category_weight
FROM
(
SELECT
entity_id,
category_id,
MAX(category_weight) AS category_weight
FROM
sample
GROUP BY
entity_id,
category_id
)
entity_category
GROUP BY
entity_id
)
entity
INNER JOIN
sample
ON sample.entity_id = entity.entity_id
是否有一种更整洁的方式,不需要自我加入?
SELECT
entity_id,
category_id,
group_id,
entity_value AS original_value,
entity_value
* (category_weight / SUM(scaled_cat_weight) OVER (PARTITION BY entity_id ))
* (group_weight / SUM(group_weight ) OVER (PARTITION BY entity_id, category_id))
AS apportioned_value
FROM
(
SELECT
*,
category_weight / COUNT(*) OVER (PARTITION BY entity_id, category_id) AS scaled_cat_weight
FROM
sample
)
scaled
ORDER BY
entity_id,
category_id,
group_id
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句