SQL:psql中的逻辑问题

内德里戈

我正在尝试开发一个查询来标识具有多个客户端ID的客户端。客户ID在第1列,第19列和第20列中包含唯一的个人标识符,您可以将它们视为某种社会保险号(我们将其称为SSN.19和SSN.20)

我的第一个想法是寻找具有匹配SSN但具有不同客户端ID的每一行,如下所示:

SELECT  
    a."5", a."3"||' '||a."4" as "3+4", a."19", a."20", a."21", a."1", 
    b."1", a."8"
FROM
    "clients_1" AS a,
    "clients_1" AS b 
WHERE a."19"=b."19" and a."20"=b."20" and a."1"<b."1" and a."1"='Value';

但是,它返回了0行。为了检查该表是否确实没有重复项,我执行了以下查询:

select distinct "19" as hk, count("19") as dl from "clients_1" group by "19" order by dl desc;

select distinct "20" as hk, count("20") as dl from "clients_1" group by "20" order by dl desc;

事实证明,在此特定表上,没有客户端具有与之关联的SSN19,但是表中有多个重复的SSN20。因此,我执行以下查询来查找具有多个ID的客户端:

SELECT  
    a."5", a."3"||' '||a."4" as "3+4", a."20", a."21", a."1", 
    b."1", a."8"
FROM
    "clients_1" AS a,
    "clients_1" AS b 
WHERE a."20"=b."20" and a."1"<b."1" and a."7"='Value';

此表返回了一个表,其中包含多个具有不同ID但具有相同SSN20的客户端。之后,我开始考虑一种针对客户同时具有SSN19和SSN20或其中之一的情况将查询通用化的方法,因此我想到了以下几点:

SELECT                
    a."5", a."3"||' '||a."4" as "3+4", a."19", a."20", a."21", a."1", 
    b."1", a."8"
FROM
    "clients_1" AS a,
    "clients_1" AS b 
WHERE ((a."19"=b."19" and a."19" is not null) or (a."20"=b."20" and a."20" is not null)) and a."1"<b."1" and a."7"='Value';

但是,此查询要花很长时间,我让查询运行了大约20分钟,但什么也没回来,而之前的尝试最多花费了2分钟。我究竟做错了什么?

内维尔

我相信的东西一样,这将是更好的表演,给你更多的灵活性:

SELECT
    *
FROM
    (
        SELECT
            COUNT(*) OVER (PARTITION BY "19") as 19_matches,
            COUNT(*) OVER (PARTITION BY "20") as 20_matches,
            COUNT(*) OVER (PARTITION BY "19","20") as both_matches,
            clients_1.*
        FROM 
            clients_1
        WHERE "7" = 'value'
    )
WHERE 19_matches > 1 OR 20_matches > 1 or both_matches > 1
ORDER BY "19","20"

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章