下面显示了我的样本数据集
PatientID PatientName
XXX-037070002 Riger, Jens^Wicki
XXX-037070002 Riger^Wicki
XXX-10052 Weier,Nicole^Peggy
XXX-10052 Weier,Nicole^Peppy
XXX-23310 Rodem^Sieglinde
XXX-23310 Sauberger, Birgit^Finja
XXX-23343 Je, Ronny^Wilma
XXX-23343 Jer, Ronny^Wilma
XXX-2349 Kel,Andy^Juka
XXX-2349 Kel^Juka
XXX-2998 Hel, Frank
XXX-2998 Hel,Frank^Fenris
XXX-3188 Mey, Marion
XXX-3188 Mey, Marion^Paula
XXX-3188 Schulz^Roma
XXX-3218 Böntgen-Simnet,Dr. Regine^Cara
XXX-3218 Simnet,Dr. Regine^Cara
XXX-3826 Mertes, Bernd Uwe^Ellie
XXX-3826 Mertes,Bernd^Ellie
XXX-3826 Mertes^Ellie
这是我从上一个请求中得到的查询:
with d as
(
select distinct
patid,
patname
from dicomstudys
)
select *
from d
where d.patid in
(
select d.patid
from d
group by d.patid
having count(*) > 1
)
现在,我要调整查询,使其仅以下数据得到输出:
PatientID PatientName
XXX-23310 Rodem^Sieglinde
XXX-23310 Sauberger, Birgit^Finja
XXX-23343 Je, Ronny^Wilma
XXX-23343 Jer, Ronny^Wilma
XXX-3188 Mey, Marion
XXX-3188 Mey, Marion^Paula
XXX-3188 Schulz^Roma
XXX-3218 Böntgen-Simnet,Dr. Regine^Cara
XXX-3218 Simnet,Dr. Regine^Cara
姓氏以','或'^'分隔。如果相同的PatientID的姓氏相同,那么我不希望它们被显示。我尝试摆弄一个带有CHARINDEX命令和其他命令的子选择语句,但是由于请求的复杂性,我的SQL语法知识非常有限。
另请注意,对于XXX-3188,该案例包含两个姓氏相同的数据集,但另一个具有完整的PatientName的数据集,因此需要将其包含在输出中。
尝试这个:
DECLARE @DataSource TABLE
(
[ID] VARCHAR(32)
,[Name] VARCHAR(256)
);
INSERT INTO @DataSource ([ID], [Name])
VALUES ('XXX-037070002', 'Riger, Jens^Wicki')
,('XXX-037070002', 'Riger^Wicki')
,('XXX-10052', 'Weier,Nicole^Peggy')
,('XXX-10052', 'Weier,Nicole^Peppy')
,('XXX-23310', 'Rodem^Sieglinde')
,('XXX-23310', 'Sauberger, Birgit^Finja')
,('XXX-23343', 'Je, Ronny^Wilma')
,('XXX-23343', 'Jer, Ronny^Wilma')
,('XXX-2349', 'Kel,Andy^Juka')
,('XXX-2349', 'Kel^Juka')
,('XXX-2998', 'Hel, Frank')
,('XXX-2998', 'Hel,Frank^Fenris')
,('XXX-3188', 'Mey, Marion')
,('XXX-3188', 'Mey, Marion^Paula')
,('XXX-3188', 'Schulz^Roma')
,('XXX-3218', 'Böntgen-Simnet,Dr. Regine^Cara')
,('XXX-3218', 'Simnet,Dr. Regine^Cara')
,('XXX-3826', 'Mertes, Bernd Uwe^Ellie')
,('XXX-3826', 'Mertes,Bernd^Ellie')
,('XXX-3826', 'Mertes^Ellie');
WITH DataSource AS
(
SELECT [ID]
,[Name]
,COUNT(*) OVER (PARTITION BY [ID], LTRIM(RTRIM(SUBSTRING([Name], 0, CHARINDEX(',', REPLACE([Name], '^', ',')))))) AS [ID_Name_Count]
,COUNT(*) OVER (PARTITION BY [ID]) AS [ID_Count]
,LTRIM(RTRIM(SUBSTRING([Name], 0, CHARINDEX(',', REPLACE([Name], '^', ','))))) AS [FamilyName]
FROM @DataSource
)
SELECT [ID]
,[Name]
FROM DataSource
WHERE [ID_Name_Count] = 1
AND [ID_Count] = 2
OR [ID] IN
(
SELECT [ID]
FROM DataSource
GROUP BY [ID]
HAVING COUNT(DISTINCT [FamilyName]) > 1
);
解决方案非常简单。以下是有趣的部分:
^
用,
,以简化姓氏提取ID
和提取姓氏和计算次数last name
id-last name
对,然后添加ID并添加一个以上的唯一姓氏(您的特殊情况)id
2
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句