Write SQL to identify multiple subgroupings within a grouping

Chad Published at Dev

Chad

I have a program that summarizes non-normalized data in one table and moves it to another and we frequently get a duplicate key violation on the insert due to bad data. I want to create a report for the users to help them identify the cause of the error.

For example, consider the following contrived simple SQL which summarizes data in the table Companies and inserts it into CompanySum, which has a primary key of State/Zone. In order for the INSERT not to fail, there cannot be more than one distinct combinations of Company/Code for every unique primary key State/Zone combination. If there is, we want the insert to fail so that the data can be corrected.

INSERT INTO CompanySum
(
    [State] 
    ,[Zone] 
    ,[Company]
    ,[Code] 
    ,[Revenue] 
)
SELECT 
    --Keys of target
    [State] 
    ,[Zone] 

    --We are expecting to have one distinct combination of these fields per key grouping
    ,[Company]
    ,[Code] 

    --Aggregate
    ,SUM([Revenue])
    
FROM COMPANIES

GROUP BY
    [State] 
    ,[Zone] 
    ,[Company]
    ,[Code]

I would like to create a report to help the users easily identify and correct the data so that there is only one distinct Company/Code combination within a State/Zone. For each distinct State/Zone value, I would like to identify the distinct Company/Code combinations within the State/Zone. If there are more than one Company/Code combinations within a State/Zone, I would like all of the records in the State/Zone to be displayed in the output. For example, here is the sample input and desired output:

Data:

RecordNumber    State   Zone    Company         Code    Revenue
------------    -----   ----    -------         ----    --------
1               CT      B       State of CT     65453    10
2               CT      B       State of CT     65453     3
3               CT      B       Travelers       33443    20
4               CT      C       Cigna           45678    24
5               CT      C       Cigna           45678   234
6               MI      A       GM              48089   100
7               MI      A       GM              54555   200
8               MI      B       Chrysler        43434    44


Desired Output:

RecordNumber    State   Zone    Company         Code    Revenue
------------    -----   ----    -------         ----    --------
1               CT      B       State of CT     65453     10
2               CT      B       State of CT     65453      3
3               CT      B       Travelers       33443     20
6               MI      A       GM              48089    100
7               MI      A       GM              54555    200

Here is the DDL and DML needed to create this test scenario

CREATE TABLE [dbo].[Companies](
    [RecordNumber] [int] NULL,
    [State] [char](2) NOT NULL,
    [Zone] [varchar](30) NOT NULL,
    [Company] [varchar](30) NOT NULL,
    [Code] [varchar](30) NOT NULL,
    [Revenue] [numeric](9, 1) NULL
) ON [PRIMARY]


CREATE TABLE [dbo].[CompanySum](
    [State] [char](2) NOT NULL,
    [Zone] [varchar](30) NOT NULL,
    [Company] [varchar](30) NOT NULL,
    [Code] [varchar](30) NOT NULL,
    [Revenue] [numeric](9, 1) NULL,
 CONSTRAINT [PK_CompanySum] PRIMARY KEY CLUSTERED 
(
    [State] ASC,
    [Zone] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]


DELETE FROM [dbo].[Companies]
GO

INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (1, N'CT', N'B', N'State of CT', N'65453', CAST(10.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (2, N'CT', N'B', N'State of CT', N'65453', CAST(3.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (3, N'CT', N'B', N'Travelers', N'33443', CAST(20.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (4, N'CT', N'C', N'Cigna', N'45678', CAST(24.0 AS Numeric(9, 1)))
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (5, N'CT', N'C', N'Cigna', N'45678', CAST(234.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (6, N'MI', N'A', N'GM', N'48089', CAST(100.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (7, N'MI', N'A', N'GM', N'54555', CAST(200.0 AS Numeric(9, 1)))
GO
INSERT [dbo].[Companies] ([RecordNumber], [State], [Zone], [Company], [Code], [Revenue]) VALUES (8, N'MI', N'B', N'Chrysler', N'43434', CAST(44.0 AS Numeric(9, 1)))
GO

This is a hopefully better re-construction of a previous post of mine SQL to return unique combinations of non key columns within a set of key columns where I am trying to help clarify the question and provide a simple working example that readers can use.

Please see this SQL Fiddle:

http://sqlfiddle.com/#!18/d0141/1

Marcus Vinicius Pompeu

Is this a solution?

Fiddle: http://sqlfiddle.com/#!18/12e9a0/9

select c.*
from
    Companies c
        inner join (
            select State, Zone
            from Companies
            group by State, Zone
            having count(distinct Company + Code) > 1
        ) as dup_state_zone
        on(
                c.State = dup_state_zone.State
            and c.Zone  = dup_state_zone.Zone
        )

Edited - Fix the having clause, with a little cheat...

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2021-06-14

Comments

0 comments

TOP Ranking

Article

Write SQL to identify multiple subgroupings within a grouping

Write SQL to identify multiple subgroupings within a grouping

pump.io port in URL

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

How to import an asset in swift using Bundle.main.path() in a react-native native module

How to use HttpClient with ANY ssl cert, no matter how "bad" it is

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

Modbus Python Schneider PM5300

What is the exact difference between “ use_all_dns_ips” and "resolve_canonical_bootstrap_servers_only” in client.dns.lookup options?

Loopback Error: connect ECONNREFUSED 127.0.0.1:3306 (MAMP)

BigQuery - concatenate ignoring NULL

Is there an option for a Simulink Scope to display the layout in single column?

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

How to define a specific version of macOS in C

MERGE with DELETE on target with partial match on source?

Apache rewrite or susbstitute rule for bugzilla HTTP 301 redirect

Soundcloud API Authentication | NodeWebkit, redirect uri and local file system

express js can't redirect user

UWP access denied

How to Set Particular Area/Region Selected MapView

split column by delimiter and deleting expanded column

Center buttons and brand in Bootstrap

How to design a xml file to display in more screen