Select rows where the combination of two columns is unique and we only display rows where the first column is not unique

Anh-Duc

I have an order line table that looks like this:

ID Order ID Product Reference Variant
1 1 Banana Green
2 1 Banana Yellow
3 2 Apple Green
4 2 Banana Brown
5 3 Apple Red
6 3 Apple Yellow
7 4 Apple Yellow
8 4 Banana Green
9 4 Banana Yellow
10 4 Pear Green
11 4 Pear Green
12 4 Pear Green

I want to know how often people place an order with a combination of different fruit products. I want to know the orderId for that situation and which productReference was combined in the orders.

I only care about the product, not the variant.

I would imagine the desired output looking like this - a simple table output that gives insight in what product combos are ordered:

Order ID Product
2 Banana
2 Apple
4 Banana
4 Apple
4 Pear

I just need data output of the combination Banana+Apple and Banana+Apple+Pear happening so I can get more insight in the frequency of how often this happens. We expect most of our customers to only order Apple, Banana or Pear products, but that assumption needs to be verified.

Problem

I kind of get stuck after the first step.

select orderId, productReference, count(*) as amount
from OrderLines
group by orderId, productReference

This outputs:

Order ID Product Reference amount
1 Banana 2
2 Apple 1
2 Banana 1
3 Apple 2
4 Apple 1
4 Banana 2
4 Pear 3

I just don't know how to move on from this step to get the data I want.

Serg

You can use a window count() over()

select * 
from
(   
    select orderId, productReference, count(*) as amount
       , count(productReference) over(partition by orderId) np
    from OrderLines
    group by orderId, productReference    
) t
where np > 1

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related

How to select rows where a combination of columns matches a list of values?

Pandas: select rows where two columns are different

Mysql: how to select rows where the combination of 2 of the column values are unique given a certain other column value

Pandas only select rows where unique string in column only has one specific string in another column

Do I have to use "FETCH FIRST n ROWS ONLY" or "LIMIT" when selecting an unique id in where clause

How to extract list of unique combinations from multiline array, where only every two consecutive rows are the basis for combinations?

Extracting only those rows with a unique combination of column values in R

SAS - select rows where unique ID occurs for first time (ID can be in two columns)

Select rows where column value is a combination of numbers and letters

print only lines where the first column is unique

Finding all rows with unique combination of two columns

Select all rows where two columns contain a combination of values

First two rows per combination of two columns

Combining Two SQL Queries with different where and returning unique rows

Select rows with highest values where 1 column is unique

select all rows where column values are unique

Query to delete all duplicate rows but one where no columns is/are unique

assign unique ID from two columns where values can be in inverse order on different rows

SQL select rows, where column value is unique (only appears once)

Select unique rows with where in clause (SQL Server)

How to group by two columns in pandas where the combination of the two is unique

Select rows with Max(Column Value) for each unique combination of two other columns

MySQL: Select rows where field contains unique value

Select only rows where date in one column falls between two dates in other columns?

Condensing an array where some rows differ only by one column (to one with unique rows but more columns)

PostgreSQL: Select unique rows where distinct values are in list

How to select only the first rows for each combination of a set of columns?

Android: How to get count of unique values in a table, where unique value is the combination of two columns with where clause

Select random rows such that combination of a subgroup of columns is unique