Group by and having trouble understanding

jedu Published at Dev

jedu

I was looking at some SQL query that I have in Access database that I did not make.

One of the SQL query goes something like this:

select column1 from table1 group by column1 having count(*)>1

The purpose of this query is to find the value in column1 that appears more than once. I can verify that this query works correctly and returns the column value that appears more than once.

I however do not understand why this query works. As per my understanding using group by will remove duplicate fields. For instance if column1 had

    column1
    apple
    mango
    mango

Doing group by (column1) will result

    column1
    apple
    mango

At this point, if we perform having count(*)>1 or having count(column1)>1, it should return no result because group by has already removed the duplicate field. But clearly, I am wrong as the above SQL statement does give the accurate result.

Would you please let me know the problem in my understanding?

Edit 1:

Besides the accepted answer, I this article which deals with order of SQL operation really helped my understanding

Gordon Linoff

You are misunderstanding how HAVING works. In fact, you can think of it by using subqueries. Your query is equivalent to:

select column1
from (select column1, count(*) as cnt
      from table1
      group by column1
     ) as t
having cnt > 1;

That is, having filters an aggregation query after the aggregation has taken place. However, the aggregation functions are applied per group. So count(*) is counting the number of rows in each group. That is why it is identifying duplicates.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at2020-12-3

Comments

0 comments

TOP Ranking

Article

Group by and having trouble understanding

Group by and having trouble understanding

pump.io port in URL

Failed to listen on localhost:8000 (reason: Cannot assign requested address)

How to import an asset in swift using Bundle.main.path() in a react-native native module

Inner Loop design for webscrapping

Can't pre-populate phone number and message body in SMS link on iPhones when SMS app is not running in the background

ggplotly no applicable method for 'plotly_build' applied to an object of class "NULL" if statements

mysql.connector.errors.InterfaceError: 2003: Can't connect to MySQL server on '127.0.0.1:3306' (111 Connection refused)

Removed zsh, but forgot to change shell back to bash, and now Ubuntu crashes (wsl)

Ambiguous use of 'init' with CFStringTransform and Swift 3

Resetting Value of <input type="time"> in Firefox

Execute ./script.sh with a crontab

Converting a class method to a property with a backing field

Spring Boot JPA PostgreSQL Web App - Internal Authentication Error

How to update azerothcore-wotlk docker container

How to set tab order for array of cluster,where cluster elements have different data types in LabVIEW?

Grails with Oracle thick OCI driver authenticate to Oracle with wrong user

How to pass data to the ng2-bs3-modal?

Making Array From Page Elements in jQuery

Retrieve Element Tag Value XML Using Bash

Laravel's ORM sync with timestamps doesn't update timestamps

Do animations stop css changes after animation completion?