IT

하나의 열에 대해서만 DISTINCT

lottoking 2020. 6. 18. 07:31
반응형

하나의 열에 대해서만 DISTINCT


다음과 같은 쿼리가 있다고 가정 해 봅시다.

SELECT ID, Email, ProductName, ProductModel FROM Products

중복 이메일을 반환하지 않도록 수정하려면 어떻게해야합니까?

즉, 여러 행에 동일한 전자 메일이 포함 된 경우 결과에 해당 행 중 하나만 (바람직하게는 마지막 행) 만 포함 시키려고합니다. 다른 열에는 중복이 허용되어야합니다.

은 전체 행에서 좋아 DISTINCT하고 GROUP BY작동하는 것으로 보입니다. 그래서 어떻게 접근 해야할지 모르겠습니다.


SQL Server 2005 이상을 사용하는 경우 다음을 사용하십시오.

SELECT *
  FROM (
                SELECT  ID, 
                        Email, 
                        ProductName, 
                        ProductModel,
                        ROW_NUMBER() OVER(PARTITION BY Email ORDER BY ID DESC) rn
                    FROM Products
              ) a
WHERE rn = 1

편집 : where 절을 사용하는 예 :

SELECT *
  FROM (
                SELECT  ID, 
                        Email, 
                        ProductName, 
                        ProductModel,
                        ROW_NUMBER() OVER(PARTITION BY Email ORDER BY ID DESC) rn
                    FROM Products
                   WHERE ProductModel = 2
                     AND ProductName LIKE 'CYBER%'

              ) a
WHERE rn = 1

이것은 SQL Server 2005+를 가정하고 "마지막"에 대한 정의는 주어진 전자 메일의 최대 PK입니다

;WITH CTE AS
(
SELECT ID, 
       Email, 
       ProductName, 
       ProductModel, 
       ROW_NUMBER() OVER (PARTITION BY Email ORDER BY ID DESC) AS RowNumber 
FROM   Products
)
SELECT ID, 
       Email, 
       ProductName, 
       ProductModel
FROM CTE 
WHERE RowNumber = 1


사용할 때 DISTINCT열이 아닌 별도의 행으로 생각하십시오. 열이 정확히 일치하지 않는 행만 반환합니다.

SELECT DISTINCT ID, Email, ProductName, ProductModel
FROM Products

----------------------
1 | something@something.com | ProductName1 | ProductModel1
2 | something@something.com | ProductName1 | ProductModel1

The query would return both rows because the ID column is different. I'm assuming that the ID column is an IDENTITY column that is incrementing, if you want to return the last then I recommend something like this:

SELECT DISTINCT TOP 1 ID, Email, ProductName, ProductModel
FROM Products
ORDER BY ID DESC

The TOP 1 will return only the first record, by ordering it by the ID descending it will return the results with the last row first. This will give you the last record.


You can over that by using GROUP BY function

SELECT ID, Email, ProductName, ProductModel FROM Products GROUP BY Email


For Access, you can use the SQL Select query I present here:

For example you have this table:

CLIENTE|| NOMBRES || MAIL

888 || T800 ARNOLD || t800.arnold@cyberdyne.com

123 || JOHN CONNOR || s.connor@skynet.com

125 || SARAH CONNOR ||s.connor@skynet.com

And you need to select only distinct mails. You can do it with this:

SQL SELECT:

SELECT MAX(p.CLIENTE) AS ID_CLIENTE
, (SELECT TOP 1 x.NOMBRES 
    FROM Rep_Pre_Ene_MUESTRA AS x 
    WHERE x.MAIL=p.MAIL 
     AND x.CLIENTE=(SELECT MAX(l.CLIENTE) FROM Rep_Pre_Ene_MUESTRA AS l WHERE x.MAIL=l.MAIL)) AS NOMBRE, 
p.MAIL
FROM Rep_Pre_Ene_MUESTRA AS p
GROUP BY p.MAIL;

You can use this to select the maximum ID, the correspondent name to that maximum ID , you can add any other attribute that way. Then at the end you put the distinct column to filter and you only group it with that last distinct column.

This will bring you the maximum ID with the correspondent data, you can use min or any other functions and you replicate that function to the sub-queries.

This select will return:

CLIENTE|| NOMBRES || MAIL

888 || T800 ARNOLD || t800.arnold@cyberdyne.com

125 || SARAH CONNOR ||s.connor@skynet.com

Remember to index the columns you select and the distinct column must have not numeric data all in upper case or in lower case, or else it won't work. This will work with only one registered mail as well. Happy coding!!!


Try This

;With Tab AS (SELECT DISTINCT Email FROM  Products)
SELECT Email,ROW_NUMBER() OVER(ORDER BY Email ASC) AS  Id FROM Tab
ORDER BY Email ASC

The reason DISTINCT and GROUP BY work on entire rows is that your query returns entire rows.

To help you understand: Try to write out by hand what the query should return and you will see that it is ambiguous what to put in the non-duplicated columns.

If you literally don't care what is in the other columns, don't return them. Returning a random row for each e-mail address seems a little useless to me.


Try this:

SELECT ID, Email, ProductName, ProductModel FROM Products WHERE ID IN (SELECT MAX(ID) FROM Products GROUP BY Email)

참고URL : https://stackoverflow.com/questions/5021693/distinct-for-only-one-column

반응형