我创建了一个查询来更新标志,我使用了CASE语句来确定值。但是,当我将查询作为UPDATE语句运行时,只有大约一半的预期行被更新了吗?更有趣的是,我之前对相同的数据运行了完全相同的UPDATE查询,并且按预期运行(查看是我对旧还是新进行了调查)。
我使用相同的CASE语句尝试执行SELECT查询,得到了正确的结果,但是将其切换回UPDATE只会更新大约一半的记录。
将条件移至WHERE子句可解决此问题。似乎是SET部分中的CASE语句导致了问题。我不知道为什么?我想知道,这样可以避免以后犯任何错误。
原始代码:
UPDATE D
SET PUBLISH_FLAG =
CASE WHEN
MAPPED_CAT NOT IN(1,2,3)
AND SRC != '999'
AND RECEIVED_DATE is not null
AND RECEIVED_DATE <= D.CENSUS_DATE
AND SCHEDULED_FLAG = 'N'
THEN 'Y'
ELSE 'N'
END
FROM TBL_DATA D
INNER JOIN TBL_PUBLISH V
ON D.ID = V.ID
AND D.CENSUS_DATE = V.CENSUS_DATE
AND D.VERSION_NUMBER = V.VERSION_NUMBER
LEFT JOIN TBL_CAT_MAP C
ON D.SRC_CATEGORY = C.SOURCE_CAT
工作代码:
UPDATE D
SET PUBLISH_FLAG = 'Y'
FROM TBL_DATA D
INNER JOIN TBL_PUBLISH V
ON D.ID = V.ID
AND D.CENSUS_DATE = V.CENSUS_DATE
AND D.VERSION_NUMBER = V.VERSION_NUMBER
LEFT JOIN TBL_CAT_MAP C
ON D.SRC_CATEGORY = C.SOURCE_CAT
WHERE
MAPPED_CAT NOT IN(1,2,3)
AND SRC != '999'
AND RECEIVED_DATE is not null
AND RECEIVED_DATE <= D.CENSUS_DATE
AND SCHEDULED_FLAG = 'N'
我认为两者应产生完全相同的结果?我想念什么?
为了帮助说明以下代码有2个显示的区别,“ PUBLISH_FLAG”列(使用我的原始代码或PSK的答案进行了更新)具有10162个“ Y”值(其余为“ N”),pub_2列具有正确的18917个值Y'值。
SELECT
PUBLISH_FLAG,
CASE WHEN
MAPPED_CAT NOT IN(1,2,3)
AND SRC != '999'
AND RECEIVED_DATE is not null
AND RECEIVED_DATE <= D.CENSUS_DATE
AND SCHEDULED_FLAG = 'N'
THEN 'Y'
ELSE 'N'
END as pub_2
FROM TBL_DATA D
INNER JOIN TBL_PUBLISH V
ON D.ID = V.ID
AND D.CENSUS_DATE = V.CENSUS_DATE
AND D.VERSION_NUMBER = V.VERSION_NUMBER
LEFT JOIN TBL_CAT_MAP C
ON D.SRC_CATEGORY = C.SOURCE_CAT
WHERE
CASE WHEN
MAPPED_CAT NOT IN(1,2,3)
AND SRC != '999'
AND RECEIVED_DATE is not null
AND RECEIVED_DATE <= D.CENSUS_DATE
AND SCHEDULED_FLAG = 'N'
THEN 'Y'
ELSE 'N'
END = 'Y'
您的第一个查询绝对不同于第二个查询。实际上,从我在这里看到的情况来看,我将说明您使用CASE进行的更新是正确的,因为它正在更新标志的两面。带有WHERE的另一个查询不会将标志更新为应该的位置。您如何准确确定预期的“正确”更新数量?我认为您期望UPDATE语句的更新行数与SELECT语句的数目一样多,尽管并非总是如此。您正在创建的JOIN可能会根据您的过滤器产生笛卡尔积。
考虑下面的查询。
CREATE TABLE #table1 (Field_1 INT, Field_2 VARCHAR(MAX))
INSERT INTO
#table1
VALUES
(1, 'Item A'),
(2, 'Item B'),
(3, 'Item C'),
(4, 'Item D'),
(5, 'Item E')
CREATE TABLE #table2 (Field_1 INT, Field_2 VARCHAR(MAX))
INSERT INTO
#table2
VALUES
(1, 'Item A'),
(1, 'Item B'),
(2, 'Item B'),
(2, 'Item C'),
(3, NULL)
-- This produces 7 rows:
SELECT
*
FROM
#table1
LEFT JOIN
#table2 ON #table1.[Field_1] = #table2.[Field_1]
-- This updates 1 row. This is akin to your second query. Only one flag value is changed.
-- You would still have to write an UPDATE statement for the 'N' flag update.
UPDATE
#table1
SET
#table1.[Field_2] = 'Y'
FROM
#table1
LEFT JOIN
#table2 ON #table1.[Field_1] = #table2.[Field_1]
WHERE
#table2.[Field_2] = 'Item C'
-- Because your UPDATE statement only updates the values to 'Y' where a condition matches, only one record is changed here.
-- The others are left untouched.
SELECT
*
FROM
#table1
-- Now what happens if we perform the reverse UPDATE.
UPDATE
#table1
SET
#table1.[Field_2] = 'N'
FROM
#table1
LEFT JOIN
#table2 ON #table1.[Field_1] = #table2.[Field_1]
WHERE
NOT (#table2.[Field_2] = 'Item C')
-- First of all we notice that we are not dealing with NULL values at all so only two records get changed to 'N'.
-- The first record gets changed because it does not have a match on 'Item C'.
-- The second record also gets changed because it does not have a match on 'Item C', i.e. there is at least one record without an 'Item C' match.
-- The last three records have either no match in the JOIN or are NULL in #table2. Meaning they are not updated.
-- This is why I'm more a fan of your CASE query, because in theory it should deal with setting everything to the correct value.
SELECT
*
FROM
#table1
-- Let's see what would happen with a CASE statement.
-- Since our JOIN is a cartesian product there are multiple options for #table1.Id == 2: it can be updated to both N and Y.
-- N is chosen by T-SQL. You will see that after the UPDATE.
SELECT
*, CASE WHEN #table2.[Field_2] = 'Item C' THEN 'Y' ELSE 'N' END
FROm
#table1
LEFT JOIN
#table2 ON #table1.[Field_1] = #table2.[Field_1]
-- This updates 5 rows, maybe you would have expected 7 here based on the above SELECT statement?
-- You can also notice how it updates everything to N, that's because our CASE deals with both sides.
-- It's either 'Y' or either 'N'. It will always touch every record it can to UPDATE it.
-- This in contrast with an UPDATE statement which will only touch one side and because of JOIN clauses and NULL values
-- it's entirely possible that both UPDATE statements do not touch the entire table if written incorrectly.
-- You would have to write an UPDATE statement like this one, which comes after the first.
--UPDATE
-- #table1
--SET
-- #table1.[Field_2] = 'N'
--FROM
-- #table1
--LEFT JOIN
-- #table2 ON #table1.[Field_1] = #table2.[Field_1]
--WHERE
-- #table1.[Field_2] <> 'Y' OR #table1.[Field_2] IS NULL
-- In conclusion this means that if you want to be absolutely sure you have updated all values to their correct setting: use CASE.
-- But if you only care about setting 'Y' to the correct value: don't use CASE.
-- If you do use CASE, make sure you are definitely performing your JOIN correct and you are calculating the correct value for both sides.
UPDATE
#table1
SET
#table1.[Field_2] = CASE WHEN #table2.[Field_2] = 'Item C' THEN 'Y' ELSE 'N' END
FROM
#table1
LEFT JOIN
#table2 ON #table1.[Field_1] = #table2.[Field_1]
SELECT
*
FROM
#table1
DROP TABLE #table1
DROP TABLE #table2
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句