基本上,我有3个表:customer_profiles_lib,customer_profiles_tmp和customer_duplicates_tmp。
我想检查是否来自customer_profiles_lib的每个记录都在customer_profiles_tmp中。如果不是,则将它们插入到customer_profiles_tmp中。如果是,则将它们插入到customer_duplicates_tmp中。
我在一个过程中尝试了此操作,但是我有900万条记录要处理,而且运行速度太慢了……这就是我的经验:
CREATE DEFINER=`company`@`%` PROCEDURE `customerImport`()
BEGIN
DECLARE unique_id INT;
DECLARE fin INT;
DECLARE curs CURSOR FOR SELECT customer_id AS unique_id FROM customer_profiles_lib;
DECLARE CONTINUE HANDLER FOR NOT FOUND SET fin = 1;
OPEN curs;
SET fin = 0;
REPEAT
FETCH curs INTO unique_id;
IF (SELECT EXISTS (SELECT customer_id FROM customer_profiles_tmp WHERE customer_id = unique_id)) THEN
SELECT unique_id AS 'ADDING';
INSERT IGNORE INTO customer_duplicates_tmp (first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number)
SELECT first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number FROM customer_profiles_lib WHERE customer_id = unique_id ORDER BY customer_profile_id DESC LIMIT 1;
ELSE
SELECT unique_id AS 'SKIPPING';
INSERT IGNORE INTO customer_profiles_tmp (first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number)
SELECT first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number FROM customer_profiles_lib WHERE customer_id = unique_id ORDER BY customer_profile_id DESC LIMIT 1;
END IF;
UNTIL fin END REPEAT;
CLOSE curs;
END
这种方法需要1个小时,并且可以插入,但不会在我的customer_duplicates_tmp表中放入任何内容。
INSERT IGNORE INTO customer_profiles_tmp (
first,
last,
address_1,
address_2,
city,
state,
zipcode,
email,
customer_id,
phone,
store_number
)
SELECT
tmp.first,
tmp.last,
tmp.address_1,
tmp.address_2,
tmp.city,
tmp.state,
tmp.zipcode,
tmp.email,
tmp.customer_id,
tmp.phone,
tmp.store_number
FROM customer_profiles_lib AS tmp;
感谢您的帮助!
似乎整个RBAR过程可以用两个SQL语句代替,从而显着提高了性能:
INSERT IGNORE INTO customer_duplicates_tmp
(first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number)
SELECT
first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number
FROM customer_profiles_lib
WHERE customer_id IN (SELECT customer_id FROM customer_profiles_tmp);
INSERT IGNORE INTO customer_profiles_tmp
(first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number)
SELECT
first, last, address_1, address_2, city, state, zipcode, email, customer_id, phone, store_number
FROM customer_profiles_lib
WHERE customer_id NOT IN (SELECT customer_id FROM customer_profiles_tmp);
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句