[I've posted this on Stack Overflow:
https://stackoverflow.com/questions/63194116/upsert-with-non-unique-index]
I need to implement concurrent-safe UPSERT using a non-unique key and
avoid unnecessary auto-increment of ID.
Traditional INSERT ... ON DUPLICATE KEY [1] doesn't work for me, so I'm
performing:
INSERT INTO table (col1, col2, col3, col4, col5)
SELECT 1, 2, 'value3', 'value4', 'value5'
WHERE NOT EXISTS (SELECT 1
FROM table
WHERE col3 = 'value3'
AND col4 = 'value4'
AND col5 = 'value5')
then if it results in no row inserted, I'm performing:
UPDATE table
SET col1 = col1 + 1,
col2 = MAX(col2, 2)
WHERE col3 = 'value3'
AND col4 = 'value4'
AND col5 = 'value5'
There's an index:
CREATE INDEX ON table (col3, col4, col5)
It is non-unique as there are legacy data that does not allow me to
declare it unique. Newer records, however, should not have duplicated
(col3, col4, col5) rows.
Unsurprisingly, using the given INSERT statement I'm getting mixed
results trying to execute it concurrently from two sessions. I can see
the second session blocking until the first one commits its transaction,
but then the second transaction is also able to insert a new row
sometimes (or sometimes it achieves the expected of avoiding to insert a
duplicate (col3, col4, col5) row).
I'm currently performing manual unique-check after the insert:
SELECT COUNT(1)
FROM table
WHERE col3 = 'value3'
AND col4 = 'value4'
AND col5 = 'value5'
but I've also tried:
INSERT INTO table (col1, col2, col3, col4, col5)
SELECT 1, 2, 'value3', 'value4', 'value5'
WHERE NOT EXISTS (SELECT 1
FROM table
WHERE col3 = 'value3'
AND col4 = 'value4'
AND col5 = 'value5'
FOR UPDATE)
which appears to work with the examples I'm always getting a duplicate
(col3, col4, col5) row, otherwise. Is the given FOR UPDATE usage
reliable for the purpose of ensuring no duplicate (col3, col4, col5) row
will be inserted?
I'm using READ-COMMITTED transaction isolation.
[1]
https://dev.mysql.com/doc/refman/5.7/en/insert-on-duplicate.html
--
Stanimir