MySQL LOAD DATA INFILE with ON DUPLICATE KEY UPDATE

For loading huge amounts of data into MySQL, LOAD DATA INFILE is by far the fastest option. Unfortunately, while this can be used in a way INSERT IGNORE or REPLACE works, ON DUPLICATE KEY UPDATE is not currently supported.

However, ON DUPLICATE KEY UPDATE has advantages over REPLACE. The latter does a delete and an insert when a duplicate exists. This brings overhead for key management. Also, autoincrement ids will not stay the same on a replace.

How can ON DUPLICATE KEY UPDATE be emulated when using LOAD DATA INFILE?

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

These steps can be used to emulate this functionality:

  1. Create a new temporary table.
    CREATE TEMPORARY TABLE temporary_table LIKE target_table;
  2. Optionally, drop all indices from the temporary table to speed things up.
    SHOW INDEX FROM temporary_table;
    DROP INDEX `PRIMARY` ON temporary_table;
    DROP INDEX `some_other_index` ON temporary_table;
  3. Load the CSV into the temporary table
    LOAD DATA INFILE 'your_file.csv'
    INTO TABLE temporary_table
    FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"'
    (field1, field2);
  4. Copy the data using ON DUPLICATE KEY UPDATE
    SHOW COLUMNS FROM target_table;
    INSERT INTO target_table
    SELECT * FROM temporary_table
    ON DUPLICATE KEY UPDATE field1 = VALUES(field1), field2 = VALUES(field2);
  5. Remove the temporary table
    DROP TEMPORARY TABLE temporary_table;

Using SHOW INDEX FROM and SHOW COLUMNS FROM this process can be automated for any given table.

Method 2

We can replace first (two steps) with below single query in the answer shared by Jan.

For steps 1 and 2 we can create new table with same reference structure and without any indexes.

CREATE TEMPORARY TABLE temporary_table SELECT * FROM target_table WHERE 1=0;

Instead of.

  1. Create a new temporary table.
    CREATE TEMPORARY TABLE temporary_table LIKE target_table;
  2. Optionally, drop all indices from the temporary table to speed things up.
    SHOW INDEX FROM temporary_table;
    DROP INDEX `PRIMARY` ON temporary_table;
    DROP INDEX `some_other_index` ON temporary_table;

Method 3

Non-LOCAL Versus LOCAL Operation

The LOCAL modifier affects these aspects of LOAD DATA, compared to non-LOCAL operation:

  • It changes the expected location of the input file; see Input File
    Location.
  • It changes the statement security requirements; see Security Requirements.
  • It has the same effect as the IGNORE modifier on the interpretation of input file contents and error handling; see
    Duplicate-Key and Error Handling, and Column Value Assignment.

LOCAL works only if the server and your client both have been configured to permit it. For example, if mysqld was started with the local_infile system variable disabled, LOCAL produces an error. See Section 6.1.6, “Security Considerations for LOAD DATA LOCAL”.


All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes
Article Rating
Subscribe
Notify of
guest

0 Comments
Inline Feedbacks
View all comments
0
Would love your thoughts, please comment.x
()
x