MySQL Big DELETEs

Table of Contents

The Problem
Why it is a Problem
InnoDB and undo
Solutions
PARTITION
Deleting in Chunks
SBR vs RBR
InnoDB Chunking Recommendation
Reclaiming the disk space
Another use case
Non-deterministic Replication
Replication and KILL
Iterating through a compound key
Postlog
Brought to you by Rick James

The Problem


How to DELETE lots of rows from a large table? Here is an example of purging items older than 30 days:
   DELETE FROM tbl WHERE ts < DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
If there are millions of rows in the table, this statement may take minutes, maybe hours.

Any suggestions on how to speed this up?

Why it is a Problem


    ⚈  MyISAM will lock the table during the entire operation, thereby nothing else can be done with the table.
    ⚈  InnoDB won't lock the table, but it will chew up a lot of resources, leading to sluggishness.
    ⚈  InnoDB has to write the undo information to its transaction logs; this significantly increases the I/O required.
    ⚈  Replication, being asynchronous, will effectively be delayed (on Slaves) while the DELETE is running.

InnoDB and undo


To be ready for a crash, a transactional engine such as InnoDB, will record what it is doing to a log file. To make that somewhat less costly, the log file is sequentially written. If the log files you have (there are usually 2) fill up because the delete is really big, then the undo information spills into the actual data blocks, leading to even more I/O.

Deleting in chunks avoids some of this excess overhead.

Limited benchmarking of total delete elapsed time show two observations:

* Total delete time approximately doubles above some 'chunk' size (as opposed to below that threshold). I do not have a formula relating the log file size with the threshold cutoff. * Chunk size below several hundred rows is slower. This is probably because the overhead of starting/ending each chunk dominates the timing.

Solutions


    ⚈  PARTITION -- Requires 5.1 and some careful setup, but is excellent for purging a time-base series.
    ⚈  DELETE in chunks -- Carefully walk through the table N rows at a time.

PARTITION


The idea here is to have a sliding window of partitions. Let's say you need to purge news articles after 30 days. The "partition key" would be the datetime (or timestamp) that is to be used for purging, and the PARTITIONs would be "range". Every night, a cron job would come along and build a new partition for the next day, and drop the oldest partition.

Dropping a partition is essentially instantaneous, much faster than deleting that many rows. However, you must design the table so that the entire partition can be dropped. That is, you cannot have some items living longer than others.

PARTITION tables have a lot of restrictions, some are rather weird. You can either have no UNIQUE (or PRIMARY) key on the table, or every UNIQUE key must include the partition key. In this use case, the partition key is the datetime. It should not be the first part of the PRIMARY KEY (if you have a PRIMARY KEY).

You can PARTITION InnoDB or MyISAM tables.

Since two news articles could have the same timestamp, you cannot assume the partition key is sufficient for uniqueness of the PRIMARY KEY, so you need to find something else to help with that.

Reference implementation for Partition maintenance

PARTITIONing requires MySQL 5.1. MySQL docs on PARTITION

Deleting in Chunks


(This discussion applies to both MyISAM and InnoDB.)

When deleting in chunks, be sure to avoid doing a table scan. The code below is good at that; it scans no more than 1001 rows in any one query. (The 1000 is tunable.)

Assuming you have news articles that need to be purged, and you have a schema something like
   CREATE TABLE tbl
      id INT UNSIGNED NOT NULL AUTO_INCREMENT,
      ts TIMESTAMP,
      ...
      PRIMARY KEY(id)
Then, this pseudo-code is a good way to delete the rows older than 30 days:
   @a = 0
   LOOP
      DELETE FROM tbl
         WHERE id BETWEEN @a AND @a+999
           AND ts < DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
      SET @a = @a + 1000
      sleep 1  -- be a nice guy
   UNTIL end of table
Notes:
    ⚈  It uses the PK instead of the secondary key. This gives much better locality of disk hits, especially for InnoDB.
    ⚈  You could (should?) do something to avoid walking through recent days but doing nothing. Caution -- the code for this could be costly.
    ⚈  The 1000 should be tweaked so that the DELETE usually takes under, say, one second.
    ⚈  No INDEX on ts is needed. (This helps INSERTs a little.)
    ⚈  If your PRIMARY KEY is compound, the code gets messier.
    ⚈  This code will not work without a numeric PRIMARY or UNIQUE key.
    ⚈  Read on, we'll develop messier code to deal with most of these caveats.

If there are big gaps in id (and there will after the first purge), then
   @a = SELECT MIN(id) FROM tbl
   LOOP
      SELECT @z := id FROM tbl WHERE id >= @a ORDER BY id LIMIT 1000,1
      If @z is null
         exit LOOP  -- last chunk
      DELETE FROM tbl
         WHERE id >= @a
           AND id <  @z
           AND ts < DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
      SET @a = @z
      sleep 1  -- be a nice guy, especially in replication
   ENDLOOP
   # Last chunk:
   DELETE FROM tbl
      WHERE id >= @a
        AND ts < DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
That code works whether id is numeric or character, and it mostly works even if id is not UNIQUE. With a non-unique key, the risk is that you could be caught in a loop whenever @z==@a. That can be detected and fixed thus:
   ...
      SELECT @z := id FROM tbl WHERE id >= @a ORDER BY id LIMIT 1000,1
      If @z == @a
         SELECT @z := id FROM tbl WHERE id > @a ORDER BY id LIMIT 1
   ...
The drawback is that there could be more than 1000 items with a single id. In most practical cases, that is unlikely.

If you do not have a primary (or unique) key defined on the table, and you have an INDEX on ts, then consider
   LOOP
      DELETE FROM tbl
         WHERE ts < DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
         ORDER BY ts   -- to use the index, and to make it deterministic
         LIMIT 1000
   UNTIL no rows deleted
This technique is NOT recommended because the LIMIT leads to a warning on replication about it being non-deterministic (discussed below).

SBR vs RBR


TBD -- "Row Based Replication" may impact this discussion.

InnoDB Chunking Recommendation


    ⚈  Have a 'reasonable' size for innodb_log_file_size.
    ⚈  Use AUTOCOMMIT=1 for the session doing the deletions.
    ⚈  Pick about 1000 rows for the chunk size.
    ⚈  Adjust the row count down if asynchronous replication (Statement Based) causes too much delay on the Slaves or hogs the table too much.

Reclaiming the disk space


This is costly.

MyISAM leaves gaps in the table (.MYD file); OPTIMIZE TABLE will reclaim the freed space after a big delete. But it may take a long time.

InnoDB is block-structured, organized in a BTree on the PRIMARY KEY. An isolated deleted row leaves a block less full. A lot of deleted rows can lead to coalescing of adjacent blocks. (Blocks are normally 16KB.)

In InnoDB, there is no practical way to reclaim the freed space from ibdata1, other than to reuse the freed blocks eventually.

The only option with innodb_file_per_table = 0 is to dump ALL tables, remove ibdata*, restart, and reload. That is rarely worth the effort and time.

InnoDB, even with innodb_file_per_table = 1, won't give space back to the OS, but at least it is only one table to rebuild with. In this case, something like this should work:
   CREATE TABLE new LIKE main;
   INSERT INTO new SELECT * FROM main;  -- This could take a long time
   RENAME TABLE main TO old, new TO main;
   DROP TABLE old;   -- Space freed up here

Another use case


Given:
   CREATE TABLE tbl (
      seq INT UNSIGNED AUTO_INCREMENT NOT NULL,
      channel ...,
      ts TIMESTAMP NOT NULL,
        ...
      PRIMARY KEY (seq)
   ) ENGINE = InnoDB;
And you want to prune old rows from tbl, one channel at a time (possibly with different threshholds).
   $cutoff = 0
   while(true) {   // or run via cron frequently enough
      $cutoff = SELECT seq  FROM tbl
                     WHERE seq > $cutoff
                     ORDER BY seq  LIMIT 1000, 1
      DELETE * FROM tbl
         WHERE channel = ...
           AND seq < $cutoff
           AND ts < DATE_SUB(NOW(), INTERVAL 7 DAY)
      $ct = RowsAffected()
      quit if $ct == 0
   }

    ⚈  The SELECT finds the thousandth record's id.
    ⚈  The technique won't hit more than 1000 rows in a pass, therefore "nice" on the system. (Tune the "1000" if needed.)
    ⚈  It is possible that the 1000 oldest records are for some other "channel", thereby hiding rows that need deleting. If so, add INDEX (channel, seq).
    ⚈  "INDEX (channel, seq)" should be avoided if unnecessary -- in huge tables, extra indexes _may_ be an I/O burden.
    ⚈  This technique works quite well without the "channel" stuff.

Non-deterministic Replication


Any UPDATE, DELETE, etc with LIMIT that is replicated to slaves (via Statement Based Replication) _may_ cause inconsistencies between the Master and Slaves. This is because the actual order of the records discovered for updating/deleting may be different on the slave, thereby leading to a different subset being modified. To be safe, add ORDER BY to such statements. Moreover, be sure the ORDER BY is deterministic -- that is, the fields/expressions in the ORDER BY are unique.

An example of an ORDER BY that does not quite work: Assume there are multiple rows for each 'date':
   DELETE * FROM tbl ORDER BY date LIMIT 111
Given that id is the PRIMARY KEY (or UNIQUE), this will be safe:
   DELETE * FROM tbl ORDER BY date, id LIMIT 111

Unfortunately, even with the ORDER BY, MySQL has a deficiency that leads to a bogus warning in mysqld.err. See
Spurious "Statement is not safe to log in statement format." warnings

Replication and KILL


If you KILL a DELETE (or any? query) on the Master in the middle of its execution, what will be Replicated?

If it is InnoDB, the query should be rolled back. (Exceptions??)

In MyISAM, rows are DELETEd as the statement is executed, and there is no provision for ROLLBACK. Some of the rows will be deleted, some won't. You probably have no clue of how much was deleted. In a single server, simply run the delete again. The delete is put into the binlog, but with error 1317. Since Replication is supposed to keep the Master and Slave in sync, and since it has no clue of how to do that, Replication stops and waits for manual intervention. In a HA (High Available) system using Replication, this is a minor disaster. Meanwhile, you need to go to each Slave(s) and verify that it is stuck for this reason, then do
   SET GLOBAL SQL_SLAVE_SKIP_COUNTER = 1;
   START SLAVE;
Then (presumably) reexecuting the DELETE will finish the aborted task.

Iterating through a compound key


To perform the chunked deletes recommended above, you need a way to walk through the PRIMARY KEY. This can be difficult if the PK has more than one column in it. To that end I have a perl library that will let you walk through a table 'nicely'. It effectively gives you a WHERE clause for each chunk. Those clauses are designed to encompass at most N rows (sometimes fewer). It will work for any UNIQUE key and even non-UNIQUE keys (but then it may violate the 'at most N').

Here is a hint of the efficient way to do compound GT: Assume that you left off at ($g, $s):
   INDEX(Genus, species) -- or INDEX(Genus)
   WHERE Genus >= '$g' AND ( species  > '$s' OR Genus > '$g' )
   ORDER BY Genus, species

Postlog


Posted, 2010


Contact me by posting a question at
MySQL Forums :: Performance
-- Rick James

Rick's MySQL Documents

MySQL Documents by Rick James

Tips, Debugging, HowTos, Optimizations, etc...

Rick's RoTs (Rules of Thumb -- lots of tips)
Memory Allocation (caching, etc)
Character Set and Collation problem solver
Converting from MyISAM to InnoDB -- includes differences between them
Big DELETEs - how to optimize
Compound INDEXes plus other insights into the mysteries of INDEXing
Partition Maintenance (DROP+REORG) for time series
Entity-Attribute-Value -- a common, poorly performing, design patter; plus an alternative
Find the nearest 10 pizza parlors (efficient searching on Latitude + Longitude)
Alter of a Huge table
Latest 10 news articles -- how to optimize the schema and code for such
Pagination, not with OFFSET, LIMIT
Data Warehouse techniques (esp., Summary Tables)
Techniques on efficiently finding a random row (On beyond ORDER BY RAND())
GUID/UUID Performance (type 1 only)
IP Range Table Performance
MySQL Limits
Galera Limitations (with Percona XtraDB Cluster / MariaDB)
Rollup Unique User Counts
Best of MySQL Forum