We needed to cleanup (delete) some archived keys from Redis. A lot of them.

Prepare All The Keys To Delete

The first step is to create a file with all of the keys to be deleted. We exported this from our database, and it's just a text file with one key per line, it looks like this:

orderkey:15765:2PEObcEW5I1fesMCiUMi
orderkey:14237:im92SiPc7kvOVdwtGOEW
orderkey:28068:sWVVsGa8lVO0MvscdZke
orderkey:12751:CaszWC6DKrs08InOEaqU
...

The result file, which I will call all_redis_keys.txt contains 92,402,859 keys (3.1 GB).

Dry Run

Trying to execute one Redis request per key would be far too slow. Luckily DEL allows multiple keys to be passed to in and returns the number of keys deleted.

Before we unleash this on Redis we want to make sure that it's going to work. Fortunately we can substitute the DEL command with EXISTS command which returns the number of keys that exist (ie. the number of keys that would have been deleted).

Let's run through the first 100,000 keys in batch sizes of 1000 and output the number to another file:

$ head -n 100000 all_redis_keys.txt | xargs -n 1000 redis-cli EXISTS | tee count.txt

The tee command will output as well as writing to count.txt. If this is too noisy for you then you could just write directly to the file with > count.txt.

The count.txt (or stdout) file looks like this:

5897
5899
5903
...

We can add up all of these numbers:

$ awk '{s+=$1} END {print s}' count.txt

This returns that 589,742 keys would have been deleted.

The Real Run

$ time cat all_redis_keys.txt | xargs -n 10000 redis-cli DEL | tee count.txt
...

real 104m33.7s
user 3m34.005s
sys 0m48.68s

Now count the number of keys that were deleted:

$ awk '{s+=$1} END {print s}' count.txt
90646006

Over 90 million keys deleted in less than 2 hours (14,526 per second).