In an attempt to speed up my imap access I moved old emails in to folders based on year. This speed up process majorly backfired when imap/mdir decided to make 7 copies of each email in the same folder. I ended up with 27,000 emails in my 2006 folder!
With my mailbox quota full I needed a quick solution… and couldn’t find one! Thunderbird has a plugin that will search for and delete duplicate messages but it runs over imap which crippled the server trying to handle all the requests.
Using Google I stumbled across this solution for finding and deleting duplicate messages using reformail but after getting reformail installed I found it to be very slow and the number of messages to delete didn’t add up so I had to abandon this approach.
In the end I decided to write my own PHP script that would cycle through the specific mail directory, search for duplicate messages based on the Message-Id (or a checksum of the email if not available) and then delete the unnecessary, duplicate emails. It worked a treat, and went through the 27,000 emails in less than 5 minutes! If anybody wants the code, its below!