Tag Archives: SQL

SQL UPDATE() updating millions of records nicely

Ever needed to update millions of records before ?
New to tackling this ?
I had this experience when our database contained 17 million records when 15.5 million were junk and could be disgarded. My mission was flag the unwanted records and remove them.
This blog is simply about how to go about updating a table with millions of records. Fear not there is an approach that wont lock your database for hours, or expontentially increase the size of your transaction log file. Its known as Chunked updates.

original source explained here http://www.sqlusa.com/bestpractices2005/hugeupdate/

In my case I needed to update 17 million records based on a cross reference match of each record against another table containing 5.5K records, a single update statement would have takens hours and hours.
The chunked approach on an optimised indexed set of tables took just 35 minutes on a single license quad core proc SQL server 2005 standard edition.
Heres is the SQL statement

So you can see from the code i was chunking in 500000 rows segments, tune this to your requirements I guessed and struck it lucky.

For those interested, yes i did take steps to prevent this from occuring again. I created an insert trigger that cross references the User Agent table flagging the record. A Job then does the clean up of periodically archiving the unwanted data into a redundant database for historical reference.