jump to navigation

Oh! Bad performance. December 5, 2008

Posted by msrviking in Performance tuning.
trackback

Few months ago, one of the team came up to me asking for help on “How to enhance performance of large data (bulk) operations using BULK INSERT?”. After couple of hours thinking and analysis of what was going on in the environment and in the scripts, here is what I came up as recommendations.

· Disable foreign key constraints

· Truncate instead of delete. Because delete is a logged operation, and truncate is not a logged operation (per transaction basis). Disadvantage is that, recovery until point of failure will be only from the last backup.

· Disable AUTO CREATE and AUTO UPDATE STATISTICS

· Optionally disable the constraints (null, not null, check constraints), triggers

· Drop table indexes to improve performance for large bulk imports, except clustered indexes.

· Use TABLE LOCK hint for non-parallel operation

· It is not necessary to use table-locking to bulk-import data into a table from multiple clients in parallel, but doing so can improve performance for parallel load operation

· When indexes exist on a table, you cannot perform a parallel load operation by using the TABLOCK option for parallel load operation

· If you have control over the creation of your input files, make them of a size that is evenly divisible by the number of load threads you want to run in parallel.  To use the processor most efficiently, distribute the data evenly among the clients. Make sure that data files are similar in size if you want to import them in parallel from multiple clients into an instance of SQL Server.

· For parallel load operation – split the files in equal size

· Inserting n million rows in smaller groups say 10000 rows each (for e.g.). Use ROWS_PER_BATCH = 2500, First and Last row option of BULK INSERT

· For bulk-import operations, minimal logging is more efficient than full logging and reduces the possibility that a bulk-import operation will fill the log space. To minimally log a bulk-import operation on a database that normally uses the full recovery model; you can first switch the database to the bulk-logged recovery model. After bulk importing the data, switch the recovery model back to the full recovery model.

· Presort the input files and create the clustered index first before loading the records. If they are presorted it should allow us to shorten the create index step.

· Run the BULK insert on other client machines to remove the need to SQL Agent and SQL Server to handle the direct reading of the files. Move the input files to another computer to see what the load pattern looks like with the network involved.

· Database configuration – If the data or log files grow at % level it would be inefficient. Configure the database at MB level. Create database with enough reserved space

Why don’t you guys too throw your thoughts in the ring?

Advertisements

Comments»

No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: