jump to navigation

SQL Server 2005 SP3 is here! December 16, 2008

Posted by msrviking in DB Administration, SQL Server (SP's).
add a comment

Alas! the one we are waiting for all this while is out for download. This service pack SQL Server 2005 SP3 has consolidation of all the CU packages and fixes.

What has improved, quoted from “What’s New in SQL Server 2005 SP3” is on DB Engine, Notification services, Replication and Reporting services.

Check this link on what has improved, and download the SP3 from here.

BTW, please don’t apply the service packs on the production or any application critical environments. Try it on some sandbox to watch the behavior, test it and deploy it in appropriate environments.

Cheers!

Advertisements

SQL Server 2005 BOL (December 2008) December 16, 2008

Posted by msrviking in Technical Documentation.
add a comment

Guys,

SQL Server 2005 BOL (Dec’08) version is available for download at this link. Keep yourself updated with latest knowledge base.

Cheers!

AWS is for public beta testing December 9, 2008

Posted by msrviking in General.
add a comment

Amazon web services is out of public beta testing. Here is what it is all about in short and learn more from this link.

A short description from AWS website:

“Amazon SimpleDB is a web service providing the core database functions of data indexing and querying. This service works in close conjunction with Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2), collectively providing the ability to store, process and query data sets in the cloud, making web-scale computing easier and more cost-effective for developers.

A traditional, clustered relational database requires a sizable upfront capital outlay, is complex to design, and often requires a DBA to maintain and administer. Amazon SimpleDB is dramatically simpler, requiring no schema, automatically indexing your data and providing a simple API for storage and access. This approach eliminates the administrative burden of data modeling, index maintenance, and performance tuning. Developers gain access to this functionality within Amazon’s proven computing environment, are able to scale instantly, and pay only for what they use.”

Oh! Bad performance. December 5, 2008

Posted by msrviking in Performance tuning.
add a comment

Few months ago, one of the team came up to me asking for help on “How to enhance performance of large data (bulk) operations using BULK INSERT?”. After couple of hours thinking and analysis of what was going on in the environment and in the scripts, here is what I came up as recommendations.

· Disable foreign key constraints

· Truncate instead of delete. Because delete is a logged operation, and truncate is not a logged operation (per transaction basis). Disadvantage is that, recovery until point of failure will be only from the last backup.

· Disable AUTO CREATE and AUTO UPDATE STATISTICS

· Optionally disable the constraints (null, not null, check constraints), triggers

· Drop table indexes to improve performance for large bulk imports, except clustered indexes.

· Use TABLE LOCK hint for non-parallel operation

· It is not necessary to use table-locking to bulk-import data into a table from multiple clients in parallel, but doing so can improve performance for parallel load operation

· When indexes exist on a table, you cannot perform a parallel load operation by using the TABLOCK option for parallel load operation

· If you have control over the creation of your input files, make them of a size that is evenly divisible by the number of load threads you want to run in parallel.  To use the processor most efficiently, distribute the data evenly among the clients. Make sure that data files are similar in size if you want to import them in parallel from multiple clients into an instance of SQL Server.

· For parallel load operation – split the files in equal size

· Inserting n million rows in smaller groups say 10000 rows each (for e.g.). Use ROWS_PER_BATCH = 2500, First and Last row option of BULK INSERT

· For bulk-import operations, minimal logging is more efficient than full logging and reduces the possibility that a bulk-import operation will fill the log space. To minimally log a bulk-import operation on a database that normally uses the full recovery model; you can first switch the database to the bulk-logged recovery model. After bulk importing the data, switch the recovery model back to the full recovery model.

· Presort the input files and create the clustered index first before loading the records. If they are presorted it should allow us to shorten the create index step.

· Run the BULK insert on other client machines to remove the need to SQL Agent and SQL Server to handle the direct reading of the files. Move the input files to another computer to see what the load pattern looks like with the network involved.

· Database configuration – If the data or log files grow at % level it would be inefficient. Configure the database at MB level. Create database with enough reserved space

Why don’t you guys too throw your thoughts in the ring?

Another whitepaper! December 5, 2008

Posted by msrviking in Technical Documentation.
add a comment

There is another white paper from the stable of SQLCAT team. Check this link and know it for your self on what it is.

Great work SQLCAT team.

Cheers!

Visit this, that, these…. December 5, 2008

Posted by msrviking in General.
add a comment

Guys,

I was thinking of writing this entry for those who read my blog quite often.

Today morning when I was up from my bed, I was asking myself what would be I posting on my blog today. And there the usual stuff my brain started working  – do this, do that and hey by the way why not say visit this, that, these also…

There are several bloggers (a special mention to my favorite bloggers) who are all industry experts on SQL Server and who write prolificly. I definitely don’t mind sharing the links on my blog of these guys who are great MVPs (and with due respect for who are not MVPs) so that we could learn lot more from others experiences too.

With this note, I intend to say that you may see or you probably are seeing links to other bloggers apart from my own posts. So please go ahead and read to fill in the information which you probably would love to have always.

Okay, all speech over and now here are bunch of links you will want visit for today.

1) Buck Woody has wonderful entries, and here is another one you will want to visit.

2) James L has good post and also the one which inspired me to write the above speech (don’t quote me anywhere :)).

Happy Reading, cheers!

SQL Server 2005 DB Availability Features December 4, 2008

Posted by msrviking in High Availability.
add a comment

Sometime ago I had to come up with a document in my IT shop talking and comparing different SQL Server 2005 High Availability features.

With inputs from several SQL Server websites, MSDN, blog posts from my favorite bloggers, and other resources I managed to bring up what is in here. Also, at the end I brought a table which is a checklist of all the available features.

What is availability?

The simple definition of the availability of a system is the amount of time a system is up in a given year. Hence if a system is needed to be available for 99.99%, then it can’t be down for more than about 52 minutes in a year.

What is high availability?

A highly available system is one that can potentially nullify effects of the failure and maintain availability to such a degree that end users are not affected by the failure.

Higher availability in SQL Server world would revolve around these questions

Ø Are the instances up and running?

Ø Are all databases always available for the business?

Ø What if we bring the instance or database down as part of planned downtime, will the business tolerate this planned downtime?

Ø What if the instance or database crashes as part of an unplanned downtime, will the business tolerate such unplanned downtime?

In order to bring the instance or database up and running we have following high availability features in SQL Server 2005

Ø Database mirroring (New in SQL Server 2005)

Ø Failover clustering

Ø Replication

Ø Log shipping

Scalability vs. High Availability:

Availability and Scalability may seem to be conceptually related but they are actually not. With high availability, we guarantee a certain level of up-time, whereas scalability is primarily to get better performance through utilization of more resources.

In scalability, we have two different solutions for achieving better performance and to balance the load on SQL Server – scaling-up and scaling-out. Scaling up is about increasing resources like memory, CPU, disk controllers etc and is at hardware level. Scale out is using additional servers to distribute the load. This document is focused on availability options; a subsequent document will highlight scalability options in SQL Server 2005.

Description of High Availability options

Database mirroring

In database mirroring, an originating SQL Server 2005 instance continuously sends a database’s transaction log records to a copy of the database on another standby SQL Server instance. The originating database and server have the role of principal, and the receiving database and server have the role of mirror. The principal and mirror servers are separate instances of SQL Server 2005.

In all SQL Server databases, data changes are recorded in the transaction log before any changes to actual data pages are made. The transaction log records are placed first in a database’s log buffer in memory, and then flushed to disk (or ‘hardened’) as quickly as possible. In database mirroring, as the principal server writes the principal database’s log buffer to disk, it simultaneously sends that block of log records to the mirror instance. When the mirror server receives a block of log records, it places the log records first into the mirror database’s log buffer and then hardens them to disk as quickly as possible.

The principal and mirror servers are each considered a partner in the database mirroring session. A database mirroring session consists of a relationship between the partner servers when they mirror a database from one partner to another. A given partner server may have the principal role for one database and a mirror role for a different database.

In addition to the two partner servers (principal and mirror) a database mirroring session may have an optional third server, called the witness. The witness server’s role is to enable automatic failover. When database mirroring is used for high availability, if a principal server suddenly fails, if the mirror server has confirmation from the witness, it can automatically take on the role of principal and make its database available within a few seconds. In such a mode, when using ADO.Net or the SQL native client, the application can failover transparently using a connection string that looks, for example, like:

Data Source=myServerAddress; Failover Partner=myMirrorServer; Initial Catalog=myDataBase; Integrated Security=True;

There are two types of transfer mechanism for a database mirroring session: Synchronous and Asynchronous.

Synchronous transfer: The principal server hardens its principal database log records to disk; it also sends them to the mirror. The principal then waits for a response from the mirror server. The mirror responds when it has hardened those same log records to the mirror’s log disk.

Asynchronous transfer: The principal server will not wait for an acknowledgment from the mirror that the mirror has hardened a block of transaction records. The mirror will attempt to keep up with the principal, by recording transactions as quickly as possible, but some transactions may be lost if the principal suddenly fails and you force the mirror into service.

Failover clustering

A failover cluster consists of a set of redundant servers (called nodes) that share an external disk system. Clustering relies on the clustering capability of the Windows operating system. Windows must hence be configured for clustering. The server editions that support clustering in Windows 2003 are Enterprise and Datacenter. This mechanism also relies on a quorum disk that is shared by both instances, and holds all database files. This is typically a shared disk array.

During a cluster failover, a virtual SQL Server instance moves from one node to another. As a result, a cluster failover appears to external applications as if the virtual SQL Server instance is briefly unavailable (usually for less than a minute), then available again. The instance seemingly just stops and restarts. Behind the scenes, when one SQL Server instance located on one physical server becomes unavailable, Windows closes the database data files that the instance had open on a commonly shared disk space, starts another SQL Server instance on another physical server, opens the same data files, and takes over the virtual server name and virtual IP address of the failed instance. Open connections are closed during the failover, all committed transactions are rolled forward during the redo phase of database recovery in secondary node, and the undo phase will rollback the uncommitted records.

There are two types of failover clustering in Windows: Active/Passive and Active/Active.

Active/Passive means that your cluster has an active node and a passive mode. If your active node failed, then its defined resources would shift to the passive node and it would become active. The passive node is not accessible unless an accident occurs and the resources shifted.

In Active/Active clustering, both nodes are accessible and active. If a node fails, then its resources would shift to the other active node. The node that survives would then carry the load for both nodes.

Transactional Replication

Of the various types of replication supported by SQL Server, transactional replication is used for high availability primarily because it can guarantee transactional consistency between the publisher and subscribers, and also because its performance for continuous replication is much better than that of other replication methods. The most common high availability scenario for transactional replication occurs when you copy data from one database – the publisher – to one or more subscribers through a distribution server. If the publisher fails, one of the subscribers can become a read/write server and accept data updates—and even become a publisher to the other subscribers.

Log shipping

Log shipping allows you to automatically send transaction log backups from one database (known as the primary database) to a secondary database on another server (known as the secondary server). At the secondary server, these transaction log backups are restored to the secondary database, keeping it closely synchronized with the primary database. An optional third server, known as the monitor server records the history and status of backup and restores operations and optionally raises alerts if these operations fail to occur as scheduled.

When / Why should you the below availability features:

If any of your answers to the below questions is “Yes”, then that availability feature is for you.

Database mirroring:

1) Do you need a quick (<3-5 sec) automatic failover?

2) Do you want to not have the hardware expenses of a cluster?

3) Are you okay with a small performance overhead due to mirroring in the high protection mode, or with a small potential data loss in the high performance mode?

4) Do you need to protect individual databases (as opposed to all databases on an instance?

5) Do you want an automatic application redirection to standby?

6) Do you want to have standby over WAN?

7) Do you want to access standby for read-only access to a snapshot of the data?

Failover clustering:

1) Do you have hardware like a large shared cluster disk array, a dual NIC, a high capacity network, high capacity CPU and RAM listed in MS Windows 2003 HCL?

2) Do you need databases failover without any performance overhead?

3) Are you okay with a downtime of about 1-2 minutes during failover?

4) Do you need all databases of an instance as highly available?

5) Do you need a scalable solution for database availability?

Transactional replication and Log shipping:

1) Do you tolerate database availability delays that result from a manual failover?

2) Are you okay with clients not being completely transparent to failover?

3) Do you need to spread the standby servers across your WAN?

4) Do you want to balance the load across servers by accessing a read-only copy of the primary database?

Checklist of availability features

Category

Availability Feature

Database Mirroring

Failover Clustering

Log Shipping

Transactional Replication

Failover Characteristics

Standby type

Hot

Hot

Warm

Hot

Failure detection

Yes

Yes

No

No

Automatic failover

Yes

Yes

No

No

Masks disk failure

Yes

No

Yes

Yes

Masks SQL process failure

Yes

Yes

Yes

Yes

Masks other process failure

No

Yes

No

No

Metadata support

User Database

All databases (System and User Database)

User database

User database

Transactional consistency

Yes

Yes

Yes

Yes

Transactions are current

Yes

Yes

No

Yes

Perceived downtime

< 3 seconds

<20 seconds and db recovery time

Seconds plus recovery time

Seconds plus role change time

Transparent to client

Yes

Yes

No

No

Automatic role change

Yes

Yes

No

No

Configuration

Special hardware needed

No

Yes

No

No

Distance limit

Unlimited

Limited

Dispersed

Dispersed

Complexity

Some

More

Some

Some

Standby – accessible

Yes

No

Yes

Yes

Impact on performance

Some

No impact

Some

Some (Depending kind of subscription)

Impact on backup strategy

No impact

No impact

Some

No impact

Version availability

Enterprise of SQL 2005 – All features
Standard of SQL 2005 – Limited features

Enterprise and Standard Edition of SQL 2005

All versions except Express Edition

All versions except Express Edition

Additional server role

Witness

None

Monitor

Distributor

Redundant storage locations

Yes

No

Yes

Yes

Scope of availability

Database

Instance

Database

Database

Management

Load delay on secondary

No

No

Yes

No

Load balancing

No

No

No

No

Scalable

No

Yes

Yes

Yes

Client redirect

Limited support

No

No

No

Monitoring

Third party tools required

No

Yes

Yes

No

Customized scripts

Yes

Yes

Yes

No

Expertise required

Medium

High

Low

Medium

Troubleshooting

Complexity

Medium

High

Low

Medium

Expertise required

Medium

High

Low

Medium

Maintenance and Support

Complexity

Medium

High

Low

Medium

Expertise required

Medium

High

Low

Medium

Support

Support from Microsoft

New feature, so MS support is limited

Well defined support from MS

Well defined support from MS

Difficult to troubleshoot even by MS (sometimes)

Cost

Solution Cost

Low

High

Low

Low

Security

Complexity

Low

High

Low

Low

Expertise required

Low

High

Low

Low

Why don’t you guys tell me how did you find this consolidation?

Cheers!

PAL (Performance Analysis Log Tool) December 4, 2008

Posted by msrviking in Performance tuning.
add a comment

Sometime last week I was approached by one of my peers asking for help in analyzing the Windows performance counters of a SQL Server system and share my thoughts.

Although I was given a day time to get back, but at the back of my mind I was wanting to do two things. I should share well analyzed and conclusive thoughts, but quickly. How do I do it? For some reason I don’t remember how I got to this link – PAL but it turned out to be real worth to use it. It took me just half-hour to conclude what is right and wrong in the system based on the reports that is generated from this tool. And I got back with results quickly, of course the correct ones though :).

You could visit the link and see it for yourself what you need to do. Thanks to PSS engineers who have spent time to build this tool and sharing it among the community. Great job, people! Here is a quick peep into what it is all about,

“Ever have a performance problem, but don’t know what performance counters to collect or how to analyze them? The PAL (Performance Analysis of Logs) tool is a new and powerful tool that reads in a performance monitor counter log (any known format) and analyzes it using complex, but known thresholds (provided). The tool generates an HTML based report which graphically charts important performance counters and throws alerts when thresholds are exceeded. The thresholds are originally based on thresholds defined by the Microsoft product teams and members of Microsoft support, but continue to be expanded by this ongoing project. This tool is not a replacement of traditional performance analysis, but it automates the analysis of performance counter logs enough to save you time. This is a VBScript and requires Microsoft LogParser (free download).”

Cheers!

DBA! what should he be? December 4, 2008

Posted by msrviking in General.
add a comment

There are couple of blog posts by Satya and Karima on what a DBA should do or does in development, production or any other environment. Learn from the SQL Server MVP’s and shape your future (I am inspired!)