11/09/2009

SQL Server High Availability Options Preview

As it is named, all the SQL Servers and databases should be highly available (HA) to cater the business requirement, and, the business application should be up and running all the time. To achieve this Microsoft has introduced SQL Server Cluster which is one of the highly recommended and used solutions for SQL Server high availability which is server specific that means that all the databases on the SQL Server Cluster are available in case of failover. Database Mirroring (DM) and Log Shipping (LS) along with the Replication (Repl.) are the alternate solution for the high availability but all this options are database specific that means that you can set it up for specific database(s).

Here are the brief about each (I'll post detailed article on each in near future):

Replication: As it is named, replicates the data/database to one or more than one location. Replication was first introduced in SQL Server 6 with minor support and was enhanced in SQL Server 7.0; with launch of SQL Server 2000 it has added some more features to it. And, with SQL Server 2005 Microsoft has introduced Peer-to-Peer replication which is capable of doing load balancing.

The Application:  Based on the business requirement i.e. for reporting purpose, data from more than one location needs to be merged and or if wish to have transaction level replication of data.
Component of the Replication:
Publisher:  Is a server which published the data
Distributor: Is a server which is responsible to distribute data to the subscriber(s)
Subscribers: are servers which has the subscription, the distributor sends the data which is being published by the publisher here.
Subscription Type: Subscription has two method, pull and push. With Pull subscription, subscriber connects to the publisher periodically and pulls the data where as with Push subscription publisher connects with the subscribers and push the changes over subscriber.

 There are 4 type of Replication available: 
1.       Snapshot:  Snapshot replication is the very basic and simplest type of replication available. As name implies, it took snapshot of the published data, connect to the subscriber and then applied the snapshot on subscriber.  This is most suitable when data is required for reporting purpose and are not updated on the subscriber end, and high latency is acceptable.

Application:  Where higher latency is allowed and data is used for the reporting purpose only.

2.       Transactional:  In this mode of replication, it is replicating data transaction level by continuous monitoring.  To achieve this Log Reader Agent is being created when Transactional Replication is configured, it monitors every database which is configured for Transactional Replication and then copies the transactions which are marked for replication from T-Log to distribution database. Distribution database then send (either pull/push subscription) it to the subscriber.
       Application:  Where higher latency is not allowed and data is used for the reporting
       purpose, and   sometimes subscriber is allowed to updates.
     
3.       Merge:  Merge replication uses Triggers to track the changes in subscriber to publisher.  And, as it named it merges the records, irrespective of who updates/inserts records.  This is the most complex topology of replication.  As it is a merge replication wherein updates/modifications are allowed from more than one subscriber at a time, it has in-built facility to resolved conflict.  There are some pre-requisites which you need to have before Merge Replication can be configured, here is my article on Merge Replication Tips here
Application:
a.       Multiple Subscribers might update the same data at various times and propagate those changes to the Publisher and to other Subscribers.
b.      It might have a chance of occurring conflict, and if they do, there should be the mechanism to resolve it.
c.       One time modification is done at subscriber end and the same needs to update at publisher and other subscriber.

4.       Peer-to-Peer:  This is the best topology of Replication introduced up till. This has been introduced in with launch of SQL Server 2005. In, Peer-to-Peer replication the participants have the complete schema and all the data modifications (DML operations) are made at all nodes.  Now, you would be thinking, hey what is the catch here, what is new then merge replication? Well, the simplest answer is Peer-to-peer Replication is designed for Load Balancing by spreading out write activity across all available nodes and then based on the sync latency it synchronizes database, thus read and write operations become faster.  And, additionally if any of the server is not available it will continue to perform as it is with another node and write operation for this node is routed to another node.
Application: Best suited for online shopping application

Database Mirroring: Database mirror is the best and cost effective solution for automatic failover. Database mirroring is introduced with launch of SQL Server 2005. A database mirroring replicate the change to secondary server at transaction level and provides two modes: High Performance (Asynchronous) and High Safety (Synchronous). The difference between two is, with High Safety mode it will not send and write any data to the secondary node until it is committed on the primary node whereas with High Performance mode it will ship the transaction and write the transaction on to secondary server no matter it is committed or roll back on the primary server.

Log Shipping:   This is the cost effective solution, as it is named, it ships T-Log backup to the secondary node(s) on regular interval and then restore job performs T-Log restoration at secondary server to keep it updated with the primary server. The disadvantage is, it don’t have automatic failover feature and hence manual intervention for role change would take 2~3 minutes time (I guess this is allowable downtime).


SQL Server Cluster:  This is the best HA feature in SQL Server, which provides automatic failover for Entire Server, in comparison to other options which provides supports to particular database(s). But, this solution is costlier than other 3 options because it requires identical hardware.  

0 comments:

Post a Comment

I appreciate your time, thanks for posting your comment. I will review and reply to your comment as soon as I can.

Thank you
Hemantgiri