SQL Server: High Availability and Disaster Recovery à la Carte!
Aug 2, 2012 / By Mohammed Mawla
In the business world, a good RDMS is one that can provide high availability (HA) options with minimal downtime and the cheapest licenses. Vendors usually if not always provide a set of HA features to satisfy different needs. Usually, some of these features are reserved for higher-end editions: Enterprise, Datacenter, cluster, or whatever name; of course, these editions are more expensive than their “inferior” counterparts.
The most important factors to consider when choosing an edition are:
1 - License fees:
This is almost the most important factor as high-End editions cost a lot of money. Upgrading them in the future can also cost a lot. For example: The cost of SQL server Enterprise Edition core license is almost four times the cost of Standard Edition.
A critical system can benefit from the fine-tuned optimizations, HA, and recovery options available only to high-end editions. For example: SQL Server ENT Edition can support more than two nodes for clustering, support online index operations, fast recovery, online restores, resource governor, partitioning, TDE, etc.
3- Vendor requirements
If you are using a 3rd party software, you may be limited to one or two editions. For example: SAP requires SQL Server Enterprise Edition for applications based on the SAP Netweaver Stack because they use ENT edition features like partitioning and data compression.
Those who have been working with SQL Server for a while know that new High Availability (HA) options were being added every major version, with the exception of SQL Server 2008, to address business needs. SQL Server 2008 introduced enhancements to the already existing features but no new features.
New features were added, trying to address business needs that weren’t addressed in previous editions or to do so in a better way.
Here’s a brief outline of the HA feature introductions across the versions. I’m starting with SQL server 2000, although I worked with 6.5 & 7.0:
SQL Server 2000
- Log shipping
It provides a warm standby copy of a database, something impossible with Clustering or an immense effort with replication depending on your schema. Log shipping allowed the creation of read-only copies of databases across multiple sites, which is beneficial for reporting.
However, LS was far from perfect. The was no automatic failover, data loss could happen because logs are shipped on a schedule, and more space was needed for the log backups. (Backup compression enhanced that later.)
SQL Server 2005
- Database mirroring (supported in SQL Server 2005 SP1)
Automatic failover was then possible with Witness, there was no data loss with Synchronous Mirroring, and it was simpler to configure. However, it still has its limitations: It works only per database and has only one secondary instance and no read-only standby. (Database snapshot can be used but only supported in ENT edition.)
- Peer to Peer transactional replication
It is an ENT edition feature where a set of instances contain identical copies of the data, and we want to distribute the load among them. Reads are spread around and the loss of one instance can be substituted by routing traffic to others.
P2P replication is still difficult to setup and has many restrictions. It does NOT have conflict detection/resolution. More info can be found here http://technet.microsoft.com/en-us/library/ms151196.aspx.
SQL Server 2008/R2
No new features were added, but there were some enhancements like DB mirroring log stream compression and corrupt pages recovery.
SQL Server 2012
- AlwaysON Availability groups, which almost combined all of the above features.
AlwaysON is the gem of HA, patching a lot of other technologies shortages : Failover a group of databases, multiple secondaries, more flexible Failover Policy, Virtual name for fast application failover, and more.
Downside? COST, licenses, and hardware.
Some other 3rd party applications provide HA features that satisfy other needs. For example: HP Polyserve (I believe it’s retired now) consolidates multiple SQL Server instances onto fewer servers to maximize data availability through virtualizing the data tier. Some other tools continuously replicate the storage tier across the network to a secondary server in case primary fails.
SQL Server provides a set of HA features with each version. Some editions themselves can support a different set of features, and sometimes one technology may have better capabilities in higher editions (Example: DB mirroring Asynch mode in ENT edition).
Here’s a list of the features included in Enterprise and Standard editions, as those are the most widely used. You can use this link to see the feature comparison across all editions.
|SQl server Version||Edition||Clustering (1)||Replication||Log shipping||DB mirroring||P2P replication||Geo-Clustering (2)||Availability groups|
(1) : Standard edition can allow two-node clustering only.
(2) : Stretched VLAN is necessary for Geo-clusters involving SQL Server 2008 R2 and below. SQL Server 2012 ENT Edition supports multi-subnet clusters natively, but STD edition still needs Stretching VLAN for a 2-node Geo cluster. More info here.
What’s really good about these options, and the idea behind this blog, is how you can mix almost ALL of them to achieve your HA/DR goals (pick à la carte). Since not everyone can afford SQL 2012 ENT Edition, Windows Server Enterprise Edition, and multiple instances in different locations to achieve a decent HA/DR topology, we need to mix the existing features to complement each other and offer a requirement not provided by other feature.
- Clustering does NOT provide a standby or a redundant copy of the databases; a failure in shared storage can be a severe problem. Yes, there are hardware redundancies at some SANs, but they don’t come for cheap so you need another way to “duplicate” your data.
- SQL server 2000 brought log shipping, allowing to create a standby database that can be even readable for reporting purposes. However, it has some disadvantages (see above). If no data loss can be tolerated by LS, then database mirroring should be used. Both LS or DB mirroring can work smoothly with clustering since the virtual name of the clustered instance will be the end point.
- However, DB mirroring or Log Shipping protects only against one single database. If the primary server fails, extensive work needs to be done to bring a second server to take over (logins, jobs, user objects, etc.) so clustering excels here.
- If you are using database mirroring as a primary HA/DR solution, you may want to have some data shipped for reporting purposes. Then, you want to combine replication or log shipping. I saw a client deploying a 3rd party application that needed to replicate the WHOLE database to another instance for HA, but we wanted to add a 2nd layer of protection for the published database by mirroring it.
- Also, since database mirroring has limitations on the number of databases that can be mirrored, you want to use another feature if you have too many databases. Personally, I’ve seen that with some of our clients. They have too many databases and want to mirror them all, causing the mirror instance to stop working after reaching a number of mirrored databases. Microsoft has a KB that outlines hardware considerations for database mirroring.
- Failover clustering can also be combined by Shared storage replication (which needs 3rd party support) across multi-site to protect against disaster on one site. This is Geo-Clustering.
- P2P replication nodes themselves can be clustered instances. Also, the instances involved in SQL 2012 Availability groups can be clustered SQL instances. Please do NOT confuse Windows Server Failover Clustering WSFC necessary for Availability groups with SQL Server failover cluster instance (FCI). The first protects a single or multi-database, and the second protects the whole instance.
Well, SQL Server has plenty to offer, and you can pick and mix what you want to make it work best for you. What’s lacking on one feature may be offered in another. Whether it’s budget, 3rd party tools requirements, upgrade policy, etc., there should be a fit somehow.
I intend to write a few blogs posts in which I’ll list some of the scenarios where we had to implement more than one feature to satisfy data redundancy and the considerations behind implementing them.
Leave a Reply