Azure SQL Databases Disaster Recovery 101
Why should I care?
As a PaaS service, Azure SQL provides automated backup for all databases. It allows customers to recover their data from system or human errors and restores the databases to any point in time during the retention. But it won’t guarantee that your data will be always available, in some extreme cases. Imagine aliens invaded one of our data centers and destroyed everything, or more realistically, extreme weather put the whole area in power outage. Look at what hurricane Irma or Marina did. It happens.
Fortunately, Azure DB is prepared for this. We provide not only one, but two different solutions to recover your data in such disasters: Geo Restore and Geo Replication.
You will learn the followings in the next 5 minutes:
- What do these features do?
- What is the difference between these two?
- How do I choose between these two?
If you already have an answer to these questions or want to learn more details about the Azure DB business continuity and disaster recovery, please refer to our online documentation.
What do these features do?
Geo Restore allows you to recover the database to a different region from backup. The automated backup of all Azure databases is replicated to a secondary region in background. Geo Restore always restores the database from the copy of backup files stored in the secondary region.
Geo Replication will create a continuous copy of your database in one or more secondary region(s) (up to 4 secondary replicas). In event of disaster, you can simply failover to one of the secondary region and bring you database back online. You can also configure failover group to recover the databases automatically.
What is the difference between these two?
There are three major differences: Data Loss, Recovery Time, and Cost.
- Data Loss – The backup files are replicated to a secondary region in an asynchronized process. It means we may not get a chance to copy the latest backup before the disaster happens. In Azure DB, the RPO of geo restore (Recovery Time Objective, it’s not SLA) is 1 hour. In the same time, geo replication provides 5 seconds RPO.
- Recovery Time – Geo Restore will basically restore your database from backup files. The recovery time can be impacted by multiple factors: how large your database is, which service tier the database is restoring to, where the backup files are, how many databases you are trying to recover, how many people are trying to recovery their databases… The ERT for geo restore (estimated recovery time) is 12 hours. But for some cases, especially for very large databases, it could run longer than that. If you have geo replication configured, the failover usually takes less than 30 seconds.
- Cost – If geo replication is all good, why not use it? $$$$! When you configured geo replication, you basically created multiple copies of your databases. You are not paying for one, but two or more databases depending on how many replicas you configured. You can use Geo Restore with no extra cost.
How do I choose between these two?
It’s like choosing a car insurance plan. There’re no golden rules. But we’ll give you some silver bullets:
- Think about how much it’s going to cost you if you lose the data in last 60 minutes or have your database offline for 24 hours. Compare it with the extra cost of configuring geo replication. If you don’t know the cost, try it out by deleting your database and restoring it to one hour before the deletion time after 24 hours. Trick! If you did so, the database may not be important enough for you and you may not need geo replication.
- You can apply different DR solutions for different databases. Databases for an online payment system? Yes, please configure geo replication and failover groups! Databases where you store recipes you found from internet? Nah.
- You can change your mind and switch between these two anytime.
Anything else I should know?
There’s something else you may want to know before you close this page:
- Do DR drills and document all the steps. Always prepare and plan for the worst.
- Create a failover server in a secondary region and pre-configure all security objects including logins, users, and certificates. It will save you some time from the recovery.
- If you are using encryption keys in Azure key vault to protect your data, backup your keys!
- Active geo-replication can also be used to provide better query performance for read-only queries to geographically dispersed users.
- If you want to learn the whole story of Azure DB business continuity and disaster recovery, read the online documentation.
Source: Azure Blog Feed