Why calculating RTO and RPO are essential to business continuity
The first step of any backup and disaster recovery plan will be to establish the amount of recoverable information that is essential to your business success. These two factors, RTO (recovery time objectives) and RPO (recovery point objectives) will set the stage for the entirety of your disaster recovery planning.
What is an RTO? RPO?
The concepts of recovery time objectives (RTO) and recovery point objectives (RPO) define the acceptable amount of time it takes to recover operations of an application or system (RTO), as well as how much of a gap in time exists between the last backup of data and the production copy (RPO). Basically, the RTO is a definition of the tolerable amount of time the application or system can be down, while the RPO states the tolerable age of recovered data.
Here is a graphical depiction of RTOs/RPOs.
(image credit: Veeam Software)
Defining RTOs and RPOs should occur per application, at the entire application level; if an app has a front-end web server, several application servers, and a back-end database, all of those components must be considered to specify the RTO and RPO. Typically, these applications exist in tiers of 1, 2, and 3, with tier 1 being mission-critical, tier 2 at the next level, and so forth. Not all companies will have the same definitions, as they differ in services and size. The application tiers usually depend on the amount of revenue at risk or lost while the application is not functioning normally.
Determining, assigning, and documenting the correct RTOs and RPOs for the necessary applications will increase the ability to meet the expected results of your business continuity or disaster recovery plans.
Here’s an example: Let’s say your company only performs backups nightly, and this includes all applications and systems. If that backup completes at 4 am and your outage occurs at 2 pm, your recovery point is 10 hours. This is irrelevant to your recovery point objective because ten hours is what you have. If you determine that you can only afford (to meet customer or shareholder expectations) losing two hours’ worth of data, then you must tailor your backups for the application in question to meet those requirements. The backup job for that must run, and complete, in a timely enough manner to never have the most recent copy more than two hours old.
Perhaps you have a regulatory requirement that an application is never down for more than 2 hours. Your recovery time objective for that application is 2 hours. This includes any activity related to troubleshooting, remediation, restores, connectivity, etc. Other applications may not be as important and can be moved down in the order of recovery, to perhaps have an RTO of 8 or even 24 hours.
How do I calculate downtime costs?
Downtime costs are tricky to calculate at times. At its root, the cost is simply the amount lost per unit of time multiplied by the same unit of time. For example, if a company loses $10,000 per hour that a website is down and that site is unreachable for 6 hours, the downtime cost is $60,000. Some associated costs are not so obvious but include employee productivity, loss of reputation, and possible fines, fees, and even hacker ransoms.
What can I do to improve RTOs and RPOs?
In previous years before virtualization and cloud workloads emerged, recovery solutions were usually geared toward the recovery of physical servers. This typically included backing up data to a tape drive or using software to clone one server to another on duplicate hardware. Large, storage area networks (SANs) could also replicate blocks of data but typically come with exorbitant price tags.
Presently, we have more options than ever to improve the performance of recovery and reliability. Cloud technologies allow for flexible workload distribution and do not impose hardware requirements on recoverability. Perhaps the most important part of any plan is the plan itself. Be sure to analyze and document your environment and then frequently update the information, as systems change and are added or decommissioned.
It may also make sense to discuss options with experts such as Global Data Vault. Relying on a partner who has proven results in meeting or exceeding service level agreements (SLAs) can make a difference when it matters. From simple file restores to entire virtual machines and even the complete recovery of a network in our cloud facilities, Global Data Vault is here to help and achieve the results you expect.