All about RPO, RTO, and MTD
A primer in Planning
RPO - Recovery Point Objective
RTO - Recovery Time Objective
MTD - Maximum Tolerable Downtime
Let me get this Benjamin Franklin quote out of the way before we start:
If you fail to plan, you are planning to fail!
With that in mind, RPO, RTO, and MTD are part of planning to avoid failure. They may be used for any system or process, and they often should be used for any important system or process.
All three are executive-level declarations. Senior leadership is accountable for the successful operation of any organization. Prudent business leaders declare these numbers for all major systems and processes as a part of their risk management strategy.
Business leadership does not have to understand technical requirements or system specs to adequately declare RPO, RTO, and MTD. They simply have to declare the what, and it’s the technical people problem to solve the how. However, without these declarations, systems cannot be designed to meet them, and therefore will naturally be overengineered (wasted $$$!) or underengineered (downtime/loss!).
RPO - Recovery Point Objective
Recovery Point means the latest point in time you would like to be able to restore in the event of a system failure of any kind. For a database of important one-time-only transactions, the RPO would be zero. Would you want your bank to have an RPO of one hour? Not if your check was deposited in that missing hour!
For other systems, an RPO of several hours is completely appropriate. For example, when paper records are manually entered/stored, if several hours (or more) is lost due to a system failure, the work can just be done again for far cheaper than a system that maintains an RPO of zero.
RTO - Recovery Time Objective
Recovery Time means how long it takes to get the Recovery Point back online. How long can the system be unavailable while you work to get it restored? Amazon and Google probably have an RTO of zero, but this website honestly has an RTO of 96 hours.
For example, if a physical system has an RTO of four hours, that means: whether it’s a bad system update, ransomware, or a fire in the server room, the technical team needs to have designed things so that system can be restored to operation within four hours. That means a four-hour-response parts replacement warranty isn’t enough. How long does it take to set up and configure the equipment once it arrives (in up to four hours)? This may mean offsite/alternate-site restore options are needed.
MTD - Maximum Tolerable Downtime
While similar to Recovery Time Objective, this is the extreme — the “do not exceed” version of RTO. For example, RTO says “Let’s design systems for an ideal recovery time of, say, 12 hours,” and MTD finishes that statement with, "but under no circumstances can it go beyond 72 hours, for there will surely be catastrophic, irreversible consequences.” While the other two are objectives, this is an absolute, and must be treated so.
For example, a utility’s Advanced Metering Infrastructure (AMI) may be configured to send usage data either continuously or at any time interval. If the communications network is unavailable, the meters may easily queue months of usage data while waiting for the network to be restored. When the network comes back online, the data is transmitted, no problem. But what happens if the outage lasts through the end of a billing cycle? While the utility may have an RPO of 12 hours for AMI network outages, their MTD might be 72 hours, or even 24 hours.
How do you determine the right RPO, RTO, and MTD?
We have a process to help determine what’s appropriate for utilities and municipalities. If you already have, or are considering AMI, our AMI network assessment process helps you stand with confidence.
Request your free assessment today.
It’s difficult for utility leadership to modernize IT, telecommunications infrastructure, and operations without putting stability and compliance at risk. We have a process to understand and deliver on the diverse success criteria utility teams have so leadership can demonstrate competence to peers and constituents.