Business Continuity and Disaster Recovery planning and management is the process by which risks and threats to the ongoing availability of services, business functions, and the organization are actively reviewed and managed at set intervals as part of the overall risk-management process.
The goal is to keep the business operating and functioning in the event of a disruption.
Disaster recovery planning (DRP) is the process by which suitable plans and measures are taken to ensure that, in the event of a disaster (flood, storm, tornado, and so on).
The business can respond appropriately with the view to recovering critical and essential operations (even somewhat limited) to a state of the partial or full level of service in as little time as possible.
The goal is to quickly establish, reestablish, or recover affected areas or elements of the business following a disaster.
Note that DR and business continuity are often confused or used interchangeably in some organizations.
Wherever possible, be sure to use the correct terminology and highlight the differences between them.
Business Continuity and Disaster Recovery planning Elements
From the perspective of the cloud customer, business continuity elements include the relevant security pillars of availability, integrity, and confidentiality.
The availability of the relevant resources and services is often the key requirement, along with the uptime and ability to access this on-demand.
Failure to ensure this results in significant impacts, including loss of earnings, loss of opportunities, and loss of confidence for the customer and provider
Many security professionals struggle to keep their business continuity processes current once they have started to utilize cloud-based services.
Equally, many fail to adequately update, amend, and keep their business continuity plans up to date in terms of complete coverage of services.
This may be due to several factors; however, the key component contributing to this is that business continuity is operated mainly at set intervals and is not integrated fully into ongoing business operations.
That is, business continuity activities are performed only annually or biannually, which may not take into account notable changes in business operations (such as the cloud) within relevant business units, sections, or systems.
Note that not all assets or services are equal! What are the key or fundamental components required to ensure the business or service can continue to be delivered?
The answer to this question should shape and structure your business continuity and disaster recovery (BCDR) practices.
BC/DR Critical Success Factors
- Understand your responsibilities versus the CSP’s responsibilities
- Customer responsibilities
- Understanding any interdependencies or third parties (supply chain risks)
- Order of restoration (priority)
- Appropriate frameworks and certifications held by the facility, services, and processes
- Right to audit and make regular assessments of continuity capabilities
- Communications of any issues or limited services
- Identification of need for backups to be held onsite or offsite or with another CSP
- Clearly state and ensure the SLA addresses which components of business continuity and disaster recovery are covered and to what degree they are covered
- Penalties and compensation for loss of service
- Loss of integrity or confidentiality
- Points of contact and escalation processes
- Failover to maintain compliance
- Changes being communicated in a timely manner
- Clearly defined responsibilities
- Where usage of third parties is required per the agreed-upon SLA
The cloud customer should agree with and be fully satisfied with all the details relating to BCDR (including recovery times, responsibilities, and more) before signing any documentation or agreements that signify acceptance of the terms for system operation.
The customer typically pays for the associated time and costs of requesting amendments or changes to the relevant SLA.
3. Important SLA Components for Business Continuity and Disaster Recovery planning
- Undocumented single points of failure should not exist.
- Migration to alternate providers should be possible within agreed-upon timeframes.
- All components need to be supported by alternate CSPs in the event of a failover; if not, onsite services may be required as a fallback solution.
- Automated controls should be enabled to allow customers to verify data integrity.
- Where data backups are included, incremental backups should allow the user to select the desired settings, including coverage, frequency, and ease of use for recovery point restoration options.
- Regular assessment of the SLA and any changes that may affect the customer’s ability to utilize cloud computing components for DR should be captured at regular and set intervals.
Although it’s impossible to plan for every event or disaster that may occur, relevant plans and continuity measures should cover several logical groupings, which could be applied for unforeseen or unplanned incidents.
As cloud adoption and migration continue to expand, all affected or associated areas of business (technology and otherwise) should be reviewed under BCDR plans, thus ensuring that any changes for the customer or provider are captured and acted upon.
Imagine the challenges of trying to restore or act upon a loss of availability, when processes, controls, or technologies have changed without the plans being updated or amended to reflect such changes.
The following Business Continuity and Disaster Recovery planning ISO/IEC documents may be of use to CSPs as they are considering what items an SLA will need to address:
- ISO/IEC DIS 19086-1, “Information Technology—Cloud Computing—Service Level Agreement (SLA) Framework
- ISO/IEC NP 19086-2, “Information Technology—Cloud Computing—Service Level Agreement (SLA) Framework and Technology
- ISO/IEC CD 19086-3, “Information Technology –Cloud Computing—Service Level Agreement (SLA) Framework and Technology
- ISO/IEC AWI 19941, “Information Technology –Cloud Computing— Interoperability and Portability”
- ISO/IEC CD 19944, “Information Technology—Cloud Computing—Data and Their Flow Across Devices and Cloud Services”
- ISO/IEC FDIS 20933, “Information Technology—Distributed Application Platforms and Services (DAPS)—Access Systems”