What is BCDR Related to Cloud Environment? - Cloud Security and Computing

There are several characteristics of the cloud environment that you need to consider for your BCDR plan. They represent opportunities as well as challenges.

First, though, it pays to have a more detailed look at some different scenarios in which you might want to consider BCDR. The following sections discuss these scenarios,

BCDR planning factors, and relevant cloud infrastructure characteristics. Before proceeding, two definitions need to be presented to help ensure the appropriate understanding of what BCDR is in the mind of the Cloud professional.

The business continuity plan (BCP) allows a business to plan what it needs to do to ensure that its key products and services continue to be delivered in case of a disaster, whereas the disaster recovery plan (DRP) allows a business to plan what needs to be done immediately after a disaster to recover from the event.

On-Premises, Cloud as BCDR

The first scenario is focused on an existing on-premises infrastructure, which may or may not have a BCDR plan in place already.

In this scenario, a CSP is considered the provider of alternative facilities should a disaster strike the on-premises infrastructure.

This is essentially the “traditional” failover conversation that IT has been engaged in for the enterprise since before the advent of the cloud.

The only difference is that the cloud is now being introduced as the endpoint for failover services and BCDR activities.

Cloud Service Consumer, Primary Provider BCDR

In the second scenario, the infrastructure under consideration is already located at a CSP.

The risk being considered is the potential failure of part of the CSP’s infrastructure, for example, one of its regions or availability zones.

The business continuity strategy then focuses on the restoration of service or failover to another part of that same CSP infrastructure.

Cloud Service Consumer, Alternative Provider BCDR

The third scenario is somewhat like the second scenario, but instead of restoration of service to the same provider, the service has to be restored to a different provider.

This also addresses the risk of complete CSP failure. DR almost by definition requires replication. The key difference between these scenarios is where the replication happens.

BCDR Planning Factors

Information relevant in BCDR planning includes the following:

The important assets: data and processing

The current locations of these assets

The networks between the assets and the sites of their processing

Actual and potential location of the workforce and business partners about the disaster event

Relevant Cloud Infrastructure Characteristics

Cloud infrastructure has several characteristics that can be distinct advantages in realizing BCDR, depending on the scenario:

Rapid elasticity and on-demand self-service lead to flexible infrastructure that can be quickly deployed to execute an actual DR without hitting unexpected ceilings.

Broad network connectivity, which reduces operational risk.

Cloud infrastructure providers have a resilient infrastructure, and an external BCDR provider has the potential for being experienced and capable because the provider’s technical and people resources are being shared across several tenants.

Pay-per-use can mean that the total BCDR strategy can be a lot cheaper than alternative solutions. During normal operation, the BCDR solution is likely to have a low cost.

Of course, as part of due diligence in your BCDR plan, you should validate all assumptions with the candidate service provider and ensure that they are documented in your SLAs.

Understanding the Business Requirements Related to BCDR

When considering the use of CSPs in establishing BCDR, there are general concerns and business requirements that hold for other cloud services as well, and there are business requirements that are specific to BCDR.

BCDR protects against the risk of data not being available and the risk that the business processes that it supports are not functional, leading to adverse consequences for the organization.

The analysis of this risk leads to the business requirements for BCDR

Here are some of the questions that need to be answered before an optimal cloud BCDR strategy can be developed:

Is the data sufficiently valuable for additional BCDR strategies?

What is the required RPO; that is, what data loss would be tolerable?

What is the required RTO; that is, what unavailability of business functionality is tolerable?

What kinds of “disasters” are included in the analysis?

Does that include provider failure?

What is the necessary RSL for the systems covered by the plan?

This is part of an overall threat model that the BCDR aims to mitigate.

In the extreme case, both the RPO and the RTO requirements are zero. In practice, some iteration from requirements to proposed solutions is likely to occur to find an optimal balance between loss prevention and its cost.

Some additional concerns can be created when BCDR across geographic boundaries is considered. Geographically separating resources for BCDR can result in a reduction of, say, flooding or earthquake risk.

Counterbalancing this is the fact that every CSP is subject to local laws and regulations based on geographic location.

The key for the Cloud professional is to understand how BCDR can differ in a cloud environment from the traditional approaches that exist in non-cloud environments.

For instance, in a virtualized environment, the use of snapshots can offer a bare-metal restoration option that can be deployed extremely quickly, whereas improvements to backup technology such as the ability to examine data sets in variable segment widths and change block tracking have enabled the handling of large and complex data and systems in compressed timeframes.

These can affect the RTO specified for a system. In addition, as data becomes both larger and more valuable as the result of being able to be better quantified, the RPO window will only continue to widen about more historical data being considered important enough to include in RPO policy and the initial RPO point continuing to move closer to the disaster event.