In today’s world of high-availability clouds, there is still some  paranoia around the complete loss of service for cloud providers and  backups. In particular, many businesses are concerned that if they put  all their eggs in one basket, they may be forced out of business if say,  Amazon, were to go out of business. While this is highly unlikely, it  is technically a possibility. For those organisations with a very low  risk profile, there is a simple solution for you to setup to ensure that  you can maintain a cloud-agnostic(ish) environment. This post will  detail the logical structure and leave the implementation up to you to  use with your preferred tools. We will focus on implementing a full  backup strategy within AWS.

What are we preparing for?

First, we must define what each problem is that we are going to  solve. At a high level, we want to make sure that our business is  protected against disaster as possible. This can be broken down into  multiple scenarios that need to be mitigated against:

  • The deletion of data or resources from our tenancy.
  • The permanent loss of an Availability Zone.
  • The permanent loss of a Region.
  • The permanent loss of our provider-level backups (e.g. tenancy backups being deleted).
  • Our cloud provider going out of business or ceasing operations in our country (e.g. trade sanction).

These scenarios are ordered from most likely to least likely. Each  tier of problem is progressively harder. Each scenario gets a little bit  harder to mitigate against, although it is feasibly possible to prevent  loss in all cases.

How will we prevent loss?

Preventing loss is pretty straight forward. In each scenario, we just  need to take backups of our resources. The problem that needs to be  solved is simply how to get the backups into a place that is  significantly separated from our risk scenario.

Deletion of data and loss of an AZ

Deletion of data or resources within AWS is very simple to guard against. This simply involves implementing AWS Backup to take backups of your resources using Snapshots. Snapshots are backed  by S3 and as such, they are replicated across AZs and are tolerant to  AZ failures. AWS Backup provides a simple and cheap method for backing  up your resources within a region.

Loss of a Region

The loss of a region, should not impair the operability of your  application. AWS has many different resources available to manage  replicating data across regions. It is simply a case of implementing  cross-region backup solutions on a service-by-service basis. We will  cover off two of the most important scenarios here.

EC2 and RDS

EC2 and RDS support copying Snapshots across regions. To ensure that your data is  available in another Region, you simply need to make sure the Snapshots  taken by AWS Backup are copied across to your backup Region regularly.  These copied Snapshots incur duplicate storage costs, however within a  region Snapshots are de-duplicated so it will not cost more than double  of our current storage costs for each additional backup Region.

S3

S3 has a super handy feature known as Cross-Region Replication (CRR). CRR ensures that data between two regions stays in sync. The  great part about CRR is you can make manual backups of any data that  doesn’t natively support cross-Region backup support. Once your manual  backups are taken and stored in S3, CRR takes care of duplicating this  data into your Backup region. By also implementing versioning on these  buckets, you ensure that when objects are “deleted” and that action is  synced, the previous versions are still available (for higher security  implement MFA Delete). It’s also important to transition data to Glacier and take advantage of Vault Locking.  Vault Locking means you can create Write-Once-Read-Many (WORM) archives  that are permanently available as per the policies that are set. This  means that anything stored in S3 should not be deletable.

Loss of Provider-based Backups

The loss of Provider-based Backups is down to malicious action. In  almost all circumstances, it is unlikely that Amazon would induce the  deletion of an account without any notice. It is more likely that an  unauthorised individual would access an account and delete backup  resources to impair or destroy your business.  There is a simple solution for you to ensure that your resources are  protected: setup a second account with global deny Service Control  Policies limiting deletion of Backup objects. This account will have a  secure root account that has its password and MFA stored offline and in  an inaccessible location. Normal users can only add to the account and  cannot destroy resources. All S3 bucket actions should be versioned and  MFA Delete implemented. All of these restrictions make this account and  effective DR/”offsite” backup solution by logically decoupling your  Backup resources from your operations account.

Once this account has been setup, you can simply setup the following:

  • Copy EBS snapshots to the second account.
  • Copy RDS snapshots to the second account.
  • Setup CRR for any critical S3 backups to sync into the second account.

To ensure a very high level of redundancy, you can also implement the Region-syncing features from “Loss of a Region” as well.

Cloud Provider Unavailability

While it is very unlikely for a Cloud Provider such as AWS or Azure  to go out of business, it is not impossible. Therefore, we should try  and secure ourselves against this possibility by setting up backups that  are provider-agnostic. An additional feature of setting up these  agnostic backups, is that it provides us with a pattern of migrating to a  new Cloud Provider if we were to choose so in the future.

Unfortunately, if we are using agnostic methods for Backup this does  limit the solutions we can use. In my own experience, at the very least  the following should be done:

  • All databases should be backed up using vendor provided DBMS tools (e.g. mysqldump).
  • Configuration of servers should be fully documented.
  • IaC should be stored in code repositories.
  • Any other data considered important to your business (e.g. Git repositories) should also have backups taken.

Once you’ve taken backups and documented all necessary components of  your architecture, is all you need to do is store this data somewhere  that isn’t your current provider. This could be as  simple as downloading all the data onto a very large HDD you have  sitting next to your desk, or syncing the data into Azure Blob Storage  (or a similar service from another Cloud Provider). Other solutions  include using tools that take backups of services and can store them agnostically in a third-party location.

Once you have completed these final step, your environment should be  fully protected against almost all disasters (except maybe a  dinosaur-like extinction event).

Sleeping easy

Now that we’ve got everything backed up, you can sleep easy. We’ve  taken a very simple approach to Backups an ensured that our business is  protected in almost any scenario.

Thanks for reading! If you’re interested in more articles on securing your business against threats, please see Establishing Trust: Why TLS should be important to you. If you’d like to offer some suggestions for future content, please contact me.