PAAS & SAAS — Provider support

All of our primary data stores (as opposed to secondary copies that can be easily recreated) will be stored as PAAS, including configurable support for backups rather than requiring us to develop our own solution, and we will make use of the best practices recommended by the provider.

Github

Github handles all of its own data backups internally. In the event of a disaster that destroyed or made inaccessible all of the copies that Github maintains, each developer has a local copy of all repos undergoing active development.

AWS Cognito

Amazon Cognito is internally highly-available across availability zones within each region. It has never supported backups because it would not be able to provide the password security guarantees that it does if there were externally visible backups. It does support failover across regions. Our app is currently deployed to only a single AWS region. When we expand the deployment to multiple AWS regions we will look into adding cross-region Cognito failover.

AWS Dynamo

We have enabled Point-in-Time backups for all of our production DynamoDB tables, which permits a restore of the data at any point in the 35-days following an issue. In addition we create On-Demand backups before substantial changes to these tables in case it takes more than 35 days for an issue to become apparent.

AWS RDS (Postgres)

Our database server has daily automated snapshots, with a 7-day retention period, we also expect to make additional snapshots with longer retention whenever a database on that server has a major schema change.

AWS S3

the AWS S3 documentation insists that it is more highly available than unidentified “traditional ... multi-data center infrastructures”. We make use of S3 versioning in buckets holding user data, but not cloudwatch, amplify, or serverless buckets. Any additional resilience requires a multi-region deployment.

AWS OpenSearch

The data stored in OpenSearch is a subset of the data stored in DynamoDB and our indexes can be regenerated with a single command, so recovery of OpenSearch ultimately depends upon the high-availability of DynamoDB. Because it prevents things from being broken in the middle of the night, it improves read performance, and it’s easy, we also have replicas within the cluster so that the loss of a single machine will not require recreation of the index from DynamoDB.