Backups & Disaster Recovery

View as Markdown

14.9.1 What to back up

AssetMethodFrequencyRetention
Postgrespg_dump (logical) + WAL archivinghourly WAL, daily full30 days hot, 12 months cold
RedisRDB snapshot + AOFRDB hourly, AOF continuous7 days
API config (env, secrets)secret manager exporton changeversioned forever
Smart-contract addressesgit-tracked deployments/<network>.jsonon every deploygit history
WALLET_ENCRYPTION_KEYoffline cold storage (HSM, paper, vault)once + on rotationforever
Audit logsobject storage (S3, GCS)streaming≥ 12 months (compliance)

Critical: lose WALLET_ENCRYPTION_KEY and every embedded wallet’s server share is unrecoverable. Treat it like a root CA key.

14.9.2 Backup commands

Postgres (logical):

$pg_dump --format=custom --jobs=4 \
> --file=ida-$(date +%Y%m%d-%H%M).dump \
> "postgres://ida:$DB_PASS@$DB_HOST:5432/ida"
$
$# Compressed:
$pg_dump --format=custom --compress=9 --file=ida.dump "..."
$
$# Upload to S3
$aws s3 cp ida.dump s3://ida-backups/postgres/$(date +%Y/%m/%d)/

Redis (RDB snapshot):

$docker exec ida-redis redis-cli SAVE
$docker cp ida-redis:/data/dump.rdb ./redis-$(date +%Y%m%d-%H%M).rdb

14.9.3 Restore drill

Run quarterly:

$# 1. Provision a clean Postgres
$docker run -d --name ida-restore-test -p 55432:5432 \
> -e POSTGRES_DB=ida -e POSTGRES_USER=ida -e POSTGRES_PASSWORD=test \
> postgres:16-alpine
$
$# 2. Restore
$pg_restore --jobs=4 \
> --dbname=postgres://ida:test@localhost:55432/ida \
> ida-$(date +%Y%m%d).dump
$
$# 3. Smoke check
$psql "postgres://ida:test@localhost:55432/ida" -c "SELECT count(*) FROM credentials;"

14.9.4 Recovery objectives

ObjectiveTarget
RTO (recovery time)≤ 1 hour for API; ≤ 4 hours for full stack
RPO (recovery point)≤ 1 hour (hourly WAL archives)

14.9.5 Smart-contract recovery

On-chain state is immutable and replicated by the chain network — there is nothing to back up. To “recover”:

  1. Resync the new API instance against the deployed contracts (re-read events from the DIDRegistry, RevocationRegistry, etc.) — pkg/blockchain/sync.go should expose a backfill command. 2. Verify the off-chain DB matches the on-chain truth — periodic reconciliation job.