feat(server-backup): restic dual-repo backup (NAS + B2) with dashboard UI

Adds a server-wide backup capability beyond the existing ops_dashboard
pg_dump flow:

- Daily systemd timer (03:30) runs pg_dumpall + Forgejo dump, then restic
  to a local NAS repo and an offsite Backblaze B2 repo with Object Lock.
  Phase-based script with single-instance flock, structured statusfile,
  systemd hardening, and live-datadir excludes (Postgres / Forgejo) so
  the dumps stay authoritative.
- Ops-agent gets nine new read-only/trigger commands (snapshots, stats,
  status, logs, plus two triggers) backed by sudoers-whitelisted wrapper
  scripts that source /etc/restic-backup.env so the agent never sees the
  restic password or B2 keys.
- Two new flows (server_backup_full, server_backup_restore_test) drive
  the dashboard's "Backup now" and "Restore test" buttons.
- /settings/backups gains a Server backup section with overall + per-phase
  status, NAS / B2 snapshot tables, restore-size / raw-data / dedup-ratio
  stats, and the last restore-test result. The existing pg_dump section
  is preserved unchanged.
- Runbook docs/runbooks/server-backup.md follows the tailscale-setup
  pattern (plan + addendum) and covers B2 Object Lock + scoped keys,
  Forgejo subplan with isolated restore-test stack, the off-server
  maintenance flow for B2 prune, and the integrity-check schedule.

Code-only change — installation on scrum4me-srv follows the runbook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Madhura68 2026-05-15 13:03:00 +02:00
parent 27cba872a8
commit ab87c0fada
23 changed files with 2625 additions and 170 deletions

View file

@ -0,0 +1,12 @@
[Unit]
Description=Daily server-wide backup (timer)
[Timer]
# Daily at 03:30 local. After ops-db-backup.timer (02:00) so the ops_dashboard
# pg_dump from /srv/ops/backups/ is fresh when restic picks it up.
OnCalendar=*-*-* 03:30:00
Persistent=true
RandomizedDelaySec=600
[Install]
WantedBy=timers.target