Commit graph

13 commits

Author SHA1 Message Date
Madhura68
ab87c0fada feat(server-backup): restic dual-repo backup (NAS + B2) with dashboard UI
Adds a server-wide backup capability beyond the existing ops_dashboard
pg_dump flow:

- Daily systemd timer (03:30) runs pg_dumpall + Forgejo dump, then restic
  to a local NAS repo and an offsite Backblaze B2 repo with Object Lock.
  Phase-based script with single-instance flock, structured statusfile,
  systemd hardening, and live-datadir excludes (Postgres / Forgejo) so
  the dumps stay authoritative.
- Ops-agent gets nine new read-only/trigger commands (snapshots, stats,
  status, logs, plus two triggers) backed by sudoers-whitelisted wrapper
  scripts that source /etc/restic-backup.env so the agent never sees the
  restic password or B2 keys.
- Two new flows (server_backup_full, server_backup_restore_test) drive
  the dashboard's "Backup now" and "Restore test" buttons.
- /settings/backups gains a Server backup section with overall + per-phase
  status, NAS / B2 snapshot tables, restore-size / raw-data / dedup-ratio
  stats, and the last restore-test result. The existing pg_dump section
  is preserved unchanged.
- Runbook docs/runbooks/server-backup.md follows the tailscale-setup
  pattern (plan + addendum) and covers B2 Object Lock + scoped keys,
  Forgejo subplan with isolated restore-test stack, the off-server
  maintenance flow for B2 prune, and the integrity-check schedule.

Code-only change — installation on scrum4me-srv follows the runbook.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-15 13:03:00 +02:00
Janpeter Visser
68c4d037cf feat(flows): redeploy_all flow + fix MCP-worker cache-bust
Legt de volledige stack-redeploy vast als één flow: scrum4me-web
(pull/migrate/build/restart) gevolgd door de MCP-worker.

Onderweg een echte bug gevonden en gefixt: update_mcp_worker.yml deed
`docker_compose_build worker-idea` zónder cache-bust. De worker-idea
Dockerfile clonet scrum4me-mcp van GitHub in een aparte laag; zolang
MCP_GIT_REF gelijk blijft ('main') hergebruikt Docker die laag, dus
nieuwe MCP-commits werden NIET opgepikt. Een schijnbaar geslaagde
rebuild draaide stilletjes op oude MCP-code.

Wijzigingen:
- commands.yml.example: nieuw command docker_compose_build_worker_fresh
  dat via `sh -c` MCP_CACHE_BUST=$(date +%s) meegeeft — invalideert de
  clone-laag zodat de laatste MCP-code wordt gepulld
- update_mcp_worker.yml: gebruikt nu de fresh-build; pullt ook
  scrum4me-mcp lokaal (on_failure: continue, sync-only)
- redeploy_all.yml: nieuwe gecombineerde flow (16 stappen, web → worker)
- app/flows/redeploy-all/: UI-pagina + panel, zelfde patroon als de
  bestaande flow-pagina's
- app/flows/page.tsx: Redeploy All bovenaan de flows-lijst

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 12:05:39 +02:00
Scrum4Me Agent
4dd0490afc feat(backup): add ops-db backup commands, flow, and systemd timer
Adds pg_dump_ops_db, list_ops_backups, and cleanup_ops_backups to the
agent command whitelist. Includes a backup_ops_db flow YAML (dump +
30-day retention), and a systemd service/timer for daily automated
backups at 02:00.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 20:07:14 +02:00
Scrum4Me Agent
6bee8e8741 feat(flows): add update_mcp_worker flow with git_status, force-recreate, and health check
- update_mcp_worker.yml: git_status -> git_fetch -> git_pull (all cwd=scrum4me-docker) -> docker_compose_build -> docker_compose_up_recreate -> wait_for_health_worker
- commands.yml.example: add docker_compose_up_recreate (--force-recreate) and wait_for_health_worker (polls /var/log/agent/current for 'pre-flight passed')

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 19:45:36 +02:00
Scrum4Me Agent
9f590f1732 feat(flows): add update_scrum4me_web flow and UI page
- Update ops-agent/flows.example/update_scrum4me_web.yml with full
  deployment steps: git_status, git_fetch, git_log_ahead, git_pull,
  npm_ci, prisma_migrate_deploy, npm_run_build, systemctl_restart,
  and smoke test against thuis.jp-visser.nl/api/products
- Add npm_ci, prisma_migrate_deploy, npm_run_build, and
  curl_smoke_scrum4me_thuis to commands.yml.example
- Add /flows/update-scrum4me-web UI page with Run and Dry Run buttons,
  streaming terminal output, and link to audit log on completion

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 19:42:39 +02:00
Scrum4Me Agent
bdc24b57ba feat(flows): add YAML flow format, flow-runner, and /agent/v1/flow endpoint
- ops-agent/src/lib/flow-runner.ts: loads YAML flows, validates all steps
  against the command whitelist, executes sequentially; supports dry_run
  (emits WOULD RUN lines) and on_failure: abort|continue per step
- ops-agent/src/routes/flow.ts: POST /agent/v1/flow { flow_key, dry_run }
  streams step_start/stdout/stderr/step_done/done SSE events
- ops-agent/src/index.ts: register flow route, add FLOWS_PATH env var
- ops-agent/flows.example/: three flow definitions — update_scrum4me_web,
  update_mcp_worker, update_caddy_config; deploy to /etc/ops-agent/flows/
- ops-agent/commands.yml.example: add curl_smoke_scrum4me_web and
  docker_compose_ps_worker smoke-test commands
- app/api/flows/run/route.ts: Next.js proxy — creates FlowRun/FlowStep
  DB records per step, forwards SSE stream to browser
- hooks/useFlowRun.ts: add startFlow(flowKey, dryRun) method; handle
  step_start events to display step headers in the terminal
- components/StreamingTerminal.tsx: add 'info' line type (sky-400) for
  step headers

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 19:22:34 +02:00
Scrum4Me Agent
3781fce1e2 feat(ui): add action buttons to Docker, Git, systemd, and Caddy modules
- Docker table: Restart and Stop buttons per container row (docker_compose_restart / docker_compose_stop)
- Git repos list: Fetch and Pull buttons per repo; Pull disabled when working tree is dirty
- systemd units list: Restart button per unit (systemctl_restart)
- Caddy: Edit link on /caddy page, new /caddy/edit page with textarea + 3-step Validate → Save+Reload flow
- All buttons open ConfirmDialog with exact agent-call preview, then stream output via StreamingTerminal
- Add docker_compose_stop to ops-agent/commands.yml.example

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 19:14:49 +02:00
Scrum4Me Agent
234b2d1a58 feat(ops-agent): extend whitelist with destructive commands + preconditions
Adds docker_compose_restart/build/up, git_pull (guarded by
git_status_clean precondition), systemctl_restart (via sudo),
caddy_validate, caddy_reload, and caddy_write_config (atomic
stdin→Caddyfile.new→Caddyfile write).

- CommandDef gains preconditions[] and stdin_from_body fields
- exec route checks git_status_clean before git_pull; returns 409 on
  dirty tree with a clear message
- stdin field in ExecBody is piped to child stdin for caddy_write_config

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 17:53:05 +02:00
Scrum4Me Agent
1c51a0868f feat(caddy): add caddy_list_certs whitelist entry and cert parser
- Extend commands.yml.example with caddy_list_certs (sh loop over /data/caddy/certificates/*/*.crt using openssl)
- Add lib/parse-caddy.ts: parseCertList() parses CERTFILE/CERTEND delimited openssl output
- Add shiki ^1.29.2 dependency for server-side Caddyfile syntax highlighting

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 17:48:41 +02:00
Scrum4Me Agent
c12e36e0a4 feat(systemd): unit overview + journal viewer pages
- Add journalctl_recent command and scrum4me-web to whitelist in commands.yml.example
- Add SYSTEMD_UNITS env var to .env.example
- lib/parse-systemd.ts: parse activeState, subState, uptime, description
- /app/systemd: server page reading SYSTEMD_UNITS, client list with 10s polling and status badges
- /app/systemd/[unit]: server detail page, client component showing systemctl status + last 100 journal lines (polling 10s)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 17:41:54 +02:00
Scrum4Me Agent
4821d29670 feat(ops-agent): cwd_pattern support + git command whitelist
Add validateCwd() to whitelist.ts for dynamic-cwd validation, update
exec.ts to resolve first arg as cwd when cwd_pattern is set, and extend
commands.yml.example with git_status, git_log_ahead, git_diff, git_fetch.
Add REPO_PATHS to .env.example.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 17:32:50 +02:00
Scrum4Me Agent
d605eb17a5 feat(ops-agent): whitelist-config parser + strict command executor
- CommandDef now uses cmd[] array instead of exec string — no shell splitting
- validateArgs() checks every request arg against allowed list; rejects unknown values
- spawn() called with shell:false (execFile semantics); cwd from config
- Audit log (JSON) per call to stdout → captured by systemd journal
- commands.yml.example updated to new schema with 4 read-only commands

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 17:18:45 +02:00
Scrum4Me Agent
4bccbf28f3 feat: ops-agent Fastify service met SSE, whitelist en systemd-unit
- ops-agent/: Node.js Fastify+TypeScript service
  - GET /agent/v1/health
  - POST /agent/v1/exec → SSE stream (stdout/stderr/exit events)
  - Whitelist geladen uit /etc/ops-agent/commands.yml bij opstart
  - Auth via Bearer shared secret (/etc/ops-agent/secret, mode 0640)
  - Vier standaard commando's: docker_ps, git_status, systemctl_status,
    caddy_show_config
- deploy/ops-agent/ops-agent.service: systemd-unit (User=ops-agent,
  Restart=on-failure, StandardOutput=journal)
- deploy/ops-agent/setup.sh: aanmaken system-user, build, deploy,
  systemctl enable --now ops-agent

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-13 17:15:44 +02:00