docs(PBI-58): add developer manual chapters under docs/manual/
Adds a 7-file English-language manual targeted at new human contributors: index, overview, statuses & transitions (with mermaid state diagrams), git workflow, MCP integration, docker, and troubleshooting. The manual is the *map* — it cross-references existing runbooks/ADRs/architecture docs rather than duplicating their content. Regenerates docs/INDEX.md and validates with check-doc-links.mjs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
e8562d4018
commit
e75bac9375
8 changed files with 873 additions and 0 deletions
|
|
@ -105,6 +105,13 @@ Auto-generated on 2026-05-07 from front-matter and headings.
|
||||||
| [Docker smoke test — task 2](./docker-smoke/2-mei-task-2.md) | `docker-smoke/2-mei-task-2.md` | done | 2026-05-03 |
|
| [Docker smoke test — task 2](./docker-smoke/2-mei-task-2.md) | `docker-smoke/2-mei-task-2.md` | done | 2026-05-03 |
|
||||||
| [Scrum4Me — Functionele Specificatie](./functional.md) | `functional.md` | active | 2026-05-03 |
|
| [Scrum4Me — Functionele Specificatie](./functional.md) | `functional.md` | active | 2026-05-03 |
|
||||||
| [Scrum4Me — Glossary](./glossary.md) | `glossary.md` | active | 2026-05-03 |
|
| [Scrum4Me — Glossary](./glossary.md) | `glossary.md` | active | 2026-05-03 |
|
||||||
|
| [Overview](./manual/01-overview.md) | `manual/01-overview.md` | active | 2026-05-07 |
|
||||||
|
| [Statuses & Transitions](./manual/02-statuses-and-transitions.md) | `manual/02-statuses-and-transitions.md` | active | 2026-05-07 |
|
||||||
|
| [Git Workflow](./manual/03-git-workflow.md) | `manual/03-git-workflow.md` | active | 2026-05-07 |
|
||||||
|
| [MCP Integration](./manual/04-mcp-integration.md) | `manual/04-mcp-integration.md` | active | 2026-05-07 |
|
||||||
|
| [Docker](./manual/05-docker.md) | `manual/05-docker.md` | active | 2026-05-07 |
|
||||||
|
| [Troubleshooting](./manual/06-troubleshooting.md) | `manual/06-troubleshooting.md` | active | 2026-05-07 |
|
||||||
|
| [Scrum4Me Developer Manual](./manual/index.md) | `manual/index.md` | active | 2026-05-07 |
|
||||||
| [Scrum4Me — Styling & Design System](./md3-color-scheme.md) | `md3-color-scheme.md` | active | 2026-05-03 |
|
| [Scrum4Me — Styling & Design System](./md3-color-scheme.md) | `md3-color-scheme.md` | active | 2026-05-03 |
|
||||||
| [Obsidian as Personal Authoring Layer](./obsidian-authoring.md) | `obsidian-authoring.md` | active | 2026-05-02 |
|
| [Obsidian as Personal Authoring Layer](./obsidian-authoring.md) | `obsidian-authoring.md` | active | 2026-05-02 |
|
||||||
| [PbiDialog Profiel](./pbi-dialog.md) | `pbi-dialog.md` | active | 2026-05-03 |
|
| [PbiDialog Profiel](./pbi-dialog.md) | `pbi-dialog.md` | active | 2026-05-03 |
|
||||||
|
|
|
||||||
99
docs/manual/01-overview.md
Normal file
99
docs/manual/01-overview.md
Normal file
|
|
@ -0,0 +1,99 @@
|
||||||
|
---
|
||||||
|
title: "Overview"
|
||||||
|
status: active
|
||||||
|
audience: [contributor]
|
||||||
|
language: en
|
||||||
|
last_updated: 2026-05-07
|
||||||
|
when_to_read: "First chapter — start here for the elevator pitch and project structure."
|
||||||
|
---
|
||||||
|
|
||||||
|
# 01 — Overview
|
||||||
|
|
||||||
|
## What is Scrum4Me?
|
||||||
|
|
||||||
|
Scrum4Me is a **desktop-first fullstack web app for solo developers and small Scrum teams** who manage multiple software projects in parallel. It models the Scrum hierarchy explicitly (Product → PBI → Story → Task), supports Sprints with split-screen drag-and-drop planning, and integrates Claude Code as an automated implementation worker — every result the agent produces is logged back into the originating story.
|
||||||
|
|
||||||
|
The app is deployable to **Vercel + Neon** (default) and can run **fully local** via the worker container. A built-in demo user has read-only access; Product Owners add Developers by username, and those Developers gain write access to that product's stories, tasks, and sprints.
|
||||||
|
|
||||||
|
## Entity hierarchy
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
flowchart TB
|
||||||
|
Product["Product<br/>(per repo)"]
|
||||||
|
Idea["Idea<br/>(pre-PBI staging)"]
|
||||||
|
PBI["PBI<br/>(Product Backlog Item)"]
|
||||||
|
Story["Story"]
|
||||||
|
Task["Task"]
|
||||||
|
Sprint["Sprint<br/>(cross-cutting)"]
|
||||||
|
|
||||||
|
Product --> Idea
|
||||||
|
Idea -.->|"AI-grilled & planned"| PBI
|
||||||
|
Product --> PBI
|
||||||
|
PBI --> Story
|
||||||
|
Story --> Task
|
||||||
|
Sprint -.->|"contains stories<br/>denormalised on tasks"| Story
|
||||||
|
Sprint -.-> Task
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Product** — one row per repo. `repo_url`, `definition_of_done`, members.
|
||||||
|
- **Idea** — pre-PBI staging entity introduced in M12. Goes through `IDEA_GRILL` (AI Q&A loop) and `IDEA_MAKE_PLAN` jobs to produce a structured plan that can be turned into a PBI tree.
|
||||||
|
- **PBI** — a Product Backlog Item. Has `priority` (1–4) and `sort_order` (float, see [`docs/patterns/sort-order.md`](../patterns/sort-order.md)).
|
||||||
|
- **Story** — a unit of value under a PBI; has acceptance criteria. Lives in the backlog (`OPEN`) until added to a sprint.
|
||||||
|
- **Task** — the smallest unit; has an `implementation_plan` consumed by the Claude worker. `sprint_id` is denormalised from the parent story for query efficiency.
|
||||||
|
- **Sprint** — cross-cutting time-box. Stories are added to a sprint; tasks inherit `sprint_id`. Sprint execution has two modes: `PER_TASK` and `SPRINT_BATCH` — see [`docs/architecture/sprint-execution-modes.md`](../architecture/sprint-execution-modes.md).
|
||||||
|
|
||||||
|
For status lifecycles of each entity, see [02 — Statuses & Transitions](./02-statuses-and-transitions.md).
|
||||||
|
|
||||||
|
## Stack
|
||||||
|
|
||||||
|
| Layer | Technology |
|
||||||
|
|---|---|
|
||||||
|
| Framework | Next.js 16 (App Router) + React 19 |
|
||||||
|
| Language | TypeScript (strict) |
|
||||||
|
| Styling | Tailwind CSS + shadcn/ui + Material Design 3 tokens via [`app/styles/theme.css`](../../app/styles/theme.css) |
|
||||||
|
| Client state | Zustand + dnd-kit |
|
||||||
|
| Database | Prisma v7 + PostgreSQL (Neon) |
|
||||||
|
| Auth | iron-session + bcryptjs |
|
||||||
|
| Utilities | Zod, Sonner, Sharp, Vercel Analytics |
|
||||||
|
| Hosting | Vercel (app), Neon (DB), Mac/NAS Docker (worker) |
|
||||||
|
|
||||||
|
For the rationale behind each choice and the technologies we explicitly **don't** use, see [`docs/architecture/overview.md`](../architecture/overview.md).
|
||||||
|
|
||||||
|
## Repository layout
|
||||||
|
|
||||||
|
```
|
||||||
|
Scrum4Me/
|
||||||
|
├── app/ # Next.js App Router routes
|
||||||
|
│ ├── (app)/ # authenticated desktop UI
|
||||||
|
│ ├── (auth)/ # login, register, demo
|
||||||
|
│ ├── (mobile)/ # /m/* mobile shell (3 screens)
|
||||||
|
│ ├── api/ # REST route handlers (Claude integration)
|
||||||
|
│ └── styles/ # MD3 token CSS
|
||||||
|
├── components/ # shared UI components
|
||||||
|
├── lib/ # server/client utilities
|
||||||
|
│ └── task-status.ts # the ONLY place DB↔API enum mapping happens
|
||||||
|
├── prisma/ # schema + migrations
|
||||||
|
├── docs/ # this manual + ADRs, runbooks, patterns, specs
|
||||||
|
└── scripts/ # codegen, seeders, link checkers
|
||||||
|
```
|
||||||
|
|
||||||
|
The `*-server.ts` filename suffix marks server-only modules (DB, Node APIs). They must never be imported into a client component — see the hardstop in [`CLAUDE.md`](../../CLAUDE.md#hardstop-regels).
|
||||||
|
|
||||||
|
For a deeper structural breakdown including stores, realtime channels, and the job queue, see [`docs/architecture/project-structure.md`](../architecture/project-structure.md).
|
||||||
|
|
||||||
|
## Glossary refresh
|
||||||
|
|
||||||
|
A few terms used throughout this manual that often differ from "generic Scrum" usage:
|
||||||
|
|
||||||
|
- **PBI** — Product Backlog Item. Not "Feature" or "Epic".
|
||||||
|
- **Story** — A unit of work under a PBI. Not "Ticket" or "Issue".
|
||||||
|
- **Sprint Goal** — The narrative for a sprint. Not "Objective".
|
||||||
|
- **Worker** — A Claude Code agent claiming jobs from the Scrum4Me queue (M13).
|
||||||
|
- **Demo user** — A read-only built-in user; writes return `403`. See [`docs/adr/0006-demo-user-three-layer-policy.md`](../adr/0006-demo-user-three-layer-policy.md).
|
||||||
|
- **Idea** — Pre-PBI staging artefact (M12). Has its own state machine; see [02](./02-statuses-and-transitions.md#idea).
|
||||||
|
|
||||||
|
The complete glossary lives at [`docs/glossary.md`](../glossary.md).
|
||||||
|
|
||||||
|
## What's next
|
||||||
|
|
||||||
|
→ [02 — Statuses & Transitions](./02-statuses-and-transitions.md) covers how each entity moves through its lifecycle, with state-machine diagrams.
|
||||||
222
docs/manual/02-statuses-and-transitions.md
Normal file
222
docs/manual/02-statuses-and-transitions.md
Normal file
|
|
@ -0,0 +1,222 @@
|
||||||
|
---
|
||||||
|
title: "Statuses & Transitions"
|
||||||
|
status: active
|
||||||
|
audience: [contributor]
|
||||||
|
language: en
|
||||||
|
last_updated: 2026-05-07
|
||||||
|
when_to_read: "Whenever an entity's status changes unexpectedly or you need to know what status comes next."
|
||||||
|
---
|
||||||
|
|
||||||
|
# 02 — Statuses & Transitions
|
||||||
|
|
||||||
|
Every persistent entity in Scrum4Me has an explicit status enum. This chapter documents them all, with state-machine diagrams showing allowed transitions, the trigger for each transition (user action vs system / job-driven), and the side effects.
|
||||||
|
|
||||||
|
> **Hardstop:** the database stores enums in `UPPER_SNAKE`; the REST API exposes them in `lowercase`. Conversion happens **only** through [`lib/task-status.ts`](../../lib/task-status.ts) — never call `.toLowerCase()` or `.toUpperCase()` directly. See the [DB vs API mapping](#db-vs-api-mapping) section at the end.
|
||||||
|
|
||||||
|
## Quick reference
|
||||||
|
|
||||||
|
| Entity | Source enum | Statuses |
|
||||||
|
|---|---|---|
|
||||||
|
| [PBI](#pbi) | `PbiStatus` | `READY`, `BLOCKED`, `DONE`, `FAILED` |
|
||||||
|
| [Story](#story) | `StoryStatus` | `OPEN`, `IN_SPRINT`, `DONE`, `FAILED` |
|
||||||
|
| [Task](#task) | `TaskStatus` | `TO_DO`, `IN_PROGRESS`, `REVIEW`, `DONE`, `FAILED` |
|
||||||
|
| [Sprint](#sprint) | `SprintStatus` | `ACTIVE`, `COMPLETED`, `FAILED` |
|
||||||
|
| [SprintRun](#sprintrun) | `SprintRunStatus` | `QUEUED`, `RUNNING`, `PAUSED`, `DONE`, `FAILED`, `CANCELLED` |
|
||||||
|
| [ClaudeJob](#claudejob) | `ClaudeJobStatus` | `QUEUED`, `CLAIMED`, `RUNNING`, `DONE`, `FAILED`, `CANCELLED`, `SKIPPED` |
|
||||||
|
| [Idea](#idea) | `IdeaStatus` | `DRAFT`, `GRILLING`, `GRILL_FAILED`, `GRILLED`, `PLANNING`, `PLAN_FAILED`, `PLAN_READY`, `PLANNED` |
|
||||||
|
|
||||||
|
## PBI
|
||||||
|
|
||||||
|
A **Product Backlog Item** holds one or more stories. Its status reflects whether the PBI as a whole is ready to be picked up, blocked on something external, finished, or written off.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
stateDiagram-v2
|
||||||
|
[*] --> READY: create_pbi
|
||||||
|
READY --> BLOCKED: user marks blocked
|
||||||
|
BLOCKED --> READY: user unblocks
|
||||||
|
READY --> DONE: all stories DONE
|
||||||
|
READY --> FAILED: user gives up
|
||||||
|
BLOCKED --> FAILED: user gives up
|
||||||
|
DONE --> [*]
|
||||||
|
FAILED --> [*]
|
||||||
|
```
|
||||||
|
|
||||||
|
| Transition | Trigger | Side effect |
|
||||||
|
|---|---|---|
|
||||||
|
| `* → READY` | `create_pbi` MCP tool or PBI dialog | New PBI lands in `priority` group, `sort_order = last + 1` |
|
||||||
|
| `READY ↔ BLOCKED` | User toggles via PBI dialog | None besides log entry |
|
||||||
|
| `READY → DONE` | All child stories reach `DONE` | Auto-promotion (see [ST-1109 plan](../plans/ST-1109-pbi-status.md)) |
|
||||||
|
| `* → FAILED` | User gives up on the PBI | Stories may remain `OPEN`; PBI is filtered out of active boards |
|
||||||
|
|
||||||
|
## Story
|
||||||
|
|
||||||
|
A **Story** sits under a PBI. It moves out of the backlog when added to a Sprint, and reaches `DONE` when its tasks are complete and the implementation is verified.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
stateDiagram-v2
|
||||||
|
[*] --> OPEN: create_story
|
||||||
|
OPEN --> IN_SPRINT: added to sprint
|
||||||
|
IN_SPRINT --> OPEN: removed from sprint
|
||||||
|
IN_SPRINT --> DONE: all tasks DONE + verify passes
|
||||||
|
IN_SPRINT --> FAILED: verify fails / abandoned
|
||||||
|
DONE --> [*]
|
||||||
|
FAILED --> [*]
|
||||||
|
```
|
||||||
|
|
||||||
|
| Transition | Trigger | Side effect |
|
||||||
|
|---|---|---|
|
||||||
|
| `* → OPEN` | `create_story` MCP tool or Story dialog | Lives in product backlog |
|
||||||
|
| `OPEN ↔ IN_SPRINT` | Drag onto Sprint board, or sprint-removal | Tasks denormalise `sprint_id` |
|
||||||
|
| `IN_SPRINT → DONE` | Story completion via MCP / UI; auto-PR flow may trigger | Auto-PR flow ([`runbooks/auto-pr-flow.md`](../runbooks/auto-pr-flow.md)) may run; PBI is re-evaluated for `READY → DONE` |
|
||||||
|
| `IN_SPRINT → FAILED` | Verification failure or manual abandon | Logged in story log |
|
||||||
|
|
||||||
|
## Task
|
||||||
|
|
||||||
|
A **Task** is the smallest unit. The Claude worker mainly reads `implementation_plan` and writes status transitions through MCP tools.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
stateDiagram-v2
|
||||||
|
[*] --> TO_DO: create_task
|
||||||
|
TO_DO --> IN_PROGRESS: agent claims / user starts
|
||||||
|
IN_PROGRESS --> REVIEW: implementation done, awaiting verify
|
||||||
|
REVIEW --> DONE: verify passes
|
||||||
|
REVIEW --> IN_PROGRESS: verify fails, retry
|
||||||
|
IN_PROGRESS --> FAILED: unrecoverable error
|
||||||
|
REVIEW --> FAILED: gives up after retries
|
||||||
|
DONE --> [*]
|
||||||
|
FAILED --> [*]
|
||||||
|
```
|
||||||
|
|
||||||
|
| Transition | Trigger | Side effect |
|
||||||
|
|---|---|---|
|
||||||
|
| `* → TO_DO` | `create_task` MCP tool / Task dialog | Inherits `sprint_id` from parent story |
|
||||||
|
| `TO_DO → IN_PROGRESS` | Worker claim or user starts | Story may auto-promote to `IN_SPRINT` |
|
||||||
|
| `IN_PROGRESS → REVIEW` | Implementation logged | Optional `verify_task_against_plan` runs |
|
||||||
|
| `REVIEW → DONE` | Verify passes / human accepts | When all sibling tasks are `DONE`, the parent story is eligible for `DONE` |
|
||||||
|
| `* → FAILED` | Unrecoverable error or human marks failed | Story may auto-promote to `FAILED` |
|
||||||
|
|
||||||
|
The MCP tool is `update_task_status({ task_id, status })` accepting lowercase API values: `todo | in_progress | review | done | failed`.
|
||||||
|
|
||||||
|
## Sprint
|
||||||
|
|
||||||
|
A **Sprint** is the cross-cutting time-box. Its status tracks the overall sprint container, not the agent execution.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
stateDiagram-v2
|
||||||
|
[*] --> ACTIVE: create sprint
|
||||||
|
ACTIVE --> COMPLETED: user closes sprint
|
||||||
|
ACTIVE --> FAILED: user abandons sprint
|
||||||
|
COMPLETED --> [*]
|
||||||
|
FAILED --> [*]
|
||||||
|
```
|
||||||
|
|
||||||
|
For execution semantics (PER_TASK vs SPRINT_BATCH) see [`docs/architecture/sprint-execution-modes.md`](../architecture/sprint-execution-modes.md).
|
||||||
|
|
||||||
|
## SprintRun
|
||||||
|
|
||||||
|
A **SprintRun** is one execution attempt of a sprint by the agent worker. Multiple runs may exist over a sprint's lifetime (if a run is cancelled or paused and restarted).
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
stateDiagram-v2
|
||||||
|
[*] --> QUEUED: trigger sprint run
|
||||||
|
QUEUED --> RUNNING: worker claims
|
||||||
|
RUNNING --> PAUSED: pause requested
|
||||||
|
PAUSED --> RUNNING: resume
|
||||||
|
RUNNING --> DONE: all tasks done
|
||||||
|
RUNNING --> FAILED: unrecoverable
|
||||||
|
QUEUED --> CANCELLED: user cancels
|
||||||
|
RUNNING --> CANCELLED: user cancels
|
||||||
|
PAUSED --> CANCELLED: user cancels
|
||||||
|
DONE --> [*]
|
||||||
|
FAILED --> [*]
|
||||||
|
CANCELLED --> [*]
|
||||||
|
```
|
||||||
|
|
||||||
|
The cascade rules (which task transitions automatically promote the SprintRun) are described in [`docs/plans/sprint-pr-worktree-state-machines.md`](../plans/sprint-pr-worktree-state-machines.md). When calling `update_task_status` from inside a sprint run, pass the optional `sprint_run_id` so the server can validate ownership and propagate cascades.
|
||||||
|
|
||||||
|
## ClaudeJob
|
||||||
|
|
||||||
|
The agent **job queue** (M13). Each enqueued unit of work is a `ClaudeJob` with a `kind` (`TASK_IMPLEMENTATION`, `IDEA_GRILL`, `IDEA_MAKE_PLAN`, `PLAN_CHAT`, `SPRINT_IMPLEMENTATION`).
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
stateDiagram-v2
|
||||||
|
[*] --> QUEUED: enqueue
|
||||||
|
QUEUED --> CLAIMED: wait_for_job (FOR UPDATE SKIP LOCKED)
|
||||||
|
CLAIMED --> RUNNING: worker starts
|
||||||
|
RUNNING --> DONE: update_job_status('done')
|
||||||
|
RUNNING --> FAILED: update_job_status('failed')
|
||||||
|
QUEUED --> CANCELLED: user cancels
|
||||||
|
CLAIMED --> QUEUED: stale (>30min)
|
||||||
|
QUEUED --> SKIPPED: superseded
|
||||||
|
DONE --> [*]
|
||||||
|
FAILED --> [*]
|
||||||
|
CANCELLED --> [*]
|
||||||
|
SKIPPED --> [*]
|
||||||
|
```
|
||||||
|
|
||||||
|
| Transition | Trigger | Side effect |
|
||||||
|
|---|---|---|
|
||||||
|
| `QUEUED → CLAIMED` | `wait_for_job` atomically claims | Bearer token is bound to the job (`claimed_by_token_id`) |
|
||||||
|
| `CLAIMED → QUEUED` | Stale claim (>30 min) | Auto-requeue on next `wait_for_job` |
|
||||||
|
| `RUNNING → DONE` | `update_job_status('done')` | Optional token-cost telemetry stored on the row |
|
||||||
|
| `RUNNING → FAILED` | `update_job_status('failed')` | For `IDEA_GRILL`/`IDEA_MAKE_PLAN`, idea status auto-rolls to `GRILL_FAILED` / `PLAN_FAILED` |
|
||||||
|
|
||||||
|
For idempotency rules and recovery procedures see [`docs/runbooks/worker-idempotency.md`](../runbooks/worker-idempotency.md).
|
||||||
|
|
||||||
|
## Idea
|
||||||
|
|
||||||
|
The **Idea** entity (M12) is a pre-PBI staging area. It goes through two AI-driven phases: a **grill** (Q&A loop with the user to clarify the idea) and a **plan** (single-pass output of a structured PBI tree). Failures are explicit terminal-ish states that allow retry.
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
stateDiagram-v2
|
||||||
|
[*] --> DRAFT: create idea
|
||||||
|
DRAFT --> GRILLING: enqueue IDEA_GRILL
|
||||||
|
GRILLING --> GRILLED: update_idea_grill_md
|
||||||
|
GRILLING --> GRILL_FAILED: job failed
|
||||||
|
GRILL_FAILED --> GRILLING: retry
|
||||||
|
GRILLED --> PLANNING: enqueue IDEA_MAKE_PLAN
|
||||||
|
PLANNING --> PLAN_READY: update_idea_plan_md (parse ok)
|
||||||
|
PLANNING --> PLAN_FAILED: parsePlanMd rejected
|
||||||
|
PLAN_FAILED --> PLANNING: retry
|
||||||
|
PLAN_READY --> PLANNED: PBI tree created
|
||||||
|
PLANNED --> [*]
|
||||||
|
```
|
||||||
|
|
||||||
|
| Transition | Trigger | Side effect |
|
||||||
|
|---|---|---|
|
||||||
|
| `DRAFT → GRILLING` | User clicks "Grill" | Enqueues `IDEA_GRILL` job; worker reads `prompt_text` + `idea.grill_md` |
|
||||||
|
| `GRILLING → GRILLED` | `update_idea_grill_md` | Logs `IdeaLog{GRILL_RESULT}` |
|
||||||
|
| `* → GRILL_FAILED` | `update_job_status('failed')` for `IDEA_GRILL` | Idea remains usable; user can retry |
|
||||||
|
| `GRILLED → PLANNING` | User clicks "Make plan" | Enqueues `IDEA_MAKE_PLAN`; worker outputs strict YAML-frontmatter |
|
||||||
|
| `PLANNING → PLAN_READY` | `update_idea_plan_md` parse ok | Logs `IdeaLog{PLAN_RESULT}` |
|
||||||
|
| `PLANNING → PLAN_FAILED` | `parsePlanMd` rejected | Logs `IdeaLog{JOB_EVENT, errors}` |
|
||||||
|
| `PLAN_READY → PLANNED` | PBI tree generated from plan | Idea is archived; PBI/Story/Task tree appears in the backlog |
|
||||||
|
|
||||||
|
For the full Idea workflow, prompts, and `prompt_text` contents, see [`docs/plans/M12-ideas.md`](../plans/M12-ideas.md).
|
||||||
|
|
||||||
|
## DB vs API mapping
|
||||||
|
|
||||||
|
> **Hardstop:** never bypass [`lib/task-status.ts`](../../lib/task-status.ts).
|
||||||
|
|
||||||
|
The database stores enums in `UPPER_SNAKE` (`TO_DO`, `IN_PROGRESS`, `IN_SPRINT`, …) because Prisma + PostgreSQL prefer that convention. The REST API exposes them in `lowercase` (`todo`, `in_progress`, `in_sprint`, …) because that's the convention HTTP consumers expect.
|
||||||
|
|
||||||
|
The two are mapped **only** through the helpers in [`lib/task-status.ts`](../../lib/task-status.ts):
|
||||||
|
|
||||||
|
```ts
|
||||||
|
taskStatusToApi(status) // DB → API
|
||||||
|
taskStatusFromApi(input) // API → DB (returns null on bad input)
|
||||||
|
storyStatusToApi(status)
|
||||||
|
storyStatusFromApi(input)
|
||||||
|
pbiStatusToApi(status)
|
||||||
|
pbiStatusFromApi(input)
|
||||||
|
sprintStatusToApi(status)
|
||||||
|
sprintStatusFromApi(input)
|
||||||
|
sprintRunStatusToApi(status)
|
||||||
|
sprintRunStatusFromApi(input)
|
||||||
|
```
|
||||||
|
|
||||||
|
Bad input on the inbound side (`*FromApi`) returns `null` — the route handler converts that to a `422` Zod-style error. See [`docs/adr/0004-status-enum-mapping.md`](../adr/0004-status-enum-mapping.md) for the rationale.
|
||||||
|
|
||||||
|
## What's next
|
||||||
|
|
||||||
|
→ [03 — Git Workflow](./03-git-workflow.md) covers branching, commits, and the cost-driven PR rules.
|
||||||
99
docs/manual/03-git-workflow.md
Normal file
99
docs/manual/03-git-workflow.md
Normal file
|
|
@ -0,0 +1,99 @@
|
||||||
|
---
|
||||||
|
title: "Git Workflow"
|
||||||
|
status: active
|
||||||
|
audience: [contributor]
|
||||||
|
language: en
|
||||||
|
last_updated: 2026-05-07
|
||||||
|
when_to_read: "Before creating a branch, before committing, and especially before pushing or opening a PR."
|
||||||
|
---
|
||||||
|
|
||||||
|
# 03 — Git Workflow
|
||||||
|
|
||||||
|
The Scrum4Me git workflow is shaped by two pressures that don't usually appear together:
|
||||||
|
|
||||||
|
1. An **AI agent** that can produce many commits per hour without human review,
|
||||||
|
2. A **Vercel Hobby plan** that meters preview deployments and bills for them.
|
||||||
|
|
||||||
|
These two together drive a workflow that looks unusual compared to "feature-branch + PR-per-story". This chapter explains the *why*; the authoritative *how* lives in the runbooks linked at the bottom.
|
||||||
|
|
||||||
|
## The five guiding rules
|
||||||
|
|
||||||
|
### 1. One branch per milestone, not per story
|
||||||
|
|
||||||
|
A milestone (e.g. `M10-qr-login`) groups multiple stories that ship together. The agent runs through them on a single branch named `feat/M{N}-{slug}` (or `feat/ST-XXX-{slug}` for one-off stories without a milestone). All commits accumulate on that branch.
|
||||||
|
|
||||||
|
> **Why?** Every push to a feature branch triggers a Vercel preview build. Pushing per story would multiply the build cost without producing more reviewable units of work — the user reviews the milestone, not the story.
|
||||||
|
|
||||||
|
See [`docs/adr/0003-one-branch-per-milestone.md`](../adr/0003-one-branch-per-milestone.md) for the full rationale.
|
||||||
|
|
||||||
|
### 2. Commit per layer, not per task
|
||||||
|
|
||||||
|
A single task can touch the database, the API, and the UI. Each of those layers gets its own commit. The pattern:
|
||||||
|
|
||||||
|
```
|
||||||
|
feat(ST-XXX): add field X to Prisma schema # DB
|
||||||
|
feat(ST-XXX): add Y endpoint accepting X # API
|
||||||
|
feat(ST-XXX): wire X into the editor component # UI
|
||||||
|
chore(ST-XXX): configure sharp for X processing # config
|
||||||
|
docs(ST-XXX): document the X feature # docs
|
||||||
|
```
|
||||||
|
|
||||||
|
> **Why?** Reviewers and `git bisect` both benefit when one commit can be reverted without touching unrelated layers. A `feat: add profile system` mega-commit is an antipattern.
|
||||||
|
|
||||||
|
### 3. Push only after the user has tested
|
||||||
|
|
||||||
|
Commits accumulate **locally** until the milestone is functionally complete and the user has confirmed it works. Then — and only then — `git push` and `gh pr create`.
|
||||||
|
|
||||||
|
> **Why?** Same cost reason as rule 1. Mid-milestone "save points" should be local tags or `git stash`, not pushes. Some exceptions exist (planning-only PRs, emergency hotfixes); they're enumerated in [`branch-and-commit.md`](../runbooks/branch-and-commit.md#uitzonderingen-op-de-push-regel).
|
||||||
|
|
||||||
|
### 4. One PR per batch → one preview build
|
||||||
|
|
||||||
|
When the worker runs through a queue of jobs, the entire run produces **one** PR with one commit per task. No interim pushes, no force-pushes to clean up history, no PR-per-story splits.
|
||||||
|
|
||||||
|
The end-to-end verification — that one batch produces exactly one Vercel deployment — is in [`branch-and-commit.md`](../runbooks/branch-and-commit.md) (see the *End-to-end verificatie* section).
|
||||||
|
|
||||||
|
### 5. Auto-PR flow at the end
|
||||||
|
|
||||||
|
Once a story reaches `DONE`, the auto-PR flow takes over: it pushes the branch, opens a PR, waits for the scope to be complete, waits for checks, and merges. The contract for "scope complete" and the path-filter / label rules that decide whether a deploy actually runs are split between two runbooks:
|
||||||
|
|
||||||
|
- **End-to-end pipeline**: [`docs/runbooks/auto-pr-flow.md`](../runbooks/auto-pr-flow.md)
|
||||||
|
- **Selective deploy controls** (`skip-deploy` label, path-filter for `app/`/`components/`/`lib/`): [`docs/runbooks/deploy-control.md`](../runbooks/deploy-control.md)
|
||||||
|
|
||||||
|
## Commit message format
|
||||||
|
|
||||||
|
```
|
||||||
|
<type>(ST-XXX): short description
|
||||||
|
```
|
||||||
|
|
||||||
|
Where `<type>` is one of `feat`, `fix`, `chore`, `docs`. The story code in parentheses links the commit back to the Scrum4Me MCP entity.
|
||||||
|
|
||||||
|
For PBI-level work (no single story), use the PBI code: `docs(PBI-58): scaffold developer manual`.
|
||||||
|
|
||||||
|
## Merge conflicts
|
||||||
|
|
||||||
|
| Scenario | Conflict? | Mitigation |
|
||||||
|
|---|---|---|
|
||||||
|
| Multiple tasks on the same batch branch | No — they stack linearly on one branch | None needed |
|
||||||
|
| Two parallel batches touching the same files | Yes, possible | Serialise batches via the MCP `get_claude_context` flow (one story at a time per agent), or rebase before push |
|
||||||
|
| Long-lived branch drifting from `main` | Yes, possible | `git fetch origin main && git rebase origin/main` before `gh pr create` |
|
||||||
|
|
||||||
|
`git push --force` to "wipe" earlier preview builds is forbidden — it costs the same build again on recreation, defeating the purpose of the cost-control rules.
|
||||||
|
|
||||||
|
## When **not** to follow the strict rules
|
||||||
|
|
||||||
|
When the Vercel account moves to Pro (or another billing tier without per-build cost), this workflow can revert to the more conventional "branch + PR per story". When that happens, update the rule in [`branch-and-commit.md`](../runbooks/branch-and-commit.md) and log the change in [`docs/decisions/agent-instructions-history.md`](../decisions/agent-instructions-history.md).
|
||||||
|
|
||||||
|
## Deep links
|
||||||
|
|
||||||
|
| Topic | Authoritative source |
|
||||||
|
|---|---|
|
||||||
|
| Branch & commit rules (full normative spec) | [`docs/runbooks/branch-and-commit.md`](../runbooks/branch-and-commit.md) |
|
||||||
|
| Auto-PR flow (story-DONE → merged-PR pipeline) | [`docs/runbooks/auto-pr-flow.md`](../runbooks/auto-pr-flow.md) |
|
||||||
|
| Deploy controls (labels, path-filter) | [`docs/runbooks/deploy-control.md`](../runbooks/deploy-control.md) |
|
||||||
|
| Vercel deployment specifics | [`docs/runbooks/deploy-vercel.md`](../runbooks/deploy-vercel.md) |
|
||||||
|
| Decision rationale (one-branch-per-milestone) | [`docs/adr/0003-one-branch-per-milestone.md`](../adr/0003-one-branch-per-milestone.md) |
|
||||||
|
| Worker idempotency & job-status protocol | [`docs/runbooks/worker-idempotency.md`](../runbooks/worker-idempotency.md) |
|
||||||
|
|
||||||
|
## What's next
|
||||||
|
|
||||||
|
→ [04 — MCP Integration](./04-mcp-integration.md) covers how the Claude agent drives this workflow from the queue side.
|
||||||
121
docs/manual/04-mcp-integration.md
Normal file
121
docs/manual/04-mcp-integration.md
Normal file
|
|
@ -0,0 +1,121 @@
|
||||||
|
---
|
||||||
|
title: "MCP Integration"
|
||||||
|
status: active
|
||||||
|
audience: [contributor]
|
||||||
|
language: en
|
||||||
|
last_updated: 2026-05-07
|
||||||
|
when_to_read: "Whenever Claude Code is interacting with Scrum4Me — opening a story, claiming a job, asking the user a question."
|
||||||
|
---
|
||||||
|
|
||||||
|
# 04 — MCP Integration
|
||||||
|
|
||||||
|
Scrum4Me exposes its REST API as native Claude Code tools through a dedicated **MCP server** living in [`madhura68/scrum4me-mcp`](https://github.com/madhura68/scrum4me-mcp). Schemas are shared via a git submodule (`vendor/scrum4me`) so there's exactly one definition of every type. From the agent's perspective, Scrum4Me looks like a set of native tools prefixed `mcp__scrum4me__*`.
|
||||||
|
|
||||||
|
This chapter is the **onboarding tour**. The full tool reference (all 18 tools, their parameters, and edge cases) is in [`docs/runbooks/mcp-integration.md`](../runbooks/mcp-integration.md).
|
||||||
|
|
||||||
|
## Three ways the agent works
|
||||||
|
|
||||||
|
| Mode | Triggered by | Loop |
|
||||||
|
|---|---|---|
|
||||||
|
| **Track A — MCP-driven** | User says *"implement the next story"* | `get_claude_context` → execute tasks → `update_task_status` → commit per layer → repeat until queue empty → push + PR |
|
||||||
|
| **Track B — Manual** | User describes a one-off change in chat | Read pattern + styling → edit → verify → wait for `commit it` → commit |
|
||||||
|
| **Worker — Queue-driven** | Background worker container running on a Mac/NAS | `wait_for_job` (blocks ≤600s) → switch on `kind` → execute → `update_job_status` → loop forever |
|
||||||
|
|
||||||
|
CLAUDE.md describes Track A and Track B; this manual focuses on the **Worker** mode because it's the most novel and the most likely to surprise a new contributor reading server logs.
|
||||||
|
|
||||||
|
## A typical Track A run
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant U as User
|
||||||
|
participant C as Claude
|
||||||
|
participant M as MCP server
|
||||||
|
participant DB as Postgres
|
||||||
|
|
||||||
|
U->>C: "implement the next story"
|
||||||
|
C->>M: get_claude_context(product_id)
|
||||||
|
M->>DB: SELECT product, sprint, next story, tasks
|
||||||
|
M-->>C: { story, tasks[], pbi, sprint }
|
||||||
|
loop per task in sort_order
|
||||||
|
C->>M: update_task_status(task_id, 'in_progress')
|
||||||
|
C->>C: read pattern + styling, edit files
|
||||||
|
C->>M: log_implementation(story_id, content)
|
||||||
|
C->>M: update_task_status(task_id, 'review')
|
||||||
|
C->>M: log_test_result(story_id, 'PASSED')
|
||||||
|
C->>M: update_task_status(task_id, 'done')
|
||||||
|
end
|
||||||
|
C->>U: "milestone ready for your test"
|
||||||
|
U->>C: "looks good, push it"
|
||||||
|
C->>C: git push + gh pr create
|
||||||
|
```
|
||||||
|
|
||||||
|
The contract every step relies on:
|
||||||
|
|
||||||
|
- All inputs are **lowercase API enums** (`'in_progress'`, never `'IN_PROGRESS'`); the MCP server applies [`lib/task-status.ts`](../../lib/task-status.ts) under the hood.
|
||||||
|
- Status writes are **forbidden for demo accounts** — they return `403`. See [02 — Statuses](./02-statuses-and-transitions.md#db-vs-api-mapping) and [`docs/adr/0006-demo-user-three-layer-policy.md`](../adr/0006-demo-user-three-layer-policy.md).
|
||||||
|
- Bearer tokens are bound to a product. `list_products` returns only what the token can see; `get_claude_context` is product-scoped.
|
||||||
|
|
||||||
|
## Idea jobs vs task implementation
|
||||||
|
|
||||||
|
The worker `wait_for_job` returns a payload with a `kind` discriminator. The agent must switch on it:
|
||||||
|
|
||||||
|
| `kind` | Behaviour |
|
||||||
|
|---|---|
|
||||||
|
| `TASK_IMPLEMENTATION` | Default. Execute the `implementation_plan`, follow the [git workflow](./03-git-workflow.md), end with `update_job_status('done')`. |
|
||||||
|
| `IDEA_GRILL` | Read embedded `prompt_text` + existing `idea.grill_md`. Iterate with `ask_user_question` / `get_question_answer`. End with `update_idea_grill_md(markdown)`. |
|
||||||
|
| `IDEA_MAKE_PLAN` | Read `prompt_text` + `idea.grill_md`. **Do not ask questions** — single-pass output in strict YAML-frontmatter. End with `update_idea_plan_md(markdown)`. Server-side parser may reject → `PLAN_FAILED`. |
|
||||||
|
| `PLAN_CHAT` | Conversational refinement loop on an existing plan (M12+). |
|
||||||
|
| `SPRINT_IMPLEMENTATION` | Sprint-level run that cascades through every task; `update_task_status` calls must include the `sprint_run_id`. |
|
||||||
|
|
||||||
|
For the full Idea state machine (DRAFT → GRILLING → … → PLANNED) see [02 — Statuses & Transitions § Idea](./02-statuses-and-transitions.md#idea).
|
||||||
|
|
||||||
|
## The Q&A channel
|
||||||
|
|
||||||
|
When Claude needs a human decision mid-run, it doesn't block silently — it posts a question through the MCP and either polls or returns control:
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant C as Claude
|
||||||
|
participant M as MCP
|
||||||
|
participant DB as Postgres
|
||||||
|
participant U as User (NavBar bell)
|
||||||
|
C->>M: ask_user_question({ story_id, question, wait_seconds: 600 })
|
||||||
|
M->>DB: INSERT user_question; NOTIFY user_question_created
|
||||||
|
DB-->>U: SSE event → bell pulses
|
||||||
|
U->>M: POST /api/questions/:id/answer
|
||||||
|
M->>DB: UPDATE user_question; NOTIFY user_question_answered
|
||||||
|
DB-->>C: ask_user_question returns { answer }
|
||||||
|
C->>C: continue execution
|
||||||
|
```
|
||||||
|
|
||||||
|
Key facts:
|
||||||
|
|
||||||
|
- `wait_seconds` is capped at 600. If the user doesn't answer in time, `ask_user_question` returns with status `pending`; Claude can resume later via `get_question_answer(question_id)`.
|
||||||
|
- Idea questions (`{ idea_id }` instead of `{ story_id }`) are **user-private** — they bypass `productAccessFilter`, so collaborators don't see them.
|
||||||
|
- A question can be cancelled by the asker via `cancel_question`.
|
||||||
|
|
||||||
|
The persistent design (table + `LISTEN/NOTIFY`) is documented in [`docs/architecture/claude-question-channel.md`](../architecture/claude-question-channel.md).
|
||||||
|
|
||||||
|
## The worker's pre-flight quota check
|
||||||
|
|
||||||
|
The worker doesn't blindly call `wait_for_job`. Each iteration it first checks Anthropic API quota via `bin/worker-quota-probe.sh` so it doesn't burn a 10-minute block on a queue it can't actually process. The full algorithm — settings, `worker_heartbeat` SSE event, sleep-until-reset — is in [`docs/runbooks/mcp-integration.md`](../runbooks/mcp-integration.md#pre-flight-quota-check-m13). The Docker chapter ([05](./05-docker.md#quota-probe)) shows how to test it locally.
|
||||||
|
|
||||||
|
## Schema-drift watchdog
|
||||||
|
|
||||||
|
If Scrum4Me's Prisma schema changes but `scrum4me-mcp` isn't synced, the MCP server will fail at runtime — not at deploy. To prevent that, a remote agent runs every Monday at 08:00 Amsterdam time, syncs `vendor/scrum4me`, and runs `prisma:generate` + `tsc --noEmit` in `scrum4me-mcp`. Drift reports must be resolved **before** any Scrum4Me PR with schema changes can merge. See [`docs/runbooks/mcp-integration.md`](../runbooks/mcp-integration.md#schema-drift-bewaking).
|
||||||
|
|
||||||
|
## Deep links
|
||||||
|
|
||||||
|
| Topic | Authoritative source |
|
||||||
|
|---|---|
|
||||||
|
| Tool reference (all 18 tools) | [`docs/runbooks/mcp-integration.md`](../runbooks/mcp-integration.md) |
|
||||||
|
| Worker idempotency & job-status protocol | [`docs/runbooks/worker-idempotency.md`](../runbooks/worker-idempotency.md) |
|
||||||
|
| Q&A channel architecture (table + LISTEN/NOTIFY) | [`docs/architecture/claude-question-channel.md`](../architecture/claude-question-channel.md) |
|
||||||
|
| Idea-laag plan & prompts | [`docs/plans/M12-ideas.md`](../plans/M12-ideas.md) |
|
||||||
|
| Sprint execution modes (PER_TASK vs SPRINT_BATCH) | [`docs/architecture/sprint-execution-modes.md`](../architecture/sprint-execution-modes.md) |
|
||||||
|
| Realtime NOTIFY payload contract | [`docs/patterns/realtime-notify-payload.md`](../patterns/realtime-notify-payload.md) |
|
||||||
|
| Demo-user write protection | [`docs/adr/0006-demo-user-three-layer-policy.md`](../adr/0006-demo-user-three-layer-policy.md) |
|
||||||
|
|
||||||
|
## What's next
|
||||||
|
|
||||||
|
→ [05 — Docker](./05-docker.md) covers how the worker container is run, debugged, and operated.
|
||||||
149
docs/manual/05-docker.md
Normal file
149
docs/manual/05-docker.md
Normal file
|
|
@ -0,0 +1,149 @@
|
||||||
|
---
|
||||||
|
title: "Docker"
|
||||||
|
status: active
|
||||||
|
audience: [contributor]
|
||||||
|
language: en
|
||||||
|
last_updated: 2026-05-07
|
||||||
|
when_to_read: "Before running the worker locally, debugging a stuck job, or operating the Mac/NAS deployment."
|
||||||
|
---
|
||||||
|
|
||||||
|
# 05 — Docker
|
||||||
|
|
||||||
|
This chapter is the contributor's tour of the Docker side of Scrum4Me. Two important up-front facts:
|
||||||
|
|
||||||
|
1. **The Next.js app is not containerised.** The web UI, API routes, server actions, and database connection all run on **Vercel** (serverless functions + Edge runtime). There is no `Dockerfile` in this repo and no `docker-compose.yml`.
|
||||||
|
2. **Only the worker is containerised.** The "worker" is a Claude Code agent in a long-running container that polls the Scrum4Me job queue via MCP and executes `TASK_IMPLEMENTATION` / `IDEA_GRILL` / `IDEA_MAKE_PLAN` / `SPRINT_IMPLEMENTATION` jobs.
|
||||||
|
|
||||||
|
The container image and its supporting scripts live in a **separate repo**: [`madhura68/scrum4me-docker`](https://github.com/madhura68/scrum4me-docker). This manual documents the consumer side — what the worker is, how it relates to Scrum4Me, and how to diagnose issues. The container internals (Dockerfile, entrypoint, agent provisioning) are out of scope for this manual; see that repo's README.
|
||||||
|
|
||||||
|
> **Note:** A separate sandbox repo `scrum4me-sbx` ([`SC-4`](https://github.com/madhura68/scrum4me-sbx)) exists for Docker exploration. Treat it as a scratchpad, not as the production worker.
|
||||||
|
|
||||||
|
## Topology
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
flowchart LR
|
||||||
|
subgraph Vercel
|
||||||
|
App[Next.js app<br/>+ API routes]
|
||||||
|
end
|
||||||
|
subgraph Neon
|
||||||
|
DB[(Postgres)]
|
||||||
|
end
|
||||||
|
subgraph Mac["Mac (default) / NAS (opt-in)"]
|
||||||
|
Worker[Worker container<br/>Claude Code + MCP]
|
||||||
|
end
|
||||||
|
Worker -- MCP over HTTPS --> App
|
||||||
|
App -- Prisma --> DB
|
||||||
|
Worker -- git push --> GH[GitHub]
|
||||||
|
GH -- webhooks --> App
|
||||||
|
```
|
||||||
|
|
||||||
|
- The worker **never connects to the database directly**. All state changes go through MCP tools, which call the Vercel-hosted REST API, which writes to Neon via Prisma.
|
||||||
|
- The worker **does** push commits directly to GitHub. GitHub then notifies Vercel and the auto-PR flow ([03 — Git Workflow](./03-git-workflow.md)) takes over.
|
||||||
|
|
||||||
|
## Mac vs NAS
|
||||||
|
|
||||||
|
| Flow | When to use | Status |
|
||||||
|
|---|---|---|
|
||||||
|
| **Mac-native (arm64)** | Default for development and small teams | Active |
|
||||||
|
| **NAS** | Self-hosted always-on worker on a Synology / Asustor / similar | Opt-in, validated by historical smoke tests in [`docs/docker-smoke/`](../docker-smoke/) |
|
||||||
|
|
||||||
|
The Mac flow is the default because it doesn't require dedicated hardware. The container runs natively on Apple Silicon (arm64) — no x86 emulation overhead.
|
||||||
|
|
||||||
|
## Environment variables the worker needs
|
||||||
|
|
||||||
|
The worker container needs **only** what's required to authenticate to MCP and push to GitHub:
|
||||||
|
|
||||||
|
| Var | Purpose |
|
||||||
|
|---|---|
|
||||||
|
| `SCRUM4ME_BEARER_TOKEN` | Bearer token bound to a product. Returned by the user's API-token settings page. |
|
||||||
|
| `SCRUM4ME_BASE_URL` | Usually `https://scrum4me.vercel.app` (or the user's domain). |
|
||||||
|
| `GITHUB_TOKEN` | Personal access token with `repo` scope, used by `git push` and `gh pr create`. |
|
||||||
|
| `ANTHROPIC_API_KEY` | The Claude API key used by the worker process. |
|
||||||
|
| `MIN_QUOTA_PCT` | Optional. Worker pauses if Anthropic quota drops below this percentage. |
|
||||||
|
|
||||||
|
> **Hardstop:** the worker does **not** need `DATABASE_URL`, `SESSION_SECRET`, or `CRON_SECRET`. Those belong to the Next.js app; the worker only talks to MCP. If you find yourself adding DB env vars to the worker, stop — you're solving the wrong problem.
|
||||||
|
|
||||||
|
The full list and provisioning instructions live in the [`scrum4me-docker` README](https://github.com/madhura68/scrum4me-docker). **TODO:** link to specific sections of that README once it's stable.
|
||||||
|
|
||||||
|
## What the worker loop does, on a single iteration
|
||||||
|
|
||||||
|
```mermaid
|
||||||
|
sequenceDiagram
|
||||||
|
participant W as Worker
|
||||||
|
participant Q as worker-quota-probe.sh
|
||||||
|
participant M as MCP server
|
||||||
|
W->>Q: probe Anthropic quota
|
||||||
|
Q-->>W: { pct, reset_at_iso }
|
||||||
|
alt pct < MIN_QUOTA_PCT
|
||||||
|
W->>M: worker_heartbeat(pct, last_quota_check_at)
|
||||||
|
W->>W: sleep until reset_at_iso (cap 1h)
|
||||||
|
else quota ok
|
||||||
|
W->>M: worker_heartbeat(pct, last_quota_check_at)
|
||||||
|
W->>M: wait_for_job (block ≤600s, claim atomically)
|
||||||
|
alt queue empty
|
||||||
|
W->>W: continue (no work, loop again)
|
||||||
|
else got job
|
||||||
|
W->>W: execute by `kind`
|
||||||
|
W->>M: update_job_status(done|failed)
|
||||||
|
end
|
||||||
|
end
|
||||||
|
Note over W: continue forever
|
||||||
|
```
|
||||||
|
|
||||||
|
The loop is described authoritatively in [`docs/runbooks/mcp-integration.md`](../runbooks/mcp-integration.md#batch-loop-verplichte-agent-flow) and [`docs/runbooks/worker-idempotency.md`](../runbooks/worker-idempotency.md).
|
||||||
|
|
||||||
|
### Quota probe
|
||||||
|
|
||||||
|
`bin/worker-quota-probe.sh` (in `scrum4me-docker`) makes a tiny call to the Anthropic API to read the current quota percentage and reset time. Cost: ~1 output token per probe (~12 tokens/hour at 5-minute intervals). The default `MIN_QUOTA_PCT` is **20%** — typically high enough on Pro/Max plans that the worker never pauses during normal day-job hours.
|
||||||
|
|
||||||
|
### Heartbeat
|
||||||
|
|
||||||
|
Every iteration the worker calls `worker_heartbeat({ last_quota_pct, last_quota_check_at })`. The MCP server emits an SSE event so the NavBar in the Next.js app shows the worker as live. A heartbeat older than 15 seconds is rendered as "offline" / "stand-by" in the UI.
|
||||||
|
|
||||||
|
### Stale-claim recovery
|
||||||
|
|
||||||
|
If a worker dies mid-job (process crash, container kill, network partition), its claimed job stays as `CLAIMED` in the database. After **30 minutes** the next `wait_for_job` call automatically requeues it (`CLAIMED → QUEUED`) before claiming a fresh one. No manual intervention is required for clean recovery.
|
||||||
|
|
||||||
|
When you **do** need to manually requeue a job (e.g. you killed it intentionally and don't want to wait 30 min), the operator route is the admin board → "Requeue job" button. **TODO:** confirm the exact UI path; this is not yet documented in `docs/runbooks/`.
|
||||||
|
|
||||||
|
## Running the worker locally
|
||||||
|
|
||||||
|
The intended local workflow per the project's standing memory is **Mac-native Docker** (the user's `project_docker_default_target` memory). High-level steps (verify against the [scrum4me-docker README](https://github.com/madhura68/scrum4me-docker) for exact commands):
|
||||||
|
|
||||||
|
1. Clone `scrum4me-docker` next to `Scrum4Me/` (so `~/Development/Scrum4Me/scrum4me-docker/`).
|
||||||
|
2. Provision the env vars above (typically a `.env` file in that repo, **not committed**).
|
||||||
|
3. `docker build` the image and `docker run` it with the env file mounted.
|
||||||
|
4. Watch container logs for the heartbeat/quota cycle.
|
||||||
|
5. Trigger a job from the UI ("Voer alle uit" on the Solo Board) and verify the worker picks it up within ~5 seconds.
|
||||||
|
|
||||||
|
> **TODO:** once the `scrum4me-docker` README has stabilised, replace the bullets above with copy-paste-ready commands. Until then, defer to that repo for canonical instructions.
|
||||||
|
|
||||||
|
## Debugging a stuck worker
|
||||||
|
|
||||||
|
| Symptom | Likely cause | Fix |
|
||||||
|
|---|---|---|
|
||||||
|
| Worker shows offline in NavBar but container is running | `worker_heartbeat` not reaching MCP | Check `SCRUM4ME_BASE_URL` and `SCRUM4ME_BEARER_TOKEN`; tail container logs for HTTP errors |
|
||||||
|
| Worker logs say "stand-by" indefinitely | `pct < MIN_QUOTA_PCT` and reset_at not reached | Lower `MIN_QUOTA_PCT` for testing, or wait for the printed `reset_at_iso` |
|
||||||
|
| Job stuck `CLAIMED` for >30 min | Worker died mid-job | Wait — auto-requeue triggers on next `wait_for_job` |
|
||||||
|
| Worker claims job but never updates status | Crashed before `update_job_status`; container restarted in a loop | Check `docker logs`; the next `wait_for_job` will requeue stale claims |
|
||||||
|
| `update_job_status` returns `403` | Bearer token doesn't match `claimed_by_token_id` | The token was rotated mid-run; restart with fresh token |
|
||||||
|
|
||||||
|
For deeper troubleshooting see [06 — Troubleshooting](./06-troubleshooting.md).
|
||||||
|
|
||||||
|
## Smoke-test references
|
||||||
|
|
||||||
|
Historical Docker smoke tests live in [`docs/docker-smoke/`](../docker-smoke/). They validated the worktree-isolation + branch-per-story flow when the Docker worker was first introduced. They are **historical** — don't expect them to be runnable as-is — but they're a useful reference when you want to verify the same flow on a new container image.
|
||||||
|
|
||||||
|
## Deep links
|
||||||
|
|
||||||
|
| Topic | Source |
|
||||||
|
|---|---|
|
||||||
|
| Container image, Dockerfile, build | [`scrum4me-docker` repo](https://github.com/madhura68/scrum4me-docker) |
|
||||||
|
| Worker loop & quota check | [`docs/runbooks/mcp-integration.md`](../runbooks/mcp-integration.md#pre-flight-quota-check-m13) |
|
||||||
|
| Worker idempotency / job-status protocol | [`docs/runbooks/worker-idempotency.md`](../runbooks/worker-idempotency.md) |
|
||||||
|
| Historical smoke tests | [`docs/docker-smoke/`](../docker-smoke/) |
|
||||||
|
| Sandbox / exploration repo | [`scrum4me-sbx` repo](https://github.com/madhura68/scrum4me-sbx) |
|
||||||
|
|
||||||
|
## What's next
|
||||||
|
|
||||||
|
→ [06 — Troubleshooting](./06-troubleshooting.md) covers error codes and recovery procedures across the full stack.
|
||||||
112
docs/manual/06-troubleshooting.md
Normal file
112
docs/manual/06-troubleshooting.md
Normal file
|
|
@ -0,0 +1,112 @@
|
||||||
|
---
|
||||||
|
title: "Troubleshooting"
|
||||||
|
status: active
|
||||||
|
audience: [contributor]
|
||||||
|
language: en
|
||||||
|
last_updated: 2026-05-07
|
||||||
|
when_to_read: "When something breaks. Start with the symptom table; fall back to the error-code reference."
|
||||||
|
---
|
||||||
|
|
||||||
|
# 06 — Troubleshooting
|
||||||
|
|
||||||
|
This chapter is the **first place to look** when something is wrong. Each row links to the authoritative source so you can dig deeper without losing your trail.
|
||||||
|
|
||||||
|
## Error code reference
|
||||||
|
|
||||||
|
These three HTTP status codes are non-negotiable hardstops in the API surface — they always mean the same thing across every route handler.
|
||||||
|
|
||||||
|
| Code | Meaning | Where it comes from |
|
||||||
|
|---|---|---|
|
||||||
|
| **`400`** | JSON parse error | Body couldn't be parsed as JSON. Usually a malformed request from a client. |
|
||||||
|
| **`422`** | Zod validation error | Body parsed, but failed schema validation. Response includes the offending field path. |
|
||||||
|
| **`403`** | Demo-user write blocked | Authenticated user `is_demo = true` attempted a write. Three layers enforce this — see [`docs/adr/0006-demo-user-three-layer-policy.md`](../adr/0006-demo-user-three-layer-policy.md). |
|
||||||
|
|
||||||
|
> **Hardstop:** these codes are reserved. Do not use `400` for validation errors or `422` for unauthorised access. The contract is enforced at the route-handler level — see the [Route Handler pattern](../patterns/route-handler.md).
|
||||||
|
|
||||||
|
Other common codes:
|
||||||
|
|
||||||
|
| Code | Meaning |
|
||||||
|
|---|---|
|
||||||
|
| `401` | No session / invalid bearer token |
|
||||||
|
| `404` | Resource not found, or token does not have access |
|
||||||
|
| `409` | State conflict — e.g. trying to claim a job that's already `CLAIMED` |
|
||||||
|
| `429` | Rate-limited — typically the Anthropic quota cap, not Scrum4Me itself |
|
||||||
|
| `500` | Unhandled server error. Always check Vercel function logs. |
|
||||||
|
|
||||||
|
## Symptom → cause → fix
|
||||||
|
|
||||||
|
### MCP
|
||||||
|
|
||||||
|
| Symptom | Likely cause | Fix |
|
||||||
|
|---|---|---|
|
||||||
|
| `mcp__scrum4me__get_claude_context` returns `null` or empty story | Bearer token doesn't have access to that product | Run `mcp__scrum4me__list_products` to confirm scope; rotate the token if needed |
|
||||||
|
| `mcp__scrum4me__update_task_status` returns `403` | Demo user, or token mismatch in a sprint run | Check user identity; if inside a sprint run, the bearer token must match `claimed_by_token_id` of the parent job |
|
||||||
|
| `mcp__scrum4me__wait_for_job` returns nothing for the full 600s block | Queue is genuinely empty | This is normal — loop and call again. See [`runbooks/mcp-integration.md`](../runbooks/mcp-integration.md#batch-loop-verplichte-agent-flow) |
|
||||||
|
| Job stays `CLAIMED` for >30 minutes | Worker died mid-job | Auto-requeue triggers on next `wait_for_job`; no manual action needed |
|
||||||
|
| `update_idea_plan_md` causes idea to flip to `PLAN_FAILED` | `parsePlanMd` server-side rejected the YAML-frontmatter | Inspect `IdeaLog{JOB_EVENT, errors}` for the parse error; re-run `IDEA_MAKE_PLAN` after fixing the prompt |
|
||||||
|
|
||||||
|
### Statuses & data integrity
|
||||||
|
|
||||||
|
| Symptom | Likely cause | Fix |
|
||||||
|
|---|---|---|
|
||||||
|
| Status displayed differently in DB vs UI | Some code path bypassed `lib/task-status.ts` | Grep the codebase for direct enum string usage; force everything through the mappers. See [`adr/0004-status-enum-mapping.md`](../adr/0004-status-enum-mapping.md) |
|
||||||
|
| Story stuck `IN_SPRINT` when all tasks are `DONE` | Auto-promotion not triggered | Check the most recent `update_task_status` call — it may have failed silently. Re-issue with the correct task |
|
||||||
|
| PBI not auto-promoting to `DONE` | Not all child stories are `DONE` yet | List stories under the PBI; one is probably still `OPEN` or `IN_SPRINT` |
|
||||||
|
| `422` from `create_pbi` / `create_story` / `create_task` | Zod validation failed (length cap, missing required field) | Response body includes field path — fix and retry |
|
||||||
|
| `IdeaStatus` stays `GRILLING` long after the worker stopped | The job ended without calling `update_idea_grill_md` | Check the worker logs for an exception; manually requeue or mark `GRILL_FAILED` to allow retry |
|
||||||
|
|
||||||
|
### Git & deploy
|
||||||
|
|
||||||
|
| Symptom | Likely cause | Fix |
|
||||||
|
|---|---|---|
|
||||||
|
| Unexpected Vercel preview build appeared mid-batch | An interim push happened that shouldn't have | Inspect `git log --all --graph` for the offending push; review [`runbooks/branch-and-commit.md`](../runbooks/branch-and-commit.md) |
|
||||||
|
| PR has multiple Vercel deployments for the same commit range | Force-push, or push-then-revert | Don't force-push. If genuinely needed, document in the PR description |
|
||||||
|
| Auto-PR didn't open after story `DONE` | Story not actually `DONE`, or auto-PR pre-conditions unmet | Walk through [`runbooks/auto-pr-flow.md`](../runbooks/auto-pr-flow.md); typically a missing `update_task_status('done')` for the last task |
|
||||||
|
| Vercel skipped the deploy entirely | `skip-deploy` label or path-filter excluded the changed paths | See [`runbooks/deploy-control.md`](../runbooks/deploy-control.md) for the rules |
|
||||||
|
| Merge conflict between two parallel batches | Two branches touched the same files | Serialise: merge the first PR before pushing the second. Then `git fetch origin main && git rebase origin/main` |
|
||||||
|
|
||||||
|
### Realtime
|
||||||
|
|
||||||
|
| Symptom | Likely cause | Fix |
|
||||||
|
|---|---|---|
|
||||||
|
| Solo Board doesn't update when status changes | SSE connection dropped, or NOTIFY payload missing fields | Reload the page; if it persists, check `DIRECT_URL` (LISTEN/NOTIFY needs the pooler-bypass URL). See [`patterns/realtime-notify-payload.md`](../patterns/realtime-notify-payload.md) |
|
||||||
|
| NavBar bell doesn't pulse on new question | SSE/event channel mismatched, or payload missing required fields | Confirm the question was actually inserted (`mcp__scrum4me__list_open_questions`); inspect the Network tab for the SSE connection |
|
||||||
|
| Worker shows offline despite a running container | `worker_heartbeat` not reaching MCP | Verify `SCRUM4ME_BASE_URL` and bearer token; tail container logs |
|
||||||
|
|
||||||
|
### Auth & sessions
|
||||||
|
|
||||||
|
| Symptom | Likely cause | Fix |
|
||||||
|
|---|---|---|
|
||||||
|
| Login redirects in a loop | Session cookie not set; usually `SESSION_SECRET` mismatch between deployments | Check Vercel env vars for `SESSION_SECRET` (must be ≥32 chars); see [`patterns/iron-session.md`](../patterns/iron-session.md) |
|
||||||
|
| All write buttons disabled with "Niet beschikbaar in demo-modus" tooltip | You're logged in as the demo user | Log out and log in with a real account |
|
||||||
|
| `403` on a route that should be allowed | Proxy or server-action layer rejected the request | Walk through the three layers in [`adr/0006-demo-user-three-layer-policy.md`](../adr/0006-demo-user-three-layer-policy.md); each can independently say "no" |
|
||||||
|
|
||||||
|
### Build & dev-server
|
||||||
|
|
||||||
|
| Symptom | Likely cause | Fix |
|
||||||
|
|---|---|---|
|
||||||
|
| `npm run build` fails with `Cannot find module '@/...'` | TypeScript path alias mismatch | Check `tsconfig.json` `paths`; rerun `npm run prebuild` if codegen is stale |
|
||||||
|
| Mermaid diagram renders as plain text in the in-app `/manual` viewer | `MermaidBlock` not picking up `language-mermaid` | See [04 — MCP Integration](./04-mcp-integration.md) won't help here — open `app/(app)/manual/_components/mermaid-block.tsx` and confirm the dynamic import is `ssr: false` |
|
||||||
|
| "Server-only" import error in browser | A `*-server.ts` module was imported into a client component | Refactor — split server logic out, or use a server action. Hardstop in [`CLAUDE.md`](../../CLAUDE.md#hardstop-regels) |
|
||||||
|
| `npm run dev` shows hydration mismatch | Server and client render diverge — usually time-based or random values | Wrap in `useEffect` for client-only state, or pass server time as a prop |
|
||||||
|
|
||||||
|
## When in doubt
|
||||||
|
|
||||||
|
1. **Read the runbook.** Each runbook in [`docs/runbooks/`](../runbooks/) starts with a `when_to_read` field — match the situation.
|
||||||
|
2. **Check the ADRs.** The ADR index in [`docs/INDEX.md`](../INDEX.md) lists the rationale for every cross-cutting decision. If your fix would contradict an ADR, talk to a maintainer first.
|
||||||
|
3. **Read the agent-flow pitfalls log.** [`docs/runbooks/agent-flow-pitfalls.md`](../runbooks/agent-flow-pitfalls.md) is a living list of issues found during agent runs and how they were resolved.
|
||||||
|
4. **Look at recent commits.** `git log --oneline --since='7 days ago'` often reveals the very change that broke whatever you're debugging.
|
||||||
|
|
||||||
|
## Escalation
|
||||||
|
|
||||||
|
If after the steps above the issue is still unresolved:
|
||||||
|
|
||||||
|
- **AI agent / MCP issues** → file in the [`scrum4me-mcp` repo](https://github.com/madhura68/scrum4me-mcp).
|
||||||
|
- **Worker container issues** → file in the [`scrum4me-docker` repo](https://github.com/madhura68/scrum4me-docker).
|
||||||
|
- **App / data / status issues** → file in the [`Scrum4Me` repo](https://github.com/madhura68/Scrum4Me).
|
||||||
|
|
||||||
|
## What's next
|
||||||
|
|
||||||
|
You've reached the end of the manual. Bookmark this troubleshooting chapter — it's the most-revisited page once you're past onboarding.
|
||||||
|
|
||||||
|
Back to [index](./index.md).
|
||||||
64
docs/manual/index.md
Normal file
64
docs/manual/index.md
Normal file
|
|
@ -0,0 +1,64 @@
|
||||||
|
---
|
||||||
|
title: "Scrum4Me Developer Manual"
|
||||||
|
status: active
|
||||||
|
audience: [contributor]
|
||||||
|
language: en
|
||||||
|
last_updated: 2026-05-07
|
||||||
|
when_to_read: "Onboarding to Scrum4Me as a human contributor."
|
||||||
|
---
|
||||||
|
|
||||||
|
# Scrum4Me Developer Manual
|
||||||
|
|
||||||
|
Welcome. This manual is the **map** of Scrum4Me — a guided tour through the moving parts of the project. It is written for a new human contributor who needs to understand how the pieces fit together before diving into the authoritative reference docs (the runbooks, ADRs, and patterns under [`docs/`](../INDEX.md)).
|
||||||
|
|
||||||
|
> **The manual is the map. The runbooks are the territory.**
|
||||||
|
> When two sources disagree, trust the runbook or ADR linked from this manual.
|
||||||
|
|
||||||
|
## Audience
|
||||||
|
|
||||||
|
- **New human contributors** picking up the project for the first time.
|
||||||
|
- **Returning contributors** who want a quick refresher on how a specific subsystem (statuses, git, MCP, Docker) fits into the whole.
|
||||||
|
- **Not for**: AI agents — they should follow [`CLAUDE.md`](../../CLAUDE.md) and the agent-specific runbooks under [`docs/runbooks/`](../runbooks/).
|
||||||
|
|
||||||
|
## How to read this manual
|
||||||
|
|
||||||
|
| You want to… | Read |
|
||||||
|
|---|---|
|
||||||
|
| …get the elevator pitch and project structure | [01 — Overview](./01-overview.md) |
|
||||||
|
| …understand how a PBI/Story/Task moves through its lifecycle | [02 — Statuses & Transitions](./02-statuses-and-transitions.md) |
|
||||||
|
| …know when to branch, commit, push, and open a PR | [03 — Git Workflow](./03-git-workflow.md) |
|
||||||
|
| …see how Claude Code drives stories via the MCP server | [04 — MCP Integration](./04-mcp-integration.md) |
|
||||||
|
| …run the worker container locally or understand the deploy topology | [05 — Docker](./05-docker.md) |
|
||||||
|
| …diagnose an error code, stuck job, or weird state | [06 — Troubleshooting](./06-troubleshooting.md) |
|
||||||
|
|
||||||
|
A linear read takes about 30 minutes. As a lookup reference, jump straight to a chapter — each one stands alone.
|
||||||
|
|
||||||
|
## Conventions
|
||||||
|
|
||||||
|
- **Cross-references** use relative links (`../runbooks/...`) so they work both in GitHub and inside the in-app `/manual` viewer.
|
||||||
|
- **Callouts** use blockquotes prefixed with a label: `> **Note:**`, `> **Warning:**`, `> **Hardstop:**` (a non-negotiable rule from [`CLAUDE.md`](../../CLAUDE.md)).
|
||||||
|
- **Code blocks** show shell commands with no `$` prefix, so they're copy-pasteable.
|
||||||
|
- **State diagrams** use Mermaid `stateDiagram-v2`; they render in GitHub and in the in-app viewer.
|
||||||
|
- **Status labels** are written in `UPPER_SNAKE` when referring to the database value and `lowercase` when referring to the API representation — see [02 — Statuses & Transitions](./02-statuses-and-transitions.md#db-vs-api-mapping) for the contract.
|
||||||
|
|
||||||
|
## In-app rendering
|
||||||
|
|
||||||
|
Every chapter in this manual is also browsable inside the running Scrum4Me app at `/manual`. The in-app sidebar mirrors this index, and Mermaid diagrams render in place. The markdown files under `docs/manual/` are the **source of truth** — the in-app page reads them at build time via the `scripts/build-manual.mjs` generator.
|
||||||
|
|
||||||
|
## What this manual does **not** cover
|
||||||
|
|
||||||
|
- **REST API reference** → [`docs/api/rest-contract.md`](../api/rest-contract.md)
|
||||||
|
- **Component & dialog specs** → [`docs/specs/dialogs/`](../specs/dialogs/)
|
||||||
|
- **Architecture deep-dives** → [`docs/architecture.md`](../architecture.md) breadcrumb
|
||||||
|
- **Decision rationale** → [`docs/adr/`](../adr/)
|
||||||
|
- **Implementation patterns** → [`docs/patterns/`](../patterns/)
|
||||||
|
- **AI-agent instructions** → [`CLAUDE.md`](../../CLAUDE.md) and [`docs/runbooks/mcp-integration.md`](../runbooks/mcp-integration.md)
|
||||||
|
|
||||||
|
## Table of contents
|
||||||
|
|
||||||
|
1. [Overview](./01-overview.md) — what Scrum4Me is, the entity hierarchy, the stack, repository layout
|
||||||
|
2. [Statuses & Transitions](./02-statuses-and-transitions.md) — state machines for every entity
|
||||||
|
3. [Git Workflow](./03-git-workflow.md) — branching, commits, PRs, deploy controls
|
||||||
|
4. [MCP Integration](./04-mcp-integration.md) — the agent loop, idea jobs, the Q&A channel
|
||||||
|
5. [Docker](./05-docker.md) — worker container, local dev, scrum4me-docker
|
||||||
|
6. [Troubleshooting](./06-troubleshooting.md) — error codes, stuck jobs, recovery procedures
|
||||||
Loading…
Add table
Add a link
Reference in a new issue