Scrum4Me/docs/runbooks/review-plan-job.md
Janpeter Visser d84cdf664f
feat(PBI-67): IDEA_REVIEW_PLAN — iterative multi-model plan review (#199)
* feat(ideas): upload-plan knop — short-circuit van Make-Plan AI-flow

Voegt een 'Upload plan' knop toe in idea-row-actions (verschijnt in zowel
list als idea-detail). Klik → file picker → kies .md → server-side parse +
opslaan; idea-status springt naar PLAN_READY. Vandaaruit de bestaande
'Maak PBI' knop voor materialize.

Server (uploadPlanMdAction):
- Toegestaan vanuit DRAFT, GRILLED, PLAN_FAILED, PLAN_READY
- DRAFT → skip-grill: status gaat direct naar PLAN_READY
- PLAN_READY overschrijft het bestaande plan (consistent met
  updatePlanMdAction, geen confirmation)
- Geblokkeerd in GRILLING/PLANNING (job loopt), PLANNED (al gematerialiseerd)
- Parse-failure → 422 + details (NIET opslaan, zodat een onparseerbaar plan
  nooit in de DB belandt)
- Empty / >100k chars → 422
- Schrijft IdeaLog NOTE met from_status + length
- Rate-limit + demo-guard + ownership-check via loadOwnedIdea (zelfde
  patroon als updatePlanMdAction)

UI (idea-row-actions.tsx):
- Hidden <input type=file accept=".md,.markdown,text/markdown,text/plain">
- FileReader → text → action
- Toast bij success + router.refresh()
- Blocked-tooltip in andere statussen

Tests: 10 nieuwe in __tests__/actions/ideas-crud.test.ts dekkend voor:
happy paths (DRAFT/GRILLED/PLAN_READY-overwrite/PLAN_FAILED), blocks
(PLANNED/GRILLING), validation (empty/oversized/parse-fail), 404.
Full suite groen: 849/849.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add reviews for Bootstrap-wizard plans v3.2 to v3.4

- Review v3.2: Addressed executor model, fire-and-forget issues, and PAT handling.
- Review v3.3: Improved transaction handling, stale recovery, and ID generation.
- Review v3.4: Finalized GitHub permissions, catalog versioning, and E2E verification queries.
- Updated recommendations for each version to enhance implementation readiness.

* docs(plans): M8 bootstrap-wizard upload-variant v1.4 — backtick-paden

Upload-variant van het volledige technische plan (docs/plans/M8-bootstrap-wizard.md),
bedoeld voor de "Upload plan"-functie. Genereert 1 PBI + 4 Stories + 22 Tasks
via materializeIdeaPlanAction.

v1.4-aanpassingen tov eerdere generatie-iteratie:
- Alle bestandspaden in implementation_plan in backticks (path-extractor matchen)
- Expliciete "Bestanden:" blok per task vóór de stappen
- Alle tasks op verify_required: ALIGNED_OR_PARTIAL (was deels ALIGNED — te strict
  voor ADR-stubs en multi-file edits)

Fixt forward-only: T-963 cancelled_by_self door DIVERGENT verifier-verdict.
Re-upload van dit bestand produceert tasks die door verify_task_against_plan
als ALIGNED of PARTIAL geclassificeerd kunnen worden.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* PBI-67: Add review-plan support to Idea model and job config

- Add plan_review_log and reviewed_at fields to Idea model
- Add REVIEWING_PLAN, PLAN_REVIEW_FAILED, PLAN_REVIEWED to IdeaStatus enum
- Add IDEA_REVIEW_PLAN to ClaudeJobKind enum
- Add IDEA_REVIEW_PLAN config to job-config.ts with model=opus, thinking_budget=6000
- Create migration record for schema changes (applied via db push)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* PBI-67 Phase 2: Add update-idea-plan-reviewed MCP tool

- Create src/tools/update-idea-plan-reviewed.ts: saves review-log and transitions idea status to PLAN_REVIEWED
- Add PLAN_REVIEW_RESULT to IdeaLogType enum (both repos)
- Register tool in src/index.ts
- Update Prisma schemas (both repos): add plan_review_log and reviewed_at fields to Idea model
- Add REVIEWING_PLAN, PLAN_REVIEW_FAILED, PLAN_REVIEWED to IdeaStatus enum (MCP schema)
- Add IDEA_REVIEW_PLAN to ClaudeJobKind enum (MCP schema)
- Tool includes transaction safety and convergence metrics logging

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* feat(PBI-67): IDEA_REVIEW_PLAN Phases 3-6 — server actions, UI components, prompt & tests

- Phase 3: startReviewPlanJobAction, cancelIdeaJobAction, status transitions
  (REVIEWING_PLAN / PLAN_REVIEWED / PLAN_REVIEW_FAILED), status colors,
  job-card/jobs-column filters, idea-list status tabs
- Phase 4: review-plan-job.md prompt (multi-model orchestration with codex
  injection + active plan revision via update_idea_plan_md after each round),
  runbook, 13 unit tests
- Phase 5: ReviewLogViewer component (rounds, convergence, approval, issues),
  idea-detail integration, proper ReviewLog TypeScript types exported from component
- Phase 6.1: wait-for-job discriminator wired (IDEA_REVIEW_PLAN), plan-revision
  step made mandatory in prompt (was previously optional/missing)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-14 03:35:02 +02:00

285 lines
10 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Review-Plan Job Orchestration
> Implementation guide for the IDEA_REVIEW_PLAN job kind and multi-model iterative plan review.
---
## Overview
The review-plan job is an autonomous agent that performs iterative multi-model review of implementation plans (YAML frontmatter + markdown documents). It coordinates three review stages (structure, logic/patterns, risk assessment), detects convergence, and either approves the plan or returns it for manual refinement.
**Job Kind:** `IDEA_REVIEW_PLAN`
**Triggerable From:** `PLAN_READY`, `PLAN_REVIEWED` (re-review)
**Transitions To:** `PLAN_REVIEWED` (approved) or `PLAN_REVIEW_FAILED` (rejected/abandoned)
---
## System Design
### Data Flow
```
User clicks "Review Plan" on PLAN_READY idea
startReviewPlanJobAction() queues IDEA_REVIEW_PLAN job
Worker claims job via wait_for_job (MCP)
Review-plan prompt orchestrates:
- Ronde 1: Structure check (YAML parsing, format correctness)
- Ronde 2: Logic & patterns (dependencies, architecture fit)
- Ronde 3: Risk assessment (edge cases, refactoring, type-safety)
Convergence detection: if stable, ask approval
On approval: update_idea_plan_reviewed(approval_status='approved')
→ Idea transitions to PLAN_REVIEWED
→ IdeaLog entry created with PLAN_REVIEW_RESULT
On rejection: return for manual edit (status → PLAN_REVIEW_FAILED)
```
### Review-Log JSON Schema
The orchestrator produces a detailed JSON log stored in `idea.plan_review_log`:
```typescript
interface ReviewLog {
plan_file: string; // Idea code (e.g., "I-042")
created_at: ISO8601; // Review start timestamp
rounds: Array<{
round: number; // 0, 1, 2 (structure, logic, risk)
model: string; // claude-3-5-haiku | claude-3-5-sonnet | claude-opus-4-7
role: string; // "Structure Review" | "Logic & Patterns" | "Risk Assessment"
focus: string; // Review focus summary
plan_before: string; // Original plan_md at round start
plan_after: string; // Revised plan after feedback
issues: Array<{
category: 'structure' | 'logic' | 'risk' | 'pattern';
severity: 'error' | 'warning' | 'info';
suggestion: string; // Concrete fix recommendation
}>;
score: number; // 0-100 review score
plan_diff_lines: number; // Changed lines in this round
converged: boolean; // Did this round trigger convergence?
timestamp: ISO8601; // Round completion time
}>;
convergence?: {
stable_at_round: number; // Round where convergence was detected
final_diff_pct: number; // Percentage of changed lines at convergence
convergence_metric: string; // "plan_stability" (constant for now)
};
approval: {
status: 'pending' | 'approved' | 'rejected';
timestamp?: ISO8601; // When user made decision
};
summary: string; // 12 sentence summary for IdeaLog
}
```
---
## Assumptions & Constraints
### Prompt Assumptions
1. **Plan Format:** Idea's `plan_md` field contains YAML frontmatter (parsed at PLAN_READY) + markdown body.
- Frontmatter keys: `pbi`, `stories`, `tasks`, `priority`, `verify_required`.
- If parse fails, orchestrator transitions idea to `PLAN_REVIEW_FAILED`.
2. **Context Availability:** The job payload includes:
- `idea.plan_md`: The plan to review (required)
- `idea.grill_md`: Context from grill phase (optional but recommended)
- `product.definition_of_done`: Product-level acceptance criteria
- `repo_url`: Local repository for pattern inspection
3. **User Availability:** At least one worker is active (server-side check via `countActiveWorkers`).
4. **No External APIs:** Orchestrator performs reviews entirely with information from job context. No external codex or multi-model APIs are called directly.
- Future improvement: Codex-injection from `docs/patterns/**/*.md` and `docs/architecture/**/*.md`.
### Convergence Detection Assumptions
1. **Stability Metric:** Two consecutive rounds with < 5% line changes = convergence.
- Threshold is hardcoded; future: make configurable per product.
- Diff percentage = `(changed_lines / total_lines) * 100`.
2. **Max Iterations:** 3 initial rounds + 2 optional extra rounds (total max 5) before forced approval.
3. **No Infinite Loops:** If max iterations reached, approval gate enforces a decision.
### Validation Assumptions
1. **Plan is Mutable:** Orchestrator can revise `plan_md` between rounds without breaking downstream parsing.
- If YAML structure is corrupted, `parsePlanMd` (server-side) will fail on approval.
- Orchestrator should never corrupt YAML syntax.
2. **IdeaLog Persistence:** MCP tool `update_idea_plan_reviewed` atomically saves:
- `idea.plan_review_log` (full JSON)
- `idea.reviewed_at` (timestamp)
- `idea.status` (transition)
- `IdeaLog` entry (audit)
3. **User Decisions are Final:** Once approved, plan-review log is immutable (until next re-review).
---
## Implementation Details
### Prompt Location
- **Main Repo:** `lib/idea-prompts/review-plan-job.md`
- **MCP Server:** `scrum4me-mcp/src/prompts/idea/review-plan.md`
- **Synchronization:** Manual (for now); future: sync-schema.sh-like mechanism.
### Job Config Snapshot
Job created with config from `lib/job-config.ts`:
```typescript
IDEA_REVIEW_PLAN: {
model: 'claude-opus-4-7', // Opus for final orchestration
thinking_budget: 6000, // Extended for multi-round analysis
permission_mode: 'acceptEdits',
max_turns: 1,
allowed_tools: [
'Read', 'Write', 'Grep', 'Glob',
'mcp__scrum4me__update_idea_plan_reviewed',
'mcp__scrum4me__log_idea_decision',
'mcp__scrum4me__update_job_status',
'mcp__scrum4me__ask_user_question',
],
}
```
**Note:** Model is fixed to Opus for orchestration. Individual review rounds are simulated (not actual model switching) within Opus's analysis. Future: Direct multi-model support via Claude API.
### MCP Tool: update_idea_plan_reviewed
**Location:** `scrum4me-mcp/src/tools/update-idea-plan-reviewed.ts`
**Input:**
```typescript
{
idea_id: string;
review_log: object; // Full ReviewLog JSON
approval_status?: 'pending' | 'approved' | 'rejected';
}
```
**Behavior:**
1. Validates user owns idea.
2. Transitions idea status:
- `approval_status='approved'` `PLAN_REVIEWED`
- `approval_status='rejected'` `PLAN_REVIEW_FAILED`
- Default `PLAN_REVIEWED`
3. Saves `plan_review_log` and `reviewed_at` atomically.
4. Creates `IdeaLog` entry with type `PLAN_REVIEW_RESULT`.
---
## Dependencies
### Database
- **Idea Model:** Must have fields `plan_review_log` (Json), `reviewed_at` (DateTime).
- **IdeaStatus Enum:** Must include `REVIEWING_PLAN`, `PLAN_REVIEW_FAILED`, `PLAN_REVIEWED`.
- **IdeaLogType Enum:** Must include `PLAN_REVIEW_RESULT`.
### Server Actions
- `startReviewPlanJobAction()` Queues job, enforces status transitions.
- `cancelIdeaJobAction()` Allows user to cancel mid-review (reverts to `PLAN_READY`).
### MCP Tools
- `update_idea_plan_reviewed()` Saves review-log and transitions status.
- `log_idea_decision()` Logs convergence/approval decisions.
- `update_job_status()` Marks job as done/failed.
- `ask_user_question()` Approval gate interaction.
### Files
- `lib/idea-prompts/review-plan-job.md` Orchestrator prompt.
- `scrum4me-mcp/src/prompts/idea/review-plan.md` MCP server copy.
- `scrum4me-mcp/src/lib/kind-prompts.ts` Prompt loader.
- `scrum4me-mcp/src/tools/wait-for-job.ts` Job context builder.
---
## Error Handling
### Parse Failures
If `plan_md` cannot be parsed as valid YAML frontmatter:
1. Orchestrator logs error in review_log.
2. Calls `update_job_status('failed', error: 'plan_parse_failed')`.
3. Idea remains in `REVIEWING_PLAN` (no transition).
4. User can manually edit `plan_md` and retry.
### User Cancellation
If user cancels job via UI:
1. Server sets job status `CANCELLED`.
2. Worker receives no further answer from `ask_user_question`.
3. Orchestrator gracefully saves partial review_log.
4. Calls `update_job_status('skipped', ...)`.
5. Idea reverts to `PLAN_READY`.
### Question Timeout
If approval question expires (24h):
1. Orchestrator logs timeout in review_log.
2. Calls `update_job_status('failed', error: 'approval_timeout')`.
3. Idea reverts to `PLAN_READY`.
---
## Testing Strategy
### Unit Tests
- **Mock ReviewLog Generation:** Verify review-log JSON structure matches schema.
- **Convergence Calculation:** Diff percentage computation, stability threshold.
- **Status Transitions:** Valid state machine paths (PLAN_READY REVIEWING_PLAN PLAN_REVIEWED).
### Integration Tests
- **End-to-End:** Draft idea Grill Plan Review PLAN_REVIEWED.
- **Re-Review:** PLAN_REVIEWED REVIEWING_PLAN PLAN_REVIEWED (no data loss).
- **Cancellation:** Mid-review cancellation revert to PLAN_READY.
- **Parse Errors:** Malformed plan_md PLAN_REVIEW_FAILED.
### Manual Testing
1. Create test idea with PLAN_READY status.
2. Click "Review Plan".
3. Monitor job in Jobs dashboard.
4. Verify review-log in idea detail page.
5. Accept/reject approval.
6. Confirm status transition and IdeaLog entry.
---
## Future Enhancements
1. **Direct Multi-Model Calls:** Use Claude API to invoke Haiku, Sonnet, Opus separately with model switching.
2. **Codex Injection:** Auto-load and inject `docs/patterns/**/*.md` and `docs/architecture/**/*.md` as context.
3. **Configurable Thresholds:** Allow product-level convergence percentage and max-rounds settings.
4. **Review History:** Preserve all review-logs for audit trail and re-review diffs.
5. **Feedback Loop:** Log user edits between review rounds and suggest re-run based on delta.
6. **Scheduled Re-Review:** Auto-trigger review after N days (staleness check).
---
## References
- `docs/architecture/jobs.md` Job system architecture.
- `docs/patterns/server-action.md` Server action pattern (startReviewPlanJobAction).
- `docs/api/rest-contract.md` API surface for plan-review.
- `lib/idea-status.ts` Status transition graph and state machine.
- `lib/idea-plan-parser.ts` Plan YAML parsing (validator for approved plans).