Janpeter Visser d84cdf664f

feat(PBI-67): IDEA_REVIEW_PLAN — iterative multi-model plan review (#199 )

* feat(ideas): upload-plan knop — short-circuit van Make-Plan AI-flow

Voegt een 'Upload plan' knop toe in idea-row-actions (verschijnt in zowel
list als idea-detail). Klik → file picker → kies .md → server-side parse +
opslaan; idea-status springt naar PLAN_READY. Vandaaruit de bestaande
'Maak PBI' knop voor materialize.

Server (uploadPlanMdAction):
- Toegestaan vanuit DRAFT, GRILLED, PLAN_FAILED, PLAN_READY
- DRAFT → skip-grill: status gaat direct naar PLAN_READY
- PLAN_READY overschrijft het bestaande plan (consistent met
  updatePlanMdAction, geen confirmation)
- Geblokkeerd in GRILLING/PLANNING (job loopt), PLANNED (al gematerialiseerd)
- Parse-failure → 422 + details (NIET opslaan, zodat een onparseerbaar plan
  nooit in de DB belandt)
- Empty / >100k chars → 422
- Schrijft IdeaLog NOTE met from_status + length
- Rate-limit + demo-guard + ownership-check via loadOwnedIdea (zelfde
  patroon als updatePlanMdAction)

UI (idea-row-actions.tsx):
- Hidden <input type=file accept=".md,.markdown,text/markdown,text/plain">
- FileReader → text → action
- Toast bij success + router.refresh()
- Blocked-tooltip in andere statussen

Tests: 10 nieuwe in __tests__/actions/ideas-crud.test.ts dekkend voor:
happy paths (DRAFT/GRILLED/PLAN_READY-overwrite/PLAN_FAILED), blocks
(PLANNED/GRILLING), validation (empty/oversized/parse-fail), 404.
Full suite groen: 849/849.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add reviews for Bootstrap-wizard plans v3.2 to v3.4

- Review v3.2: Addressed executor model, fire-and-forget issues, and PAT handling.
- Review v3.3: Improved transaction handling, stale recovery, and ID generation.
- Review v3.4: Finalized GitHub permissions, catalog versioning, and E2E verification queries.
- Updated recommendations for each version to enhance implementation readiness.

* docs(plans): M8 bootstrap-wizard upload-variant v1.4 — backtick-paden

Upload-variant van het volledige technische plan (docs/plans/M8-bootstrap-wizard.md),
bedoeld voor de "Upload plan"-functie. Genereert 1 PBI + 4 Stories + 22 Tasks
via materializeIdeaPlanAction.

v1.4-aanpassingen tov eerdere generatie-iteratie:
- Alle bestandspaden in implementation_plan in backticks (path-extractor matchen)
- Expliciete "Bestanden:" blok per task vóór de stappen
- Alle tasks op verify_required: ALIGNED_OR_PARTIAL (was deels ALIGNED — te strict
  voor ADR-stubs en multi-file edits)

Fixt forward-only: T-963 cancelled_by_self door DIVERGENT verifier-verdict.
Re-upload van dit bestand produceert tasks die door verify_task_against_plan
als ALIGNED of PARTIAL geclassificeerd kunnen worden.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* PBI-67: Add review-plan support to Idea model and job config

- Add plan_review_log and reviewed_at fields to Idea model
- Add REVIEWING_PLAN, PLAN_REVIEW_FAILED, PLAN_REVIEWED to IdeaStatus enum
- Add IDEA_REVIEW_PLAN to ClaudeJobKind enum
- Add IDEA_REVIEW_PLAN config to job-config.ts with model=opus, thinking_budget=6000
- Create migration record for schema changes (applied via db push)

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* PBI-67 Phase 2: Add update-idea-plan-reviewed MCP tool

- Create src/tools/update-idea-plan-reviewed.ts: saves review-log and transitions idea status to PLAN_REVIEWED
- Add PLAN_REVIEW_RESULT to IdeaLogType enum (both repos)
- Register tool in src/index.ts
- Update Prisma schemas (both repos): add plan_review_log and reviewed_at fields to Idea model
- Add REVIEWING_PLAN, PLAN_REVIEW_FAILED, PLAN_REVIEWED to IdeaStatus enum (MCP schema)
- Add IDEA_REVIEW_PLAN to ClaudeJobKind enum (MCP schema)
- Tool includes transaction safety and convergence metrics logging

Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>

* feat(PBI-67): IDEA_REVIEW_PLAN Phases 3-6 — server actions, UI components, prompt & tests

- Phase 3: startReviewPlanJobAction, cancelIdeaJobAction, status transitions
  (REVIEWING_PLAN / PLAN_REVIEWED / PLAN_REVIEW_FAILED), status colors,
  job-card/jobs-column filters, idea-list status tabs
- Phase 4: review-plan-job.md prompt (multi-model orchestration with codex
  injection + active plan revision via update_idea_plan_md after each round),
  runbook, 13 unit tests
- Phase 5: ReviewLogViewer component (rounds, convergence, approval, issues),
  idea-detail integration, proper ReviewLog TypeScript types exported from component
- Phase 6.1: wait-for-job discriminator wired (IDEA_REVIEW_PLAN), plan-revision
  step made mandatory in prompt (was previously optional/missing)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-14 03:35:02 +02:00

10 KiB

Raw Blame History

Review-Plan Job Orchestration

Implementation guide for the IDEA_REVIEW_PLAN job kind and multi-model iterative plan review.

Overview

The review-plan job is an autonomous agent that performs iterative multi-model review of implementation plans (YAML frontmatter + markdown documents). It coordinates three review stages (structure, logic/patterns, risk assessment), detects convergence, and either approves the plan or returns it for manual refinement.

Job Kind: IDEA_REVIEW_PLAN Triggerable From: PLAN_READY, PLAN_REVIEWED (re-review) Transitions To: PLAN_REVIEWED (approved) or PLAN_REVIEW_FAILED (rejected/abandoned)

System Design

Data Flow

User clicks "Review Plan" on PLAN_READY idea
  ↓
startReviewPlanJobAction() queues IDEA_REVIEW_PLAN job
  ↓
Worker claims job via wait_for_job (MCP)
  ↓
Review-plan prompt orchestrates:
  - Ronde 1: Structure check (YAML parsing, format correctness)
  - Ronde 2: Logic & patterns (dependencies, architecture fit)
  - Ronde 3: Risk assessment (edge cases, refactoring, type-safety)
  ↓
Convergence detection: if stable, ask approval
  ↓
On approval: update_idea_plan_reviewed(approval_status='approved')
  → Idea transitions to PLAN_REVIEWED
  → IdeaLog entry created with PLAN_REVIEW_RESULT
  ↓
On rejection: return for manual edit (status → PLAN_REVIEW_FAILED)

Review-Log JSON Schema

The orchestrator produces a detailed JSON log stored in idea.plan_review_log:

interface ReviewLog {
  plan_file: string;           // Idea code (e.g., "I-042")
  created_at: ISO8601;         // Review start timestamp
  
  rounds: Array<{
    round: number;             // 0, 1, 2 (structure, logic, risk)
    model: string;             // claude-3-5-haiku | claude-3-5-sonnet | claude-opus-4-7
    role: string;              // "Structure Review" | "Logic & Patterns" | "Risk Assessment"
    focus: string;             // Review focus summary
    plan_before: string;       // Original plan_md at round start
    plan_after: string;        // Revised plan after feedback
    issues: Array<{
      category: 'structure' | 'logic' | 'risk' | 'pattern';
      severity: 'error' | 'warning' | 'info';
      suggestion: string;      // Concrete fix recommendation
    }>;
    score: number;             // 0-100 review score
    plan_diff_lines: number;   // Changed lines in this round
    converged: boolean;        // Did this round trigger convergence?
    timestamp: ISO8601;        // Round completion time
  }>;
  
  convergence?: {
    stable_at_round: number;   // Round where convergence was detected
    final_diff_pct: number;    // Percentage of changed lines at convergence
    convergence_metric: string; // "plan_stability" (constant for now)
  };
  
  approval: {
    status: 'pending' | 'approved' | 'rejected';
    timestamp?: ISO8601;       // When user made decision
  };
  
  summary: string;            // 1–2 sentence summary for IdeaLog
}

Assumptions & Constraints

Prompt Assumptions

Plan Format: Idea's plan_md field contains YAML frontmatter (parsed at PLAN_READY) + markdown body.
- Frontmatter keys: pbi, stories, tasks, priority, verify_required.
- If parse fails, orchestrator transitions idea to PLAN_REVIEW_FAILED.
Context Availability: The job payload includes:
- idea.plan_md: The plan to review (required)
- idea.grill_md: Context from grill phase (optional but recommended)
- product.definition_of_done: Product-level acceptance criteria
- repo_url: Local repository for pattern inspection
User Availability: At least one worker is active (server-side check via countActiveWorkers).
No External APIs: Orchestrator performs reviews entirely with information from job context. No external codex or multi-model APIs are called directly.
- Future improvement: Codex-injection from docs/patterns/**/*.md and docs/architecture/**/*.md.

Convergence Detection Assumptions

Stability Metric: Two consecutive rounds with < 5% line changes = convergence.
- Threshold is hardcoded; future: make configurable per product.
- Diff percentage = (changed_lines / total_lines) * 100.
Max Iterations: 3 initial rounds + 2 optional extra rounds (total max 5) before forced approval.
No Infinite Loops: If max iterations reached, approval gate enforces a decision.

Validation Assumptions

Plan is Mutable: Orchestrator can revise plan_md between rounds without breaking downstream parsing.
- If YAML structure is corrupted, parsePlanMd (server-side) will fail on approval.
- Orchestrator should never corrupt YAML syntax.
IdeaLog Persistence: MCP tool update_idea_plan_reviewed atomically saves:
- idea.plan_review_log (full JSON)
- idea.reviewed_at (timestamp)
- idea.status (transition)
- IdeaLog entry (audit)
User Decisions are Final: Once approved, plan-review log is immutable (until next re-review).

Implementation Details

Prompt Location

Main Repo: lib/idea-prompts/review-plan-job.md
MCP Server: scrum4me-mcp/src/prompts/idea/review-plan.md
Synchronization: Manual (for now); future: sync-schema.sh-like mechanism.

Job Config Snapshot

Job created with config from lib/job-config.ts:

IDEA_REVIEW_PLAN: {
  model: 'claude-opus-4-7',       // Opus for final orchestration
  thinking_budget: 6000,          // Extended for multi-round analysis
  permission_mode: 'acceptEdits',
  max_turns: 1,
  allowed_tools: [
    'Read', 'Write', 'Grep', 'Glob',
    'mcp__scrum4me__update_idea_plan_reviewed',
    'mcp__scrum4me__log_idea_decision',
    'mcp__scrum4me__update_job_status',
    'mcp__scrum4me__ask_user_question',
  ],
}

Note: Model is fixed to Opus for orchestration. Individual review rounds are simulated (not actual model switching) within Opus's analysis. Future: Direct multi-model support via Claude API.

MCP Tool: update_idea_plan_reviewed

Location: scrum4me-mcp/src/tools/update-idea-plan-reviewed.ts

Input:

{
  idea_id: string;
  review_log: object;  // Full ReviewLog JSON
  approval_status?: 'pending' | 'approved' | 'rejected';
}

Behavior:

Validates user owns idea.
Transitions idea status:
- approval_status='approved' → PLAN_REVIEWED
- approval_status='rejected' → PLAN_REVIEW_FAILED
- Default → PLAN_REVIEWED
Saves plan_review_log and reviewed_at atomically.
Creates IdeaLog entry with type PLAN_REVIEW_RESULT.

Dependencies

Database

Idea Model: Must have fields plan_review_log (Json), reviewed_at (DateTime).
IdeaStatus Enum: Must include REVIEWING_PLAN, PLAN_REVIEW_FAILED, PLAN_REVIEWED.
IdeaLogType Enum: Must include PLAN_REVIEW_RESULT.

Server Actions

startReviewPlanJobAction() — Queues job, enforces status transitions.
cancelIdeaJobAction() — Allows user to cancel mid-review (reverts to PLAN_READY).

MCP Tools

update_idea_plan_reviewed() — Saves review-log and transitions status.
log_idea_decision() — Logs convergence/approval decisions.
update_job_status() — Marks job as done/failed.
ask_user_question() — Approval gate interaction.

Files

lib/idea-prompts/review-plan-job.md — Orchestrator prompt.
scrum4me-mcp/src/prompts/idea/review-plan.md — MCP server copy.
scrum4me-mcp/src/lib/kind-prompts.ts — Prompt loader.
scrum4me-mcp/src/tools/wait-for-job.ts — Job context builder.

Error Handling

Parse Failures

If plan_md cannot be parsed as valid YAML frontmatter:

Orchestrator logs error in review_log.
Calls update_job_status('failed', error: 'plan_parse_failed').
Idea remains in REVIEWING_PLAN (no transition).
User can manually edit plan_md and retry.

User Cancellation

If user cancels job via UI:

Server sets job status → CANCELLED.
Worker receives no further answer from ask_user_question.
Orchestrator gracefully saves partial review_log.
Calls update_job_status('skipped', ...).
Idea reverts to PLAN_READY.

Question Timeout

If approval question expires (24h):

Orchestrator logs timeout in review_log.
Calls update_job_status('failed', error: 'approval_timeout').
Idea reverts to PLAN_READY.

Testing Strategy

Unit Tests

Mock ReviewLog Generation: Verify review-log JSON structure matches schema.
Convergence Calculation: Diff percentage computation, stability threshold.
Status Transitions: Valid state machine paths (PLAN_READY → REVIEWING_PLAN → PLAN_REVIEWED).

Integration Tests

End-to-End: Draft idea → Grill → Plan → Review → PLAN_REVIEWED.
Re-Review: PLAN_REVIEWED → REVIEWING_PLAN → PLAN_REVIEWED (no data loss).
Cancellation: Mid-review cancellation → revert to PLAN_READY.
Parse Errors: Malformed plan_md → PLAN_REVIEW_FAILED.

Manual Testing

Create test idea with PLAN_READY status.
Click "Review Plan".
Monitor job in Jobs dashboard.
Verify review-log in idea detail page.
Accept/reject approval.
Confirm status transition and IdeaLog entry.

Future Enhancements

Direct Multi-Model Calls: Use Claude API to invoke Haiku, Sonnet, Opus separately with model switching.
Codex Injection: Auto-load and inject docs/patterns/**/*.md and docs/architecture/**/*.md as context.
Configurable Thresholds: Allow product-level convergence percentage and max-rounds settings.
Review History: Preserve all review-logs for audit trail and re-review diffs.
Feedback Loop: Log user edits between review rounds and suggest re-run based on delta.
Scheduled Re-Review: Auto-trigger review after N days (staleness check).

References

docs/architecture/jobs.md — Job system architecture.
docs/patterns/server-action.md — Server action pattern (startReviewPlanJobAction).
docs/api/rest-contract.md — API surface for plan-review.
lib/idea-status.ts — Status transition graph and state machine.
lib/idea-plan-parser.ts — Plan YAML parsing (validator for approved plans).

10 KiB Raw Blame History Unescape Escape