feat(gate): verify_required levels — ALIGNED/ALIGNED_OR_PARTIAL/ANY (#16)

Sluit story 'Verify-gate uitbreiden' in PBI 'Agent verify-flow hardening' af.

The previous gate weighed only EMPTY: any PARTIAL or DIVERGENT verify
slipped through. The Insights batch (2 May 2026) showed why that's
weak — agent-jobs claiming DONE while only delivering helpers, not
the requested UI components, with verify=DIVERGENT/PARTIAL accepted.

New decision matrix:

  null                       → reject (run verify_task_against_plan)
  EMPTY  + !verify_only      → reject
  EMPTY  + verify_only       → allowed
  ALIGNED                    → always allowed
  PARTIAL/DIVERGENT
    required=ALIGNED         → reject (strict task)
    required=ALIGNED_OR_PARTIAL (default) → allowed only if summary
                                            ≥20 chars (acknowledge drift)
    required=ANY             → allowed (refactor escape hatch)

`update_job_status('done')` now reads `task.verify_required` from the DB
(field added in Scrum4Me PR #53) and passes it + `summary` to the gate.
Tool description updated with the new rules.

Vendor submodule synced to pick up the schema enum.

Tests: 129/129 (was 120 + 9 new combinatorial gate tests).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Janpeter Visser 2026-05-02 17:55:06 +02:00 committed by GitHub
parent 0bcca15235
commit 1fe6ccf609
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 131 additions and 33 deletions

View file

@ -122,9 +122,29 @@ export async function prepareDoneUpdate(
}
}
export type VerifyRequired = 'ALIGNED' | 'ALIGNED_OR_PARTIAL' | 'ANY'
const SUMMARY_MIN_LENGTH = 20
/**
* Validate whether a CLAIMED/RUNNING job can transition to DONE based on its
* verify_result + the task's verify_required level.
*
* Decision matrix:
* verifyResult=null reject (run verify_task_against_plan first)
* EMPTY + !verify_only reject
* EMPTY + verify_only allowed
* ALIGNED always allowed
* PARTIAL/DIVERGENT
* required=ALIGNED reject (strict task)
* required=ALIGNED_OR_PARTIAL require non-empty summary explaining drift
* required=ANY allowed (refactor/multi-file edit)
*/
export function checkVerifyGate(
verifyResult: string | null,
verifyOnly: boolean,
verifyRequired: VerifyRequired = 'ALIGNED_OR_PARTIAL',
summary: string | undefined = undefined,
): { allowed: true } | { allowed: false; error: string } {
if (verifyResult === null) {
return {
@ -132,7 +152,8 @@ export function checkVerifyGate(
error: 'Roep eerst verify_task_against_plan aan voordat je DONE markeert.',
}
}
if (verifyResult === 'EMPTY' && !verifyOnly) {
if (verifyResult === 'EMPTY') {
if (verifyOnly) return { allowed: true }
return {
allowed: false,
error:
@ -140,6 +161,28 @@ export function checkVerifyGate(
'Markeer de task als verify_only of pas de implementatie aan.',
}
}
if (verifyResult === 'ALIGNED') return { allowed: true }
// PARTIAL or DIVERGENT
if (verifyRequired === 'ANY') return { allowed: true }
if (verifyRequired === 'ALIGNED') {
return {
allowed: false,
error:
`Plan vereist ALIGNED maar verify gaf ${verifyResult}. ` +
`Pas de implementatie aan zodat alle plan-paden zijn afgedekt, ` +
`of stel verify_required in op ALIGNED_OR_PARTIAL/ANY.`,
}
}
// verifyRequired === 'ALIGNED_OR_PARTIAL': vereist summary
if (!summary || summary.trim().length < SUMMARY_MIN_LENGTH) {
return {
allowed: false,
error:
`Verify gaf ${verifyResult}. Geef een summary (≥${SUMMARY_MIN_LENGTH} chars) die uitlegt ` +
`waarom de implementatie afwijkt van het plan, of stel verify_required in op ANY.`,
}
}
return { allowed: true }
}
@ -218,7 +261,10 @@ export function registerUpdateJobStatusTool(server: McpServer) {
'running (start), done (finished), failed (error). ' +
'The Bearer token must match the token that claimed the job. ' +
'Before marking done: call verify_task_against_plan first — done is rejected when ' +
'verify_result is null or EMPTY (unless task.verify_only is true). ' +
'verify_result is null, EMPTY (unless task.verify_only is true), or when the verify level ' +
'doesnt meet task.verify_required: ALIGNED-only is strict; ALIGNED_OR_PARTIAL accepts ' +
'PARTIAL/DIVERGENT but requires a non-empty summary (≥20 chars) explaining the drift; ANY ' +
'accepts everything. ' +
'Automatically emits an SSE event so the Scrum4Me UI updates in real time. ' +
'Response includes next_action: when wait_for_job_again, immediately call wait_for_job again. When queue_empty, the agent batch is done.',
inputSchema,
@ -238,7 +284,7 @@ export function registerUpdateJobStatusTool(server: McpServer) {
product_id: true,
task_id: true,
verify_result: true,
task: { select: { verify_only: true } },
task: { select: { verify_only: true, verify_required: true } },
},
})
@ -261,6 +307,8 @@ export function registerUpdateJobStatusTool(server: McpServer) {
const gate = checkVerifyGate(
job.verify_result ?? null,
job.task?.verify_only ?? false,
(job.task?.verify_required ?? 'ALIGNED_OR_PARTIAL') as VerifyRequired,
summary,
)
if (!gate.allowed) return toolError(gate.error)