Task Execution Engine

The Missing Link, Closed

The consensus loop produces approved tasks. The subagent proves that Claude Code can modify kernel files. The Task Execution Engine bridges these: it picks up approved tasks, executes them via headless Claude Code, validates the output against the kernel's ontology, and seals the result with full provenance linking back to the consensus decision.

Separation of Concerns

The governance engine (CK.Consensus) decides what should change and why. The task execution engine decides how to implement that change. This separation allows different execution strategies (manual, automated, AI-assisted) without changing the governance protocol.

Task Structure

Every task is a concrete work item produced by a consensus decision:

yaml

task_id: "T-a1b2c3d4"
decision_id: "P-e5f6g7h8"
target_file: "concepts/Delvinator.Core/tool/processor.py"
instruction: "Add @on('quality.score') handler per the approved specification"
constraints:
  - "Output MUST validate against ontology.yaml QualityScore schema"
  - "MUST produce prov:wasGeneratedBy linking to this task"
executor: "headless-claude-code"
status: "pending"
version_pin: "abc123"               # git commit hash -- task applies to this version
created_at: "2026-04-05T10:00:00Z"
prov:wasGeneratedBy: "ckp://Action#CK.Consensus/approve-{ts}"

Field	Required	Description
`task_id`	MUST	Unique identifier (T-{hex8})
`decision_id`	MUST	Reference to the parent consensus decision
`target_file`	MUST	Filesystem path to the file being modified
`instruction`	MUST	Natural language instruction for the executor
`constraints`	MUST	List of validation rules the output must satisfy
`executor`	MUST	Execution method (`headless-claude-code`)
`status`	MUST	Current lifecycle state
`version_pin`	MUST	Git commit hash the task targets

Task Lifecycle

Tasks progress through four states:

State	Description	Transitions To
`pending`	Generated by consensus, awaiting execution	`executing`
`executing`	Headless Claude Code is working on it	`completed` or `failed`
`completed`	Output validated, changes committed	Terminal
`failed`	Validation failed or execution error -- requires human review	`pending` (after re-evaluation)

pending -> executing -> completed
                    \-> failed -> (human review) -> pending

No Auto-Retry

Failed tasks MUST NOT be auto-retried without human review. A task that fails validation may have produced structurally incorrect output. Auto-retrying could compound the error. Human review ensures the failure is understood before re-execution.

Task Operations

Tasks support three operations beyond simple execution:

Split

One task becomes multiple subtasks for parallel execution. Each subtask references the parent task via prov:wasStartedBy.

yaml

# Parent task
task_id: "T-parent"
instruction: "Add quality scoring with validation and reporting"

# Subtask 1
task_id: "T-parent-scoring"
parent_task: "T-parent"
instruction: "Implement the quality scoring algorithm"
prov:wasStartedBy: "ckp://Task#T-parent"

# Subtask 2
task_id: "T-parent-validation"
parent_task: "T-parent"
instruction: "Add SHACL validation for QualityScore instances"
prov:wasStartedBy: "ckp://Task#T-parent"

Evolve

A task is refined through further consensus, producing a replacement task with a reference to the original. This happens when the initial task specification was too vague or when requirements change between creation and execution.

Merge

Multiple tasks converging on the same file are combined into a single task to prevent conflicts. If tasks T1 and T2 both target tool/processor.py, merging them into T3 ensures a single coherent edit.

Version Pinning

Each task targets a specific kernel version (git commit hash). If the kernel evolves between task creation and execution, the task MUST be re-evaluated against the new version.

Task created at commit abc123
  Kernel evolves to commit def456
    Task is STALE -- version mismatch
      Human review required before execution

Stale tasks are flagged for human review, not auto-executed, because the change they encode may conflict with intervening modifications. The version_pin field makes this check deterministic.

Execution Modes

Two modes, matching the streaming architecture:

Batch (`claude -p`)

For simple, well-defined edits -- add a handler, fix a bug, update a schema field.

bash

claude -p "Add the quality.score handler to processor.py per this spec: ..." \
  --no-session-persistence --tools "Read,Edit,Write"

The task instruction becomes the prompt. Constraints are injected as system-level rules. Output is captured as a single blob.

Streaming (`claude_agent_sdk`)

For complex multi-step changes that benefit from human-in-the-loop review, tool use, and progressive feedback. Stream events are published to stream.CK.Operator so the web shell can show execution progress.

Choosing a Mode

Use batch for tasks with a single clear edit target. Use streaming for tasks that require multi-file changes, iterative refinement, or when visibility into the execution process matters.

Validation Pipeline

After headless Claude executes a task, the output passes through a four-stage validation pipeline:

Stage	Check	Failure Action
1. Structural	Does the modified file parse correctly? (Python AST, YAML, JSON)	`status = failed`
2. Ontology	Do produced instances validate against `ontology.yaml`?	`status = failed`
3. SHACL	Do produced instances satisfy `rules.shacl` constraints?	`status = failed`
4. Constraint	Are all task-specified constraints met?	`status = failed`

The rationale for post-execution validation is defence in depth: even though Claude received the specification and constraints, the output must be verified against the authoritative ontology. LLMs can produce plausible but non-conformant output.

Post-Execution Flow

On successful validation:

1. Git Commit

Changes committed with a message referencing task_id and decision_id:

[CK.Task] T-a1b2c3d4: Add quality.score handler

Decision: P-e5f6g7h8
Kernel: Delvinator.Core
Target: tool/processor.py

2. Filer Sync

Git changes propagated to SeaweedFS filer, making them available for volume mounting.

3. Operator Reconciliation

CK.Operator detects changed files and updates cluster resources. The new processor code becomes live.

4. Provenance Sealing

Task marked as completed with a prov:Activity record linking to the consensus decision:

yaml

prov:Activity:
  id: "ckp://Action#CK.Task/execute-T-a1b2c3d4"
  prov:wasGeneratedBy: "ckp://Action#CK.Consensus/approve-{ts}"
  prov:used:
    - "ckp://Kernel#Delvinator.Core:v1.0/ontology.yaml"
    - "ckp://Kernel#Delvinator.Core:v1.0/tool/processor.py"
  prov:startedAtTime: "2026-04-05T10:01:00Z"
  prov:endedAtTime: "2026-04-05T10:01:45Z"

Architecture Overview

CK.Consensus approve
    |
    v
Tasks generated (stored in CK.Consensus DATA loop)
    |
    v
Task executor picks up pending tasks
    |
    v
Headless Claude Code executes (claude -p or claude_agent_sdk)
    |
    v
Output validated against target kernel's ontology
    |
    v
Changes committed (git) -> filer synced -> CK.Operator reconciles

Conformance Requirements

Criterion	Level
Tasks MUST be produced by CK.Consensus, not created ad-hoc	REQUIRED
Each task MUST reference its parent consensus decision	REQUIRED
Task execution MUST validate output against the target kernel's ontology	REQUIRED
Failed tasks MUST NOT be committed to git	REQUIRED
Failed tasks MUST NOT be auto-retried without human review	REQUIRED
Each completed task MUST produce a `prov:Activity` linking to the consensus decision	REQUIRED
Version-pinned tasks MUST be re-evaluated if the kernel version changed	REQUIRED
Headless Claude Code MUST use `--no-session-persistence`	REQUIRED

Task Execution Engine ​

The Missing Link, Closed ​

Task Structure ​

Task Lifecycle ​

Task Operations ​

Split ​

Evolve ​

Merge ​

Version Pinning ​

Execution Modes ​

Batch (claude -p) ​

Streaming (claude_agent_sdk) ​

Validation Pipeline ​

Post-Execution Flow ​

1. Git Commit ​

2. Filer Sync ​

3. Operator Reconciliation ​

4. Provenance Sealing ​

Architecture Overview ​

Conformance Requirements ​

Task Execution Engine

The Missing Link, Closed

Task Structure

Task Lifecycle

Task Operations

Split

Evolve

Merge

Version Pinning

Execution Modes

Batch (`claude -p`)

Streaming (`claude_agent_sdk`)

Validation Pipeline

Post-Execution Flow

1. Git Commit

2. Filer Sync

3. Operator Reconciliation

4. Provenance Sealing

Architecture Overview

Conformance Requirements