Task Execution Engine
The Missing Link, Closed
The consensus loop produces approved tasks. The subagent proves that Claude Code can modify kernel files. The Task Execution Engine bridges these: it picks up approved tasks, executes them via headless Claude Code, validates the output against the kernel's ontology, and seals the result with full provenance linking back to the consensus decision.
Separation of Concerns
The governance engine (CK.Consensus) decides what should change and why. The task execution engine decides how to implement that change. This separation allows different execution strategies (manual, automated, AI-assisted) without changing the governance protocol.
Task Structure
Every task is a concrete work item produced by a consensus decision:
task_id: "T-a1b2c3d4"
decision_id: "P-e5f6g7h8"
target_file: "concepts/Delvinator.Core/tool/processor.py"
instruction: "Add @on('quality.score') handler per the approved specification"
constraints:
- "Output MUST validate against ontology.yaml QualityScore schema"
- "MUST produce prov:wasGeneratedBy linking to this task"
executor: "headless-claude-code"
status: "pending"
version_pin: "abc123" # git commit hash -- task applies to this version
created_at: "2026-04-05T10:00:00Z"
prov:wasGeneratedBy: "ckp://Action#CK.Consensus/approve-{ts}"| Field | Required | Description |
|---|---|---|
task_id | MUST | Unique identifier (T-{hex8}) |
decision_id | MUST | Reference to the parent consensus decision |
target_file | MUST | Filesystem path to the file being modified |
instruction | MUST | Natural language instruction for the executor |
constraints | MUST | List of validation rules the output must satisfy |
executor | MUST | Execution method (headless-claude-code) |
status | MUST | Current lifecycle state |
version_pin | MUST | Git commit hash the task targets |
Task Lifecycle
Tasks progress through four states:
| State | Description | Transitions To |
|---|---|---|
pending | Generated by consensus, awaiting execution | executing |
executing | Headless Claude Code is working on it | completed or failed |
completed | Output validated, changes committed | Terminal |
failed | Validation failed or execution error -- requires human review | pending (after re-evaluation) |
pending -> executing -> completed
\-> failed -> (human review) -> pendingNo Auto-Retry
Failed tasks MUST NOT be auto-retried without human review. A task that fails validation may have produced structurally incorrect output. Auto-retrying could compound the error. Human review ensures the failure is understood before re-execution.
Task Operations
Tasks support three operations beyond simple execution:
Split
One task becomes multiple subtasks for parallel execution. Each subtask references the parent task via prov:wasStartedBy.
# Parent task
task_id: "T-parent"
instruction: "Add quality scoring with validation and reporting"
# Subtask 1
task_id: "T-parent-scoring"
parent_task: "T-parent"
instruction: "Implement the quality scoring algorithm"
prov:wasStartedBy: "ckp://Task#T-parent"
# Subtask 2
task_id: "T-parent-validation"
parent_task: "T-parent"
instruction: "Add SHACL validation for QualityScore instances"
prov:wasStartedBy: "ckp://Task#T-parent"Evolve
A task is refined through further consensus, producing a replacement task with a reference to the original. This happens when the initial task specification was too vague or when requirements change between creation and execution.
Merge
Multiple tasks converging on the same file are combined into a single task to prevent conflicts. If tasks T1 and T2 both target tool/processor.py, merging them into T3 ensures a single coherent edit.
Version Pinning
Each task targets a specific kernel version (git commit hash). If the kernel evolves between task creation and execution, the task MUST be re-evaluated against the new version.
Task created at commit abc123
Kernel evolves to commit def456
Task is STALE -- version mismatch
Human review required before executionStale tasks are flagged for human review, not auto-executed, because the change they encode may conflict with intervening modifications. The version_pin field makes this check deterministic.
Execution Modes
Two modes, matching the streaming architecture:
Batch (claude -p)
For simple, well-defined edits -- add a handler, fix a bug, update a schema field.
claude -p "Add the quality.score handler to processor.py per this spec: ..." \
--no-session-persistence --tools "Read,Edit,Write"The task instruction becomes the prompt. Constraints are injected as system-level rules. Output is captured as a single blob.
Streaming (claude_agent_sdk)
For complex multi-step changes that benefit from human-in-the-loop review, tool use, and progressive feedback. Stream events are published to stream.CK.Operator so the web shell can show execution progress.
Choosing a Mode
Use batch for tasks with a single clear edit target. Use streaming for tasks that require multi-file changes, iterative refinement, or when visibility into the execution process matters.
Validation Pipeline
After headless Claude executes a task, the output passes through a four-stage validation pipeline:
| Stage | Check | Failure Action |
|---|---|---|
| 1. Structural | Does the modified file parse correctly? (Python AST, YAML, JSON) | status = failed |
| 2. Ontology | Do produced instances validate against ontology.yaml? | status = failed |
| 3. SHACL | Do produced instances satisfy rules.shacl constraints? | status = failed |
| 4. Constraint | Are all task-specified constraints met? | status = failed |
The rationale for post-execution validation is defence in depth: even though Claude received the specification and constraints, the output must be verified against the authoritative ontology. LLMs can produce plausible but non-conformant output.
Post-Execution Flow
On successful validation:
1. Git Commit
Changes committed with a message referencing task_id and decision_id:
[CK.Task] T-a1b2c3d4: Add quality.score handler
Decision: P-e5f6g7h8
Kernel: Delvinator.Core
Target: tool/processor.py2. Filer Sync
Git changes propagated to SeaweedFS filer, making them available for volume mounting.
3. Operator Reconciliation
CK.Operator detects changed files and updates cluster resources. The new processor code becomes live.
4. Provenance Sealing
Task marked as completed with a prov:Activity record linking to the consensus decision:
prov:Activity:
id: "ckp://Action#CK.Task/execute-T-a1b2c3d4"
prov:wasGeneratedBy: "ckp://Action#CK.Consensus/approve-{ts}"
prov:used:
- "ckp://Kernel#Delvinator.Core:v1.0/ontology.yaml"
- "ckp://Kernel#Delvinator.Core:v1.0/tool/processor.py"
prov:startedAtTime: "2026-04-05T10:01:00Z"
prov:endedAtTime: "2026-04-05T10:01:45Z"Architecture Overview
CK.Consensus approve
|
v
Tasks generated (stored in CK.Consensus DATA loop)
|
v
Task executor picks up pending tasks
|
v
Headless Claude Code executes (claude -p or claude_agent_sdk)
|
v
Output validated against target kernel's ontology
|
v
Changes committed (git) -> filer synced -> CK.Operator reconcilesConformance Requirements
| Criterion | Level |
|---|---|
| Tasks MUST be produced by CK.Consensus, not created ad-hoc | REQUIRED |
| Each task MUST reference its parent consensus decision | REQUIRED |
| Task execution MUST validate output against the target kernel's ontology | REQUIRED |
| Failed tasks MUST NOT be committed to git | REQUIRED |
| Failed tasks MUST NOT be auto-retried without human review | REQUIRED |
Each completed task MUST produce a prov:Activity linking to the consensus decision | REQUIRED |
| Version-pinned tasks MUST be re-evaluated if the kernel version changed | REQUIRED |
Headless Claude Code MUST use --no-session-persistence | REQUIRED |