Skip to content

Task Execution Engine

The consensus loop produces approved tasks. The subagent proves that Claude Code can modify kernel files. The Task Execution Engine bridges these: it picks up approved tasks, executes them via headless Claude Code, validates the output against the kernel's ontology, and seals the result with full provenance linking back to the consensus decision.

Separation of Concerns

The governance engine (CK.Consensus) decides what should change and why. The task execution engine decides how to implement that change. This separation allows different execution strategies (manual, automated, AI-assisted) without changing the governance protocol.

Task Structure

Every task is a concrete work item produced by a consensus decision:

yaml
task_id: "T-a1b2c3d4"
decision_id: "P-e5f6g7h8"
target_file: "concepts/Delvinator.Core/tool/processor.py"
instruction: "Add @on('quality.score') handler per the approved specification"
constraints:
  - "Output MUST validate against ontology.yaml QualityScore schema"
  - "MUST produce prov:wasGeneratedBy linking to this task"
executor: "headless-claude-code"
status: "pending"
version_pin: "abc123"               # git commit hash -- task applies to this version
created_at: "2026-04-05T10:00:00Z"
prov:wasGeneratedBy: "ckp://Action#CK.Consensus/approve-{ts}"
FieldRequiredDescription
task_idMUSTUnique identifier (T-{hex8})
decision_idMUSTReference to the parent consensus decision
target_fileMUSTFilesystem path to the file being modified
instructionMUSTNatural language instruction for the executor
constraintsMUSTList of validation rules the output must satisfy
executorMUSTExecution method (headless-claude-code)
statusMUSTCurrent lifecycle state
version_pinMUSTGit commit hash the task targets

Task Lifecycle

Tasks progress through four states:

StateDescriptionTransitions To
pendingGenerated by consensus, awaiting executionexecuting
executingHeadless Claude Code is working on itcompleted or failed
completedOutput validated, changes committedTerminal
failedValidation failed or execution error -- requires human reviewpending (after re-evaluation)
pending -> executing -> completed
                    \-> failed -> (human review) -> pending

No Auto-Retry

Failed tasks MUST NOT be auto-retried without human review. A task that fails validation may have produced structurally incorrect output. Auto-retrying could compound the error. Human review ensures the failure is understood before re-execution.

Task Operations

Tasks support three operations beyond simple execution:

Split

One task becomes multiple subtasks for parallel execution. Each subtask references the parent task via prov:wasStartedBy.

yaml
# Parent task
task_id: "T-parent"
instruction: "Add quality scoring with validation and reporting"

# Subtask 1
task_id: "T-parent-scoring"
parent_task: "T-parent"
instruction: "Implement the quality scoring algorithm"
prov:wasStartedBy: "ckp://Task#T-parent"

# Subtask 2
task_id: "T-parent-validation"
parent_task: "T-parent"
instruction: "Add SHACL validation for QualityScore instances"
prov:wasStartedBy: "ckp://Task#T-parent"

Evolve

A task is refined through further consensus, producing a replacement task with a reference to the original. This happens when the initial task specification was too vague or when requirements change between creation and execution.

Merge

Multiple tasks converging on the same file are combined into a single task to prevent conflicts. If tasks T1 and T2 both target tool/processor.py, merging them into T3 ensures a single coherent edit.

Version Pinning

Each task targets a specific kernel version (git commit hash). If the kernel evolves between task creation and execution, the task MUST be re-evaluated against the new version.

Task created at commit abc123
  Kernel evolves to commit def456
    Task is STALE -- version mismatch
      Human review required before execution

Stale tasks are flagged for human review, not auto-executed, because the change they encode may conflict with intervening modifications. The version_pin field makes this check deterministic.

Execution Modes

Two modes, matching the streaming architecture:

Batch (claude -p)

For simple, well-defined edits -- add a handler, fix a bug, update a schema field.

bash
claude -p "Add the quality.score handler to processor.py per this spec: ..." \
  --no-session-persistence --tools "Read,Edit,Write"

The task instruction becomes the prompt. Constraints are injected as system-level rules. Output is captured as a single blob.

Streaming (claude_agent_sdk)

For complex multi-step changes that benefit from human-in-the-loop review, tool use, and progressive feedback. Stream events are published to stream.CK.Operator so the web shell can show execution progress.

Choosing a Mode

Use batch for tasks with a single clear edit target. Use streaming for tasks that require multi-file changes, iterative refinement, or when visibility into the execution process matters.

Validation Pipeline

After headless Claude executes a task, the output passes through a four-stage validation pipeline:

StageCheckFailure Action
1. StructuralDoes the modified file parse correctly? (Python AST, YAML, JSON)status = failed
2. OntologyDo produced instances validate against ontology.yaml?status = failed
3. SHACLDo produced instances satisfy rules.shacl constraints?status = failed
4. ConstraintAre all task-specified constraints met?status = failed

The rationale for post-execution validation is defence in depth: even though Claude received the specification and constraints, the output must be verified against the authoritative ontology. LLMs can produce plausible but non-conformant output.

Post-Execution Flow

On successful validation:

1. Git Commit

Changes committed with a message referencing task_id and decision_id:

[CK.Task] T-a1b2c3d4: Add quality.score handler

Decision: P-e5f6g7h8
Kernel: Delvinator.Core
Target: tool/processor.py

2. Filer Sync

Git changes propagated to SeaweedFS filer, making them available for volume mounting.

3. Operator Reconciliation

CK.Operator detects changed files and updates cluster resources. The new processor code becomes live.

4. Provenance Sealing

Task marked as completed with a prov:Activity record linking to the consensus decision:

yaml
prov:Activity:
  id: "ckp://Action#CK.Task/execute-T-a1b2c3d4"
  prov:wasGeneratedBy: "ckp://Action#CK.Consensus/approve-{ts}"
  prov:used:
    - "ckp://Kernel#Delvinator.Core:v1.0/ontology.yaml"
    - "ckp://Kernel#Delvinator.Core:v1.0/tool/processor.py"
  prov:startedAtTime: "2026-04-05T10:01:00Z"
  prov:endedAtTime: "2026-04-05T10:01:45Z"

Architecture Overview

CK.Consensus approve
    |
    v
Tasks generated (stored in CK.Consensus DATA loop)
    |
    v
Task executor picks up pending tasks
    |
    v
Headless Claude Code executes (claude -p or claude_agent_sdk)
    |
    v
Output validated against target kernel's ontology
    |
    v
Changes committed (git) -> filer synced -> CK.Operator reconciles

Conformance Requirements

CriterionLevel
Tasks MUST be produced by CK.Consensus, not created ad-hocREQUIRED
Each task MUST reference its parent consensus decisionREQUIRED
Task execution MUST validate output against the target kernel's ontologyREQUIRED
Failed tasks MUST NOT be committed to gitREQUIRED
Failed tasks MUST NOT be auto-retried without human reviewREQUIRED
Each completed task MUST produce a prov:Activity linking to the consensus decisionREQUIRED
Version-pinned tasks MUST be re-evaluated if the kernel version changedREQUIRED
Headless Claude Code MUST use --no-session-persistenceREQUIRED

Released under the MIT License.