Skip to content

DATA Loop -- Knowledge and Production

Chapter 8 of the CKP v3.6 Specification -- Normative

Purpose

The DATA loop is the memory organ of the Material Entity. It is the accumulation of everything the kernel has created, verified, and come to know. Instances live here. Proofs live here. The audit ledger lives here. LLM context lives here. The web surface lives here. Nothing is ever rewritten. The storage volume grows over time and is the kernel's most valuable asset.

The DATA loop exists because knowledge must outlive any individual process execution. A kernel that loses its accumulated data loses its purpose. By isolating knowledge in a dedicated, append-only volume, CKP ensures that identity upgrades and tool changes never risk data loss.

In Description Logic terms, the DATA loop is the ABox. Its contents are individuals -- specific instances of the types defined in the TBox (CK loop).

The Instance Tree

Every tool execution that produces an output creates one instance folder. CKP distinguishes two instance kinds: sealed instances (write-once from first write) and task instances (lifecycle state tracked in ledger.json via NATS; data.json sealed at completion).

storage/

# -- SEALED INSTANCE (all non-task CKs) --
|- instance-<short-tx>/
|   |- manifest.json              # who, what, when, bindings
|   |- data.json                  # write-once output sealed on first write
|   |- proof.json                 # validation result (check-type actions)
|   +- ledger.json                # before/after for mutate-type actions

# -- TASK INSTANCE (task kernel) --
|- i-task-{conv_guid}/
|   |- manifest.json              # status, target_ck, goal_id, priority, order
|   |- conversation_ref.json      # { conv_guid, path } pointer to agent session
|   |- data.json                  # write-once -- sealed at task.complete NATS event ONLY
|   |- ledger.json                # append-only state log -- all mutations via NATS
|   +- conversation/              # operate-type: append-only session records
|       |- c-{conv_id_1}.jsonl    #   first session
|       +- c-{conv_id_2}.jsonl    #   resumed session

# -- SHARED STORAGE --
|- proof/
|- ledger/
|   +- audit.jsonl
|- index/
|   |- by_timestamp.json
|   |- by_task_id.json
|   +- by_confidence.json
|- llm/
|   |- context.jsonl
|   |- memory.json
|   +- embeddings/
+- web/

Sealed Instances vs Task Instances

AspectSealed InstanceTask Instance
Directory patterninstance-<short-tx>/i-task-{conv_guid}/
data.json sealedOn first writeAt task.complete NATS event only
State trackingNone (atomic write)ledger.json via NATS lifecycle events
Conversation recordsNot presentconversation/ subdirectory (append-only)

PROV-O Provenance

Every instance record MUST include PROV-O provenance fields linking the instance to the action that created it, the operator who authorised it, and the kernel that produced it. This implements the fleet audit principle: every autonomous action traces to its playbook, its executing agent, and its input data.

DANGER

Implementations that omit PROV-O fields from instance records are non-conformant. Provenance is not optional -- it is a MUST-level requirement.

Mandatory Fields

PROV-O FieldPurposeExample
prov:wasGeneratedByThe action execution that created this instanceckp://Action#Task.task.create-1773518402000
prov:wasAssociatedWithThe actor (human or system) who authorised the actionckp://Actor#operator
prov:wasAttributedToThe kernel that produced this instanceckp://Kernel#AgentKernel:v1.0
prov:generatedAtTimeISO 8601 timestamp of instance creation2026-03-14T20:00:02Z
prov:usedInput artifacts consumed by the actionList of CKP URNs

Example Instance with Provenance

json
{
  "instance_id":              "i-task-1773518402",
  "prov:wasGeneratedBy":      "ckp://Action#Task.task.create-1773518402000",
  "prov:wasAssociatedWith":   "ckp://Actor#operator",
  "prov:wasAttributedTo":     "ckp://Kernel#AgentKernel:v1.0",
  "prov:generatedAtTime":     "2026-03-14T20:00:02Z",
  "prov:used": [
    "ckp://Kernel#ACME.Cymatics:v1.0/conceptkernel.yaml",
    "ckp://Kernel#ACME.Cymatics:v1.0/CLAUDE.md"
  ]
}

The three-factor audit chain -- GPG + OIDC + SVID -- is the full provenance implementation. Every instance traces to its playbook, its executing agent, and its input data.

DATA Loop NATS Topics

Conformant implementations MUST publish the following NATS topics for DATA loop events. All topics use the pattern ck.{guid}.data.*.

TopicWhen Published
ck.{guid}.data.writtenNew instance written to storage/
ck.{guid}.data.indexedIndex files updated
ck.{guid}.data.proof-generatedproof/ entry created
ck.{guid}.data.ledger-entryaudit.jsonl appended
ck.{guid}.data.accessedstorage/ read by another kernel (audit)
ck.{guid}.data.exportedDataset derived from storage/ for consumers
ck.{guid}.data.amendedInstance amendment committed and proof rebuilt
ck.{guid}.data.shacl-rejectedSHACL validation failed on write attempt
ck.{guid}.data.nats-degradedKernel entered degraded state due to NATS unavailability

Instance Lifecycle -- Create, Seal, Ledger

Instance lifecycle follows a strict progression:

Create: The tool writes data.json to a new instance directory. The platform validates against rules.shacl, generates proof.json, appends to the ledger, and commits.

Seal: Once data.json is written, it is sealed. For sealed instances, this happens on first write. For task instances, this happens at the task.complete NATS event. No further modification is permitted.

Ledger: Every state transition is recorded in ledger.json with before/after values, timestamps, and actor identity. The ledger is append-only and MUST survive process restarts.

Task State Transitions

FromEventToSide Effects
pendingtask.startin_progressledger.json appended
in_progresstask.updatein_progressledger.json appended with delta
in_progresstask.completecompleteddata.json sealed, ledger.json appended, result.{KernelName} published
in_progresstask.failfailedledger.json appended, event.{KernelName} published
completed(none)(terminal)No further transitions permitted
failedtask.retrypendingNew ledger.json entry, counter incremented

Instance Versioning and Mutation Policy

Git on the storage/ volume makes instances natively versioned. The kernel's ontology.yaml declares the mutability policy for all instances it produces:

yaml
# ontology.yaml -- instance mutability declaration
instance_mutability: sealed               # default -- data.json never changes
instance_mutability: amendments_allowed   # additions permitted, proof rebuilt
instance_mutability: full_versioning      # data.json replaceable, full history kept
Policydata.json BehaviourProof BehaviourUse Case
sealedWrite-once, never modifiedGenerated once at creationAudit records, compliance artifacts
amendments_allowedOriginal preserved; data_amendment_{ts}.json addedRebuilt to cover all filesLiving documents, accumulating datasets
full_versioningReplaceable; full git history retainedRebuilt on each replacementIterative outputs, draft-to-final workflows

Write-Once Rule

data.json is NEVER modified after first write for sealed instances. For task instances, lifecycle mutations (pending -> in_progress -> completed) are invoked via NATS and recorded append-only in ledger.json. The data.json file is written exactly once at the task.complete NATS event. No tooling SHALL open data.json between creation and completion.

NATS Availability and Durability

Task lifecycle NATS messages MUST use JetStream with at_least_once delivery guarantee. If NATS is unavailable, the following degradation protocol applies:

RuleBehaviour
1Task state transitions queue locally in storage/ledger/pending_events.jsonl
2On NATS reconnection, pending events replay in order
3If the local queue exceeds 1,000 events, the kernel enters degraded state and publishes ck.{guid}.data.nats-degraded on reconnection
4data.json MUST NOT be written without NATS confirmation of the task.complete event

WARNING

Conformant implementations MUST implement all four degradation rules. The local queue file (pending_events.jsonl) MUST be append-only and MUST survive process restarts. This ensures no state transitions are lost during NATS outages.

Part II Conformance Criteria (DATA Loop)

IDRequirementLevel
L-9Every instance MUST include PROV-O provenance fieldsCore
L-10data.json MUST NOT be modified after first write (sealed) or after task.complete (task)Core
L-11Instance mutability policy MUST be declared in ontology.yaml and enforced by platformCore
L-12NATS degradation protocol (four rules) MUST be implementedCore

Released under the MIT License.