DATA Loop -- Knowledge and Production
Chapter 8 of the CKP v3.7 Specification -- Normative
Purpose
The DATA loop is the memory organ of the Material Entity. It is the accumulation of everything the kernel has created, verified, and come to know. Instances live here. Proofs live here. The audit ledger lives here. LLM context lives here. The web surface lives here. Logs live here. Every metadata kind the kernel writes at runtime is a folder under data/. Nothing is ever rewritten. The storage volume grows over time and is the kernel's most valuable asset.
The DATA loop exists because knowledge must outlive any individual process execution. A kernel that loses its accumulated data loses its purpose. By isolating knowledge in a dedicated, append-only volume, CKP ensures that identity upgrades and tool changes never risk data loss.
In Description Logic terms, the DATA loop is the ABox. Its contents are individuals -- specific instances of the types defined in the TBox (CK loop).
The Standard Subfolders
The DATA organ root is data/. Inside it are seven standard subfolders that every kernel's runtime accumulates into. Only the names and roles below are normative -- the internal layout of each subfolder is the kernel's own convention, not part of this spec.
data/ # DATA loop root — every metadata folder lives here
# Mounted at /ck/{kernel}/data/ in pod
# Sourced from /ck-data/<project>/{kernel}/{version}/data/
|- instances/ # parent for all instance records the kernel produces
|- proof/ # verification evidence (PROV-O, hash chain)
|- ledger/ # append-only audit trail of writes
|- index/ # derived search indices
|- llm/ # LLM interaction logs (if applicable)
|- web/ # runtime web data (uploads, generated pages)
+- logs/ # process / runtime logs (stdout, stderr, structured)| Subfolder | Role | Required? |
|---|---|---|
instances/ | Parent directory for instance records the kernel produces. Each instance is a folder whose name and shape is the kernel's own decision. | REQUIRED -- every kernel that produces typed output uses this folder. |
proof/ | Verification evidence -- PROV-O records, hash chains, check outcomes. | REQUIRED -- every conformant kernel emits proofs here. |
ledger/ | Append-only audit trail of writes. | REQUIRED -- runtime drift is traceable through this. |
index/ | Derived search indices computed from instances/. Format is the kernel's choice. | OPTIONAL |
llm/ | LLM interaction logs, context windows, embeddings. | OPTIONAL -- only kernels that use LLMs. |
web/ | Runtime web data (uploads, generated pages, static cache). | OPTIONAL -- only kernels that serve web content. |
logs/ | Process/runtime logs. | OPTIONAL -- in addition to stdout/stderr captured by Kubernetes. |
Kernels are free to organise their own subfolders
The contents inside instances/, proof/, ledger/, etc. are the kernel's perogative. A task kernel like CK.Task may organise instances/i-task-<guid>/ with its own files (conversation_ref.json, conversation/c-<id>.jsonl, …); a sealed-instance kernel may use instances/instance-<tx>/data.json. Both are valid. The spec does not enforce one kernel's convention as a global standard.
All metadata under data/
Everything the kernel writes at runtime lives under the data/ folder. There is no metadata at <version>/ directly; that level is reserved for organ folders. Inside data/, the seven subfolders above are the namespaces in which kernels organise their accumulated state.
Instances vs Operational Data (v3.7)
Instances (data/instances/...) are the kernel's typed output -- they ARE the data type defined in ontology.yaml. The directory structure inside instances/ is the kernel's own convention.
Operational data (proof/, ledger/, index/, llm/, web/, logs/) is runtime state that supports the kernel's operation but is not the kernel's typed output.
Both live at /ck-data/<project>/{kernel}/{version}/data/, mounted at /ck/{kernel}/data/ in the pod.
PROV-O Provenance
Every instance record MUST include PROV-O provenance fields linking the instance to the action that created it, the operator who authorised it, and the kernel that produced it. This implements the fleet audit principle: every autonomous action traces to its playbook, its executing agent, and its input data.
DANGER
Implementations that omit PROV-O fields from instance records are non-conformant. Provenance is not optional -- it is a MUST-level requirement.
Mandatory Fields
| PROV-O Field | Purpose | Example |
|---|---|---|
prov:wasGeneratedBy | The action execution that created this instance | ckp://Action#Task.task.create-1773518402000 |
prov:wasAssociatedWith | The actor (human or system) who authorised the action | ckp://Actor#operator |
prov:wasAttributedTo | The kernel that produced this instance | ckp://Kernel#AgentKernel:v1.0 |
prov:generatedAtTime | ISO 8601 timestamp of instance creation | 2026-03-14T20:00:02Z |
prov:used | Input artifacts consumed by the action | List of CKP URNs |
Example Instance with Provenance
{
"instance_id": "i-task-1773518402",
"prov:wasGeneratedBy": "ckp://Action#Task.task.create-1773518402000",
"prov:wasAssociatedWith": "ckp://Actor#operator",
"prov:wasAttributedTo": "ckp://Kernel#AgentKernel:v1.0",
"prov:generatedAtTime": "2026-03-14T20:00:02Z",
"prov:used": [
"ckp://Kernel#ACME.Cymatics:v1.0/conceptkernel.yaml",
"ckp://Kernel#ACME.Cymatics:v1.0/CLAUDE.md"
]
}The three-factor audit chain -- GPG + OIDC + SVID -- is the full provenance implementation. Every instance traces to its playbook, its executing agent, and its input data.
DATA Loop NATS Topics
Conformant implementations MUST publish the following NATS topics for DATA loop events. All topics use the pattern ck.{guid}.data.*.
| Topic | When Published |
|---|---|
ck.{guid}.data.written | New instance written to data/ |
ck.{guid}.data.indexed | Index files updated |
ck.{guid}.data.proof-generated | proof/ entry created |
ck.{guid}.data.ledger-entry | audit.jsonl appended |
ck.{guid}.data.accessed | data/ read by another kernel (audit) |
ck.{guid}.data.exported | Dataset derived from data/ for consumers |
ck.{guid}.data.amended | Instance amendment committed and proof rebuilt |
ck.{guid}.data.shacl-rejected | SHACL validation failed on write attempt |
ck.{guid}.data.nats-degraded | Kernel entered degraded state due to NATS unavailability |
Instance Lifecycle -- Create, Seal, Ledger
The DATA volume is mounted ReadWriteMany (the kernel can create new instance folders, append ledger entries, and write proof records). The sealed and append-only semantics below are application-level contracts the platform enforces at the write boundary — not volume-level restrictions. Rephrased: the volume permits writes; the platform ensures each sealed file is written exactly once and each ledger is extended (not rewritten).
Instance lifecycle follows a strict progression:
Create: The tool writes data.json to a new instance directory. The platform validates against rules.shacl, generates proof.json, appends to the ledger, and commits.
Seal: Once data.json is written, it is sealed. For sealed instances, this happens on first write. For task instances, this happens at the task.complete NATS event. No further modification is permitted.
Ledger: Every state transition is recorded in ledger.json with before/after values, timestamps, and actor identity. The ledger is append-only and MUST survive process restarts.
Task State Transitions
| From | Event | To | Side Effects |
|---|---|---|---|
pending | task.start | in_progress | ledger.json appended |
in_progress | task.update | in_progress | ledger.json appended with delta |
in_progress | task.complete | completed | data.json sealed, ledger.json appended, result.{KernelName} published |
in_progress | task.fail | failed | ledger.json appended, event.{KernelName} published |
completed | (none) | (terminal) | No further transitions permitted |
failed | task.retry | pending | New ledger.json entry, counter incremented |
Instance Versioning and Mutation Policy
Git on the data/ volume makes instances natively versioned. The kernel's ontology.yaml declares the mutability policy for all instances it produces:
# ontology.yaml -- instance mutability declaration
instance_mutability: sealed # default -- data.json never changes
instance_mutability: amendments_allowed # additions permitted, proof rebuilt
instance_mutability: full_versioning # data.json replaceable, full history kept| Policy | data.json Behaviour | Proof Behaviour | Use Case |
|---|---|---|---|
sealed | Write-once, never modified | Generated once at creation | Audit records, compliance artifacts |
amendments_allowed | Original preserved; data_amendment_{ts}.json added | Rebuilt to cover all files | Living documents, accumulating datasets |
full_versioning | Replaceable; full git history retained | Rebuilt on each replacement | Iterative outputs, draft-to-final workflows |
Write-Once Rule
data.json is NEVER modified after first write for sealed instances. For task instances, lifecycle mutations (pending -> in_progress -> completed) are invoked via NATS and recorded append-only in ledger.json. The data.json file is written exactly once at the task.complete NATS event. No tooling SHALL open data.json between creation and completion.
NATS Availability and Durability
Task lifecycle NATS messages MUST use JetStream with at_least_once delivery guarantee. If NATS is unavailable, the following degradation protocol applies:
| Rule | Behaviour |
|---|---|
| 1 | Task state transitions queue locally in data/ledger/pending_events.jsonl |
| 2 | On NATS reconnection, pending events replay in order |
| 3 | If the local queue exceeds 1,000 events, the kernel enters degraded state and publishes ck.{guid}.data.nats-degraded on reconnection |
| 4 | data.json MUST NOT be written without NATS confirmation of the task.complete event |
WARNING
Conformant implementations MUST implement all four degradation rules. The local queue file (pending_events.jsonl) MUST be append-only and MUST survive process restarts. This ensures no state transitions are lost during NATS outages.
Part II Conformance Criteria (DATA Loop)
| ID | Requirement | Level |
|---|---|---|
| L-9 | Every instance MUST include PROV-O provenance fields | Core |
| L-10 | data.json MUST NOT be modified after first write (sealed) or after task.complete (task) | Core |
| L-11 | Instance mutability policy MUST be declared in ontology.yaml and enforced by platform | Core |
| L-12 | NATS degradation protocol (four rules) MUST be implemented | Core |