Streaming

The Problem: Batch Responses Are Not Enough

Before v3.5.9, every LLM-backed action produced a single blob response. The kernel invoked claude -p, waited for completion, and published the full result to result.{kernel}. For a 30-second Claude response, the user saw nothing for 30 seconds, then everything at once.

This is architecturally correct (the result IS the sealed instance) but operationally frustrating. Users expect progressive rendering -- they want to see the response being constructed, not just the final output.

Two Invocation Modes

v3.6 establishes two modes for invoking Claude from within a concept kernel:

Mode A: Batch (`claude -p`, subprocess)

python

result = subprocess.run(
    ["claude", "-p", prompt, "--tools", "", "--no-session-persistence"],
    cwd=ck_dir, capture_output=True, text=True, timeout=300
)

Output captured as a single blob. Sealed as instance. Published as one result.{kernel} message. Simple, no streaming. Appropriate for:

Actions that produce structured data (compliance checks, status queries)
Environments without NATS WSS (headless, CI pipelines)
Short responses where streaming overhead is not worth it

Mode B: Streaming (`claude_agent_sdk`, async generator)

python

from claude_agent_sdk import query, ClaudeAgentOptions

async for event in query(
    prompt=prompt,
    options=ClaudeAgentOptions(model="sonnet", tools=[])
):
    await nc.publish(
        f"stream.{kernel}",
        json.dumps(event_to_nats(event)).encode()
    )

Each token/event published to stream.{kernel} in real-time. The browser subscribes and renders progressive chat bubbles. Appropriate for:

Interactive conversation actions (message, analyze, summarize)
Any action backed by EXTENDS to CK.Claude
Long-running Claude responses where feedback matters

Stream Topic Separation

Each kernel has a stream topic: stream.{kernel}. This is separate from result.{kernel} because streaming events are transient fragments, while results are complete, sealed responses.

Topic	Content	Persistence	Consumer
`input.{kernel}`	Action requests	Transient	Kernel processor
`result.{kernel}`	Complete action results	Sealed as instance	Web shell, other kernels
`event.{kernel}`	Lifecycle events	Logged	Web shell, edge subscribers
`stream.{kernel}`	Token-by-token LLM output	Transient (not stored)	Web shell only

Why a Separate Topic

The stream topic is separate from result.{kernel} because:

Different subscribers. The result topic carries sealed instances -- consumed by other kernels, stored in the DATA loop. The stream topic carries ephemeral tokens -- consumed by browser UIs for progressive rendering.
Different lifetimes. Result messages are durable (they become instances). Stream messages are fire-and-forget (they exist only during rendering).
Different volumes. A 2,000-token Claude response generates ~2,000 stream events but ONE result event. Mixing them on the same topic would bury results in stream noise.
Optional subscription. Kernels that do not need progressive rendering never subscribe to stream.*. The overhead is zero for non-interactive use cases.

Event Type Mapping

CK.Lib.Py maps claude_agent_sdk events to NATS stream messages. The full mapping covers 7 event types plus suppressed control events:

claude_agent_sdk Event	NATS `type` Field	Description	Browser Rendering
`content_block_start`	`content_block_start`	New content block beginning	Creates new bubble
`content_block_delta`	`content_block_delta`	Incremental text token	Growing text bubble
`content_block_stop`	`content_block_stop`	Content block complete	Finalizes block
`tool_use`	`tool_use`	Tool invocation (Read, Edit, Bash, etc.)	Collapsible tool block
`tool_result`	`tool_result`	Tool execution result	Result inside tool block
`AssistantMessage`	`AssistantMessage`	Complete assistant turn	Final bubble (removes streaming indicator)
`ResultMessage`	`ResultMessage`	Final result with usage stats	Closes stream
`message_start`, `message_stop`, `ping`	(suppressed)	Control events	Not rendered

Stream Message Structure

Every stream message carries five required fields:

json

{
  "type": "content_block_delta",
  "trace_id": "tx-a8f3c1",
  "kernel_urn": "ckp://Kernel#Delvinator.Core:v1.0",
  "data": {
    "text": "Based on my analysis of the fleet topology, "
  },
  "seq": 42,
  "timestamp": "2026-04-05T16:37:25.123Z"
}

Field	Required	Description
`type`	MUST	Event type from the mapping table above
`trace_id`	MUST	Correlation ID linking all events from one action invocation
`kernel_urn`	MUST	Source kernel -- which kernel's stream this is
`data`	MUST	Event-specific payload
`seq`	MUST	Sequence number for ordering within a trace
`timestamp`	SHOULD	ISO 8601 timestamp of event emission

The trace_id is critical for the web shell: it groups stream events into a single response bubble. Without it, events from concurrent actions on the same kernel would interleave in the UI.

The seq field provides ordering guarantees within a trace. The web shell renders tokens in sequence order, handling out-of-order delivery.

Batch vs. Streaming Decision

CKP supports two LLM invocation modes, selectable per action via the EXTENDS edge config:

Mode	CLI	Use Case	NATS Behaviour
Batch	`claude -p "prompt" --tools "" --no-session-persistence`	Simple analysis, schema generation	Single `result.{kernel}` message
Streaming	`claude_agent_sdk`	Multi-turn reasoning, tool use, human-in-loop	Progressive `stream.{kernel}` events + final `result.{kernel}`

The EXTENDS edge config declares the preferred mode:

yaml

edges:
  outbound:
    - target_kernel: CK.Claude
      predicate: EXTENDS
      config:
        mode: streaming          # batch | streaming
        persona: analytical-reviewer

When to Use Each Mode

Use batch for actions that produce structured data (compliance checks, status queries) or where streaming overhead is not worth it. Use streaming for interactive conversation actions, multi-turn reasoning, and any action where progressive feedback matters.

Stream Callback Pattern

Handlers that stream responses receive a stream callback:

python

@on("analyze")
async def handle_analyze(self, data, stream=None):
    # stream is a callable: await stream(type, payload)
    await stream("content_block_delta", {"text": "Analyzing..."})
    # ... do work ...
    await stream("content_block_delta", {"text": " found 3 issues."})
    return emit("AnalysisCompleted", result=analysis)

The stream callback publishes to stream.{kernel} on NATS. The final return emit(...) publishes to result.{kernel}.

Handler Signature

The NatsKernelLoop handler signature includes the stream parameter:

python

# Handler with streaming support
async def handle_action(body, nc=None, trace_id=None, stream=None):
    if stream:
        await stream("content_block_delta", {"text": "Processing..."})
    ...

The stream parameter is a callback: async stream(event_type: str, data: dict). When present, the handler can emit streaming events. The callback handles NATS publishing, trace_id injection, and kernel_urn tagging.

Backwards compatibility: Handlers that do not accept stream= still work. The NatsKernelLoop inspects the handler's signature at registration time and only passes stream if the handler declares it.

Structured Logging in NatsKernelLoop

v3.5.9 also replaces the [rx]/[tx] print statements in NatsKernelLoop with structured JSON logging:

json

{"ts":"2026-04-05T16:37:25Z","level":"info","kernel":"Delvinator.Core","event":"rx","trace":"tx-a8f3c1","action":"analyze","user":"test26"}
{"ts":"2026-04-05T16:37:25Z","level":"info","kernel":"Delvinator.Core","event":"stream.start","trace":"tx-a8f3c1"}
{"ts":"2026-04-05T16:37:30Z","level":"info","kernel":"Delvinator.Core","event":"stream.end","trace":"tx-a8f3c1","tokens":1847}
{"ts":"2026-04-05T16:37:30Z","level":"info","kernel":"Delvinator.Core","event":"tx","trace":"tx-a8f3c1","topic":"result.Delvinator.Core"}

The _log() method ensures every line includes ts, level, kernel, and event -- matching the structured logging spec from v3.5.2.

Architectural Bridge: Local and Cluster

The streaming architecture bridges LOCAL (developer machine) and CLUSTER (deployed kernel):

LOCAL (developer)                    CLUSTER (deployed)
-----------------                    ------------------
claude CLI                           Kernel processor
  |-- claude_agent_sdk                 |-- claude_agent_sdk
       |-- stream events                    |-- stream events
            |-- stdout                           |-- NATS stream.{kernel}
                                                      |-- Browser (WSS)

Same SDK, same event types, same rendering. A kernel running locally streams to stdout. The same kernel deployed to the cluster streams to NATS. The browser consumes both via the stream.{kernel} subscription. The developer sees the same progressive output in Claude Code CLI and in the web shell.

Web Shell Integration

The web shell's chat view mode handles stream events:

On first content_block_delta for a new trace_id -- create a new bubble with streaming indicator
On subsequent content_block_delta -- append text to the bubble
On tool_use -- create a collapsible tool block within the bubble
On AssistantMessage -- finalize the bubble, remove streaming indicator
On ResultMessage -- close the stream, enable new action input

The body and envelope view modes ignore stream events -- they only render the final result.{kernel} message.

Architectural Consistency Check

Logical Analysis: Streaming Design

Question: Should stream events be stored in the DATA loop?

Answer: No. Stream events are ephemeral rendering artifacts. The sealed instance (published to result.{kernel}) IS the DATA loop artifact. Storing individual tokens would be like storing every keystroke of a document -- the document itself is the meaningful unit. If audit of the full token stream is needed, kernel logs (which are in the DATA loop at data/logs/) already capture the stream events.

Question: What happens if the browser disconnects mid-stream?

Answer: The kernel continues streaming to NATS. The events are published regardless of subscriber presence. When the browser reconnects, it will not see the missed events -- NATS basic pub/sub does not replay. The user sees the final result.{kernel} message (which arrives after the stream completes) but misses the progressive rendering. This is acceptable: the result is never lost, only the rendering experience.

Question: Can two actions on the same kernel stream simultaneously?

Answer: Yes. Each action invocation gets a unique trace_id. Stream events are tagged with the trace_id. The web shell groups events by trace_id into separate bubbles. The NATS topic is shared (stream.{kernel}) but the trace_id provides logical isolation.

Gap identified: The stream topic currently has no backpressure mechanism. A kernel with a very fast Claude response (or a runaway streaming loop) could flood NATS. For the current deployment scale (7 kernels, single users), this is not a concern. For multi-user sessions (v3.5.14), a rate limit or NATS JetStream-backed stream may be needed.

SDK Dependencies

SDK	Language	Minimum Version
`claude_agent_sdk`	Python	v0.1.50+
`@anthropic-ai/claude-agent-sdk`	JavaScript (npm)	v0.2.84+

Version-locked to Claude Code compatibility. The SDK provides the async generator interface that maps Claude's internal event stream to typed events.

Conformance Requirements

Kernels with LLM capability SHOULD use claude_agent_sdk for streaming
Stream events MUST be published to stream.{kernel} topic
Each stream event MUST carry trace_id for correlation
Each stream event MUST carry kernel_urn for source identification
The web shell MUST subscribe to stream.{kernel} and render progressive content
Batch mode (claude -p) remains valid for non-streaming actions
Handlers that do not accept stream= MUST continue to work unchanged

Streaming ​

The Problem: Batch Responses Are Not Enough ​

Two Invocation Modes ​

Mode A: Batch (claude -p, subprocess) ​

Mode B: Streaming (claude_agent_sdk, async generator) ​

Stream Topic Separation ​

Why a Separate Topic ​

Event Type Mapping ​

Stream Message Structure ​

Batch vs. Streaming Decision ​

Stream Callback Pattern ​

Handler Signature ​

Structured Logging in NatsKernelLoop ​

Architectural Bridge: Local and Cluster ​

Web Shell Integration ​

Architectural Consistency Check ​

SDK Dependencies ​

Conformance Requirements ​

Streaming

The Problem: Batch Responses Are Not Enough

Two Invocation Modes

Mode A: Batch (`claude -p`, subprocess)

Mode B: Streaming (`claude_agent_sdk`, async generator)

Stream Topic Separation

Why a Separate Topic

Event Type Mapping

Stream Message Structure

Batch vs. Streaming Decision

Stream Callback Pattern

Handler Signature

Structured Logging in NatsKernelLoop

Architectural Bridge: Local and Cluster

Web Shell Integration

Architectural Consistency Check

SDK Dependencies

Conformance Requirements