# Persistent Scheduler State

## Status

Initial implementation in progress. The branch currently implements internal
memory-v2 scheduler observation tables, no-op observation commits, same-space
durable dirty marking, cross-space read-index mirrors, a snapshot query API,
runner-side rehydration primitives, and subscription-time rehydration for
actions recreated during piece startup. Durable process graph generations and
stronger implementation/runtime fingerprints remain version-1 placeholders.

## Summary

The runner already observes most of the information needed to restart a piece
without re-running every node: each scheduler action records read paths, shallow
read paths, actual changed write paths, declared write paths, materializer write
envelopes, action type, and dependency edges derived from those paths. That
state is currently held in process memory. When the process restarts, the
runtime reconstructs the pattern graph and re-runs nodes to rediscover their
dynamic dependencies.

The memory layer, by contrast, persists transactions in SQLite. Memory v2 keeps
an append-only commit log, revision log, branch heads, and materialized
snapshots. Transactions already carry read dependencies and write operations,
but the persisted commit payload is not sufficient to reconstruct the scheduler:

- no-op transactions are dropped before they reach memory v2
- storage commits require at least one operation
- scheduler-only details such as shallow reads, current-known writes,
  materializer envelopes, action identity, and dirty/stale state are not
  persisted
- scheduler observations are keyed by JavaScript function object identity, which
  cannot survive process restart

This proposal persists transaction-linked scheduler observations for every
successful action dependency collection or action run, including no-op runs. It
also persists enough server-side scheduler indexes to dirty inactive pieces when
later transactions write paths those pieces read. On restart, the runner can
validate those observations against the current process graph, code identity,
branch head, and durable dirty state. Valid observations can rehydrate the
scheduler indexes directly. Invalid or missing observations fall back to the
current behavior: run dependency collection or run the action when demanded.

## Goals

- Rehydrate a piece without eagerly running every node when persisted scheduler
  observations are still valid.
- Rebuild trigger indexes, writer indexes, materializer indexes, and dependency
  edges from durable observations.
- Determine which actions are clean, dirty, stale, or unknown by comparing
  recorded read watermarks to committed writes after the observation.
- Persist enough server-side trigger/dependency state that writes can dirty
  inactive pieces which read the changed data.
- Persist no-op action observations, because a run can update dependencies or
  prove output stability without changing memory.
- Support demand-targeted rehydration: if only one output, effect, or event
  preflight is needed, run only the unknown or stale actions required for that
  demand.
- Keep transaction/CFC concepts such as `attemptedWrites` separate from
  scheduler dependency evidence.

## Non-goals

- Do not make scheduler observations part of ordinary user data snapshots or
  semantic revisions. This is not an architectural secrecy boundary: runner is
  the memory client's primary caller, so memory can expose scheduler-specific
  internal APIs where that keeps the implementation direct.
- Do not replace the memory v2 commit/revision/head/snapshot model.
- Do not replay event queues across process restart in this proposal.
- Do not make the memory server execute JavaScript actions. The server records
  dirty/stale scheduler state; runners still execute actions.
- Do not trust persisted scheduler state when the action's implementation,
  schema, process graph, runtime version, or branch context no longer matches.
- Do not promise zero execution after restart. The fallback path must remain
  conservative and correct.

## Current System Overview

### Scheduler

The scheduler currently keeps the following important state in memory:

- `dependencies`: latest scheduler `ReactivityLog` per action, with recursive
  reads, shallow reads, and actual changed writes.
- `triggerIndex`: entity/path indexes mapping storage changes to readers.
- `writeIndex`: current-known writes and optional historical might-write sets,
  plus entity to writer indexes.
- `materializers`: broad or dynamic writable-input computations indexed by
  materializer write envelopes.
- `dependents` and `reverseDependencies`: action-to-action dependency edges
  derived from reads and writes.
- `pending`, `dirty`, `stale`, and upstream-stale counts.
- action type and lifecycle state: effect vs computation, parent/child action
  relation, demand roots, debounce/throttle state, retry state, and action run
  timing.

Only some of that state is worth persisting. Diagnostics, timers, in-flight
transactions, retries, pending promises, and loop counters are process-local.
Dependency observations, action identity, action type, declared writes,
materializer envelopes, and read/write watermarks are restart-relevant.

Useful classification:

| State | Persist? | Reason |
| --- | --- | --- |
| action identity, type, graph generation | Yes | Needed to match observations to recreated actions. |
| latest reads and shallow reads | Yes | Needed to rebuild triggers and validate cleanliness. |
| current-known writes and materializer envelopes | Yes | Needed to rebuild writer/materializer indexes before running. |
| dependency edges | Optional | Derivable from reads/writes, but can be cached for faster startup. |
| dirty/stale sets | Yes | The server must dirty inactive pieces; runners can validate or rebuild from observations plus later writes. |
| pending/event queues | No | Process-local unless event queues become durable. |
| debounce/throttle configuration | Yes | It is part of scheduling behavior, but active timers are not. |
| timers, promises, retries, trace buffers | No | They are runtime execution state. |

### Runner And Storage Transactions

Action runs create an extended storage transaction, invoke the action, commit
the transaction, convert the transaction to a scheduler `ReactivityLog`, then
resubscribe the action with that log. The storage transaction can expose a
`TransactionReactivityLog` containing:

- `reads`
- `shallowReads`
- `writes`
- optional `attemptedWrites`

The scheduler-facing log deliberately drops `attemptedWrites`; attempted writes
are CFC/security evidence, not dependency evidence.

Observation attachment is best-effort over the action's active transaction. It
must not change the semantics of a transaction that the action intentionally
aborted or that has already completed. In those cases the scheduler skips
observation attachment and lets the existing action retry/error path handle the
transaction result.

No-op transactions are currently short-circuited. If a transaction has no write
space, or its native commit has no effective operations, the runner transaction
returns success without opening the replica or calling memory v2. This is good
for normal storage performance, but it means the persistent transaction log
does not learn that the action ran, what it read, or that it proved no data
change was needed.

Memory v2 commit construction also strips the transaction's non-recursive read
flag before sending confirmed/pending reads to the server. That is correct for
the current conflict model, but scheduler rehydration needs to retain the
recursive-vs-shallow distinction in its own observation record.

### Memory V2

Memory v2 stores one SQLite database per space. The durable state is:

- `commit`: canonical sequence, branch, session/local sequence, original client
  commit, and server resolution
- `revision`: append-only entity mutations
- `head`: latest entity revision per branch
- `snapshot`: periodic full entity documents
- `branch`: branch metadata

The current memory v2 engine rejects commits with zero operations. Therefore a
pure scheduler observation cannot simply reuse the existing semantic commit
shape unchanged.

### Process Graph Snapshot

The pattern-construction graph snapshot spec already proposes persisting the
concrete process graph: nodes, module descriptors, input/output links, schemas,
and generation metadata. That is the right base for durable action identity,
but it explicitly leaves scheduler bookkeeping as an open question.

Persistent scheduler observations should be layered on top of graph snapshots:
the graph snapshot says which actions exist; the scheduler observation says
what each action last read, what it can write, and at what memory sequence that
observation was valid.

## Problem Statement

The current restart path can rebuild a piece's structural graph, but it cannot
know the current dynamic dependency graph without running actions. This has
four costs:

1. Cold start cost: large pieces must run many computations just to rediscover
   dependencies.
2. Precision loss: after restart, the runtime cannot tell whether a dirty state
   affects a demanded output until dependency collection or action execution
   runs again.
3. No-op invisibility: an action that ran and changed only its dependency set,
   or wrote the same value, produces no persisted memory transaction.
4. Inactive-piece dirtiness: piece A can commit a write to data read by piece B
   while piece B is not running. Today piece B gets away with this because
   startup eagerly re-runs its nodes. If startup can skip work, the memory layer
   must remember that B's prior observations are now dirty or stale.

The hard part is correctness. Dynamic dependencies are state-dependent. A
persisted scheduler edge is valid only for the action implementation, inputs,
schema interpretation, branch, and memory state against which it was observed.
If any of those inputs changed, using the edge as authoritative can skip work
that should run.

## Proposed Model

Persist scheduler observations as internal, branch-local records associated
with memory transactions and commit sequence numbers. These records are not
user data and should not be returned by ordinary memory queries.

The model should be transaction-first, not an unrelated cache. In practice, most
runner transactions are scheduler-relevant, and action runs have a 1:1
correlation with the transaction they commit. The few transactions that are not
owned by a scheduler action still act like external effects: they commit actual
writes, and those writes initiate scheduler dirty propagation for any actions
that read the changed paths.

There are four record classes:

- `scheduler_action_snapshot`: latest durable observation for one action in one
  piece generation.
- `scheduler_observation`: ordered observation events, including no-op
  observations, tied to the memory sequence that was current when the
  observation became visible.
- `scheduler_read_index` and `scheduler_write_index`: server-side path indexes
  used to find inactive readers/writers across pieces.
- `scheduler_action_state`: durable clean/dirty/stale/unknown state per action.

The names are illustrative; the implementation may choose different table names.

### Action Identity

Observation records must be keyed by a stable action identity, not by function
object identity. A durable action key should include:

- space and branch
- piece/result cell id
- process generation
- stable node id from the process graph snapshot
- module/program implementation identity or hash
- action kind: computation, effect, or event handler
- output link or stream link, when applicable
- optional parent action id for dynamic child graphs

The current scheduler action id (`src`, function name, or generated anonymous
id) is useful for diagnostics, but it is not sufficient as the durable key.

#### Version 1 Action Identity

Until durable process graph snapshots are available, the implemented version-1
identity is intentionally conservative. Runner startup annotates actions with
the result cell's normalized space/scope/id, branch, and graph generation `0`.
This is stronger than a pattern/module-name fallback because colocated pieces of
the same pattern get distinct identities, but it is not a full graph generation.
JavaScript action ids and raw builtin action ids must also include stable
node-local binding information, such as a hash of the result cell plus their
bound input and output cells, because a single piece can contain many
actions from the same source location or many `raw:map` / `raw:when` instances
with the same implementation name. Future durable graph generations, stronger
implementation fingerprints, or schema/process migrations should invalidate or
migrate these rows instead of treating them as fully versioned graph
observations.

### Scheduler Observation Shape

Each successful dependency collection or action run should produce an
observation similar to:

```ts
// Shown at module scope.
interface SchedulerActionObservationV1 {
  version: 1;
  ownerSpace: string;
  branch: string;
  pieceId: string;
  processGeneration: number;
  actionId: string;
  actionKind: "computation" | "effect" | "event-handler";
  implementationFingerprint: string;
  runtimeFingerprint: string;

  observedAtSeq: number;
  observedAtLocalSeq?: number;
  transactionKind: "dependency-collection" | "action-run" | "event-preflight";

  reads: SchedulerRead[];
  shallowReads: SchedulerRead[];
  actualChangedWrites: SchedulerAddress[];
  currentKnownWrites: SchedulerAddress[];
  declaredWrites: SchedulerAddress[];
  materializerWriteEnvelopes: SchedulerAddress[];
  ignoredSchedulingWrites?: SchedulerAddress[];

  actionOptions?: {
    debounceMs?: number;
    noDebounce?: boolean;
    throttleMs?: number;
  };

  status: "success" | "failed";
  errorFingerprint?: string;
}
```

`SchedulerRead` should retain the read path, scope, and the confirmed or
pending watermark used by the transaction. It should also preserve whether the
read was recursive or shallow. Cross-space reads require a per-space seq
watermark, because each space has an independent SQLite database.

`actualChangedWrites` is the transaction's changed write set. `currentKnownWrites`
is the scheduler's post-observation active scheduling write set after applying
the same rule used by resubscription: prefer the run's latest precise writes;
when the run has no effective writes, keep the previous current-known writes; and
fall back to declared writes only when the action has no prior current-known
view. It must not be a stale pre-run snapshot of the action's old writer index.
Persisting both is important: if a later run writes the same value,
`actualChangedWrites` can be empty while the action still owns a current output
path.

`attemptedWrites` should remain in the transaction/CFC record only. It should
not be copied into scheduler dependency fields.

### No-op Observations

No-op observations are required. They should be persisted when:

- dependency collection succeeds and records reads but no writes
- an action run succeeds and its effective native commit is empty
- an action run changes dependency paths but writes the same output value
- a materializer runs and proves that no materialized target changed

Aborted or inactive transactions are not no-op observations. If an action
intentionally aborts its transaction, or observation attachment discovers that
the target transaction is already complete, persistence should skip the
observation rather than converting the abort into a scheduler failure.

This does not mean every no-op must become a semantic revision. Scheduler
observations can be stored in an internal table with their own row id and an
`observedAtSeq` watermark. It is fine for memory to expose scheduler-specific
protocol calls to the runner; the important boundary is that ordinary data
queries and snapshots should not interpret observation rows as user data. If the
observation accompanies a real memory commit, it should be written atomically
with that commit. If it accompanies a no-op, it should still be durable and
ordered relative to the branch head it observed. Runners may batch multiple
no-op observations into one memory transaction, but the server must keep or drop
each action observation independently.

No-op observations participate in memory v2 session/local sequence replay. A
fresh no-op observation is kept, updates scheduler indexes, and clears the
action's dirty state when its reads are current. A stale no-op observation is
dropped as obsolete scheduler metadata, not rejected as a semantic conflict.
This handles the cross-device/current-data case: if another commit made the
runner's observation basis stale, the correct durable state is to retain the
existing dirty marker or unknown fallback rather than blocking the client.

Observation-only commits must use the action owner space, not whichever space
happens to appear first in the observation's reads or writes. Read-only and
same-value action runs can have no semantic write space; without an explicit
owner-space field they can otherwise persist their authoritative observation into
a cross-space read database and miss later owner-space rehydration.

## Storage Design Options

### Option A: Zero-operation semantic commits

Allow memory v2 `ClientCommit.operations` to be empty and store observation data
inside `commit.original`.

Pros:

- Reuses existing canonical sequencing.
- Gives observations a natural position in the commit log.
- Reuses existing session/local sequence replay rules.

Cons:

- Changes the memory v2 semantic contract that a transaction has at least one
  operation.
- Mixes scheduler-only records into the semantic commit stream, so every query
  and sync path would need to preserve the difference between user revisions and
  observation-only rows.
- Requires care to avoid triggering normal storage notifications for pure
  observations.

### Option B: Private scheduler tables

Add internal SQLite tables such as `scheduler_observation` and
`scheduler_action_snapshot`.

Pros:

- Keeps semantic memory commits unchanged.
- Lets observations use scheduler-specific indexes and retention policy.
- Keeps ordinary memory APIs focused on user data while still allowing explicit
  runner-facing scheduler APIs.

Cons:

- Needs a way to allocate an ordered observation watermark for no-op runs.
- Needs explicit transactional coupling when an action run also writes data.
- Must define how observations replicate, if scheduler rehydration should work
  across devices.

### Recommendation: Transaction-centric Hybrid State

Use internal scheduler tables, but drive them from the transaction pipeline. For
real action commits, write scheduler observation/index/state rows in the same
SQLite transaction as the memory commit. For no-op action runs, insert an
observation transaction row with:

- the current branch head sequence
- a monotonic observation id
- the transaction/session/local sequence, if available
- a read watermark for every observed space

The observation id orders no-op observations against each other. The branch head
sequence anchors them to the memory state they observed. No-op observations do
not create semantic revisions or normal storage notifications, but they do
update scheduler read/write indexes and action state.

The runner batches adjacent no-op observations into a single
`schedulerObservationBatch` commit. Each batch entry carries its own local
sequence, read watermarks, and observation payload. The batch has an envelope
local sequence for the transport request, but keep/drop/replay decisions are
made per entry. A semantic write flushes any queued no-op batch first so the
server observes the same action order as the runner.

This preserves the memory semantic log while making scheduler state durable and
server-visible. If cross-device scheduler rehydration becomes a requirement,
the same observation payload can later be replicated through an explicit
internal sync channel without changing the user data model.

### Table Sketch

The exact schema should follow memory v2's encoding helpers and migration style,
but the storage shape should look roughly like this:

```sql
CREATE TABLE scheduler_observation (
  observation_id   INTEGER PRIMARY KEY AUTOINCREMENT,
  branch           TEXT    NOT NULL DEFAULT '',
  commit_seq       INTEGER,
  observed_at_seq  INTEGER NOT NULL,
  session_id       TEXT,
  local_seq        INTEGER,
  piece_id         TEXT    NOT NULL,
  action_id        TEXT    NOT NULL,
  process_generation INTEGER NOT NULL,
  payload          JSON    NOT NULL,
  created_at       TEXT    NOT NULL DEFAULT (datetime('now'))
);

CREATE INDEX idx_scheduler_observation_action
  ON scheduler_observation (
    branch,
    piece_id,
    process_generation,
    action_id,
    observation_id
  );

CREATE TABLE scheduler_action_snapshot (
  branch           TEXT    NOT NULL DEFAULT '',
  piece_id         TEXT    NOT NULL,
  process_generation INTEGER NOT NULL,
  action_id        TEXT    NOT NULL,
  observation_id   INTEGER NOT NULL,
  payload          JSON    NOT NULL,

  PRIMARY KEY (branch, piece_id, process_generation, action_id),
  FOREIGN KEY (observation_id)
    REFERENCES scheduler_observation(observation_id)
);

CREATE TABLE scheduler_read_index (
  branch             TEXT    NOT NULL DEFAULT '',
  owner_space        TEXT,
  read_space         TEXT    NOT NULL,
  read_id            TEXT    NOT NULL,
  read_scope         TEXT    NOT NULL,
  read_path          JSON    NOT NULL,
  read_kind          TEXT    NOT NULL, -- 'recursive' | 'shallow'
  piece_id           TEXT    NOT NULL,
  process_generation INTEGER NOT NULL,
  action_id          TEXT    NOT NULL,
  observation_id     INTEGER NOT NULL
);

CREATE INDEX idx_scheduler_read_index_lookup
  ON scheduler_read_index (branch, read_space, read_id, read_scope);

CREATE TABLE scheduler_write_index (
  branch             TEXT    NOT NULL DEFAULT '',
  write_space        TEXT    NOT NULL,
  write_id           TEXT    NOT NULL,
  write_scope        TEXT    NOT NULL,
  write_path         JSON    NOT NULL,
  write_kind         TEXT    NOT NULL, -- 'current-known' | 'declared' | 'materializer'
  piece_id           TEXT    NOT NULL,
  process_generation INTEGER NOT NULL,
  action_id          TEXT    NOT NULL,
  observation_id     INTEGER NOT NULL
);

CREATE TABLE scheduler_action_state (
  branch             TEXT    NOT NULL DEFAULT '',
  piece_id           TEXT    NOT NULL,
  process_generation INTEGER NOT NULL,
  action_id          TEXT    NOT NULL,
  latest_observation_id INTEGER,
  direct_dirty_seq   INTEGER,
  stale_seq          INTEGER,
  unknown_reason     TEXT,

  PRIMARY KEY (branch, piece_id, process_generation, action_id)
);
```

`scheduler_observation` is the ordered history. `scheduler_action_snapshot` is
the latest usable observation per action. If the same action run produces real
memory operations, both the memory commit row/revision rows and scheduler
observation rows should be inserted under the same SQLite transaction and
linked through `commit_seq`. No-op observations leave `commit_seq` null and use
`observed_at_seq` plus `observation_id` for ordering.

The `scheduler_action_state` sketch compresses dirty/stale causes into summary
seq fields. A production implementation may need a separate dirty-cause or
stale-edge table so clearing one upstream dirty source does not accidentally
clear another.

For cross-space reads, either store one observation row in the action's primary
space with per-space read watermarks in `payload`, or store mirror rows in every
read space. The primary-space form is simpler, but it requires read validation
APIs that can open the other space databases during rehydration.

The current implementation uses mirror rows in read spaces. The owner space
keeps the authoritative observation; after the owner-space commit succeeds, the
server upserts a full scheduler observation snapshot into every read space and
into any previous read spaces that need stale index cleanup. These mirror rows
have `commit_seq = NULL` because the semantic commit belongs to another SQLite
database.

The owner space remains the source of truth for rehydration state. A write in a
read space can use mirrored rows to discover inactive readers, but it must push
the resulting direct-dirty marker back to each reader's owner space. The owner
space then propagates stale state through its own persisted scheduler graph. This
owner-space propagation may mark downstream actions stale, but it must not chase
cross-space mirrors recursively from stale state. Cross-space mirror lookup is
driven only by actual committed writes; possible writes from dirty/stale actions
wait until those actions run and commit real changed writes.

Version 1 accepts that owner-space commits and cross-space mirror writes are not
atomic across SQLite databases. If the owner commit succeeds and a later mirror
write fails, semantic data remains committed and the transaction may report a
failure after the fact. The known consequence is temporarily degraded
cross-space dirty propagation for inactive readers until a future run rewrites
or repairs the mirror rows. A later production hardening pass should add
explicit repair or `unknown` marking for mirror failures, but this spec does not
require distributed atomicity.

In-memory read-trigger indexes must also be cleaned up when actions unsubscribe
or a scheduler instance is disposed. Mirror rows are durable, but the runner's
live trigger maps should not retain empty per-entity buckets after a piece or
space unloads; a space unload can drop every trigger-index entity for that
space without touching durable mirror rows.

For inactive-piece dirtying, the write owner must be able to find readers even
when the reader's piece is not running. With one SQLite database per space, that
means either:

- mirror `scheduler_read_index` rows into every read space database, with a
  pointer back to the owning piece/action observation, or
- add a server-local scheduler metadata database that indexes reads and writes
  across spaces.

The mirror approach stays closest to the current memory architecture. The
server-local catalog is cleaner for cross-space queries, but it adds a new
storage root and replication story.

## Server-side Dirty Propagation

Persisted observations must support the same dirtying operation that the
in-memory scheduler performs today.

When any transaction commits actual changed writes, whether or not it belongs
to a scheduler action:

1. Normalize changed write paths using the same recursive/shallow overlap rules
   used by the scheduler and memory conflict validation.
2. Query `scheduler_read_index` for overlapping readers across pieces.
3. Mark those reader actions direct-dirty in `scheduler_action_state`.
4. Use persisted dependency edges, or derive edges from `scheduler_write_index`
   plus `scheduler_read_index`, to propagate stale state to downstream actions.
5. Do not execute actions on the server. Loaded runners may receive
   notifications; inactive pieces simply retain dirty/stale state until they
   start or are demanded.

When an action run commits:

1. Persist the memory commit/revisions if there are effective operations.
2. Apply dirty propagation for actual changed writes against the existing read
   index.
3. Persist the scheduler observation.
4. Replace that action's read index rows, write index rows, materializer rows,
   and latest action snapshot.
5. Clear the action's direct dirty bit if its new read watermarks are current.
6. Recompute stale propagation if the action's current-known writes changed.

Dirty propagation intentionally runs before the current action observation is
upserted. That lets an action's own successful observation clear any self-dirty
state caused by its changed writes while preserving dirty marks for other
inactive readers.

When an action run is a no-op:

1. Persist the scheduler observation without semantic revisions if its read
   watermarks are current.
2. Drop the observation without failing the transaction if a read watermark is
   stale or a pending read dependency is no longer valid.
3. Replace read/write/materializer index rows for kept observations.
4. Clear the action's direct dirty bit for kept observations.
5. Leave existing dirty/stale/unknown state in place for dropped observations.
6. Do not dirty downstream readers, because no actual changed writes occurred.

This server-side state changes the role of rehydration. Restart no longer has
to discover from scratch whether inactive writes made the piece dirty; it loads
the persisted dirty/stale/unknown action state and validates it against the
latest observations.

## Rehydration Algorithm

### Full Piece Rehydration

1. Load the process graph snapshot from the piece/result cell.
2. Validate the graph snapshot against the current runtime, program/module
   fingerprints, schemas, and process generation.
3. Recreate action objects and scheduler subscriptions from the graph snapshot,
   but do not run actions yet.
4. Load the latest valid scheduler action observation and durable action state
   for each action key.
5. For each valid observation:
   - restore `dependencies`
   - restore trigger paths from `reads` and `shallowReads`
   - restore current-known writes and writer indexes
   - restore materializer envelopes
   - rebuild dependency edges from restored reads and writes
6. For missing or invalid observations, mark the action `unknown`.
7. Load persisted direct-dirty/stale state. If the server-side dirty index is
   absent, outdated, or being backfilled, recompute by comparing observation
   read watermarks to committed writes after each observation:
   - no overlapping later write: action can be clean
   - overlapping later write: action is dirty
   - missing data, incompatible code, or ambiguous branch history: action is
     unknown
8. Propagate dirty state through restored dependency edges to verify or rebuild
   stale state.
9. Queue live effects, demanded computations, or idle materializers according
   to the normal pull-mode rules.

The current runner implementation uses these steps opportunistically during
subscription. `Scheduler.rehydrateActionFromObservation()` rebuilds in-memory
scheduler indexes from a validated observation, and
`Scheduler.rehydrateActionFromStorage()` loads one action's persisted snapshot
from the storage provider before applying that primitive. `Scheduler.subscribe`
can defer the normal first-run scheduling while the storage-backed lookup runs;
a clean snapshot restores indexes and skips first execution, while a missing or
invalid snapshot falls back to the normal initial dirty/pending path.

Rehydration must rebuild dependency edges from the restored active scheduling
write view, not only from the transaction's actual changed writes. A no-op action
observation can have `actualChangedWrites = []` while `currentKnownWrites`
contains the action's output. Restoring that writer must use the same dependency
update path as a live resubscribe so already-restored readers are backfilled as
dependents.

Storage-backed rehydration is asynchronous scheduler work. Scheduler `idle()`
and runtime disposal must wait for this work before storage sessions or memory
transports close. Browser/integration harnesses that own a page runtime should
dispose that runtime before closing the page, while preserving existing
suite-owned page lifetimes for tests that intentionally run ordered steps
against one page.

Runner startup passes this subscription option for pattern result, JavaScript,
and raw actions using the result cell's stable space/scope/id identity. The
version-1 graph generation is currently `0`. JavaScript actions add a stable
hash of their result-cell anchor and bound input/output cells to the diagnostic
action name before that name becomes the persisted scheduler action id. Raw
builtin actions similarly add a stable hash of their bound input/output cells to
the diagnostic raw action name. This prevents repeated source locations and
multiple raw instances in one piece from sharing one snapshot row. Any future
durable graph generation or stronger implementation fingerprinting
should invalidate or migrate these observations rather than treating them as
fully versioned graph snapshots.

`unknown` is stricter than dirty. Dirty means the scheduler has a valid previous
dependency view and knows what can make the action fresh again. Unknown means
the dependency view itself is missing or untrusted; the action must run
dependency collection or execute before dependents can rely on it.

`Scheduler.rehydrateActionFromObservation()` is the low-level primitive for
already-validated observations. Storage-backed rehydration is a trust boundary:
the memory protocol returns observation payloads as `unknown`, so the runner
must type-check the payload and verify action identity, implementation
fingerprint, runtime fingerprint, and scheduler mode before skipping execution.

### Demand-targeted Rehydration

When a specific output, effect, event preflight, or `cell.pull()` is demanded,
rehydration can start from that demand root:

1. Resolve the demand root to an action or read log.
2. Use restored writer indexes and materializer envelopes to find direct
   upstream actions.
3. Walk restored dependency edges backward.
4. Run only upstream actions that are dirty or unknown.
5. Stop once the demand root can observe fresh values.

This is the same logical traversal the pull scheduler performs today, but it can
start from persisted observations instead of first rebuilding every action's
dependencies by execution.

### Materializers

Materializer observations should persist both:

- the materializer's input dependencies
- the materializer write envelopes

This is the category introduced for actions with a large or dynamic write
surface. They cannot stay purely demand-pulled, because their broad envelope is
too imprecise to use as normal dependency evidence. They also should not dirty
all possible downstream readers when an input changes, because that recreates
the broad fanout that pull mode is trying to avoid.

Materializer identity is explicit scheduler metadata. For generated
`computed()` callbacks, the transformer emits
`materializerWriteInputPaths` only when capability analysis observes actual
writes through captured cell inputs; the runner resolves those input paths to
`materializerWriteEnvelopes` for the concrete action instance. A generated
action may read Writable cells without side-writing through them;
output-producing computations can also be materializers, and materializer
membership must not suppress normal dirty fanout through their declared or
current-known outputs. The current runtime fallback is limited to opaque-result
generated computations that do not carry write-path metadata, where the
computation has no normal output surface and its observable work is
side-writing through captured Writable inputs.

On restart:

- If a materializer's input reads are stale, mark the materializer dirty.
- If the same action has declared or current-known writes, rebuild those normal
  writer/dependent edges and propagate stale demand through them like any other
  computation.
- If no primary pull demand exists, run dirty materializers from the idle pull
  loop.
- If a demand root or event preflight reads a path inside a dirty materializer
  envelope, promote that materializer before the reader/handler.
- After the materializer runs, use its actual changed writes to dirty only
  readers of changed paths.

This preserves the current pull-mode materializer behavior while making the
materializer decision durable.

Persisted scheduler state should therefore distinguish three write surfaces for
these actions:

- `materializerWriteEnvelopes`: broad/dynamic target envelopes used to discover
  demand overlap and to know that this action must run when its inputs are
  dirty.
- `currentKnownWrites`: the last precise normal scheduling writes produced by
  the action after it ran. These remain in the ordinary writer index even when
  the action also has materializer envelopes.
- `actualChangedWrites`: the precise changed paths from the latest run, used to
  dirty downstream readers.

Server-side dirty propagation for materializers follows the same split:

1. A committed write that overlaps a materializer's input reads marks the
   materializer direct-dirty in `scheduler_action_state`.
2. The server still propagates stale state through any ordinary dependency edges
   from that action's declared or current-known writes.
3. The server does not fan out through the materializer envelope to all possible
   downstream readers.
4. Loaded runners schedule dirty materializers as eager idle work, honoring
   debounce/throttle settings.
5. If a demand root or event preflight reads inside a dirty materializer
   envelope before idle work runs, the runner promotes the materializer and runs
   it first.
6. Only after the materializer commits actual changed writes does durable dirty
   propagation mark precise downstream readers dirty.

## Correctness Invariants

- A persisted observation may be used only when its action identity and
  implementation fingerprint match the recreated action.
- Storage-backed observations may skip execution only when their runtime
  fingerprint, including scheduler mode, also matches the active scheduler.
- A clean action observation is valid only if no later committed write overlaps
  any of its recursive or shallow reads under the appropriate overlap rule.
- Missing or invalid observations must never be treated as clean.
- Scheduler writes must be over-approximated when uncertain. Extra work is
  acceptable; missed work is not.
- Observation persistence must be atomic with the memory commit whose writes it
  describes.
- Cross-space read-index mirrors are version-1 best-effort rows written after
  the owner-space commit; they are not distributed-transaction participants.
- No-op observations must record the branch head sequence and read watermarks
  they observed.
- Every committed actual write must be reflected into the durable scheduler
  dirty index before the transaction is considered fully integrated for
  rehydration.
- Non-action transactions must still drive dirty propagation from their actual
  changed writes.
- Cross-space reads must be validated against each space's own branch head and
  later writes.
- Cross-piece reads must be discoverable by the space that receives a write,
  even when the reading piece is not running.
- Branch identity must be part of every observation key. Rebase, merge, fork,
  or branch deletion should invalidate observations unless a branch-aware proof
  maps their read watermarks forward.
- `attemptedWrites` are not scheduler dependency evidence. They can be
  persisted for CFC/security, but rehydration must not use them to create
  writer or trigger edges.
- Event handlers can be re-registered from graph snapshots, but queued events
  are outside this proposal unless event queues become durable.

## Required Query Support

Server-side dirty propagation and rehydration need efficient overlap queries.
Memory v2 already validates confirmed reads using path-aware history. The
scheduler-facing API should expose similar internal primitives:

```ts
// Shown for illustration only.
findOverlappingWritesAfter({
  space,
  branch,
  id,
  scope,
  path,
  nonRecursive,
  afterSeq,
  beforeSeq,
})

findSchedulerReadersForWrite({
  space,
  branch,
  id,
  scope,
  path,
})
```

The implementation may over-approximate. For example, structural array edits can
invalidate a whole collection subtree. The API must not miss real overlaps.
Scheduler paths persisted in indexes and observation payloads should use the
memory boundary codec rather than ad hoc JSON, so future path component shapes
stay on the normal persistence path. The memory-side shallow-overlap helper may
be conservative, but it should have parity tests against the runner dependency
overlap logic so scheduler and memory dirtying do not drift.

The implemented snapshot lookup surface is:

- memory protocol request `scheduler.snapshot.list`
- engine API `Engine.listSchedulerActionSnapshots()`
- runner storage-provider method `listSchedulerActionSnapshots()`

The query filters by branch, piece id, process generation, and optionally action
id. Bulk listing is cursor-paginated in deterministic owner-space, piece,
generation, action order. The protocol result intentionally carries
`observation` as `unknown`; the runner owns validation and casting to
`SchedulerActionObservationV1`.

Subscription-time storage rehydration is bounded. If the snapshot request does
not resolve before the rehydration timeout, the scheduler drops that pending
snapshot attempt and schedules the action for the normal initial run path. A
late snapshot result must not overwrite newer in-memory dirty or clean state.

## Phased Plan

Current branch status:

| Area | Version 1 status |
| --- | --- |
| Feature flag | Implemented as `EXPERIMENTAL_PERSISTENT_SCHEDULER_STATE`, default off. |
| Observation construction and no-op persistence | Implemented, including batched no-op observation commits. |
| Memory scheduler tables and same-space dirty marking | Implemented. |
| Cross-space read-index mirrors | Implemented with accepted non-atomic mirror writes. |
| Snapshot query surface | Implemented. |
| Runner rehydration primitive and storage-backed lookup | Implemented. |
| Subscription-time clean startup skip | Implemented for recreated actions with valid snapshots. |
| Durable process graph generations | Future work; version 1 uses result cell identity plus graph generation `0`. |
| Demand-targeted dirty recovery beyond subscription startup | Future work. |
| Replication, retention, and mirror repair | Future work. |

The version 1 implementation is gated by the project's common
experimental-option plumbing. With
`EXPERIMENTAL_PERSISTENT_SCHEDULER_STATE=false` or unset, the runner does not
attach scheduler observations to transactions, memory clients do not request
scheduler snapshots, and the memory server does not write scheduler observation
rows, dirty rows, or cross-space mirrors. Snapshot-list requests intentionally
return an empty result while the flag is off, even if a previous flagged run
left scheduler rows in the SQLite database. Unlike `modernCellRep`, this flag
is not a required memory protocol compatibility flag: mismatched peers may still
connect, and the server-side flag determines whether scheduler observation rows
are accepted and served.

### Phase 1: Observe Without Rehydrating

- Define durable action ids from process graph snapshots.
- Emit scheduler observation objects after dependency collection and action
  runs.
- Add internal storage APIs to persist observation rows, including no-op rows.
- Add diagnostics to compare in-memory scheduler state against persisted
  observations.
- Do not change restart behavior yet.

### Phase 2: Persist Server-side Indexes

- Store scheduler read index, write index, materializer index, and latest action
  snapshot rows transactionally with action observations.
- Mirror or catalog cross-space read index rows so a write can find inactive
  readers.
- Keep restart behavior conservative while comparing persisted indexes against
  rebuilt in-memory scheduler state.

### Phase 3: Durable Dirty Propagation

- On every committed actual write, mark overlapping persisted readers direct
  dirty.
- Propagate stale state through persisted dependency edges or derived
  read/write overlap.
- Include non-action transactions as dirty propagation sources.
- Add diagnostics for dirty-state drift between server tables and live
  scheduler state.

### Phase 4: Restore Indexes Conservatively

- On restart, load graph snapshots, matching scheduler observations, and
  durable dirty/stale state.
- Rebuild trigger indexes, current-known writer indexes, materializer indexes,
  and dependency edges in the runner.
- Mark actions with valid observations as clean or dirty by comparing read
  watermarks against durable action state, with later-write overlap scans as a
  repair path.
- Mark actions without valid observations as unknown.
- Still allow normal execution to repair unknown state.

### Phase 5: Skip Clean Startup Work

- Avoid initial execution for actions whose observations are valid and clean.
- Ensure live effects can subscribe without firing stale callbacks.
- Preserve current behavior for actions with missing observations, invalid
  fingerprints, or dirty reads.
- Implemented version 1: subscriptions can defer the initial run for a
  storage-backed snapshot lookup and fall back to the normal first run on miss.

### Phase 6: Demand-targeted Dirty Recovery

- Add a rehydration entrypoint for a demanded output, effect, event preflight,
  or explicit `pull()`.
- Walk persisted dependency edges backward from the demand root.
- Execute only unknown or stale upstream actions needed by that demand.

### Phase 7: Replication And Retention

- Decide whether scheduler observations replicate across devices or remain
  local cache state.
- Add garbage collection keyed by piece generation, branch lifecycle, and
  superseded action snapshots.
- Add metrics for cold start skipped actions, unknown-action fallback, and
  observation invalidation reasons.

## Test Strategy

- Unit-test observation construction from `TransactionReactivityLog`, including
  recursive reads, shallow reads, changed writes, declared writes,
  current-known writes, materializer envelopes, and ignored scheduling writes.
- Verify `attemptedWrites` persists only in transaction/CFC records and is not
  present in scheduler observations.
- Add memory v2 tests for internal no-op observation rows: no semantic
  revisions, no normal storage notifications, but durable observation data.
- Add restart tests where a piece rehydrates without rerunning clean
  computations.
- Add restart tests where an unrelated write after the observation does not
  dirty the action.
- Add restart tests where an overlapping write after the observation dirties
  exactly the affected chain.
- Add two-piece tests where piece A writes data read by inactive piece B, then B
  starts dirty without eagerly re-running every node.
- Add non-action transaction tests where a direct edit or handler-originated
  write dirties persisted scheduler readers.
- Add cross-space tests for a write in one space dirtying a piece whose action
  observation is owned by another space.
- Add dynamic dependency tests where a condition changes branches and invalidates
  the prior dependency set correctly.
- Add no-op action tests where dependencies change but output remains equal.
- Add materializer tests for idle execution, demand promotion, and changed-write
  precision after restart.
- Add invalidation tests for implementation fingerprint, schema fingerprint,
  process generation, branch mismatch, and missing observation rows.
- Add cross-space read tests where only one observed space changes.
- Add benchmarks for large clean cold start, targeted dirty rehydration, and
  broad materializer fanout after restart.

Validation evidence for the current branch:

- targeted runner and memory scheduler-state tests
- `HEADLESS=1 deno task test`
- `HEADLESS=1 deno task integration`
- `deno task check`

The current persistent-state benchmark measures in-memory scheduler-index
rehydration only. It does not include process graph loading, storage query
latency, cross-space mirror repair, or full pattern startup.

## Open Questions

- Are scheduler observations local cache state, server-replicated runtime state,
  or something in between? Inactive-piece dirtying requires server-side state
  at least for the server that accepts the write.
- What is the precise durable action id for dynamically-created child actions?
- Should dependency edges be stored directly, or always derived from persisted
  read/write indexes?
- Should cross-space read index rows be mirrored into every read space database,
  or should memory maintain a separate scheduler catalog database?
- Should effects run once after restart even when their dependency observations
  are clean, or should restored subscriptions be considered enough?
- How should observation tables be encrypted or filtered, given that persisted
  read paths may reveal structure even when values are protected by CFC labels?
- What is the retention policy for observations from old process generations
  and inactive branches?
- Can the process graph snapshot and scheduler observation be committed in a
  single setup transaction, or do they need independent lifecycles?
- How should pending optimistic commits be represented if a process restarts
  before confirmation?