# 06 — CFC CFC carries **confidentiality and integrity** through SQLite **per column** (implemented) and **per row** (implemented — Phase 3.a). Per-column static `ifc` is honored — read-label propagation and write-time ceiling checks, derived from the labels declared on the table schema. Per-row, **data-derived** labels are computed from each row's own values by a declarative rule on the table schema: re-derived and attached per row on read, gated on write. A db that declares no `ifc` and no row rule is unaffected and pays nothing. ## Background: the existing label model CFC labels are open-ended sets of **atoms** (immutable JSON values), not a fixed classification lattice ([`packages/runner/src/cfc.ts`](../../../packages/runner/src/cfc.ts); `packages/api/cfc-atoms.ts`). A schema attaches labels through its `ifc` field ([`packages/api/index.ts`](../../../packages/api/index.ts)): - `confidentiality` — who may read (set of atoms; principals are common atoms, e.g. `did:mailto:…`, `did:key:…`). - `integrity` / `addIntegrity` / `requiredIntegrity` — provenance/authenticity guarantees. - `maxConfidentiality` — a ceiling the output may not exceed. - `ownerPrincipal`, `writeAuthorizedBy` — access/authorization control; `{ __ctCurrentPrincipal: true }` resolves to the acting principal at prepare-time. Labels join by **union with structural dedup** (`deepEqual`), and confidentiality is checked by "every atom in the label fits under the destination's `maxConfidentiality` ceiling." This flat representation is the all-singleton degenerate case of the CFC spec's CNF algebra (every atom an independent conjunctive clause; the subset ceiling check is CFC spec §8.10.3's clause subsumption restricted to singletons) — the clause-aware migration path is CFC spec §18.5. Async effects already declare a **write policy** before committing side effects via the sink-request mechanism ([`packages/runner/src/cfc/sink-request.ts`](../../../packages/runner/src/cfc/sink-request.ts)), which is the seam SQLite writes will use. ## Per-column labels (implemented) A column's label is declared on the database table schema's `ifc`, reusing the mechanism schemas already support. `table(...)` (registered via `sqliteDatabase({ tables })`, Section [01](./01-api.md)) passes `ifc` through per field: ```tsx // Shown inside a pattern body. const notes = table({ id: "integer primary key", body: { type: "string", ifc: { confidentiality: ["secret-body"] } }, }); ``` This honors **source B**: labels DERIVED from the schema's declared per-column `ifc`. Tunnelling arbitrary captured labels write→read that are NOT described on the schema (**source A** — a value that flowed in confidential and was stored in a plain column) is deferred; the reserved label column for that future work is `cf_label`. Every path below is gated on the db declaring at least one labeled column (`columnDeclaresIfc`), so unlabeled dbs pay nothing. ### Read — sound column provenance, per-field labels `@db/sqlite` keys result rows by the SELECT's **output names**, which is unsound: `SELECT body AS x` hides a confidential column, `SELECT subject AS from_email` spoofs another, and a name present in several joined tables is ambiguous. So labels are keyed off each result column's **TRUE origin** `(table, column)`, read from SQLite column metadata via FFI (`sqlite3_column_origin_name` / `sqlite3_column_table_name`) on the prepared statement ([`column-origin.ts`](../../../packages/memory/v2/sqlite/column-origin.ts)): | query output | origin | meaning | |---|---|---| | `body AS x` | `(notes, body)` | alias resolved | | `subject AS from_email` | `(emails, subject)` | spoof defeated | | `upper(body)` | `(null, null)` | no single source | | JOIN / UNION / CTE / view / subquery | true origin | disambiguated | This binds the SAME libsqlite3 `@db/sqlite` already loaded (it is compiled with `SQLITE_ENABLE_COLUMN_METADATA`; it just doesn't expose those symbols), getting its path the way `@db/sqlite` does — plug's `download({ cache: "use" })` (a cache hit returns the already-downloaded file; no scan, no network) — with `DENO_SQLITE_PATH` as an override. If the symbols can't be bound, a labeled query fails loudly rather than mislabeling. Provenance is captured **server-side** (where the prepared statement lives); the **runner** maps each origin → the column's `ifc` and writes the result rows under a **per-field label schema** (`labelResultSchema`), so a consumer reading `q.result[i].` inherits that column's label: - An origin column's `ifc` is copied to its result field. - A `null`-origin column (expression / literal / aggregate) does NOT refuse the query — it inherits the conservative combined label of the db's labeled columns via the runtime's `mergeLabel` (the integrity-combine semantics for *derived* data are an open question — see CT-1668). - Two columns projecting to the same output name **refuse** the query (the per-row label would be ambiguous). The label is attached when the **query result is written back into the result cell** (the read path), in a transaction separate from — and after — the read. This is distinct from a SQL mutation, which joins the *caller's* transaction and is only ceiling-checked; the result-cell write is its own CFC-relevant write and must be prepared like any `ifc`-bearing write before it commits. The per-field label lands on each split-out row entity, and a downstream consumer inherits it by accumulating labels across the dereferences its read traverses (not from the label of a single navigated cell). > Implementation: the result write is the post-commit effect of `db.query`, > committed via `runtime.editWithRetry` (which runs `prepareTxForCommit`); the > SQL mutation is `db.exec` recording a `sqlite` op on `this.tx`; downstream > inheritance is `cfcLabelViewForDereferenceTraces`, not `cfcLabelViewForCell`. ### Write — ceiling check `db.exec` checks each bound value's confidentiality (read off the value via `cfcLabelViewForCell`) against the target column's `maxConfidentiality`, before recording the write. The target column is resolved by a bounded, **fail-closed** parser (`parseWriteParamColumns` / `parseWriteTable`): - A resolution miss — unknown table, column not in the declared schema, a schema-qualified target, an interleaved literal in `VALUES`, `UPDATE OR `, an identifier case mismatch — rejects a labeled value rather than treating "no ceiling found" as "no ceiling". Column match is case-insensitive. - Named/object params fail closed for a labeled value (a bind name isn't reliably the column, and SET vs WHERE can't be told apart without parsing) — use positional `?` with an explicit column list. - Unlabeled values, and columns without a ceiling, are unaffected. This is a clean fit because the row schema is *already* a JSON Schema and `ifc` is *already* understood by `ContextualFlowControl`. ## Per-row labels, derived from row data (implemented — Phase 3.a) The motivating case: an email row's confidentiality should be the sender, every recipient, and the mailbox owner — constructed from row data, e.g. `"did:mailto:" + from_addr`. This is **data-dependent**, so it cannot be a static schema label; it is computed from the row's own values on write and **re-derived from stored data** on read. The rule is a **declarative projection** over (stored columns, fixed db properties), never pattern code: the atomic SQLite write lands inside `applyCommitTransaction` server-side (Section [04](./04-server-execution-and-transactions.md)), where no sandbox code runs, so the rule must serialize into the table schema and be evaluable by a small fixed evaluator on either side. That buys the audit property the design is built around: **a row's label is a pure function of its stored columns** — recomputable at any time, by either side, identically. ### Authoring surface — `table(columns, rule)` `table()` takes the rule as an optional second argument ([`schema.ts`](../../../packages/memory/v2/sqlite/schema.ts)); the helpers live on the builder's `cfSqlite` namespace (one import for the whole vocabulary). The rule receives a shallow field handle `f` (one accessor per declared column) and returns `{ confidentiality?, integrity? }` — two **independent** expressions: ```tsx // Shown inside a pattern body. const { table, all, principal, match, whenMatches, authoredBy, dbOwner } = cfSqlite; // Pull address tokens out of a messy "Name , addr" recipient line. const ADDR = /[^\s<>,;"]+@[^\s<>,;"]+/g; const emails = table( { id: "integer primary key", from_addr: "text", to_addrs: "text", auth: "text", body: "text", }, (f) => ({ // Row confidentiality = sender ∧ every recipient ∧ the mailbox owner. confidentiality: all( principal("mailto", match(f.from_addr, ADDR, { min: 1 })), principal("mailto", match(f.to_addrs, ADDR)), dbOwner(), ), // Provenance gated on the row's own anti-spoof evidence (forgeable by the // row's writer — see the integrity rule below). integrity: whenMatches( f.auth, /dmarc=pass/, authoredBy(principal("mailto", match(f.from_addr, ADDR, { min: 1 }))), ), }), ); ``` **"The sender and all recipients" really means *any-of*** — one OR-clause `[[sender ∨ recipients ∨ owner]]` (an authored disjunction, CFC spec §3.1.8), written `any(...)`. Under the runner's current flat **conjunctive** lowering the same atom set means *all-of*: a row fits only a ceiling listing every participant, so per-user views return nothing. The surface keeps the two combinators explicit — `all(...)` (conjunctive clauses, works today) and `any(...)` (one OR-clause, **errors at `table()` time** until the clause-aware label profile lands, CFC spec §18.5.3) — so the wrong semantics is never shipped silently. The closed helper set ([`row-label.ts`](../../../packages/memory/v2/sqlite/row-label.ts); each helper returns a plain-JSON AST node): | helper | meaning | |---|---| | `match(f.col, /re/, {group?, min?})` | run the regex (forced global) over the column's text ⟹ the ordered list of matches (or capture `group`). The universal field extractor — split + clean in one; subsumes JSON arrays and dirty `Name ` lists. **Strict-if-present:** a non-empty value yielding zero matches fails closed at evaluation (never under-label); `min` makes the field a required anchor | | `principal(protocol, match(…))` | `did::` for each extracted `v` (distributes over the match list). Normalization is protocol-implied: `mailto`/`web` lowercase + trim; `did:key` untouched (base58 is case-sensitive) | | `dbOwner()` | the db's owner — the principal that created the SqliteDb cell, resolved from the db ref (a fixed db property, so the rule stays pure; for a per-user-scoped db this is that user) | | `all(...t)` | separate **conjunctive** clauses, one per atom — today's only confidentiality combinator | | `any(...t)` | **one OR-clause**: any alternative satisfies it (CFC spec §3.1.8). Serializes, but **rejected at `table()` time** until the clause-aware profile lands — never silently lowered to the conjunctive form | | `intersect(...t)` | set ∩ over integrity atom sets (the trust-floor meet) — integrity only | | `whenMatches(f.col, /re/, term)` | include `term` only when the regex TESTS true against the column (gate trust, or a data-dependent conjunct, on extracted metadata). One **fused** helper — a bare `when(matches(…))` pair would collide with the builder's control-flow `when`, whose transformer lowering matches by NAME and mangles any local so named. NB: a gated confidentiality alternative is absent from some rows, so it never has the common-alternative property (CFC spec §8.17.4) | | `authoredBy(p)` / `endorsedBy(p)` | integrity (provenance) claims over a `principal(…)` term — minted as self-describing claim atoms, see below | | `constant(atom)` | a literal atom (escape hatch; named `constant` — `const` is reserved) | `f.` is the only way to name data and may appear **only** as the field argument of `match` / `whenMatches`. **No acting-principal term in rules.** `currentUser()` is deliberately not a helper: a rule must compute the **same** label at the write gate, the server commit, and read re-derivation — re-derived at read time an acting-principal term resolves to the *reader*, so in an `any(...)` clause it would place every reader into the OR-set (self-granted access), and even under `all(...)` the persisted label would flap with whoever queried last. AST validation rejects it. The acting user belongs in the **declared output ceiling** (below), resolved at prepare time (CFC spec §8.17.3); a writer-derived principal must come from a stored column; "the mailbox owner" is `dbOwner()`. **Integrity from row content is forgeable.** The `whenMatches(f.auth, /dmarc=pass/, …)` gate reads text any row writer controls: a writer who fabricates `auth: "…dmarc=pass…"` would mint authored-by-the-sender provenance for a message the sender never wrote. So `authoredBy`/`endorsedBy` mint self-describing `claimed-authored-by` / `claimed-endorsed-by` atoms — never the trusted `AuthoredBy` family directly; policies may upgrade the claim via the provider-trust pattern (CFC spec §5.6.3). Deriving *confidentiality* from row content is, by contrast, bounded by the writer's own release authority (choosing the audience of your own row). Relatedly, a global `match` can extract multiple address-shaped tokens from a crafted display name (`"Mallory " `): acceptable for confidentiality alternatives, but **more than one match in an integrity-bearing position fails closed** — a provenance subject must be unique. ### The serialized spec and the shared evaluator `table()` runs the rule once at definition time — each helper returns its AST node, so the returned object literally **is** the AST — validates it, and attaches it to the table schema as `rowLabel` (`{version: 1, confidentiality?, integrity?}`), riding the existing `db.tables` wire paths unchanged. Validation throws at authoring on an unknown column, an unknown op, an `any()` node (until OR-clauses land), an integrity-only op in confidentiality position (and vice versa), or a regex that fails the safety lint (length cap + nested-quantifier/ReDoS reject — author regexes run at the trusted evaluation points, so a pathological pattern would be a DoS vector). The same validation **re-runs on wire-supplied specs** before any evaluation: "couldn't validate" is never "no label". One pure evaluator — [`evaluateRowLabel`](../../../packages/memory/v2/sqlite/row-label.ts), in `packages/memory` beside `table()` — is shared by the write gate, read re-derivation, and (future) server-side commit evaluation, so the sides can never drift: the audit property holds by construction. It is fail-closed end to end: an absent rule-input field, a non-string value where a regex needs text, a `dbOwner()` with no owner in context, a strict-if-present zero match, a `min` violation, a multi-match integrity subject, an unknown op — each returns `{error}`, never a partial label; callers turn `{error}` into a refused query / rejected write. ### Read — re-derive per row, attach, ceiling (`db.query`) Because the label is re-derived from stored data, the read side is self-sufficient: a row is correctly labeled even if it predates the rule. Implemented in the `sqliteQuery` flush ([`sqlite-builtins.ts`](../../../packages/runner/src/builtins/sqlite-builtins.ts)) with the pure half in [`row-label-read.ts`](../../../packages/runner/src/builtins/sqlite/row-label-read.ts): 1. **Locate the rule's inputs by TRUE origin, never output name** (Phase 2's provenance machinery): for each column the rule reads, find the result column whose origin is `(ruleTable, column)`. `SELECT subject AS from_addr` does not satisfy `from_addr` — no spoof. A rule input missing from the projection, or two result columns sharing its origin, **refuses the query**; so does a rule-bearing read without column provenance, or a query touching more than one rule-bearing table (cross-rule joins are deferred). 2. **Aggregates refuse.** Any null-origin column (`COUNT(*)`, expression) on a rule-bearing query refuses: the output derives from every contributing row, and that cannot be re-labeled per row. Rule-less tables keep Phase 2's conservative static merge. (Two future lifts: once OR-clauses land, a principal listed unconditionally in `any(...)` is an alternative in every row's clause — the common-alternative property, CFC spec §8.17.4 — and reads aggregates by the ordinary algebra; an author-declared `derived:` fallback label remains a possible follow-up.) 3. **Evaluate per row, attach per row.** Each result row already splits into its own entity doc; the flush writes each labeled row doc **directly** (its own id, root path) under a root-`ifc` schema. Keyed by the row doc's id, the per-row root label coexists with Phase 2's per-column field labels on the same doc and dominates its fields by prefix-match (a field of a row is at least as confidential as the row — inheriting down can only raise). Downstream consumers inherit it through dereference traces (`cfcLabelViewForDereferenceTraces`), exactly like per-column labels. **Declared output ceiling.** A query may declare the maximum confidentiality its result may carry — a consumer contract checked per row against the computed label (per-row joined with the projection's per-column atoms), **not** reader clearance (no read-time clearance model exists): ```tsx // Shown for illustration only. const skim = db.query(sql, { maxConfidentiality: [{ __ctCurrentPrincipal: true }, { __ctDbOwner: true }], onExceed: "skip", }); ``` The ceiling is declared once: either the query's `maxConfidentiality` option or `ifc.maxConfidentiality` on the typed Row schema (declaring both errors). Placeholder atoms `{__ctCurrentPrincipal: true}` (the acting user) and `{__ctDbOwner: true}` (the db's owner) resolve before the check. A flat ceiling list keeps the **conjunctive** reading permanently — every entry required of the observer (clause subsumption restricted to singletons, CFC spec §8.10.3); the reader-enumeration reading ("observed by any of these") is an explicit `any([...])` ceiling and lands with OR-clauses. Honesty note: under today's conjunctive lowering a multi-participant row fits only a ceiling listing every participant, so narrow ceilings skip almost everything — per-user views become genuinely useful once OR-clauses land. `onExceed` governs a ceiling miss: **`"fail"`** (default) refuses the whole query; **`"skip"`** drops the offending rows and returns the rest. Skipping releases one row-presence bit per withheld row, so it is a **declared** existence release (CFC spec §8.17.2, invariant 14): opt-in in the query contract, required to be policy-permitted and auditable — and it never applies to aggregates (a withheld row already contributed server-side; a count cannot be un-counted), where the mode is rejected outright. ### Write — the runner gate (`db.exec`) The write side is a gate; it does not determine the row's effective label (read re-derivation does). Implemented in `Cell.exec` ([`cell.ts`](../../../packages/runner/src/cell.ts)) with the pure half in [`row-label-write.ts`](../../../packages/runner/src/builtins/sqlite/row-label-write.ts); zero cost until a table declares a rule: - An **attributable INSERT/REPLACE** (positional `?` params with an explicit column list — Phase 2's parser) evaluates the rule over the bound values ⟹ the prospective row label, recorded as the write's CFC policy input via the sink-request seam before commit. **No-laundering:** every labeled bound value's confidentiality must be captured by that computed label — and an empty computed label captures *nothing* (it is not an unrestricted ceiling) — else storing the value would launder its label away; fail closed. - An **UPDATE**'s SET clause must consist entirely of simple `col = ?` assignments — attributed from the SQL text, not the bind params; a literal, expression, or subquery assignment is unattributable and **fails closed** (so `SET col = 'x'` cannot bypass the rule-input check). An attributable UPDATE still may not write a rule-input column (the post-image row label cannot be computed runner-side — the other inputs are unknown; server-side 3.c lifts this); one that writes only non-input columns with unlabeled values passes, while a labeled value outside an evaluable INSERT fails closed. DELETE stores nothing and passes. - Everything the runner cannot attribute on a rule-bearing table — `INSERT…SELECT`, upsert, columnless INSERT, named params, an unparseable target — **fails closed**. ### Fail-closed rules (consolidated) 1. **Authoring:** unknown column, unknown op, unsafe regex, `any()` ⟶ `table()` throws. The same validation re-runs on wire-supplied specs before evaluation. 2. **Read, unresolvable input:** a rule input missing from the projection by origin, or ambiguous (two columns, same origin) ⟶ refuse the query. 3. **Read, bad data:** evaluator `{error}` (non-string regex input, strict-if-present zero match, `min` miss, multi-match integrity subject) ⟶ refuse the query. 4. **Read, unattributable output:** any null-origin column on a rule-bearing query ⟶ refuse; `skip` never applies to aggregates. 5. **Read, ceiling exceeded:** `onExceed` decides — fail the query (default) or skip the row (declared opt-in, row-returning queries only). 6. **Write, unattributable:** fail closed (Phase 2's set, plus rule-input UPDATEs). 7. **Write, laundering:** a labeled input not captured by the computed row label ⟶ reject the write. Never treat "couldn't resolve / couldn't evaluate" as "no label". ### Combining the label sources Within a rule, clause shape is author-controlled and explicit (`any` / `all` / `intersect`), and confidentiality and integrity stay **separate** expressions — coupling them through one lattice op would nuke integrity the moment a confidentiality-only term (`dbOwner()`) joins. Folding the per-row label with Phase 2's per-column label is a join (flat-atom union today; clause concatenation once OR-clauses land — a merge must never union the *alternatives* of two different clauses, CFC spec §3.1.8). The integrity cross-merge remains a swappable seam pending CT-1668; the CFC spec endpoint is class-aware **intersection** whenever labels of different values fold into a derived one (CFC spec §3.1.6.2, §8.6.2). Demos: [`cfc-row-label-mailbox`](../../../packages/patterns/cfc-row-label-mailbox/main.tsx) (rule + ceiling + skip + aggregate refusal end to end) and [`cfc-row-label-records`](../../../packages/patterns/cfc-row-label-records/main.tsx) (per-row ∧ per-column composition on one row). e2e: [`sqlite-cfc-row-label.test.ts`](../../../packages/runner/integration/sqlite-cfc-row-label.test.ts). ## Why this stays declarative Keeping row labels as a declarative projection (rather than a callback into pattern code) lets the **server** evaluate them at commit time — necessary because the atomic write happens inside `applyCommitTransaction` server-side (Section [04](./04-server-execution-and-transactions.md)), where pattern code does not run. It also makes labels auditable: a row's confidentiality is a pure function of its stored columns, recomputable at any time. ## Phasing 1. **v1:** no enforcement; `_cf_link` round-tripping only. Schemas *may* carry `ifc` but it is ignored. 2. **Phase 2 — per-column: _implemented_** (source B). Honors static `ifc` on columns for read-label propagation (sound column-origin provenance, per-field labels) and write-time ceiling checks, reusing `ContextualFlowControl`. Deferred within Phase 2: **source A** (tunnelling write→read labels not on the schema, reserved column `cf_label`). 3. **Phase 3.a — per-row: _implemented_** (#3974). The rule surface, AST + validation, the shared evaluator, read-side per-row labeling with declared ceilings and `onExceed`, and the runner-side write gate — everything above. Deferred: - **OR-clauses:** `any(...)` rules and `any([...])` reader-enumeration ceilings, gated on the clause-aware label profile (CFC spec §18.5); also unlocks common-alternative aggregate reads (CFC spec §8.17.4). - **3.b — read-time clearance:** filtering by *who is asking* rather than a declared ceiling; rides the same `onExceed` surface, needs OR-clauses to be useful. Own design when it lands. - **3.c — server-side commit evaluation:** evaluate the shared evaluator against the **true** committed row inside `applyCommitTransaction` (read-back by rowid), roll back on violation — covers the non-attributable writes the runner gate fails closed on today. The no-laundering half stays runner-side (the server has no input-value labels).