# Test Plan: `cf exec` + FUSE Callable Files ## Reconciliation The approved testing strategy still holds against the current implementation plan. The only material clarifications from the plan are: - mount-state and shebang-shim behavior are first-class acceptance criteria, so helper coverage for absolute identity persistence, longest-prefix mount lookup, stale cleanup, and shim generation is required - the real end-to-end flow must cover both `pieces/...` and `entities/...` callable paths - the integration fixture must include a tool input field literally named `help` to prove the raw-argv precedence rule These adjustments do not change scope or cost from the approved strategy. ## Harness requirements ### `Real FUSE interaction harness` - **What it does**: Starts from a local toolshed, creates a temp identity and space, deploys a deterministic fixture pattern, mounts a real FUSE filesystem, performs real filesystem reads and writes, runs `cf exec`, and inspects resulting piece state and tool stdout. - **What it exposes**: Mounted file paths, CLI stdout/stderr, exit codes, observed file contents, and state verification through existing `cf piece get`/`cf piece inspect` style commands. - **Estimated complexity**: Medium-high. - **Tests depending on it**: 1, 2, 3, 4. - **Implementation notes**: Build this as `packages/cli/integration/fuse-exec.sh` plus `packages/cli/integration/pattern/fuse-exec.tsx`. It requires a reachable local `API_URL`, `CF_CLI_INTEGRATION_USE_LOCAL=1`, and a FUSE provider on the test machine. ### `CLI exec focused harness` - **What it does**: Exercises `cf exec` resolution, argv handling, schema-derived flag parsing, help rendering, and execution dispatch without requiring a live mount. - **What it exposes**: Temp mount-state directories, synthetic mounted paths and `meta.json` files, stubbed manager/piece/runtime collaborators, captured stdout/stderr, and structured error assertions. - **Estimated complexity**: Medium. - **Tests depending on it**: 5, 6, 7. - **Implementation notes**: Build this in `packages/cli/test/exec.test.ts`. Keep it focused on the CLI contract; use real temp files for mount state and mounted metadata, and use stubs only for the runtime/piece layer where the real FUSE surface is already covered by tests 1 to 4. ### `FUSE callable representation harness` - **What it does**: Validates the in-memory callable tree, the shared mounted-callable path parser, readable callable bytes, and JSON-sigil synthesis without mounting FUSE. - **What it exposes**: Pure parser results, in-memory `FsTree` nodes, node kinds, rendered script bytes, and `.json` sibling contents. - **Estimated complexity**: Low. - **Tests depending on it**: 8, 9. - **Implementation notes**: Extend `packages/fuse/tree-builder.test.ts` and add `packages/fuse/callable-path.test.ts`. ### `Mount-state helper harness` - **What it does**: Validates persisted mount metadata, absolute-path normalization, stale-entry cleanup, and shim generation independently of `cf exec`. - **What it exposes**: Temp state directories, synthetic mount entries, shim file contents, and liveness decisions. - **Estimated complexity**: Low. - **Tests depending on it**: 5, 10. - **Implementation notes**: Extend `packages/cli/test/fuse.test.ts` rather than building a new helper-only test file. ### `Runner semantic guardrail harness` - **What it does**: Preserves the existing semantic baselines that `cf exec` relies on for schema lookup and tool-shape formatting. - **What it exposes**: `asSchemaFromLinks()` behavior, PatternToolResult schema formatting, and real pattern-tool execution examples. - **Estimated complexity**: Low. - **Tests depending on it**: 11, 12. - **Implementation notes**: Reuse `packages/runner/test/cell-callbacks.test.ts`, `packages/runner/test/schema-format.test.ts`, and `packages/runner/test/generate-object-tools.test.ts`. ## Named sources of truth used below - **User description**: The original feature request in the trycycle transcript. - **Implementation plan**: [docs/plans/2026-03-17-ct-exec-fuse-callables.md](./2026-03-17-ct-exec-fuse-callables.md), especially `User-visible behavior`, `Important contracts`, and Tasks 1 to 5. - **FUSE spec**: [2-path-scheme.md](../specs/fuse-filesystem/2-path-scheme.md), [3-json-mapping.md](../specs/fuse-filesystem/3-json-mapping.md), and [4-read-write.md](../specs/fuse-filesystem/4-read-write.md). - **Handler CLI workflow doc**: [handlers-cli-testing.md](../common/workflows/handlers-cli-testing.md). - **Runner guardrails**: Existing tests in [cell-callbacks.test.ts](../../packages/runner/test/cell-callbacks.test.ts), [schema-format.test.ts](../../packages/runner/test/schema-format.test.ts), and [generate-object-tools.test.ts](../../packages/runner/test/generate-object-tools.test.ts). ## Test plan 1. **Name**: Mounted handler files under `pieces/...` are readable, executable through `cf exec`, and still accept legacy write-through. **Type**: scenario **Disposition**: new **Harness**: `Real FUSE interaction harness` **Preconditions**: A local toolshed is running; the deterministic fixture pattern is deployed into a temp space; the space is mounted through `cf fuse mount`; the fixture exposes one handler with required scalar input. **Actions**: List the mounted piece directory; assert the expected `*.handler` file exists under `input/` or `result/`; read the file and capture the first line; run `cf exec --help`; run `cf exec invoke ...`; run `cf exec ...` again without an explicit verb; inspect piece state after each call; write JSON directly with `echo ... > ` and inspect piece state again. **Expected outcome**: Per the User description, the Implementation plan `User-visible behavior` items 1, 3, 4, 5, 6, and 10, and the FUSE spec read/write semantics, the handler remains present as `*.handler`, `head -n1` starts with `#!` and contains ` exec`, top-level `--help` succeeds without invoking the handler, explicit and implicit `invoke` both exit `0` and mutate the backing piece state through the mounted handler within a generous 5 second timeout, and the legacy write path still succeeds and mutates the same state. **Interactions**: CLI command parsing, mount-state lookup, FUSE daemon reads and writes, mounted-file to cell resolution, piece controller writes, runtime idle/sync waiting, and kernel cache invalidation. 2. **Name**: Mounted tool files under `pieces/...` surface as `.tool`, hide internal tool wiring, and run with schema-derived flags plus bound `extraParams`. **Type**: scenario **Disposition**: new **Harness**: `Real FUSE interaction harness` **Preconditions**: The mounted fixture exposes one pattern tool with one bound `extraParam`, plus deterministic JSON output that depends on both user input and the bound parameter. **Actions**: List the mounted piece directory and the relevant `.json` sibling; assert a `*.tool` entry exists; assert the old `pattern/extraParams` internals are not exposed as normal mounted children; read the tool file and capture the first line; run `cf exec --help`; run `cf exec run --flag ...`; run `cf exec ...` again without an explicit verb; capture stdout for both runs. **Expected outcome**: Per the User description, the Implementation plan `User-visible behavior` items 2, 3, 4, 5, 6, 8, and 9, and the FUSE spec path and JSON mapping, the mounted surface shows `*.tool` instead of expanded tool internals, reading the file returns a shebang-backed script whose first line contains ` exec`, top-level help renders schema-driven usage, explicit and implicit `run` both exit `0` within 5 seconds, and stdout is the expected JSON result that reflects both the provided flags and the bound `extraParams`. **Interactions**: FUSE callable discovery, callable JSON-sigil rendering, `cf exec` schema translation, runtime pattern execution, stdout serialization, and mounted tree layout. 3. **Name**: `cf exec run --help` is parsed as the schema field when the tool input schema contains `help`, while `cf exec --help` still prints top-level help. **Type**: scenario **Disposition**: new **Harness**: `Real FUSE interaction harness` **Preconditions**: The fixture tool input schema includes a top-level field literally named `help`, and tool output visibly reflects the field value so the execution path is observable. **Actions**: Run `cf exec --help`; then run `cf exec run --help `; capture stdout, stderr, and exit code for both invocations. **Expected outcome**: Per the User description and the Implementation plan `User-visible behavior` items 6 and 7 plus Task 5, the top-level invocation prints command help and does not execute the tool, while the post-verb invocation treats `--help` as the schema field, executes successfully, and returns output incorporating the provided field value instead of CLI help text. **Interactions**: Raw argv preservation in the CLI, schema-derived option parsing, help rendering, and end-to-end tool execution. 4. **Name**: Callable files reached through `entities//...` resolve the same backing cells as the corresponding `pieces/...` paths. **Type**: differential **Disposition**: new **Harness**: `Real FUSE interaction harness` plus `Reference comparison harness` **Preconditions**: The mounted fixture piece is reachable through both `pieces//...` and `entities//...`. **Actions**: Invoke the same handler once via its `pieces/...` path and once via its `entities/...` path with identical input; run the same tool once via each path with identical input; compare resulting piece state and tool stdout across both paths. **Expected outcome**: Per the Implementation plan `Important contracts` items 2, 6, and 7 and Task 5, plus the FUSE spec path scheme, the `entities/...` path is accepted, both paths address the same underlying callable cell, handler side effects are identical, and tool stdout is identical for the same inputs. **Interactions**: Shared mounted-callable path parser, entity resolution under FUSE, piece metadata lookup, CLI resolution, and runtime execution. 5. **Name**: `cf exec` resolves mounted callable files from persisted mount state and sibling `meta.json`, not from display-name guesses. **Type**: integration **Disposition**: new **Harness**: `CLI exec focused harness` plus `Mount-state helper harness` **Preconditions**: Temp mount-state entries exist for multiple mounts, including one nested mount path; the chosen mounted piece directory has a de-duped display name like `notes-2`; a sibling `meta.json` contains the canonical piece ID. **Actions**: Resolve an absolute mounted callable path under the nested mount; resolve a callable path under `pieces/notes-2/...`; inspect which mount entry and piece ID the resolver selects. **Expected outcome**: Per the Implementation plan `Important contracts` items 2, 3, 4, and 5 and the FUSE spec piece naming rules, the resolver chooses the longest matching mountpoint, requires persisted mount metadata, treats identity and mountpoint as absolute paths, ignores the de-duped display name as a stable identifier, and uses sibling `meta.json` to recover the canonical piece ID. **Interactions**: Filesystem temp state, mount-state lookup, mounted-piece metadata parsing, and CLI resolution logic. 6. **Name**: `cf exec` rejects invalid paths and invalid arguments with readable CLI errors instead of stack traces. **Type**: boundary **Disposition**: new **Harness**: `CLI exec focused harness` **Preconditions**: Command parsing and resolution helpers are available with stubbed collaborators. **Actions**: Run `cf exec` against a non-mounted absolute path, a mounted non-callable file, a stale mount entry, a `.tool` path with an unknown flag, a missing required field, an invalid enum value, and a mixed `--json` plus generated-flags invocation. **Expected outcome**: Per the User description, the Implementation plan `User-visible behavior` item 10 and `Keep schema flags simple and predictable`, each case exits non-zero with a clear CLI error describing the problem, and none of the cases surface a raw stack trace. **Interactions**: CLI parser, mount resolution, schema-derived flag validation, and error rendering. 7. **Name**: Schema-derived parsing covers the supported flag surface for `cf exec`. **Type**: integration **Disposition**: new **Harness**: `CLI exec focused harness` **Preconditions**: A command-spec helper exists for handler and tool schemas with representative object and non-object inputs. **Actions**: Parse representative argv for handler and tool callables covering default verb selection, top-level `--help`, boolean flags, `--no-flag`, `--flag=true|false`, arrays and objects passed as JSON strings, non-object schemas via `--value`, and `--json` by itself. **Expected outcome**: Per the User description and the Implementation plan `User-visible behavior` items 5 to 8 plus `Keep schema flags simple and predictable`, handlers default to `invoke`, tools default to `run`, top-level `--help` always returns help, booleans accept the supported forms, arrays and objects parse from JSON strings, non-object schemas use `--value`, and `--json` is accepted only when not mixed with generated flags. **Interactions**: CLI parsing, schema normalization, and help rendering. 8. **Name**: FUSE tree synthesis renders `.handler` and `.tool` callables as readable synthetic files and replaces callable values with explicit sigils in `.json` siblings. **Type**: regression **Disposition**: extend **Harness**: `FUSE callable representation harness` **Preconditions**: In-memory tree-building helpers can classify top-level callable children from representative handler and pattern-tool values. **Actions**: Build a tree for representative `input` and `result` objects containing scalar fields, handler callables, and pattern-tool callables; inspect the tree nodes and the `.json` sibling payloads; read the synthetic callable bytes from the tree representation. **Expected outcome**: Per the User description, the Implementation plan Task 2, and the FUSE spec path and JSON mapping, ordinary fields remain readable as before, top-level callables become `*.handler` or `*.tool`, callable internals are skipped from normal expansion, `.json` siblings render explicit `{\"/handler\":\"name\"}` or `{\"/tool\":\"name\"}` sigils, and synthetic callable reads return bytes whose first line is a shebang containing ` exec`. **Interactions**: Callable classification, tree node creation, JSON rendering, and synthetic read content generation. 9. **Name**: Shared mounted-callable path parsing accepts only supported top-level callable paths and rejects unsupported nested/internal paths. **Type**: unit **Disposition**: new **Harness**: `FUSE callable representation harness` **Preconditions**: The shared mounted-callable path parser exists as a pure helper. **Actions**: Parse representative paths for `pieces/.../*.handler`, `pieces/.../*.tool`, `entities/.../*.handler`, and `entities/.../*.tool`; then parse non-callable paths and nested internal tool paths such as `pattern/...` or `extraParams/...`. **Expected outcome**: Per the Implementation plan `Architecture decisions` item 2 and `Important contracts` items 6 and 7, only the supported top-level callable shapes are accepted, each accepted path returns the correct `rootKind`, `rootName`, `cellProp`, `cellKey`, and `callableKind`, and unsupported nested/internal paths are rejected. **Interactions**: Pure path parsing only. 10. **Name**: Mount-state helpers persist absolute identity, generate a stable exec shim, prefer the longest mount prefix, and clean stale entries. **Type**: regression **Disposition**: extend **Harness**: `Mount-state helper harness` **Preconditions**: Temp state directories and synthetic process entries are available; shim output can be read back from disk. **Actions**: Persist mount-state entries with relative and absolute identities; generate the exec shim; read all state entries; resolve the owning mount for nested paths; mark one entry stale and resolve again. **Expected outcome**: Per the Implementation plan Task 1 and `Important contracts` items 4 and 5, persisted entries include `mountpoint`, `apiUrl`, `identity`, `pid`, and `startedAt`, the stored identity is absolute, the shim content points at `packages/cli/mod.ts`, mount resolution chooses the longest matching mountpoint, and stale entries are ignored and cleaned up before selection. **Interactions**: State-file I/O, path normalization, process-liveness checks, and shim generation. 11. **Name**: `asSchemaFromLinks()` still resolves callable schemas when the child cell itself does not carry one. **Type**: regression **Disposition**: extend **Harness**: `Runner semantic guardrail harness` **Preconditions**: Runner tests can construct cells whose schema must be recovered through linked pattern metadata rather than from the child cell directly. **Actions**: Extend the existing `asSchemaFromLinks()` characterization with a callable-shaped child cell representative of the `cf exec` lookup path and assert the resolved schema is the linked schema. **Expected outcome**: Per the User description, the Implementation plan `Important contracts` item 1, and the existing runner schema-resolution contract, `asSchemaFromLinks()` resolves the linked schema instead of returning `undefined`, ensuring callable discovery and help generation use the backing schema rather than stale local metadata. **Interactions**: Runner cells, source-link traversal, and schema resolution only. 12. **Name**: Pattern-tool schema formatting and execution remain suitable as `cf exec` help and runtime baselines. **Type**: regression **Disposition**: extend **Harness**: `Runner semantic guardrail harness` **Preconditions**: Existing PatternToolResult examples and mixed handler plus `patternTool(...)` execution examples remain available in the runner tests. **Actions**: Extend the schema-format examples to cover the concrete pattern-tool shapes used by the `cf exec` fixture and run the existing bound-`extraParams` pattern-tool examples. **Expected outcome**: Per the User description, the Implementation plan `User-visible behavior` items 8 and 9, and the existing runner guardrails, help formatting shows the user-facing `extraParams` shape rather than leaking internal `pattern` structure, output-schema heuristics stay display-only, and bound-`extraParams` pattern tools still run to completion with the expected structured result. **Interactions**: Runner schema formatter, pattern-tool metadata, and runtime execution. ## Coverage summary ### Covered action space - Real mounted behavior for reading `*.handler` and `*.tool` files, executing them with `cf exec`, and preserving legacy handler writes. - Both callable path families the plan explicitly supports: `pieces/...` and `entities/...`. - Help and parsing behavior that the user explicitly called out: top-level help, default verbs, post-verb `--help` precedence, generated flags, and `--json`. - Mount-state and shebang-shim behavior that `cf exec` depends on outside the daemon. - FUSE layout and representation rules: readable callable files, `.tool` synthesis, and callable sigils inside `.json` siblings. - Schema resolution and PatternToolResult formatting guardrails from the runner layer that inform callable discovery and help text. ### Explicit exclusions - Direct `./file.handler` or `./file.tool` execution as a shell command. The User description and the Implementation plan both exclude this as a release criterion for this change. - Any refactor of `packages/runner/src/builtins/llm-dialog.ts`. The plan explicitly excludes that work; runner tests are used only as semantic guardrails. - A deep nested flag DSL beyond top-level fields plus the raw `--json` escape hatch. The User description and the plan both cap scope here. - CI-only proof of real FUSE behavior on hosts without a FUSE provider. The agreed strategy keeps the real mounted proof as a reproducible local integration artifact instead of replacing it with mocks. ### Risks carried by the exclusions - Without direct `./file.tool` execution coverage, the feature can still regress on executable-bit or shell-dispatch behavior without failing this plan. That is acceptable for this change because the feature contract is `cf exec`, not direct shell execution. - Without broader `llm-dialog` refactors, `cf exec` and LLM tool execution may continue to share behavior only by convention rather than through a common helper. The differential and runner guardrail tests reduce this risk but do not remove it entirely. - If FUSE is unavailable in a given environment, helper and focused tests can still go green while the actual mounted flow remains unproven there. The real integration script is therefore an acceptance gate for feature sign-off on supported local environments.