Lab domain -- L1¶
Control-plane components¶
PostHaste Lab is composed of small explicit components rather than one custom test framework:
| Component | Canonical IDs | Responsibility |
|---|---|---|
| Suite registry | suite.* |
Maps verification intent to existing test runners, fixtures, platforms, tags, and artifacts |
| Profiles | profile.* |
Declare isolated config/data/secret roots, env vars, network policy, and cleanup behavior |
| Fixtures | fixture.* |
Seed deterministic accounts, mailboxes, messages, provider state, and failure scenarios |
| Runners | runner.* |
Invoke native tools such as cargo, Bun, Playwright, Tauri, and future posthastectl commands |
| Drivers | runner:web.*, runner:desktop.* |
Drive web or desktop clients through Playwright, Tauri IPC mocks, or a Tauri bridge |
| Artifact bundle | artifact.*, log.*, state.* |
Persist machine-readable run evidence for agents and humans |
The lab CLI should be a thin orchestrator over these components. It should not reimplement cargo, Bun, Playwright, process supervision, or test-selection algorithms when mature tools already provide them.
Suite registry¶
A suite registry entry describes why and how a behavior is verified:
[suite.api.settings.dev]
level = "integration"
targets = ["daemon"]
runners = ["runner.cargo-nextest.dev"]
tags = ["api", "settings", "fast"]
paths = ["crates/posthaste-server/src/api/settings.rs"]
command = "cargo nextest run -P quick -E 'test(settings_patch)'"
artifacts = ["artifact.junit.settings.dev", "log.backend.jsonl.dev"]
[suite.ui.settings.web.test]
level = "e2e"
targets = ["web"]
profile = "profile.lab.empty.test"
fixture = "fixture.mail.basic.test"
runners = ["runner.playwright.web.test"]
tags = ["ui", "settings", "browser"]
command = "posthaste-lab run suite.ui.settings.web.test"
artifacts = ["artifact.trace.settings.test", "artifact.screenshot.settings.test"]
[suite.desktop.settings.linux.test]
level = "e2e"
targets = ["desktop", "linux"]
profile = "profile.lab.empty.test"
fixture = "fixture.mail.basic.test"
runners = ["runner.tauri-playwright.linux.test"]
tags = ["ui", "settings", "tauri", "linux"]
command = "posthaste-lab run suite.desktop.settings.linux.test"
The registry supports selection by explicit suite ID, tag, target, platform, risk profile, and changed files. Changed-file selection matches detected paths against suite paths; tools/lab/suites.toml is registry-wide and selects all otherwise-filtered suites. The CLI reads POSTHASTE_LAB_CHANGED_PATHS when set, otherwise falls back to a best-effort repo-root jj diff --name-only -r main..@, then Git diff sources for committed, staged, unstaged, and untracked changes. Changed-file selection must escalate across behavioral boundaries when public API schemas, event payloads, config schema, shared cache keys, or suite fixtures change.
Command surface¶
Canonical developer commands are module-oriented and explicit:
just web dev
just desktop dev
just dev web
just dev desktop
just dev services
just dev smoke
just dev log path
just dev log tail
just dev log query --event http.request.completed
Lab orchestration should eventually expose a dedicated CLI:
posthaste-lab suite list
posthaste-lab suite list --changed --target web
posthaste-lab suite list --changed --target web --json
posthaste-lab verify suite.api.health.dev
posthaste-lab verify --tag lab-smoke
posthaste-lab verify --tag settings --target web
posthaste-lab verify --changed
posthaste-lab launch web --profile profile.lab.empty.dev
posthaste-lab launch desktop --profile profile.lab.empty.dev --runner runner.tauri-playwright.linux.test
posthastectl is a dev/lab API client, not yet a product CLI:
posthastectl health wait
posthastectl settings get
posthastectl settings patch --json @settings.json
posthastectl accounts list
posthastectl events wait --resource appSettings.updated
posthastectl fixture load fixture.mail.basic.test
The future headless daemon and terminal TUI may promote a stable subset of posthastectl, but lab-only fixture mutation and rich diagnostics remain separate.
lab-smoke is the cheap non-graphical gate for dogfood/main. It includes the Lab registry self-check, API health, web readiness/surface route tests, and a policy suite that rejects active telemetry ingest/runtime artifacts on the main dogfood line. Graphical Tauri smoke remains an explicit Linux suite outside lab-smoke.
Profiles and fixtures¶
Every lab run uses disposable roots under a run directory:
target/lab/runs/<run-id>/
manifest.json
summary.json
state.config/
state.data/
state.secrets/
log.backend.jsonl
log.frontend.jsonl
stdout.log
stderr.log
opened-urls.jsonl
artifact.screenshot.*.png
artifact.trace.*.zip
artifact.video.*.mp4
Profile IDs describe execution environment and policy:
| ID | Purpose |
|---|---|
profile.lab.empty.test |
Empty local profile, no accounts, no real secrets |
profile.lab.seeded.test |
Seeded local mail data, deterministic timestamps and IDs |
profile.lab.offline.test |
Network disabled except loopback |
profile.lab.stalwart.dev |
Local Stalwart provider fixture |
profile.lab.upgrade.dev.from:v0.1.0-dogfood.17 |
Upgrade/regression profile from an older app state |
Fixtures are explicit products, not hidden setup. They declare seeded accounts, provider behavior, side-effect adapters, and cleanup policy. Real-provider parity remains a separate higher-cost suite; deterministic fixtures must not become the only proof of sync correctness.
Readiness and error contracts¶
UI and backend waits use semantic readiness, not sleeps.
Frontend surfaces expose stable markers:
state.app.loading.test
state.app.ready.test
state.app.error.test
state.settings.loading.test
state.settings.ready.test
state.settings.error.test
state.message-detail.ready.test
state.compose.ready.test
state.surface.<kind>.ready.test
state.surface.invalid.ready.test
The DOM representation may use data-testid or data-posthaste-state, but the suite registry and lab reports refer to canonical state IDs. Loading states that can block a user must have a reachable error state with diagnostic context. Infinite spinners are test failures.
The daemon exposes a minimal product health endpoint; richer lab-only diagnostics remain planned lab contracts:
| Endpoint | Mode | Purpose |
|---|---|---|
GET /v1/health |
product and lab | Process/API readiness without sensitive state |
GET /v1/lab/health |
lab only, planned | Config root, fixture, account convergence, event stream, and side-effect recorder state |
GET /v1/lab/opened-urls |
lab only, planned | External URL requests captured by the lab opener adapter |
When implemented, lab endpoints must refuse non-loopback use and must not expose credentials, message bodies, tokens, or raw provider payloads.
App drivers¶
The driver ladder is:
runner.playwright.web.test: Playwright against built web assets.runner.tauri-mock.web.test: frontend tests with TaurimockIPCfor IPC behavior.runner.tauri-playwright.linux.test: real Linux Tauri app with a feature-gated bridge.runner.package.linux.test: packaged Linux artifact smoke.- Manual macOS release artifact smoke, until a macOS runner is deliberately introduced.
Tauri Playwright spike contract¶
The tauri-playwright bridge is acceptable only behind an e2e feature:
feature.e2e-testingenables the optionaltauri-plugin-playwrightdependency and the PostHaste Linux e2e bridge.POSTHASTE_E2E_SOCKETsupplies a private per-run Unix socket path; the test fixture uses the same path asmcpSocket.- The default
/tmp/tauri-playwright.sockis never used. - The
playwright:defaultcapability is included only when the e2e feature selects the e2e capability file. withGlobalTauriis enabled only in the e2e config override because the app-side bridge uses Tauri events and invoke; normal desktop config keeps the tighter production setting.- Linux CI runs the Tauri bridge under a real or virtual display (
xvfb-runor equivalent) with WebKitGTK dependencies installed. - Normal release builds and DevTools dogfood builds do not include the permission, global Tauri injection, private socket bridge, or bridge marker.
- Initial Linux suites target the first ready Lab surface. Separate settings/message/attachment control is added only after multi-window label handling is proven reliable.
Go/no-go for the spike:
| Outcome | Decision |
|---|---|
Can launch Linux Tauri with isolated profile and wait for state.app.ready.test or a forced first-run state.settings.ready.test |
Continue |
Can open settings and wait for state.settings.ready.test with screenshot/trace on failure |
Continue |
| Can record external URL opener requests without opening a browser | Continue |
| Requires broad production config weakening or global unauthenticated sockets | Stop |
| Multi-window support is unreliable | Keep bridge for main-window smoke only and use web tests for surface routing |
Artifact manifest¶
Every lab run writes manifest.json and summary.json.
manifest.json records:
- command and canonical command ID (
cmd.*) - suite IDs selected, selection rationale, and per-suite execution records
- commit ID, platform, machine ID, tool versions
- profile and fixture IDs
- environment variables after redaction
- process tree and ports/sockets
- artifact paths, including explicit nested suite artifact paths emitted by runners
summary.json records:
- status:
passed,failed,skipped, orblocked - suite IDs selected, selection rationale, and changed paths when applicable
- per-suite status, duration, timeout flag, exit code, stdout/stderr artifact paths, and discovered nested artifact paths
- first failure suite and step
- reproduction command
- important log excerpts
- links to screenshots, traces, videos, and backend/frontend logs
Runners may add nested artifacts to the parent report by printing lines with the exact prefix POSTHASTE_LAB_ARTIFACT_PATH= followed by an existing file or directory path. The path must remain under the active POSTHASTE_LAB_RUN_DIR; paths with secret-like segments are ignored.
Suite runner exit code 77 means skipped; exit code 78 means blocked. Other nonzero exits mean failed.
Agents should inspect the artifact bundle before rerunning a failing suite.
Release relationship¶
Release promotion should test the artifact that will be published whenever practical. For now:
- Linux packaged smoke can be automated.
- macOS artifacts are ad-hoc signed and manually smoke-tested from GitHub release assets.
- Release checks must prove lab-only bridge/debug controls are absent from release artifacts.
Assertions¶
| ID | Sev. | Assertion |
|---|---|---|
| registry-thin-orchestrator | MUST | The suite registry delegates to existing runners and records selection rationale instead of implementing a bespoke test runner |
| disposable-run-roots | MUST | Lab runs use isolated config, data, secret, log, and artifact roots by default |
| semantic-readiness | MUST | UI and backend waits use semantic ready/error states rather than sleeps or network-idle guesses |
| no-infinite-spinner | MUST | User-visible loading states involved in lab suites expose a ready or error state that tests can assert |
| bridge-feature-gated | MUST | Real Tauri automation bridges are compile-time feature gated and absent from normal release artifacts |
| private-e2e-socket | MUST | Tauri automation uses a private per-run socket or equivalent non-predictable local channel |
| posthastectl-dev-only | SHOULD | posthastectl remains dev/lab-only until a product CLI/TUI contract is explicitly designed |
| artifact-manifest | SHOULD | Every lab run writes a manifest and summary with reproduction commands and diagnostics |