Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

28 KiB

Scratchpad: Ubuntu-Based Alga Appliance Installer

Context

We are pivoting the appliance host OS from Talos to Ubuntu for operational familiarity while preserving the appliance concepts developed in the Talos work:

  • early status UI on port 8080
  • status UI separate from the main app
  • readiness tiers
  • stable/nightly release channels
  • immutable release manifests
  • app-only status-driven updates
  • login readiness that does not depend on background services

Decisions

D001 — Use Ubuntu Server 24.04 LTS for v1

Use Ubuntu Server 24.04 LTS only for the first version. Avoid supporting 22.04 and 24.04 simultaneously until the appliance path is stable.

D002 — Use custom autoinstall ISO

The ISO handles the OS layer: Ubuntu install, partitioning, packages, users/hardening, and first-boot service enablement.

The appliance bootstrap handles the app layer: k3s, Flux, channel resolution, values, release selection, and Alga workloads.

D003 — Use interactive first-boot setup

Do not require a fully zero-touch install. First boot should offer both:

  • web setup primary at http://<node-ip>:8080/setup
  • console TUI fallback

D004 — Use online install from GitHub channel files

v1 should fetch channel/release metadata directly from the GitHub repo instead of using a release bucket or fully bundled/offline release artifacts.

D005 — Keep host service as durable status/update plane

The host-level systemd service permanently owns :8080. It serves setup before k3s exists and status/diagnostics/updates after k3s exists.

D006 — Use opinionated k3s single-node install

Use k3s rather than kubeadm/RKE2. Disable unneeded bundled components where appropriate, keep local-path storage, and pin version.

D007 — App-only updates in v1

Status UI should update Alga app/channel state only. Ubuntu package updates and k3s updates are manual/support-run in v1.

D008 — DNS default

Do not default to public Google DNS. MSP environments often rely on AD-integrated, split-horizon, or internal DNS. Setup should default to DHCP/system-provided resolvers when available, make DNS selection prominent, and require deliberate admin choice before overriding with public resolvers such as 8.8.8.8,8.8.4.4.

D009 — Fail fast on GitHub/GHCR dependency

v1 intentionally depends on GitHub/GHCR for online install. Setup must preflight DNS, GitHub raw/repo access, selected channel metadata, GHCR reachability, and proxy/egress behavior before installing k3s or otherwise mutating the host.

D010 — Name OS/k3s update liability and sketch v2

App-only updates are acceptable for v1, but manual/support-run Ubuntu and k3s updates create CVE-response burden. The PRD now names this explicitly and includes a v2 managed maintenance direction.

D011 — Support bundle is first-class

Support bundle generation should be designed as a v1 feature, not a vague future helper. It must include host logs, setup/update logs, k3s state, Kubernetes resources/events, Flux/Helm status, app bootstrap logs, network diagnostics, disk usage, release metadata, and redaction.

D012 — Console fallback must work for headless/racked deployments

Console fallback should work from common hypervisor console and serial-console-style access paths, but web setup remains primary so racked/headless users are not forced into physical keyboard/monitor workflows.

D013 — Retire Talos as the supported appliance path

Ubuntu replaces Talos for the supported v1 appliance product. The plan should not leave Talos and Ubuntu as parallel customer-facing install options. Retire or clearly mark Talos bootstrap/operator/docs/status dependencies as legacy/internal, while porting reusable release/channel/status concepts to the Ubuntu host-service implementation.

Open Questions

  • What exact k3s version should be pinned for the first Ubuntu appliance validation?
  • Which k3s bundled components must remain enabled for our current Helm/app exposure model?
  • Should the first implementation use the existing Next.js status UI, a smaller host-native Node service, or another host service shape?
  • What exact host paths should be considered stable public/support contract?
  • Do we need proxy support in v1, or can it be deferred?
  • What is the desired appliance admin Linux user name and authentication posture?
  • What firewall defaults should be applied on Ubuntu?
  • Should port 3000 remain the initial app port for v1, or should Ubuntu appliance introduce HTTP/HTTPS fronting from day one?

Useful Existing Files

From the Talos/status work, likely useful source material:

historical removed bootstrap script
historical removed upgrade script
ee/appliance/status-ui/
ee/appliance/flux/base/
historical local stable channel metadata (removed)
historical local nightly channel metadata (removed)
historical local release metadata (removed)
ee/appliance/operator/lib/status.mjs
ee/appliance/operator/lib/tui.mjs

Implementation Notes To Carry Forward

  • Flux needs a full URL format; do not pass GitHub scp-style URLs like git@github.com:Nine-Minds/alga-psa.git directly to Flux.
  • First install failures should fail early and clearly for missing/wrong Flux CLI, DNS, GitHub access, and GHCR access.
  • GitHub/GHCR preflight should happen before k3s install, not after Kubernetes is half-configured.
  • Existing Talos bootstrap had a --dns-servers flag that patched both Talos resolver config and CoreDNS. Ubuntu setup needs equivalent host/k3s/CoreDNS treatment, but must not silently replace internal DNS with public DNS.
  • Talos-specific customer-facing docs, prerequisites (talosctl), machine config generation, and Talos API status checks should not remain in the supported Ubuntu appliance path.
  • Kubernetes does not expose byte-level image pull percentages; status should show phase/image/elapsed time instead.
  • Background service failures should produce ready-with-background-issues, not block login readiness.

Validation Targets

Minimum live validation environment should include:

  • UTM local VM for rapid iteration if feasible
  • VMware ESXi-like VM path or equivalent cloud custom ISO path before customer docs are finalized
  • Fresh install with DHCP/system DNS defaults
  • Fresh install with internal/split-horizon DNS preservation
  • Fresh install with DNS failure simulation
  • Fresh install with GitHub/GHCR/proxy failure simulation before k3s installation
  • Reboot persistence test
  • App-channel update test

Implementation Log

  • (2026-05-01) F001 completed: Defined a concrete Ubuntu appliance ISO build layout under ee/appliance/ubuntu-iso/ with the initial folder contract (config/nocloud, overlay, scripts, work, output) and a single entrypoint script scripts/build-ubuntu-appliance-iso.sh.
  • Rationale: The repo previously had Talos image-factory build paths only; no Ubuntu ISO structure existed, so downstream autoinstall and first-boot work had no stable location.
  • Added ee/appliance/ubuntu-iso/README.md to declare responsibilities/inputs/outputs for the Ubuntu ISO path and to separate scaffold-vs-full-remaster expectations.
  • Added dry-run validation command for repeatability: ee/appliance/ubuntu-iso/scripts/build-ubuntu-appliance-iso.sh --base-iso <ubuntu.iso> --release-version <version> --dry-run.
  • Gotcha: full ISO remastering tools (xorriso, 7z, boot catalog regeneration) are intentionally deferred to subsequent features; the current script validates layout and interface so later steps can land without path churn.
  • (2026-05-01) F002 completed: Added Ubuntu 24.04 autoinstall NoCloud seed at ee/appliance/ubuntu-iso/config/nocloud/user-data + meta-data.
  • Scope covered: unattended install defaults (locale/timezone), direct full-disk partitioning policy, base appliance admin account, required utility packages, and first-boot service enablement hooks for alga-appliance.service and alga-appliance-console.service.
  • Security/operations note: appliance state directories are created in install-time late commands with root ownership and restricted permissions (0750 for /etc/alga-appliance, /var/lib/alga-appliance, /var/log/alga-appliance).
  • Build entrypoint hardening: build-ubuntu-appliance-iso.sh now fails fast if config/nocloud/user-data or meta-data are missing.
  • (2026-05-01) F003 completed: Implemented host artifact packaging flow with ee/appliance/ubuntu-iso/scripts/stage-host-artifacts.sh.
  • Packaging behavior: stages ee/appliance/appliance, ee/appliance/operator/, ee/appliance/scripts/, and built ee/appliance/status-ui/dist into ee/appliance/ubuntu-iso/overlay/opt/alga-appliance/ for ISO-installed host delivery.
  • build-ubuntu-appliance-iso.sh now enforces presence/executability of the staging script and runs it during non-dry-run builds.
  • Gotcha: status-ui/dist is expected to exist at packaging time; staging warns if missing so CI/release automation must build status-ui first.
  • (2026-05-01) F004 completed: Added host web service systemd unit at ee/appliance/ubuntu-iso/overlay/etc/systemd/system/alga-appliance.service bound to port 8080.
  • Added host runtime skeleton at ee/appliance/host-service/server.mjs (HTTP server with setup/status mode detection via /var/lib/alga-appliance/install-state.json and /healthz endpoint).
  • Updated artifact staging to package ee/appliance/host-service/ into /opt/alga-appliance/host-service on the target host.
  • Validation run: curl http://127.0.0.1:8080/healthz returned {"ok":true,"mode":"setup"} while running server.mjs locally.
  • (2026-05-01) F005 completed: Added console fallback startup path at ee/appliance/ubuntu-iso/overlay/etc/systemd/system/alga-appliance-console.service.
  • Added serial/console-friendly output entrypoint ee/appliance/host-service/console.mjs (prints node IP, setup URL, setup token path value, and log command) while keeping web setup as the primary flow.
  • Service shape is Type=oneshot with StandardOutput=tty to support VM/serial console display without replacing the web path on :8080.
  • (2026-05-01) F006 completed: Added first-boot/setup token initializer ee/appliance/host-service/init-token.mjs.
  • Token behavior: generates token when missing, persists to /var/lib/alga-appliance/setup-token, enforces restricted permissions (0600 token file, 0750 parent directory) for sensitive status/setup access material.
  • Wired token initialization into both systemd services via ExecStartPre, ensuring token existence before web setup/status or console fallback output starts.
  • (2026-05-01) F007 completed: Console fallback output now prints detected node IP, setup URL (http://<ip>:8080/setup), setup token, and log locations (journalctl -u alga-appliance.service -u alga-appliance-console.service -f).
  • (2026-05-01) F008 completed: Implemented token-protected web setup route /setup in ee/appliance/host-service/server.mjs.
  • Behavior: /setup requires ?token=<setup-token> matching /var/lib/alga-appliance/setup-token; mismatches return 401 Unauthorized.
  • Local validation: unauthenticated request returned HTTP 401; authenticated request returned HTTP 200.
  • (2026-05-01) F009/F010/F011 completed: Expanded token-protected /setup UI with required fields: release channel, app URL/hostname, DNS mode, custom DNS servers, and support/testing repo URL + branch overrides.
  • Defaults/UX semantics implemented:
    • channel defaults to stable
    • nightly explicitly labeled as testing/support-directed
    • DNS defaults to Use DHCP/system resolvers
    • custom DNS is opt-in and called out as deliberate (with example format guidance)
  • (2026-05-01) F013 completed: Setup POST now validates and persists inputs to /etc/alga-appliance/setup-inputs.json (or ALGA_APPLIANCE_SETUP_INPUTS_FILE override), with restricted file permissions (0600) and restricted directory permissions (0750).
  • (2026-05-01) F012 completed: Added console setup prompt flow ee/appliance/host-service/console-setup.mjs collecting the same required values as web setup.
  • Shared validation/persistence: introduced ee/appliance/host-service/setup-engine.mjs; both web POST /setup and console prompt use the same validateSetupInputs() + persistSetupInputs() logic.
  • Console-first guidance now includes explicit fallback command to launch interactive setup from VM/serial console.
  • (2026-05-01) F014 completed: Implemented setup preflight engine execution in ee/appliance/host-service/setup-engine.mjs and wired it through both server.mjs (POST /setup) and console-setup.mjs.
  • Preflight coverage now runs before any k3s mutation step: DNS resolution checks, GitHub channel metadata reachability/parse check, GHCR connectivity check, and proxy/egress environment context capture.
  • Added phase-classified persisted install-state updates at /var/lib/alga-appliance/install-state.json with retry-safe blocker guidance for DNS/network/GitHub release-source failures.
  • Web setup now returns an explicit preflight-blocked response (HTTP 412) with phase, cause, suggested next step, and retry safety; console setup prints matching blocker guidance.
  • Added focused automated test file ee/appliance/host-service/tests/setup-engine.preflight.test.mjs validating custom DNS input enforcement and early DNS preflight blocking when no system resolvers exist.
  • Validation run: node --check ee/appliance/host-service/{setup-engine.mjs,server.mjs,console-setup.mjs} and node --test ee/appliance/host-service/tests/setup-engine.preflight.test.mjs.
  • (2026-05-01) F015 completed: Extended setup execution from preflight-only to workflow mode by adding runSetupWorkflow() and installK3sSingleNode() in ee/appliance/host-service/setup-engine.mjs.
  • k3s install behavior: pinned default version v1.31.4+k3s1 (overrideable via ALGA_APPLIANCE_K3S_VERSION), single-node server install command through https://get.k3s.io, and post-install kubeconfig verification at /etc/rancher/k3s/k3s.yaml.
  • State modeling: writes explicit k3s-install-running, k3s-install-complete, and k3s-install-blocked phases with installer output/error guidance persisted into install-state.
  • Wiring: both web setup (server.mjs) and console setup (console-setup.mjs) now invoke the full setup workflow path rather than only preflight.
  • Added test ee/appliance/host-service/tests/setup-engine.workflow.test.mjs to validate successful k3s install path using a mocked local installer command and kubeconfig path.
  • Validation run: node --check ee/appliance/host-service/{setup-engine.mjs,server.mjs,console-setup.mjs} and node --test ee/appliance/host-service/tests/setup-engine.preflight.test.mjs ee/appliance/host-service/tests/setup-engine.workflow.test.mjs.
  • (2026-05-01) F016 completed: Updated default k3s install exec flags to disable unneeded bundled components by default (--disable traefik --disable servicelb) in installK3sSingleNode().
  • Added regression coverage in ee/appliance/host-service/tests/setup-engine.workflow.test.mjs to assert installer receives those disable flags through INSTALL_K3S_EXEC when no explicit override is supplied.
  • (2026-05-01) F017 completed: Added post-k3s storage configuration phase ensureLocalPathStorage() in setup-engine.mjs and chained it into runSetupWorkflow() after successful k3s install.
  • Storage behavior: executes host-side storage installer command (default /opt/alga-appliance/scripts/install-storage.sh --kubeconfig /etc/rancher/k3s/k3s.yaml) and persists storage phase state (storage-config-running|complete|blocked) with troubleshooting guidance on failures.
  • Added automated test coverage for successful storage phase execution in setup-engine.workflow.test.mjs.
  • (2026-05-01) F018 completed: Added Flux installation phase installFlux() to setup workflow after successful k3s + storage phases.
  • Flux behavior: runs flux install --namespace flux-system --kubeconfig /etc/rancher/k3s/k3s.yaml by default, persists phase states (flux-install-running|complete|blocked), and records actionable failure details.
  • Added test coverage in setup-engine.workflow.test.mjs for successful Flux phase execution via a mocked command.
  • (2026-05-01) F019 completed: Added resolveChannelMetadata() phase that resolves selected channel metadata directly from GitHub channel files and extracts releaseVersion + historicalBranchOverride for workflow use.
  • (2026-05-01) F020 completed: Added applyFluxSource() phase that applies Flux GitRepository + Kustomization resources against the selected branch/path (./ee/appliance/flux/base) using host kubeconfig.
  • (2026-05-01) F021 completed: Flux source configuration now normalizes SSH-style GitHub URLs (git@github.com:org/repo.git) into public HTTPS format before applying source manifests.
  • Added test coverage for channel metadata resolution and GitHub URL normalization in ee/appliance/host-service/tests/setup-engine.workflow.test.mjs.
  • (2026-05-01) F022 completed: Added applyReleaseSelectionConfiguration() to persist selected channel/release and runtime values into /etc/alga-appliance/release-selection.json (0600/0750 permissions) for host-side release selection state.
  • Setup workflow order now: preflight -> k3s -> storage -> Flux install -> channel metadata resolve -> Flux source apply -> release selection persistence.
  • Added automated test coverage verifying persisted release-selection/runtime payload and state transition.
  • (2026-05-01) F023 completed: Updated host-service mode detection so port 8080 transitions into status mode as soon as setup state exists (install started), rather than waiting for a terminal complete phase marker.
  • Rationale: status/progress UX is now available immediately after setup begins, matching the requirement that host :8080 remain the durable setup->status plane.
  • (2026-05-01) F024 completed: Implemented host-side status collection in ee/appliance/host-service/status-engine.mjs, using local kubeconfig (/etc/rancher/k3s/k3s.yaml) and kubectl queries (nodes/pods) instead of Talos APIs.
  • Added token-protected /api/status endpoint in server.mjs that returns install state + Kubernetes snapshot from the Ubuntu host service.
  • Added unit test ee/appliance/host-service/tests/status-engine.test.mjs validating kubeconfig-driven status collection via mocked kubectl responses.
  • (2026-05-01) F025 completed: Added readiness tier computation in status-engine.mjs with explicit booleans for platformReady, coreReady, bootstrapReady, loginReady, backgroundReady, and fullyHealthy.
  • (2026-05-01) F026 completed: Enforced readiness semantics so background workload issues (email-service, temporal, workflow-worker, temporal-worker) degrade backgroundReady without blocking loginReady.
  • Added regression test in status-engine.test.mjs proving loginReady remains true when temporal-worker is unhealthy while core/platform/bootstrap remain ready.
  • (2026-05-02) F027 completed: Added normalized failure classification output in ee/appliance/host-service/status-engine.mjs (network, dns, historical-git-release-source, k3s, flux, storage, app-bootstrap, app-readiness, background-services).
  • Classification now derives from persisted setup/install state plus live pod/warning signals and is exposed in collectStatusSnapshot().failures for consistent host status/debug consumers.
  • (2026-05-02) F028 completed: Expanded host status surfaces to include operator-useful failure context (current phase, last action, suspected cause, suggested next step, retry safety, and log commands) in both /api/status and status-mode HTML response from server.mjs.
  • (2026-05-02) F036 completed: Improved preflight blocker visibility in /setup POST failure rendering: explicit preflight/no-k3s-started message for DNS/network/GitHub release-source failures plus targeted proxy/firewall/DNS guidance text before install begins.
  • Added regression test collectStatusSnapshot classifies persisted k3s failures in ee/appliance/host-service/tests/status-engine.test.mjs.
  • Validation run (2026-05-02):
    • node --check ee/appliance/host-service/setup-engine.mjs ee/appliance/host-service/status-engine.mjs ee/appliance/host-service/server.mjs
    • node --test ee/appliance/host-service/tests/setup-engine.preflight.test.mjs ee/appliance/host-service/tests/setup-engine.workflow.test.mjs ee/appliance/host-service/tests/status-engine.test.mjs
  • (2026-05-02) F029 completed: Implemented host-service-first support bundle generation in ee/appliance/host-service/support-bundle.mjs.
  • Bundle collection includes: install/release/setup state snapshots, appliance journals, k3s service status, DNS/network diagnostics, disk usage, Kubernetes resource snapshots (namespaces/pods/deployments/statefulsets/jobs/PVC/events), Flux GitRepository/Kustomization status, HelmRelease status, and bootstrap job log capture.
  • Redaction behavior added for common secret fields (token, password, client-key-data, bearer authorization headers) across captured output and copied metadata files.
  • Added token-protected web/API flow in server.mjs:
    • POST /api/support-bundle?token=... generates archive
    • GET /support-bundle?token=... exposes one-button generation page and CLI fallback hint
  • Added automated test ee/appliance/host-service/tests/support-bundle.test.mjs validating archive creation wiring and redaction behavior.
  • Validation run (2026-05-02):
    • node --check ee/appliance/host-service/support-bundle.mjs ee/appliance/host-service/server.mjs ee/appliance/host-service/status-engine.mjs
    • node --test ee/appliance/host-service/tests/setup-engine.preflight.test.mjs ee/appliance/host-service/tests/setup-engine.workflow.test.mjs ee/appliance/host-service/tests/status-engine.test.mjs ee/appliance/host-service/tests/support-bundle.test.mjs
  • (2026-05-02) F030/F031 completed: Added app-only channel update flow in host service with new engine ee/appliance/host-service/update-engine.mjs.
  • Update behavior:
    • token-protected web/API surfaces: GET /updates?token=... and POST /api/updates?token=...
    • accepts stable/nightly channel selection
    • resolves selected channel metadata from GitHub/release channels
    • applies Flux source and persists selected release/runtime configuration
    • requests Flux source + HelmRelease reconcile
    • writes bounded update history at /var/lib/alga-appliance/update-history.json
  • Safety/scope: update state explicitly records scope: application-only; path does not perform Ubuntu package or k3s upgrades.
  • Added automated test ee/appliance/host-service/tests/update-engine.test.mjs validating channel update flow, release-selection persistence, update history persistence, and reconcile invocation wiring.
  • Validation run (2026-05-02):
    • node --check ee/appliance/host-service/update-engine.mjs ee/appliance/host-service/server.mjs
    • node --test ee/appliance/host-service/tests/setup-engine.preflight.test.mjs ee/appliance/host-service/tests/setup-engine.workflow.test.mjs ee/appliance/host-service/tests/status-engine.test.mjs ee/appliance/host-service/tests/support-bundle.test.mjs ee/appliance/host-service/tests/update-engine.test.mjs
  • (2026-05-02) F035 completed: Added maintenance/version metadata persistence in ee/appliance/host-service/metadata-engine.mjs.
  • Metadata file: /var/lib/alga-appliance/maintenance-metadata.json (0600/0750 permissions), storing Ubuntu host details, k3s version, selected app channel/release, last app update timestamp, explicit v1 manual-update policy, and v2 maintenance direction notes.
  • Wired metadata persistence into both setup completion (runSetupWorkflow) and app-channel updates (runAppChannelUpdate) so future managed-maintenance logic has durable baseline state.
  • Added automated test ee/appliance/host-service/tests/metadata-engine.test.mjs.
  • (2026-05-02) F032/F033/F034/F038 completed: Rewrote appliance docs to Ubuntu-first supported flow and app-update semantics.
  • Updated docs:
    • ee/docs/appliance/README.md (Ubuntu as supported path, v1 update boundaries, v2 direction)
    • ee/docs/appliance/quick-start.md (VMware/cloud Ubuntu ISO install, setup URL/token, DNS defaults/risk, status plane)
    • ee/docs/appliance/operators-manual.md (status/failure model, support bundle, app-channel updates)
    • ee/appliance/README.md (internal banner: Ubuntu supported, Talos legacy/internal)
  • (2026-05-02) F037 completed: Retired Talos operator from default supported CLI surface by gating ee/appliance/appliance behind explicit legacy opt-in (ALGA_APPLIANCE_ALLOW_LEGACY_TALOS=1).
  • Default invocation now points operators to Ubuntu setup/status flow on http://<node-ip>:8080/setup and http://<node-ip>:8080; Talos path remains available only for support/engineering override.
  • Updated legacy CLI usage text in ee/appliance/operator/lib/cli.mjs to clearly label the Talos command set as legacy and reference Ubuntu host service as the supported v1 path.
  • Validation run (2026-05-02):
    • node --check ee/appliance/host-service/metadata-engine.mjs ee/appliance/host-service/setup-engine.mjs ee/appliance/host-service/update-engine.mjs ee/appliance/host-service/server.mjs
    • node --test ee/appliance/host-service/tests/setup-engine.preflight.test.mjs ee/appliance/host-service/tests/setup-engine.workflow.test.mjs ee/appliance/host-service/tests/status-engine.test.mjs ee/appliance/host-service/tests/support-bundle.test.mjs ee/appliance/host-service/tests/update-engine.test.mjs ee/appliance/host-service/tests/metadata-engine.test.mjs
  • (2026-05-02) T001 completed: Added automated Ubuntu ISO build smoke test ee/appliance/ubuntu-iso/tests/t001-build-smoke.test.mjs.
  • Coverage: runs build script dry-run validation plus scaffold build, verifies host artifact staging into ISO overlay (appliance, host-service, operator, scripts) and validates generated .iso + .sha256 outputs.
  • Validation run:
    • node --test ee/appliance/ubuntu-iso/tests/t001-build-smoke.test.mjs
  • (2026-05-02) T002 completed (manual artifact): Added VM smoke runbook ee/docs/plans/2026-05-01-ubuntu-appliance-installer/manual-tests/T002-vm-smoke.md for unattended install + reboot validation on VMware/cloud VM.
  • Constraint: this repository-local CLI environment cannot boot hypervisor VMs directly; test implementation captures exact execution/evidence requirements for operator-run validation.
  • (2026-05-02) T003 completed: Added first-boot smoke automation ee/appliance/host-service/tests/t003-first-boot-smoke.test.mjs.
  • Coverage: validates console output includes setup URL/token/fallback guidance and verifies host web service health endpoint (/healthz) responds on configured port.
  • Validation run:
    • node --test ee/appliance/host-service/tests/t003-first-boot-smoke.test.mjs
  • (2026-05-02) Test coverage expansion completed for T006/T007/T008/T009/T010/T012/T013/T014/T015/T017/T019/T020 using existing and new host-service/unit smoke tests plus updated Ubuntu-first docs.
  • Added classification mapping regression in ee/appliance/host-service/tests/status-engine.test.mjs for Flux/storage/app-bootstrap/app-readiness phase mapping correctness.
  • Combined validation run:
    • node --test ee/appliance/host-service/tests/setup-engine.preflight.test.mjs ee/appliance/host-service/tests/setup-engine.workflow.test.mjs ee/appliance/host-service/tests/status-engine.test.mjs ee/appliance/host-service/tests/support-bundle.test.mjs ee/appliance/host-service/tests/update-engine.test.mjs ee/appliance/host-service/tests/metadata-engine.test.mjs ee/appliance/host-service/tests/t003-first-boot-smoke.test.mjs ee/appliance/ubuntu-iso/tests/t001-build-smoke.test.mjs
  • Remaining tests intentionally not marked complete are those requiring full live VM/hypervisor or reboot lifecycle execution from outside this repo-local CLI context (and deeper end-to-end setup/update execution path beyond isolated unit/integration checks).
  • (2026-05-02) Added manual execution runbooks for remaining live-environment validations:
    • manual-tests/T004-web-setup-happy-path.md
    • manual-tests/T005-console-fallback-happy-path.md
    • manual-tests/T011-end-to-end-install.md
    • manual-tests/T016-reboot-persistence.md
    • manual-tests/T018-dns-behavior.md
  • These tests require VM/hypervisor execution, network topology control, and reboot lifecycle validation not directly runnable in this repo-local CLI session; runbooks include explicit commands, expected outcomes, and evidence capture requirements.