Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz Source: /opt/alga-psa on psa.joliet.tech
10 KiB
10 KiB
Scratchpad — Appliance Operator TUI
- Plan slug:
appliance-operator-tui - Created:
2026-03-25
What This Is
Keep a lightweight, continuously-updated log of discoveries and decisions made while implementing this plan.
Prefer short bullets. Append new entries as you learn things, and also update earlier notes when a decision changes or an open question is resolved.
Decisions
- (2026-03-25) First version is terminal-first, not a browser-based operator console. Reason: it matches the current appliance operator workflow, works over SSH, and avoids building a second management surface before the appliance lifecycle is stable.
- (2026-03-25) The new tool should live under
ee/appliance, not inside the existing Nushell developer CLI. Reason: the current CLI is developer-oriented and appliance operations need a clearer product boundary. - (2026-03-25) The tool should be structured for both repo-based use and future standalone packaging. Reason: v1 can ship from the repo, but packaging constraints should not be baked into the lifecycle logic.
- (2026-03-25) The TUI should wrap the current appliance shell scripts and release manifests instead of replacing bootstrap, upgrade, or reset logic.
- (2026-03-25) Implemented operator as Node ESM modules under
ee/appliance/operatorwith a thin shell wrapperee/appliance/appliance. Reason: no existing appliance package/workspace existed and ESM keeps packaging and standalone embedding simple. - (2026-03-25) Added runtime path abstraction with repo auto-discovery and
ALGA_APPLIANCE_ASSET_ROOToverride. Reason: required for future standalone packaging while preserving repo-hosted workflow. - (2026-03-25) Implemented one normalized status model (
collectStatus) used by both TUI and non-interactive commands. Reason: avoids divergent status logic and supports consistent blocker guidance across command surfaces. - (2026-03-25) Kept lifecycle actions script-driven (
historical removed bootstrap script,historical removed upgrade script,reset-appliance-data.sh,collect-support-bundle.sh) with phase-aware progress wrappers. Reason: minimizes operational drift and honors existing script contracts. - (2026-03-25) The current
readlineshell is not the accepted final UX. Reason: it is operationally useful, but it does not meet the product bar for a real operator TUI. - (2026-03-25) Ink is the intended runtime for the interactive layer. Reason: it supports the full-screen, persistent-layout, keyboard-driven interface we actually want while allowing the existing Node operator core to remain intact.
- (2026-03-25) Replaced the interactive
readline/promisesloop with a stateful Ink app while keeping lifecycle/status modules untouched. Reason: satisfies the UX acceptance bar (F026-F031) without destabilizing non-interactive commands. - (2026-03-25) Added Vim-style
j/k/h/lbindings alongside arrows in the Ink shell. Reason: improves SSH/operator ergonomics and made headless TUI tests deterministic. - (2026-03-25) Appliance pod inspection belongs inside the same Ink operator rather than a separate CLI/tool. Reason: operators should stay in one surface for lifecycle, status, and debugging.
- (2026-03-25) Workload scope should default to appliance-relevant namespaces only (
msp,alga-system,flux-system). Reason: operators asked for appliance-focused visibility, not a generic cluster browser. - (2026-03-25) Pod logs should use a full-screen viewer with bounded scrollback and Escape-to-return behavior. Reason: this matches operator expectations better than a cramped split view and avoids unbounded memory growth.
- (2026-03-25) Implemented
F032by addingWorkloadsas a first-class Ink action and dedicated main-pane route. Reason: keeps pod inspection in the same operator surface as lifecycle and status actions. - (2026-03-25) Implemented
F033with a default namespace allowlist (msp,alga-system,flux-system) inside the workload adapter. Reason: appliance operators need focused inventory, not cluster-wide noise. - (2026-03-25) Implemented
F034with a refreshable pod table showing pod, namespace, status, ready, restarts, and age. Reason: aligns the workload pane with PRD operator-at-a-glance requirements. - (2026-03-25) Implemented
F035using timed workload polling with selection preservation by stable pod key. Reason: refreshing state must not disrupt active operator focus. - (2026-03-25) Implemented
F036by adding a full-screen log view opened from the workload list and closed withEscback to workloads. Reason: this mirrors required drill-down behavior without layout loss. - (2026-03-25) Implemented
F037by using chunked tail expansion plus fixed-cap line windows. Reason:kubectl logsis append-oriented, so chunked reload with a cap is the practical bounded-memory strategy. - (2026-03-25) Implemented
F038with follow-mode tied to bottom position and automatic pause when scrolling upward. Reason: operators need live tail only when intentionally at stream bottom. - (2026-03-25) Implemented
F039with keyboard controls for workloads/logs (j/k, arrows,Enter,Esc, page scroll). Reason: parity with SSH-friendly keyboard workflows. - (2026-03-25) Implemented
F040via newlib/workloads.mjsadapter that encapsulateskubectl get podsandkubectl logscalls behind normalized APIs. Reason: keeps raw command details out of TUI view logic.
Discoveries / Constraints
- (2026-03-25) The repo already has operator-facing appliance scripts:
historical removed bootstrap script,historical removed upgrade script,reset-appliance-data.sh, andcollect-support-bundle.sh. - (2026-03-25) The current bootstrap and upgrade flows are already release-manifest driven under
historical local release metadata (removed). - (2026-03-25) The operator problem is not missing capability; it is poor usability and path/command discoverability.
- (2026-03-25) The existing shell scripts already own sensitive logic like Talos config generation, Flux install, release value rendering, and destructive reset semantics. Reimplementing them in v1 would create drift risk.
- (2026-03-25) The existing developer CLI is Nushell-based and heavily focused on dev/build/test workflows, which makes it a poor default home for a customer-facing appliance operator surface.
- (2026-03-25)
ee/appliancehad no existing app package or command framework, so the operator needed to bootstrap its own CLI/TUI modules and tests from scratch. - (2026-03-25) Bootstrap stderr/stdout can contain multiple layers in one run; classifier precedence must favor explicit Kubernetes timeout strings when Talos logs are also present.
- (2026-03-25) The current operator core and non-interactive commands are still the right foundation; the main change is swapping the interactive shell, not rewriting lifecycle or status logic.
- (2026-03-25)
ink@5.xwas incompatible with this repo's React 19 runtime (ReactCurrentOwnercrash during module load).ink@6.8.0resolves the compatibility issue. - (2026-03-25) The new UI keeps a persistent layout with dedicated header, action navigator, status dashboard panel, main content pane, progress panel, and contextual help strip.
- (2026-03-25)
kubectl logsis not a true random-access log API, so "scrollback pagination" must be approximated by chunked reloads and bounded windows rather than arbitrary seek. - (2026-03-25) Auto-refreshing workload state must preserve selection and avoid clobbering active log-view state during operator inspection.
- (2026-03-25) Ink page-up/page-down availability depends on terminal input; tests are more deterministic using
j/kandEnterpaths. - (2026-03-25)
kubectl logs --since-timecan be used for live append polling once the latest seen timestamp is tracked in viewer state.
Commands / Runbooks
- (2026-03-25) Plan scaffold command:
python3 /Users/roberisaacs/.codex/skills/alga-plan/scripts/scaffold_plan.py "Appliance Operator TUI"
- (2026-03-25) Existing appliance script entrypoints:
historical removed bootstrap scripthistorical removed upgrade scriptee/appliance/scripts/reset-appliance-data.shee/appliance/scripts/collect-support-bundle.sh
- (2026-03-25) New operator entrypoints:
ee/appliance/appliance --helpee/appliance/appliance tuiee/appliance/appliance status
- (2026-03-25) New test runbook:
node --test ee/appliance/operator/tests/*.test.mjs
- (2026-03-25) Future workload/log implementation will likely need dedicated adapter tests separate from the existing lifecycle/status tests.
- (2026-03-25) Workload/log implementation test run:
node --test ee/appliance/operator/tests/*.test.mjs
- (2026-03-25) Ink dependency updates:
npm install ink@^6.8.0npm install --save-dev ink-testing-library@^4.0.0
Links / References
ee/appliance/README.mdhistorical removed bootstrap scripthistorical removed upgrade scriptee/appliance/applianceee/appliance/operator/appliance.mjsee/appliance/operator/lib/cli.mjsee/appliance/operator/lib/tui.mjsee/appliance/operator/lib/status.mjsee/appliance/operator/lib/lifecycle.mjsee/appliance/operator/tests/lifecycle-cli.test.mjsee/appliance/operator/tests/status.test.mjsee/appliance/operator/tests/runtime-paths.test.mjsee/appliance/operator/tests/tui-ink.test.mjsee/appliance/operator/lib/workloads.mjsee/appliance/operator/tests/workloads.test.mjsee/docs/premise/README.mdee/docs/premise/talos-gitops-bootstrap.mddocs/plans/2026-03-10-talos-appliance-gitops-alga-deployment-design.mddocs/plans/2026-03-10-talos-image-factory-scaffolding-design.md
Open Questions
- (Resolved 2026-03-25) TUI runtime/library: use Ink for the real interactive shell. The existing
readline/promisesshell is interim scaffolding, not the accepted end state. - (Resolved 2026-03-25) v1 status scope: summary-first (Talos/Kubernetes/Flux/Helm/workloads + release/config paths) without embedded log/event drill-down.
- (Resolved 2026-03-25) Ship TUI and mirrored non-interactive command surface together in v1.
- (Resolved 2026-03-25) Expanded v1 operator scope now includes appliance-relevant workload inventory and full-screen pod log viewing inside the Ink UI.
- (Resolved 2026-03-25)
T010completed: workload console now validates appliance-only pod inventory, required columns, and selection preservation across refresh. - (Resolved 2026-03-25)
T011completed: selecting a pod opens full-screen logs andEscrestores workload layout/selection. - (Resolved 2026-03-25)
T012completed: log viewer validates chunked older-load behavior, follow/pause transitions, and bounded in-memory line caps.