Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz Source: /opt/alga-psa on psa.joliet.tech
277 lines
15 KiB
Markdown
277 lines
15 KiB
Markdown
# Appliance Architecture & Codebase Map
|
|
|
|
> Use this to get oriented in the on-prem **appliance** and find where each part
|
|
> lives: the components and how they fit together, what happens during an
|
|
> install, and the file to open for any given piece. Every section links the real
|
|
> files.
|
|
>
|
|
> This describes the current **Ubuntu + k3s + Kubernetes-hosted control plane +
|
|
> registry-metadata** implementation. The user-facing operator docs in this
|
|
> directory ([`quick-start.md`](./quick-start.md),
|
|
> [`operators-manual.md`](./operators-manual.md),
|
|
> [`technical-reference.md`](./technical-reference.md),
|
|
> [`kubernetes-hosted-setup.md`](./kubernetes-hosted-setup.md)) and the generic
|
|
> Talos platform references under [`ee/docs/premise/`](../premise/) carry more
|
|
> detail and some historical (Talos) narrative. The design rationale for the
|
|
> registry-metadata release model is in
|
|
> [`ee/appliance/docs/registry-metadata-design.md`](../../appliance/docs/registry-metadata-design.md).
|
|
|
|
All paths are relative to the repo root. Everything for the appliance lives
|
|
under [`ee/appliance/`](../../appliance/); its own
|
|
[`README.md`](../../appliance/README.md) is the package-level companion to this
|
|
doc.
|
|
|
|
---
|
|
|
|
## 1. The big picture
|
|
|
|
The appliance is a **single-node, self-contained on-prem deployment** of Alga
|
|
PSA. It ships as an Ubuntu ISO that boots into k3s and runs its *entire* control
|
|
plane as Kubernetes workloads. The only host-level responsibility is a thin
|
|
systemd bootstrap that starts k3s and hands off to an in-cluster setup UI.
|
|
|
|
```
|
|
Ubuntu ISO (autoinstall)
|
|
└─ systemd: alga-appliance-bootstrap.service
|
|
├─ starts k3s (the substrate)
|
|
├─ imports the baked control-plane image
|
|
└─ applies the control-plane Deployment, then prints the setup banner
|
|
│
|
|
▼
|
|
Kubernetes-hosted control plane (ns: alga-appliance-control-plane)
|
|
Deployment appliance-control-plane → host-service/server.mjs on :8080
|
|
├─ serves the setup wizard + status UI
|
|
├─ runs the setup workflow (setup-engine.mjs)
|
|
└─ writes Flux resources that install the app
|
|
│
|
|
▼
|
|
Flux (GitOps) reconciles from the OCI registry (ghcr.io/nine-minds)
|
|
OCIRepository (config bundle) + HelmRepository (charts)
|
|
└─ Kustomizations: alga-platform → alga-core → alga-background
|
|
│
|
|
▼
|
|
PSA application stack (Helm releases)
|
|
ns msp : alga-core (sebastian), db, redis
|
|
ns alga-system : pgbouncer, temporal, temporal-worker, workflow-worker, email-service
|
|
```
|
|
|
|
**Four namespaces** carry the whole system:
|
|
|
|
| Namespace | Holds |
|
|
|---|---|
|
|
| `alga-appliance-control-plane` | the setup/status control-plane pod |
|
|
| `flux-system` | Flux controllers + the child Kustomizations + the OCI/Helm sources |
|
|
| `alga-system` | platform HelmReleases (pgbouncer, temporal, workers, email) |
|
|
| `msp` | the PSA app (alga-core / sebastian), database, redis, initial tenant |
|
|
|
|
---
|
|
|
|
## 2. Components & where they live
|
|
|
|
### Host bootstrap (systemd)
|
|
The host does as little as possible. See
|
|
[`ee/appliance/systemd/`](../../appliance/systemd/):
|
|
|
|
- **`alga-appliance-bootstrap.service`** — oneshot. `ExecStartPre` runs
|
|
`init-token.mjs` + `init-admin-credential.mjs`; `ExecStart` runs
|
|
`scripts/bootstrap-control-plane.sh`; `ExecStartPost` runs
|
|
`console.mjs --emit-runtime-banner` (prints the setup URL + token on tty1).
|
|
- **`alga-host-agent.service`** — long-running. Runs `host-service/host-agent.mjs`,
|
|
a diagnostics Unix socket the in-pod control plane calls for host journals.
|
|
- **`alga-appliance.sysusers`** — reserves the `alga` group (GID 10001) used for
|
|
the host-agent socket.
|
|
|
|
> Note: the old script-driven Talos bootstrap/upgrade path has been removed from
|
|
> `ee/appliance`; supported installs use the Ubuntu host setup/status service.
|
|
|
|
### The Kubernetes-hosted control plane
|
|
The setup/status brain. See [`ee/appliance/control-plane/`](../../appliance/control-plane/)
|
|
and [`ee/appliance/host-service/`](../../appliance/host-service/):
|
|
|
|
- **`control-plane/Dockerfile`** — multi-stage: builds the Next.js status UI from
|
|
`status-ui/`, then a runtime image with `bash/curl/kubectl/flux` that bundles the
|
|
`host-service/*.mjs`, `manifests/`, and `flux/` trees. Runs as UID 10001.
|
|
*(It intentionally does **not** bake release metadata — setup/update resolve
|
|
selected channels from the OCI artifact registry at runtime.)*
|
|
- **`control-plane/manifests/`** — `namespace.yaml`, `rbac.yaml`,
|
|
`workload.yaml` (the `appliance-control-plane` Deployment: 1 replica,
|
|
`hostNetwork`, hostPort 8080, mounts the host state dir + setup-token secret),
|
|
and `kustomization.yaml`.
|
|
- **`scripts/control-plane-entrypoint.sh`** — the pod's entrypoint: builds an
|
|
in-cluster kubeconfig from the service-account token, then runs `server.mjs`.
|
|
|
|
### host-service — the runtime (`ee/appliance/host-service/`)
|
|
Plain Node ESM (`.mjs`), no build step. The pieces:
|
|
|
|
| File | Role |
|
|
|---|---|
|
|
| `server.mjs` | HTTP server on :8080. Routes: `/healthz`, `/setup` (UI), `/api/setup/config`, `/api/setup` (POST), `/api/status`. Serves the static status UI, validates input, and dispatches the setup workflow. |
|
|
| `setup-engine.mjs` | The setup workflow (see §3). Resolves the release from the registry, renders runtime values, writes the initial-tenant Secret + release-selection ConfigMap, and applies the Flux source. |
|
|
| `update-engine.mjs` | Channel **upgrades**: re-resolve the channel, update release selection, `flux reconcile`. |
|
|
| `status-engine.mjs` | Cluster health: HelmReleases, pods, readiness tiers (platform / core / bootstrap / login). Backs `/api/status`. |
|
|
| `console.mjs` | Console (tty1) setup banner + fallback flow; writes `/etc/issue` and `/etc/motd`. |
|
|
| `host-agent.mjs` | The diagnostics socket daemon (host side of `alga-host-agent.service`). |
|
|
| `kubectl-queue.mjs` | Serializes kubectl calls (one at a time, with timeout/output caps). |
|
|
| `metadata-engine.mjs` | Persists maintenance metadata (OS info, bootstrap/upgrade history). |
|
|
| `init-token.mjs` / `init-admin-credential.mjs` | Generate the setup token and the temporary console admin password on first boot. |
|
|
| `resolve-control-plane-image.mjs` | Prints the channel-pinned control-plane image ref from the release manifest (used by `bootstrap-control-plane.sh` to roll the control plane from the registry). |
|
|
| `bootstrap-boundary.mjs` | Phase constants for the host bootstrap. |
|
|
| `support-bundle.mjs` | Builds a redacted diagnostics tarball. |
|
|
| `tests/` | Node test runner specs for the engines (workflow, update, status, control-plane packaging, first-boot smoke, etc.). |
|
|
|
|
### The setup UI (`ee/appliance/status-ui/`)
|
|
A small Next.js app. `app/setup/page.tsx` is the wizard (channel, app hostname,
|
|
DNS, tenant + admin); `app/page.tsx` is the status view. It's built into static
|
|
`dist/` and served by the control-plane pod (the Dockerfile copies it to
|
|
`ALGA_APPLIANCE_STATUS_UI_DIR`).
|
|
|
|
### Flux / Helm layer (`ee/appliance/flux/`)
|
|
GitOps definitions, published to the registry as the **config bundle**:
|
|
|
|
- `flux/base/kustomization.yaml` — top of the bundle: namespaces + the charts
|
|
HelmRepository + the child Kustomizations.
|
|
- `flux/base/charts/helm-repository.yaml` — `HelmRepository` `alga-charts`
|
|
(`type: oci`, `oci://ghcr.io/nine-minds/charts`).
|
|
- `flux/base/flux/kustomizations.yaml` — the dependency chain
|
|
**`alga-platform` → `alga-core` → `alga-background`**, all sourced from the
|
|
`OCIRepository alga-appliance` that the setup workflow creates.
|
|
- `flux/base/releases/*.yaml` — one HelmRelease each: `alga-core` (chart
|
|
`sebastian`, ns `msp`), `pgbouncer`, `temporal`, `temporal-worker`,
|
|
`workflow-worker`, `email-service` (ns `alga-system`). Each pulls values from a
|
|
generated ConfigMap.
|
|
- `flux/base/platform/appliance-status.yaml` — in-cluster RBAC + status helpers
|
|
the control plane queries (uses the Flux **v1** APIs).
|
|
- `flux/profiles/single-node/values/*.single-node.yaml` — the per-service value
|
|
overrides baked into the release manifest's `profileValues`.
|
|
|
|
> **Flux apiVersions:** the appliance's Flux serves source CRDs
|
|
> (OCIRepository/HelmRepository/GitRepository) + Kustomization at **`v1`** and
|
|
> HelmRelease at **`v2`**. Don't reintroduce `v1beta2`.
|
|
|
|
### Release metadata and publishing
|
|
- Release metadata is an OCI artifact at `ghcr.io/nine-minds/alga-appliance-release`.
|
|
- The release manifest JSON is the artifact config blob: app image tags,
|
|
chart versions, Flux config-bundle digest, control-plane image ref, values
|
|
profile, and profile values.
|
|
- Channels such as `stable` and `nightly` are OCI tags on that release artifact.
|
|
- Publishing is owned by the Argo workflow in
|
|
`~/nm-kube-config/alga-psa/workflows/composite/alga-psa-build-migrate-deploy.yaml`.
|
|
For stable releases use `promote-release=true`,
|
|
`publish-appliance-release=true`, and `appliance-release-channel=stable`.
|
|
|
|
### Host/control-plane scripts (`ee/appliance/scripts/`)
|
|
- `bootstrap-control-plane.sh` — the host bootstrap: start k3s, import the baked
|
|
image, resolve + pull the channel control-plane image, apply the Deployment,
|
|
report handoff.
|
|
- `stage-control-plane-bundle.sh` — builds + stages the baked control-plane image
|
|
archive into the ISO overlay.
|
|
- `install-storage.sh` / `manifests/local-path-storage.yaml` — local-path
|
|
provisioner.
|
|
- `bin/alga-control-plane-reapply` — non-destructive recovery: re-import +
|
|
re-apply the control plane.
|
|
|
|
### ISO builder (`ee/appliance/ubuntu-iso/`)
|
|
- `config/nocloud/user-data` — the autoinstall seed. Identity `alga-admin`,
|
|
`interactive-sections: [network, storage]` (the installer pauses on those two),
|
|
late-commands copy the overlay into `/target` and enable the systemd units, then
|
|
reboot.
|
|
- `overlay/` — the host filesystem injected into the install: everything lands at
|
|
`/opt/alga-appliance/` (scripts, host-service, flux, manifests, status UI),
|
|
plus the systemd units and `/etc/alga-appliance/`.
|
|
- `scripts/build-ubuntu-appliance-iso.sh` — entrypoint (`--base-iso`,
|
|
`--release-version`). Calls `stage-host-artifacts.sh` (which builds the
|
|
control-plane image + stages the overlay), then remasters with `xorriso`.
|
|
- `output/` — the built `alga-appliance-ubuntu-<version>.iso` + `.sha256`.
|
|
|
|
### Operator CLI (`ee/appliance/operator/`)
|
|
`ee/appliance/appliance` → `operator/lib/cli.mjs`: a TUI/CLI for status,
|
|
support bundles, repair helpers, and destructive reset. It no longer exposes the
|
|
removed bootstrap/upgrade commands; installs and app-channel updates are handled
|
|
by the host/control-plane setup and update engines.
|
|
|
|
---
|
|
|
|
## 3. How an install happens (end to end)
|
|
|
|
1. **Boot.** The ISO autoinstalls Ubuntu (operator confirms the network + storage
|
|
screens), copies the overlay, enables the systemd units, and reboots.
|
|
→ `ubuntu-iso/config/nocloud/user-data`
|
|
2. **Host bootstrap.** `alga-appliance-bootstrap.service` generates the setup
|
|
token + temp admin password, runs `bootstrap-control-plane.sh` (k3s up, import
|
|
baked image, **resolve the channel control-plane image from the registry and
|
|
roll to it**, apply the Deployment), then prints the setup URL + token on the
|
|
console. → `systemd/`, `scripts/bootstrap-control-plane.sh`,
|
|
`host-service/resolve-control-plane-image.mjs`
|
|
3. **Operator opens the setup wizard** at `http://<ip>:8080/setup?token=…` and
|
|
submits channel (`stable`), app hostname, DNS, tenant + admin.
|
|
→ `status-ui/app/setup/page.tsx`, `host-service/server.mjs`
|
|
4. **Setup workflow** (`setup-engine.mjs`, orchestrated by `runSetupWorkflow`):
|
|
- `validateSetupInputs` → `runSetupPreflight` / `runNetworkChecks`
|
|
- `resolveChannelMetadata` → `resolveReleaseManifest` resolves the channel tag
|
|
on `ghcr.io/nine-minds/alga-appliance-release` to an immutable manifest
|
|
(image tags + chart versions + config-bundle digest + control-plane ref +
|
|
profile values), validated by `validateReleaseManifest`.
|
|
- `applyReleaseSelectionConfiguration` + `applyRuntimeValuesAndReleaseSelection`
|
|
write the per-service values ConfigMaps, the `appliance-release-selection`
|
|
ConfigMap, and the **initial-tenant Secret** (tenant + admin, PBKDF2-hashed).
|
|
- `applyFluxSource` creates the `OCIRepository` (config bundle, **pinned by
|
|
digest**) + the parent Kustomization.
|
|
5. **Flux reconciles.** `alga-platform → alga-core → alga-background` apply the
|
|
HelmReleases; charts come from the OCI `HelmRepository`. The app comes up in
|
|
`msp` + `alga-system`. → `flux/base/`
|
|
6. **Tenant bootstrap.** The app's bootstrap job runs `create-tenant.ts` against
|
|
the initial-tenant Secret, creating the tenant + admin and seeding
|
|
roles/permissions/statuses. Login then works. → `helm/templates/appliance-bootstrap-configmap.yaml`,
|
|
`server/scripts/create-tenant.ts`
|
|
7. **Status** is reported throughout by `status-engine.mjs` via `/api/status`
|
|
(readiness tiers: platform / core / bootstrap / login).
|
|
|
|
---
|
|
|
|
## 4. The registry-metadata model (why no git, no ISO re-burn)
|
|
|
|
A **channel** (`stable`) is an OCI tag on a release manifest; everything it
|
|
references is pinned by digest/version, so a resolved release is immutable. The
|
|
appliance pulls all of it anonymously from `ghcr.io/nine-minds` (public read).
|
|
|
|
**Published artifacts** (all under `ghcr.io/nine-minds`):
|
|
|
|
| Artifact | What |
|
|
|---|---|
|
|
| `charts/<name>:<ver>` | the 6 Helm charts (OCI) |
|
|
| `alga-appliance-config:<ver>` | the Flux config bundle (the rendered `flux/base` tree) |
|
|
| `alga-appliance-control-plane:<ver>` | the control-plane image |
|
|
| `alga-appliance-release:<ver>` + `:<channel>` | the release manifest (its JSON is the artifact's config blob) |
|
|
|
|
**Consequences for what needs rebuilding:**
|
|
|
|
- **App image change** (e.g. a fix in `server/`) → build `alga-psa-ee`, then run
|
|
the `~/nm-kube-config` Argo workflow with `publish-appliance-release=true` and
|
|
the desired channel (for stable: `appliance-release-channel=stable`). **No ISO,
|
|
no control-plane rebuild.**
|
|
- **Control-plane / setup-UI / engine change** → rebuild + publish the
|
|
control-plane image; the next boot rolls to it from the registry. **No ISO.**
|
|
- **ISO re-burn only** for: k3s / base OS / Flux controllers / systemd units / the
|
|
autoinstall seed / the baked baseline control-plane image.
|
|
|
|
See [`ee/appliance/docs/registry-metadata-design.md`](../../appliance/docs/registry-metadata-design.md)
|
|
for the full contract (publish side, consume side, decisions).
|
|
|
|
---
|
|
|
|
## 5. Quick file index
|
|
|
|
| I need to… | Look at |
|
|
|---|---|
|
|
| Understand the package | `ee/appliance/README.md` |
|
|
| See the boot path | `ee/appliance/systemd/`, `scripts/bootstrap-control-plane.sh` |
|
|
| Change the setup wizard | `ee/appliance/status-ui/app/setup/page.tsx` |
|
|
| Change the setup API / flow | `ee/appliance/host-service/server.mjs`, `setup-engine.mjs` |
|
|
| Change what the app stack is / Helm values | `ee/appliance/flux/base/`, `flux/profiles/single-node/values/` |
|
|
| Pin a new app image | `~/nm-kube-config/alga-psa/workflows/composite/alga-psa-build-migrate-deploy.yaml` (`appliance-release-source-ref` / `appliance-release-version`) |
|
|
| Publish a release to ghcr | Argo workflow with `publish-appliance-release=true` |
|
|
| Rebuild the control-plane image | `ee/appliance/control-plane/Dockerfile` |
|
|
| Build the ISO | `ee/appliance/ubuntu-iso/scripts/build-ubuntu-appliance-iso.sh` |
|
|
| Diagnose a running box | `ee/appliance/host-service/status-engine.mjs`, `operator/`, `bin/alga-control-plane-reapply` |
|
|
| Understand the release model | `ee/appliance/docs/registry-metadata-design.md` |
|