Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

189 lines
18 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# AlgaPSA MCP Server — Design
**Date:** 2026-06-06
**Branch:** `feature/alga-mcp-server`
**Status:** Approved design (brainstorming output) → feeds PRD/feature/test plan in this directory.
---
## 1. Intent
Expose AlgaPSA to AI agents over the Model Context Protocol (MCP). AlgaPSA acts as an MCP **server** only (not a client). MCP is the human/agent-initiated **pull** surface; the event-driven routing engine remains the system-initiated **push** surface — out of scope here.
The strategic stance: the protocol and basic local access are free/open (adoption funnel, reconstructable from the open API anyway); the monetizable value is **governance + managed hosting** around agent access.
## 2. The central reframe (why this design diverges from a naive adapter)
The source product description (§4) implied one richly-described MCP tool per entity×operation — ~40+ tools, each description a "product surface." **That is the documented anti-pattern.** Real MCP servers reach 50400 tools = 55K400K+ tokens of definitions loaded before the agent reads a request; context chokes and tool-selection accuracy *drops*.
Current state of the art (Anthropic ["Code execution with MCP"](https://www.anthropic.com/engineering/code-execution-with-mcp), Nov 2025; ["Advanced tool use / Tool Search Tool"](https://www.anthropic.com/engineering/advanced-tool-use), Jan 2026; MCP-Zero; meta-tool pattern) converges on **progressive disclosure**: expose a tiny constant surface; let the agent *search* for the capability it needs and pull only that schema on demand. Reported savings: 8598.7% fewer tokens, with accuracy going *up*.
**Key discovery:** AlgaPSA already built this. The EE chat assistant
(`ee/server/src/services/chatCompletionsService.ts` + `ee/server/src/chat/registry/`)
is mechanically a progressive-disclosure engine wired to an internal LLM loop instead of an MCP transport:
- A **registry generated from the OpenAPI spec** (`apiRegistry.generated.ts`) — every endpoint carries `displayName`, `description`, `parameters`, request/response schemas, plus governance metadata: **`rbacResource` and `approvalRequired` per endpoint**, and curated `examples`/`playbooks` (YAML-overridable).
- A **ranked search** over it (`chat/registry/search.ts`) — intent detection + token scoring, returns top-N. **Pure TypeScript, zero EE dependencies.**
- The **exact meta-tool surface** the SOTA recommends (`buildToolDefinitions`): `search_api_registry`, `search_business_data`, `call_api_endpoint` (+ a loop-only `finish_response`).
- The **read-auto / mutation-gated** split already designed into `call_api_endpoint`'s description.
- Identity threading via `TemporaryApiKeyService.issueForAiSession()`.
So the MCP server is **~80% existing engine + a thin transport**, not new business logic.
## 3. Tool surface — 3 constant meta-tools
The MCP surface is **3 tools, constant, independent of API size.** No per-endpoint tools.
| MCP tool | Purpose | Execution |
|---|---|---|
| `search_api_registry(query, limit)` | Ranked search over the endpoint catalog; returns top-N descriptors (id, name, params, schema, examples) | read-only, immediate |
| `search_business_data(query, types)` | Cross-entity record search → `GET /api/v1/search`, ACL-scoped | read-only, immediate |
| `call_api_endpoint(entryId, path?, query?, body?)` | Execute the chosen endpoint | read auto; mutation gated (EE remote only) |
The agent loop is run by the **client** (Claude Desktop/Cursor), not by AlgaPSA: `search_api_registry` → read one schema → `call_api_endpoint`. `finish_response` is dropped (server-loop artifact; in MCP the host model ends its own turn).
- `call_api_endpoint`'s description is **edition-templated**: in the CE local connector there is *no* approval (the agent acts under the user's own token + RBAC; the user's MCP client is itself the human-in-the-loop). The approval clause only becomes real on the EE remote path.
- **MCP Resources are out of scope.** Progressive disclosure subsumes them — any read is reachable via search + `call_api_endpoint`, so a parallel resource surface is redundant maintenance. Revisit only if a specific client needs @-mention/attach UX.
## 4. Architecture — one engine, two transports, shared with chat
```
packages/agent-tooling/ ← NEW shared CE package (the "engine")
┌─────────────────────────────────────────────────┐
│ registry/ generated from alga-openapi.<ed>.json (CE + EE)
│ search.ts ranked search (moved as-is, already pure)
│ invoke/ build request from a registry entry → {method,path,...}
│ tool-defs/ the 3 meta-tool schemas + descriptions
└─────────────────────────────────────────────────┘
▲ ▲ ▲
EE chat assistant CE local connector EE remote server
(re-pointed onto stdio transport Streamable HTTP + OAuth,
the package) runs ON workstation governance; embedded in
runs loop server- calls instance server app (appliance/SaaS)
side, mints temp /api/v1 w/ user token
keys from session
└──────────── all dispatch → existing /api/v1 (no new business logic) ─────────────┘
```
**Boundaries:**
- `packages/agent-tooling` holds only *mechanism* (registry, search, request-building, the 3 tool-def schemas). No LLM code, no transport, no governance → CE-safe and reusable. `search.ts` + schema move essentially as-is.
- **Dispatch splits by caller.** The chat assistant's temp-key-from-session path **stays in EE**. The local connector already holds the user's API token → calls `/api/v1` directly, no temp-key machinery. The package exposes request-*building*; each consumer owns request-*sending* + auth.
- The existing EE chat assistant is **re-pointed** onto the package — the one place shipped code is touched; carries regression risk; must retest existing chat behavior.
## 5. CE / EE seam (diverges deliberately from source spec §3.2/§6)
The source spec had a CE self-hosted remote *base* with only governance gated. **This design tightens that: anything networked is EE.**
| Surface | Edition |
|---|---|
| Local stdio connector (full 3-tool surface, user-scoped via API token) | **CE / free** |
| Shared engine package (`agent-tooling`) | **CE** |
| Remote Streamable HTTP MCP server — the *entire* networked endpoint: OAuth 2.1, multi-client serving, **and** governance | **EE / paid** |
| Managed/hosted remote endpoint (SaaS) | **EE / paid** |
Rationale: "run it yourself on your workstation = free; a networked server many agents connect to = paid" is a crisp, defensible line, and the remote transport is inseparable from the governance/hosting value. The free local connector still provides the full tool surface under the user's identity, honoring "basic access is never gated."
## 6. Phasing
### Phase 1 — Local connector (CE) — *ships first*
- Extract `packages/agent-tooling` from the EE chat code (registry + `search.ts` + request-building + tool-defs).
- Re-point the EE chat assistant onto it (+ regression test).
- Generalize `generate-chat-registry.mjs` to emit **both** CE and EE registries (`alga-openapi.ce.json` / `.ee.json`).
- New server endpoint: **`GET /api/v1/meta/mcp-registry`** serving the generated registry for that instance's edition (precedent: `meta/openapi`, `meta/endpoints` already exist).
- `@alga/mcp-connector``npx`-run Node package on `@modelcontextprotocol/sdk` `StdioServerTransport`, exposing the 3 tools.
- Config via env: `ALGA_INSTANCE_URL` + `ALGA_API_TOKEN` (an existing `api_keys` key; no new auth).
- Startup: fetch registry from the instance (source of truth for version + edition). Decision: **fetch from instance**, not bundle (avoids drift across a heterogeneous self-hosted fleet).
- Dispatch: `search_api_registry` → in-memory search; `search_business_data``/api/v1/search`; `call_api_endpoint` → build request + send with the user's token.
- Identity = the user's token → inherits RBAC/ABAC. No agent identity, no approval, no governance (intentional, §3.1).
- **Acceptance:** a user configures URL + token and operates AlgaPSA from Claude Desktop under their own permissions.
### Phase 2 — Remote server, MVP governance (EE)
- Streamable HTTP single endpoint (`/api/mcp`) via SDK `StreamableHTTPServerTransport`, embedded in the server app. No legacy HTTP+SSE.
- **OAuth 2.1** per MCP authorization spec: MCP endpoint is an OAuth resource server; advertises `.well-known/oauth-protected-resource`; auth-code + PKCE; Dynamic Client Registration. AlgaPSA acts as / fronts the authorization server.
- **Agent identity** as a first-class subject: extend `AuthorizationSubject` (already open-shaped, already carries `apiKeyId`) with `agentId` + subject type `'agent'`, admin-provisioned per tenant. Because it's a kernel subject, its permissions are enforced by the existing authz kernel; basic per-agent permissions reuse existing RBAC roles.
- **Audit** of every agent action via existing `auditLog()` / `audit_logs` (identity, tool, inputs, policy decision, result, timestamp), exportable.
- Dispatch runs *inside* AlgaPSA → through the kernel under the agent subject. Reads auto-execute; mutations execute only if agent permissions allow, and everything is audited. (Hold-for-human approval is Phase 3.)
- **Acceptance:** an admin stands up the remote server on an appliance and connects a client over OAuth; agent actions are attributable and audited.
### Phase 3 — Governance depth (EE)
- **Agent-specific ABAC policy** — which agent may invoke which tools, on which resources, under which conditions; add the agent subject type to the kernel's bundle/narrowing policy evaluation.
- **Approval gates (human-in-the-loop)** — registry already carries `approvalRequired`; chat already has a propose→`/api/chat/v1/execute` flow to mirror. New: holding queue, approve/reject UI, timeout policy.
- ⚠️ **Open sub-decision (deferred, needs more thought):** how a *held* mutation resolves over request/response MCP. Candidate shapes: gated call returns a `pending_approval` handle, resolved via Streamable HTTP streaming the eventual result within the timeout, or via a `check_approval(handle)` tool. Not pinned in this design.
- **Quotas & rate limits** — per-agent and per-tenant; extend existing `enforceApiRateLimit` (already used for API keys) to agent subjects; structured to later feed metered usage.
- **SSO-bound agent identity** — agent identity provisioned/bound via the tenant's IdP.
- **Acceptance (§9 EE):** an admin defines a policy restricting an agent to read-only on billing data, requires approval for bulk ticket closes, and gets an exportable audit trail of all agent actions.
## 7. Cross-cutting
- **No business logic in MCP code** — every path terminates at `/api/v1` (Phase 1) or kernel→API dispatch (Phase 2+). MCP layer only discovers, builds, dispatches, audits.
- **Edition gating** via existing `isEnterpriseEdition()` / `getFeatureImplementation()`. Remote + governance in `ee/`; `agent-tooling` + connector are CE.
- **Fail-fast** per repo standards: validate inputs early, throw actionable errors. But tool *execution* errors surface to the agent as structured tool errors (not thrown) so the model can recover.
- **Security:** token never logged; registry endpoint requires auth; OAuth scopes map to agent permissions; audit append-only.
- **Testing (80/20):** invest in the few tests that de-risk the most — search ranking, request-building from a registry entry, dual-edition registry generation, and the chat-assistant regression after re-pointing. One MCP-protocol conformance check per transport. EE: OAuth flow + agent-subject authz + audit-completeness. Do **not** exhaustively unit-test thin pass-throughs.
## 8. Decisions log (divergences + commitments)
1. **Progressive disclosure, not per-endpoint tools** — 3 constant meta-tools. (Reframe of source §4.)
2. **Reuse the existing chat engine** by extracting it to a shared CE package `agent-tooling`. (Not greenfield.)
3. **Anything networked is EE** — the remote server in its entirety, not just governance. (Tightens source §3.2/§6.)
4. **MCP Resources dropped from scope** — subsumed by progressive disclosure.
5. **Registry fetched from the instance**, not bundled into the connector.
6. **Local connector uses the existing `api_keys` mechanism**, no new token type.
7. **Phase order:** CE local first, then EE remote (MVP governance), then governance depth.
8. **Deferred:** the approval-gate request/response mechanism (Phase 3 open sub-decision).
## 9. Open questions for implementation
- ~~AlgaPSA-as-authorization-server vs. delegating to tenant IdP~~ → **RESOLVED in §10: delegate to tenant IdP.**
- ~~Approval-gate resolution mechanism~~ → **RESOLVED in §10: pending-handle + `check_approval` poll (Phase 3).**
- Whether `search_business_data` ACL semantics via `/api/v1/search` exactly match the chat assistant's internal ACL path, or need reconciliation.
## 10. Phase 2 design addendum — remote server, identity & auth (decided 2026-06-06)
Grounded in the current MCP authorization spec (2025-11 revision): an MCP server is an **OAuth 2.1 resource server only** — it validates bearer tokens from a separate authorization server, MUST serve Protected Resource Metadata (RFC 9728), and clients bind tokens to the resource via Resource Indicators (RFC 8707). Alga has **no** authorization server today (NextAuth relying-party only).
### 10.1 Decisions
1. **OAuth = delegate to the tenant IdP.** The MCP server is purely a **resource server**. Token issuance is the tenant's existing IdP (Entra / Google / Keycloak — the same providers EE SSO already integrates). Alga validates tokens (issuer + audience + resource indicator + signature via the IdP's JWKS) and maps the token's client/subject claim to an Alga agent. **No Alga-as-AS.**
- **Accepted constraint:** a remote MCP server therefore *requires the tenant to have an IdP*. A bare appliance with no IdP cannot run the remote server (it can still run the free local connector). Document this prominently; offer Keycloak as the appliance IdP option.
- This pulls **"SSO-bound agent identity" (was Phase 3 / F042) into the Phase 2 core** — the IdP binding *is* how agents authenticate.
2. **Agent identity = first-class `agents` table.** A real principal, not an api-key alias.
3. **Approval over MCP = pending-handle + `check_approval` tool (poll)** — Phase 3; decouples the Alga-admin approver from the agent; robust on all clients.
### 10.2 Auth flow (resource-server)
```
MCP client → GET /api/mcp (no token)
← 401 + WWW-Authenticate: resource_metadata="…/.well-known/oauth-protected-resource"
MCP client → reads PRM → authorization_servers = [tenant IdP]
→ obtains token from the IdP (client-credentials / service principal for a machine agent),
with resource indicator = the Alga MCP resource URL
MCP client → GET /api/mcp (Authorization: Bearer <IdP JWT>)
Alga MCP → validate: issuer ∈ tenant's configured IdPs, aud/resource = this server,
signature via cached JWKS, not expired
→ extract client_id / sub claim → look up agents.idp_subject → agent principal
→ build AuthorizationSubject{ agentId, subjectType:'agent', tenant, … }
→ dispatch through the existing authz kernel → /api/v1; audit every call
```
### 10.3 Agent identity model
- **`agents` table** (per tenant): `agent_id` (uuid PK), `tenant`, `name`, `description`, `active`, `created_by`, `created_at`, plus the **IdP binding**: `idp_issuer`, `idp_subject` (the `sub`/`azp`/`client_id` claim that identifies this agent in the tenant IdP). Unique on `(tenant, idp_issuer, idp_subject)`.
- **Credentials:** primary path is the IdP token (above). Optionally, an Alga-issued agent credential reuses `api_keys` with a new nullable `agent_id` (and `user_id` made nullable for agent keys) — useful for non-OAuth/dev access and to let the *local connector* act as a registered agent later. Not required for the IdP path.
- **`AuthorizationSubject`** gains `agentId?: string` and `subjectType?: 'user' | 'agent'` (default `'user'`). `buildAuthorizationPrincipalSubject` grows an **agent branch**: given a resolved agent, assemble a subject with the agent's assigned roles/permissions (Phase 2 reuses RBAC roles; Phase 3 adds agent-specific ABAC bundles).
- **Authz kernel unchanged** — it already evaluates whatever subject it's handed; we only teach the *subject builder* about agents.
### 10.4 Reuse map (minimal new surface)
- **Transport:** SDK `StreamableHTTPServerTransport`, single `/api/mcp` route, EE-gated via `isEnterpriseEdition()`. Tool handlers reuse the connector's `search/call` logic but dispatch **in-process through the kernel** (not HTTP-to-self) under the agent subject.
- **JWKS/JWT validation:** reuse `@auth/core/jwt` + `getSecretProviderInstance`; cache JWKS per IdP. IdP config reuses EE `providerConfig` / `ssoProviders` (per-tenant).
- **Audit:** reuse `auditLog()` / `audit_logs` — one row per tool invocation (agent_id, tool, inputs, decision, result, ts).
- **Rate limiting:** reuse `enforceApiRateLimit` with `rateLimitSubjectId = agent_id`.
- **Registry:** the EE registry already served by `GET /api/v1/meta/mcp-registry` (Phase 1).
### 10.5 Revised feature interpretation (Phase 2)
- **F024 (PRM):** serve `/.well-known/oauth-protected-resource` advertising the tenant's IdP as the `authorization_servers`. (Core.)
- **F025 (token flow):** **validate IdP tokens** (issuer/aud/resource/JWKS), not run an Alga auth-code+PKCE flow. (Re-scoped.)
- **F026 (DCR):** **dropped/deferred** — DCR is downgraded to optional in the spec, and with IdP delegation client registration happens at the IdP, not Alga.
- **F042 (SSO-bound identity):** **pulled into Phase 2 core** (it's the auth mechanism), not a Phase-3 add-on.
- **F027-F033** (agent subject, provisioning, mapping, per-agent RBAC, kernel dispatch, audit, export) unchanged in intent.
### 10.6 What still needs live infra / can't be unit-tested here
- A real tenant IdP for the full token round-trip. Unit-testable now: JWKS signature validation (mock JWKS), claim→agent mapping, the `agents` migration + subject builder, EE-gating. End-to-end needs a live IdP (or a mock OAuth server in integration tests).