Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz Source: /opt/alga-psa on psa.joliet.tech
18 KiB
AlgaPSA MCP Server — Design
Date: 2026-06-06
Branch: feature/alga-mcp-server
Status: Approved design (brainstorming output) → feeds PRD/feature/test plan in this directory.
1. Intent
Expose AlgaPSA to AI agents over the Model Context Protocol (MCP). AlgaPSA acts as an MCP server only (not a client). MCP is the human/agent-initiated pull surface; the event-driven routing engine remains the system-initiated push surface — out of scope here.
The strategic stance: the protocol and basic local access are free/open (adoption funnel, reconstructable from the open API anyway); the monetizable value is governance + managed hosting around agent access.
2. The central reframe (why this design diverges from a naive adapter)
The source product description (§4) implied one richly-described MCP tool per entity×operation — ~40+ tools, each description a "product surface." That is the documented anti-pattern. Real MCP servers reach 50–400 tools = 55K–400K+ tokens of definitions loaded before the agent reads a request; context chokes and tool-selection accuracy drops.
Current state of the art (Anthropic "Code execution with MCP", Nov 2025; "Advanced tool use / Tool Search Tool", Jan 2026; MCP-Zero; meta-tool pattern) converges on progressive disclosure: expose a tiny constant surface; let the agent search for the capability it needs and pull only that schema on demand. Reported savings: 85–98.7% fewer tokens, with accuracy going up.
Key discovery: AlgaPSA already built this. The EE chat assistant
(ee/server/src/services/chatCompletionsService.ts + ee/server/src/chat/registry/)
is mechanically a progressive-disclosure engine wired to an internal LLM loop instead of an MCP transport:
- A registry generated from the OpenAPI spec (
apiRegistry.generated.ts) — every endpoint carriesdisplayName,description,parameters, request/response schemas, plus governance metadata:rbacResourceandapprovalRequiredper endpoint, and curatedexamples/playbooks(YAML-overridable). - A ranked search over it (
chat/registry/search.ts) — intent detection + token scoring, returns top-N. Pure TypeScript, zero EE dependencies. - The exact meta-tool surface the SOTA recommends (
buildToolDefinitions):search_api_registry,search_business_data,call_api_endpoint(+ a loop-onlyfinish_response). - The read-auto / mutation-gated split already designed into
call_api_endpoint's description. - Identity threading via
TemporaryApiKeyService.issueForAiSession().
So the MCP server is ~80% existing engine + a thin transport, not new business logic.
3. Tool surface — 3 constant meta-tools
The MCP surface is 3 tools, constant, independent of API size. No per-endpoint tools.
| MCP tool | Purpose | Execution |
|---|---|---|
search_api_registry(query, limit) |
Ranked search over the endpoint catalog; returns top-N descriptors (id, name, params, schema, examples) | read-only, immediate |
search_business_data(query, types) |
Cross-entity record search → GET /api/v1/search, ACL-scoped |
read-only, immediate |
call_api_endpoint(entryId, path?, query?, body?) |
Execute the chosen endpoint | read auto; mutation gated (EE remote only) |
The agent loop is run by the client (Claude Desktop/Cursor), not by AlgaPSA: search_api_registry → read one schema → call_api_endpoint. finish_response is dropped (server-loop artifact; in MCP the host model ends its own turn).
call_api_endpoint's description is edition-templated: in the CE local connector there is no approval (the agent acts under the user's own token + RBAC; the user's MCP client is itself the human-in-the-loop). The approval clause only becomes real on the EE remote path.- MCP Resources are out of scope. Progressive disclosure subsumes them — any read is reachable via search +
call_api_endpoint, so a parallel resource surface is redundant maintenance. Revisit only if a specific client needs @-mention/attach UX.
4. Architecture — one engine, two transports, shared with chat
packages/agent-tooling/ ← NEW shared CE package (the "engine")
┌─────────────────────────────────────────────────┐
│ registry/ generated from alga-openapi.<ed>.json (CE + EE)
│ search.ts ranked search (moved as-is, already pure)
│ invoke/ build request from a registry entry → {method,path,...}
│ tool-defs/ the 3 meta-tool schemas + descriptions
└─────────────────────────────────────────────────┘
▲ ▲ ▲
EE chat assistant CE local connector EE remote server
(re-pointed onto stdio transport Streamable HTTP + OAuth,
the package) runs ON workstation governance; embedded in
runs loop server- calls instance server app (appliance/SaaS)
side, mints temp /api/v1 w/ user token
keys from session
└──────────── all dispatch → existing /api/v1 (no new business logic) ─────────────┘
Boundaries:
packages/agent-toolingholds only mechanism (registry, search, request-building, the 3 tool-def schemas). No LLM code, no transport, no governance → CE-safe and reusable.search.ts+ schema move essentially as-is.- Dispatch splits by caller. The chat assistant's temp-key-from-session path stays in EE. The local connector already holds the user's API token → calls
/api/v1directly, no temp-key machinery. The package exposes request-building; each consumer owns request-sending + auth. - The existing EE chat assistant is re-pointed onto the package — the one place shipped code is touched; carries regression risk; must retest existing chat behavior.
5. CE / EE seam (diverges deliberately from source spec §3.2/§6)
The source spec had a CE self-hosted remote base with only governance gated. This design tightens that: anything networked is EE.
| Surface | Edition |
|---|---|
| Local stdio connector (full 3-tool surface, user-scoped via API token) | CE / free |
Shared engine package (agent-tooling) |
CE |
| Remote Streamable HTTP MCP server — the entire networked endpoint: OAuth 2.1, multi-client serving, and governance | EE / paid |
| Managed/hosted remote endpoint (SaaS) | EE / paid |
Rationale: "run it yourself on your workstation = free; a networked server many agents connect to = paid" is a crisp, defensible line, and the remote transport is inseparable from the governance/hosting value. The free local connector still provides the full tool surface under the user's identity, honoring "basic access is never gated."
6. Phasing
Phase 1 — Local connector (CE) — ships first
- Extract
packages/agent-toolingfrom the EE chat code (registry +search.ts+ request-building + tool-defs). - Re-point the EE chat assistant onto it (+ regression test).
- Generalize
generate-chat-registry.mjsto emit both CE and EE registries (alga-openapi.ce.json/.ee.json). - New server endpoint:
GET /api/v1/meta/mcp-registryserving the generated registry for that instance's edition (precedent:meta/openapi,meta/endpointsalready exist). @alga/mcp-connector—npx-run Node package on@modelcontextprotocol/sdkStdioServerTransport, exposing the 3 tools.- Config via env:
ALGA_INSTANCE_URL+ALGA_API_TOKEN(an existingapi_keyskey; no new auth). - Startup: fetch registry from the instance (source of truth for version + edition). Decision: fetch from instance, not bundle (avoids drift across a heterogeneous self-hosted fleet).
- Dispatch:
search_api_registry→ in-memory search;search_business_data→/api/v1/search;call_api_endpoint→ build request + send with the user's token.
- Config via env:
- Identity = the user's token → inherits RBAC/ABAC. No agent identity, no approval, no governance (intentional, §3.1).
- Acceptance: a user configures URL + token and operates AlgaPSA from Claude Desktop under their own permissions.
Phase 2 — Remote server, MVP governance (EE)
- Streamable HTTP single endpoint (
/api/mcp) via SDKStreamableHTTPServerTransport, embedded in the server app. No legacy HTTP+SSE. - OAuth 2.1 per MCP authorization spec: MCP endpoint is an OAuth resource server; advertises
.well-known/oauth-protected-resource; auth-code + PKCE; Dynamic Client Registration. AlgaPSA acts as / fronts the authorization server. - Agent identity as a first-class subject: extend
AuthorizationSubject(already open-shaped, already carriesapiKeyId) withagentId+ subject type'agent', admin-provisioned per tenant. Because it's a kernel subject, its permissions are enforced by the existing authz kernel; basic per-agent permissions reuse existing RBAC roles. - Audit of every agent action via existing
auditLog()/audit_logs(identity, tool, inputs, policy decision, result, timestamp), exportable. - Dispatch runs inside AlgaPSA → through the kernel under the agent subject. Reads auto-execute; mutations execute only if agent permissions allow, and everything is audited. (Hold-for-human approval is Phase 3.)
- Acceptance: an admin stands up the remote server on an appliance and connects a client over OAuth; agent actions are attributable and audited.
Phase 3 — Governance depth (EE)
- Agent-specific ABAC policy — which agent may invoke which tools, on which resources, under which conditions; add the agent subject type to the kernel's bundle/narrowing policy evaluation.
- Approval gates (human-in-the-loop) — registry already carries
approvalRequired; chat already has a propose→/api/chat/v1/executeflow to mirror. New: holding queue, approve/reject UI, timeout policy.- ⚠️ Open sub-decision (deferred, needs more thought): how a held mutation resolves over request/response MCP. Candidate shapes: gated call returns a
pending_approvalhandle, resolved via Streamable HTTP streaming the eventual result within the timeout, or via acheck_approval(handle)tool. Not pinned in this design.
- ⚠️ Open sub-decision (deferred, needs more thought): how a held mutation resolves over request/response MCP. Candidate shapes: gated call returns a
- Quotas & rate limits — per-agent and per-tenant; extend existing
enforceApiRateLimit(already used for API keys) to agent subjects; structured to later feed metered usage. - SSO-bound agent identity — agent identity provisioned/bound via the tenant's IdP.
- Acceptance (§9 EE): an admin defines a policy restricting an agent to read-only on billing data, requires approval for bulk ticket closes, and gets an exportable audit trail of all agent actions.
7. Cross-cutting
- No business logic in MCP code — every path terminates at
/api/v1(Phase 1) or kernel→API dispatch (Phase 2+). MCP layer only discovers, builds, dispatches, audits. - Edition gating via existing
isEnterpriseEdition()/getFeatureImplementation(). Remote + governance inee/;agent-tooling+ connector are CE. - Fail-fast per repo standards: validate inputs early, throw actionable errors. But tool execution errors surface to the agent as structured tool errors (not thrown) so the model can recover.
- Security: token never logged; registry endpoint requires auth; OAuth scopes map to agent permissions; audit append-only.
- Testing (80/20): invest in the few tests that de-risk the most — search ranking, request-building from a registry entry, dual-edition registry generation, and the chat-assistant regression after re-pointing. One MCP-protocol conformance check per transport. EE: OAuth flow + agent-subject authz + audit-completeness. Do not exhaustively unit-test thin pass-throughs.
8. Decisions log (divergences + commitments)
- Progressive disclosure, not per-endpoint tools — 3 constant meta-tools. (Reframe of source §4.)
- Reuse the existing chat engine by extracting it to a shared CE package
agent-tooling. (Not greenfield.) - Anything networked is EE — the remote server in its entirety, not just governance. (Tightens source §3.2/§6.)
- MCP Resources dropped from scope — subsumed by progressive disclosure.
- Registry fetched from the instance, not bundled into the connector.
- Local connector uses the existing
api_keysmechanism, no new token type. - Phase order: CE local first, then EE remote (MVP governance), then governance depth.
- Deferred: the approval-gate request/response mechanism (Phase 3 open sub-decision).
9. Open questions for implementation
AlgaPSA-as-authorization-server vs. delegating to tenant IdP→ RESOLVED in §10: delegate to tenant IdP.Approval-gate resolution mechanism→ RESOLVED in §10: pending-handle +check_approvalpoll (Phase 3).- Whether
search_business_dataACL semantics via/api/v1/searchexactly match the chat assistant's internal ACL path, or need reconciliation.
10. Phase 2 design addendum — remote server, identity & auth (decided 2026-06-06)
Grounded in the current MCP authorization spec (2025-11 revision): an MCP server is an OAuth 2.1 resource server only — it validates bearer tokens from a separate authorization server, MUST serve Protected Resource Metadata (RFC 9728), and clients bind tokens to the resource via Resource Indicators (RFC 8707). Alga has no authorization server today (NextAuth relying-party only).
10.1 Decisions
- OAuth = delegate to the tenant IdP. The MCP server is purely a resource server. Token issuance is the tenant's existing IdP (Entra / Google / Keycloak — the same providers EE SSO already integrates). Alga validates tokens (issuer + audience + resource indicator + signature via the IdP's JWKS) and maps the token's client/subject claim to an Alga agent. No Alga-as-AS.
- Accepted constraint: a remote MCP server therefore requires the tenant to have an IdP. A bare appliance with no IdP cannot run the remote server (it can still run the free local connector). Document this prominently; offer Keycloak as the appliance IdP option.
- This pulls "SSO-bound agent identity" (was Phase 3 / F042) into the Phase 2 core — the IdP binding is how agents authenticate.
- Agent identity = first-class
agentstable. A real principal, not an api-key alias. - Approval over MCP = pending-handle +
check_approvaltool (poll) — Phase 3; decouples the Alga-admin approver from the agent; robust on all clients.
10.2 Auth flow (resource-server)
MCP client → GET /api/mcp (no token)
← 401 + WWW-Authenticate: resource_metadata="…/.well-known/oauth-protected-resource"
MCP client → reads PRM → authorization_servers = [tenant IdP]
→ obtains token from the IdP (client-credentials / service principal for a machine agent),
with resource indicator = the Alga MCP resource URL
MCP client → GET /api/mcp (Authorization: Bearer <IdP JWT>)
Alga MCP → validate: issuer ∈ tenant's configured IdPs, aud/resource = this server,
signature via cached JWKS, not expired
→ extract client_id / sub claim → look up agents.idp_subject → agent principal
→ build AuthorizationSubject{ agentId, subjectType:'agent', tenant, … }
→ dispatch through the existing authz kernel → /api/v1; audit every call
10.3 Agent identity model
agentstable (per tenant):agent_id(uuid PK),tenant,name,description,active,created_by,created_at, plus the IdP binding:idp_issuer,idp_subject(thesub/azp/client_idclaim that identifies this agent in the tenant IdP). Unique on(tenant, idp_issuer, idp_subject).- Credentials: primary path is the IdP token (above). Optionally, an Alga-issued agent credential reuses
api_keyswith a new nullableagent_id(anduser_idmade nullable for agent keys) — useful for non-OAuth/dev access and to let the local connector act as a registered agent later. Not required for the IdP path. AuthorizationSubjectgainsagentId?: stringandsubjectType?: 'user' | 'agent'(default'user').buildAuthorizationPrincipalSubjectgrows an agent branch: given a resolved agent, assemble a subject with the agent's assigned roles/permissions (Phase 2 reuses RBAC roles; Phase 3 adds agent-specific ABAC bundles).- Authz kernel unchanged — it already evaluates whatever subject it's handed; we only teach the subject builder about agents.
10.4 Reuse map (minimal new surface)
- Transport: SDK
StreamableHTTPServerTransport, single/api/mcproute, EE-gated viaisEnterpriseEdition(). Tool handlers reuse the connector'ssearch/calllogic but dispatch in-process through the kernel (not HTTP-to-self) under the agent subject. - JWKS/JWT validation: reuse
@auth/core/jwt+getSecretProviderInstance; cache JWKS per IdP. IdP config reuses EEproviderConfig/ssoProviders(per-tenant). - Audit: reuse
auditLog()/audit_logs— one row per tool invocation (agent_id, tool, inputs, decision, result, ts). - Rate limiting: reuse
enforceApiRateLimitwithrateLimitSubjectId = agent_id. - Registry: the EE registry already served by
GET /api/v1/meta/mcp-registry(Phase 1).
10.5 Revised feature interpretation (Phase 2)
- F024 (PRM): serve
/.well-known/oauth-protected-resourceadvertising the tenant's IdP as theauthorization_servers. (Core.) - F025 (token flow): validate IdP tokens (issuer/aud/resource/JWKS), not run an Alga auth-code+PKCE flow. (Re-scoped.)
- F026 (DCR): dropped/deferred — DCR is downgraded to optional in the spec, and with IdP delegation client registration happens at the IdP, not Alga.
- F042 (SSO-bound identity): pulled into Phase 2 core (it's the auth mechanism), not a Phase-3 add-on.
- F027-F033 (agent subject, provisioning, mapping, per-agent RBAC, kernel dispatch, audit, export) unchanged in intent.
10.6 What still needs live infra / can't be unit-tested here
- A real tenant IdP for the full token round-trip. Unit-testable now: JWKS signature validation (mock JWKS), claim→agent mapping, the
agentsmigration + subject builder, EE-gating. End-to-end needs a live IdP (or a mock OAuth server in integration tests).