Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz Source: /opt/alga-psa on psa.joliet.tech
22 KiB
SCRATCHPAD — AlgaPSA MCP Server
Working memory for the effort. Source of truth for scope =
design.mdin this folder.
Context
Implement AlgaPSA as an MCP server in two transports: a free CE local stdio connector and an EE remote Streamable HTTP server with governance. Central design move: progressive disclosure — 3 constant meta-tools, not per-endpoint tools — reusing the existing EE chat agentic engine.
Key discoveries (existing code to reuse)
- The engine already exists in the EE chat assistant:
ee/server/src/services/chatCompletionsService.ts— agent loop;buildToolDefinitions()(~line 958) defines the meta-toolssearch_api_registry,search_business_data,call_api_endpoint,finish_response;executeFunctionCall()(~line 3317) dispatches via a temp API key;searchBusinessData()(~line 1206) calls server-internal full-text search.ee/server/src/chat/registry/apiRegistry.schema.ts—ChatApiRegistryEntry(carriesrbacResource,approvalRequired,parameters, request/response schemas,examples,playbooks). Pure types.ee/server/src/chat/registry/search.ts—searchRegistryEntries()ranked search. Pure TS, imports only the schema type → trivially extractable.ee/server/src/chat/registry/apiRegistry.generated.ts— generated registry (~1.2MB).ee/scripts/generate-chat-registry.mjs— generator from OpenAPI; supports YAML overrides inee/docs/api-registry/.
- OpenAPI specs exist for both editions:
sdk/docs/openapi/alga-openapi.ce.jsonand…ee.json(+ yaml). Generator:sdk/scripts/generate-openapi.ts. - HTTP surfaces the connector needs already exist:
- Global search:
server/src/app/api/v1/search/route.ts(+ per-entity*/search). - Meta endpoints:
server/src/app/api/v1/meta/{openapi,endpoints,schemas,sdk}— precedent for addingmeta/mcp-registry.
- Global search:
- API-key auth:
server/src/lib/api/middleware/apiAuthMiddleware.ts(x-api-key/ Bearer,api_keystable). Subject already carriesapiKeyId. - Authz kernel:
server/src/lib/authorization/kernel/{contracts.ts,engine.ts}.AuthorizationSubjectis open-shaped ([key: string]: unknown) → can addagentId+ subject type'agent'. - Audit:
server/src/lib/logging/auditLog.ts→audit_logstable.auditLog(knex, {userId, operation, tableName, recordId, changedData, details}). - Edition gating:
server/src/lib/features.ts—isEnterpriseEdition(),getFeatureImplementation(). EDITION env (community|ee|enterprise). - Monorepo: npm workspaces (root
package.json), Nx. New CE pkg →packages/agent-tooling; connector →packages/alga-mcp-connector(or@alga/mcp-connector); remote endpoint lives in the server app underee/.
Decisions (see design.md §8)
- Progressive disclosure: 3 constant meta-tools, no per-endpoint tools.
- Extract engine to shared CE package
agent-tooling; chat + both MCP transports consume it. - Anything networked is EE (tightens source spec §3.2/§6 — remote base is no longer CE).
- MCP Resources dropped from scope (subsumed by progressive disclosure).
- Registry fetched from the instance (
meta/mcp-registry), not bundled (avoids fleet drift). - Local connector reuses existing
api_keysmechanism — no new token type. - Phase order: CE local → EE remote (MVP gov) → governance depth.
- Temp-key-from-session dispatch stays EE (chat); connector calls
/api/v1directly with user token.
Open questions / deferred
- Deferred: approval-gate resolution over request/response MCP (Phase 3 design spike) — candidates:
pending_approvalhandle resolved via Streamable HTTP streaming within timeout, or acheck_approval(handle)tool. - Do
/api/v1/searchACL semantics match the chat assistant's internal ACL path, or need reconciliation? - OAuth: AlgaPSA-as-authorization-server vs. delegate to tenant IdP (P2 vs SSO-bound identity in P3).
Testing posture
80/20 by explicit user directive — lean test list, high-value risks only. This intentionally overrides the software-planner default of "tests > features." Do not exhaustively test thin pass-throughs.
Commands / runbooks
- Generate registries (to be generalized for CE+EE):
node ee/scripts/generate-chat-registry.mjs - Build editions:
npm run build:ce/npm run build:ee - OpenAPI regen:
sdk/scripts/generate-openapi.ts
Gotchas
searchBusinessData()in chat uses server-internal DB search (createTenantKnex, ACL principal) — not reachable from a workstation connector. The connector must use the HTTP/api/v1/searchendpoint instead.- Re-pointing the chat assistant onto
agent-toolingis the only shipped-code change in Phase 1 → regression-test the existing chat flow. - Registry is ~1.2MB; serve gzipped from
meta/mcp-registry.
Implementation log / surprises
2026-06-06 — Group A (F001-F003): agent-tooling package extracted
- Created
packages/agent-tooling(CE), mirroringpackages/formattingconventions (src-export map, tsup preset, project.json, vitest). Typechecks + builds + 6 search tests pass. - Decision: copy schema+search into the package and leave
ee/server/src/chat/registry/*untouched for now (brief, intentional duplication). The EE chat re-point + de-dup is Group D — deferred so the connector (Group F) lands first with zero changes to the build-criticalserver/next.config.mjsor shipped chat code. - Sequencing change vs features.json order: executing A → B → F (standalone connector, no Next.js) BEFORE C → E → D (server integration + next.config edits + chat re-point). Risk pushed later; value (working CE connector) lands first.
- SURPRISE — search never returns empty for a non-empty query.
scoreEntryadds an unconditional recency bonusMath.max(0, 2 - index*0.05), so the first ~40 registry entries always score > 0 even with zero token/intent match. Implication for MCP:search_api_registryon an irrelevant query returns low-relevance entries (by registry order), not an empty set. The agent must judge relevance from the returned scores/descriptions. Consider surfacing thescorein the MCP tool result so the model can tell "weak match" from "strong match". Not changing the algorithm now (parity with shipped chat behavior). - next.config.mjs reality: per-package webpack aliases exist in TWO blocks (dev-source ~L230-274 and prebuilt ~L515-544) plus a
transpilePackageslist (~L413). Group E/D must add@alga-psa/agent-toolingto all three. Build-critical file — edit carefully.
2026-06-06 — Group F (F012-F020): @alga-psa/mcp-connector built
- New
packages/alga-mcp-connector(publishable, NOT private — the one shippable package).npx-runnable bin via tsupbannershebang; bundlesagent-tooling(noExternal: [/@alga-psa\//]) so the published artifact only needs the public MCP SDK at runtime. Verified:searchRegistryEntriesis inlined in the 25KB dist bin. - Decision — low-level
ServerAPI, notMcpServer+Zod.buildMetaToolDefinitionsalready emits raw JSON-SchemainputSchema, which maps 1:1 to the low-levelListToolsRequestSchemahandler. Avoids re-expressing schemas in Zod. - Decision — connector always uses
edition: 'ce'tool templating (no approval clause) regardless of the instance edition that served the registry — the local connector is inherently user-scoped. EE templating is for the Phase-2 remote server. - Conformance proven in-memory (T011):
InMemoryTransport.createLinkedPair()+ SDKClient↔ our server. listTools → exactly the 3 tools; callTool search works; HTTP-failure →isError. Fail-fast verified by running the built bin with no env (clear stderr msg, exit 1). stdout kept clean (all logs → stderr). - SURPRISE — MCP SDK 1.29 pulls ~50 transitive deps (express, ajv, hono, eventsource, …). It bundles the Streamable-HTTP server transport, so even a stdio-only connector drags the HTTP stack in at install. Harmless (tree-shaken from our bundle; runtime only needs the SDK), but worth knowing.
package-lock.jsonnow pins@modelcontextprotocol/sdk@1.29.0— committed with this group. - OPEN — tenant header. Connector relies on API-key→tenant resolution and adds
x-tenant-idonly ifALGA_TENANT_IDis set. Must verify against a live instance whetherapiAuthMiddlewarerequires the tenant header forvalidateApiKeyAnyTenant. Tracked for Group G live E2E. - Contract pinned for Group E: connector expects
GET /api/v1/meta/mcp-registry→ JSON{ entries: [...] }(also tolerates a bare array), auth viax-api-key.search_business_data→GET /api/v1/search?query=&types=csv&limit=&cursor=&sort=(confirmed againstApiSearchController). - T012 (live E2E) deferred to Group G — will drive the built bin over real stdio against a local mock HTTP instance.
2026-06-06 — Group C (F006, T005): dual-edition registry generation
- Generalized
ee/scripts/generate-chat-registry.mjsto emit both editions in one run (CE →server/src/lib/mcp/registry.generated.ts, EE → existing location). CE file imports the type from@alga-psa/agent-tooling/registry/schema. Added root npm scriptmcp:registry:generate. T005 is enforced in the generator as a hard invariant: it throws if any CE endpoint is absent from EE. - Added
@alga-psa/agent-toolingtotsconfig.base.jsonpaths — the repo resolves@alga-psa/*types via tsconfigpaths(not the packageexportsmap), and the package emits no.d.ts(presetdts:false). This is why the IDE flagged "Cannot find module"; the connector's owntscpassed because it usesmoduleResolution: Bundler. The base-paths entry fixes IDE + global typecheck and is required for the server (Group D/E) to import the package. - SURPRISE — the committed EE chat registry is STALE. Regenerating from the current EE spec went 609 → 901 entries (+292 real endpoints, e.g. inboundwebhooks). The committed registry was generated 2026-04-29; the EE spec was updated 2026-06-04. So the in-app chat is currently ~292 endpoints behind its own API spec.
- Decision: did NOT refresh the EE registry in this commit — it's a 14k-line, chat-behavior-changing diff unrelated to the MCP extraction, and warrants its own review. Reverted the EE regeneration; committed only the new CE registry. Consequence: committed CE (879, fresh) is briefly larger than committed EE (609, stale); they serve independent consumers, so no runtime issue, but the CE⊆EE invariant only holds on a fresh dual generation.
- Follow-up to surface to the user: run
npm run mcp:registry:generateand commit the refreshed EE registry separately (also refreshes the connector's view of an EE instance via the meta endpoint). Both the chat and EE-instance MCP currently see the stale set.
2026-06-06 — Group E (F009-F011): GET /api/v1/meta/mcp-registry
- Added
getMcpRegistry()toApiMetadataController+ routeserver/src/app/api/v1/meta/mcp-registry/route.ts. Auth via the sharedauthenticate()+assertProductApiAccess(F010). Returns{ edition, count, entries }. - Edition-aware with ZERO next.config changes (F011). CE registry is
await import('@/lib/mcp/registry.generated'); on EE,await import('@product/chat/entry').eeMcpRegistry(added that export topackages/product-chat/ee/entry.tsx, the established CE→EE seam used by the chat routes). Falls back to CE if the EE artifact is missing. - Why no next.config alias was needed: changed the generator to emit
import type { ChatApiRegistryEntry }in the registry files → the schema import is erased at runtime, so the CE registry never pulls@alga-psa/agent-toolinginto the server's runtime graph. Regenerated the CE registry with this. (The agent-tooling webpack alias is only needed for Group D, when the chat runtime imports the package.) - LSP shows
@ee/*"cannot find module" for product-chat/ee/entry.tsx — that's the file's normal state (the@ee/alias resolves only in the EE build), and affects the pre-existing service imports identically. Not a regression. - T007 (live endpoint auth + edition) NOT auto-tested — a Next route handler needs the full server/DB/auth stack (poor 80/20). Auth is the shared, already-tested middleware; the edition branch is trivial; the registry-fetch contract is covered by the Group G connector E2E against a mock instance. Validate the real endpoint via a running dev server (manual).
2026-06-06 — Group D (F007, F008): re-point EE chat onto agent-tooling
- Replaced
ee/server/src/chat/registry/{apiRegistry.schema,search}.tswith thin re-export shims →@alga-psa/agent-tooling/registry/{schema,search}. De-dups the ~360 lines that were copied into the package in Group A. Every existing import path (indexer, generated registry, chatCompletionsService) keeps resolving via the shim. Behavior is identical by construction (verbatim re-export of the same code). - F008 preserved: the temp-key-from-session dispatch (
executeFunctionCall+TemporaryApiKeyService) stays in EEchatCompletionsService; the package only does request-building. Chat's own OpenAI/Vertex-shapedbuildToolDefinitions(withfinish_response+ business-search enum) stays in EE too — intentionally NOT replaced by the package's MCP-shapedbuildMetaToolDefinitions(different transport/format). - next.config.mjs: added
@alga-psa/agent-tooling(+/subpath variant) in all three places, mirroring the source-transpiledscheduling/formattingpackages exactly: turbopackresolveAlias,transpilePackages, and the webpackconfig.resolve.alias"Source-transpiled" block. This runtime alias is needed (unlike Group E) because the chat importssearchRegistryEntriesas a runtime value. Verified the config still parses/loads (node import()); agent-tooling + connector tests still pass. - ⚠️ FINAL GATE I could NOT run in-session: a full EE/CE Next build (
npm run build/npm run dev) to confirm the webpack/turbopack alias resolves at build time. The edits mirror a known-working package precisely and the config parses, but a real build is the definitive check. Surface this to the user. - T006 (chat regression) left implemented=false: behavior is preserved by construction, but the live chat flow (LLM + server) wasn't exercised. Verify on a running dev server.
2026-06-06 — Group G (F021, T009, T012): Phase 1 E2E
- Added
e2e.test.ts: a mock AlgaPSA HTTP instance + the realInstanceClient+ the real MCP protocol (InMemory transport). Drives the full path: registry fetch →search_api_registry→call_api_endpoint→GET /api/v1/tickets/{id}, plus a 401 auth-failure case. 17 connector tests pass. - Covers T009 (real
/api/v1dispatch + parsed result) and T012 (lists + reads a ticket). F021 acceptance is faithfully simulated (real HTTP + protocol); real Claude-Desktop verification is manual.
PHASE 1 STATUS — COMPLETE (pending live gates)
Done & committed (8 commits): agent-tooling package (registry/search/request-build/tool-defs), @alga-psa/mcp-connector stdio bin, dual-edition registry generation + CE artifact, GET /api/v1/meta/mcp-registry, EE chat re-pointed onto the shared package. 21/21 Phase-1 features; 10/12 Phase-1 tests (29 automated tests across the two packages all green).
Live gates I could NOT run in-session (surface to user):
- EE/CE Next build (
npm run build/npm run dev) — validates the Group Dnext.configalias resolves at build time. Edits mirrorscheduling/formattingexactly; config parses; but a real build is the definitive check. (→ T006 chat regression + T007 endpoint auth ride on this.) - EE registry is stale (609 vs 901) — run
npm run mcp:registry:generate, review, commit separately. - Connector tenant header — verify whether
/api/v1needsx-tenant-idor resolves tenant from the API key (setALGA_TENANT_IDif required).
PHASE 2 COMPLETE (2026-06-07) — remote governance, live-verified
13/14 Phase-2 features done (F026 Dynamic Client Registration intentionally dropped — spec downgraded it; IdP delegation registers clients at the IdP). Built + verified live against the EE dev server (:3001) with a mock IdP (RS256 keypair + a local JWKS server).
- Agent identity (
agents,agent_idp_providers,agent_roles,api_keys.agent_id; migrations applied). Agents are backed by a no-login internal user so the existing kernel +hasPermissionenforce the agent's RBAC roles (the kernel'sdefaultRbacEvaluatorre-fetches byuser_id, so a backing user is the low-risk way to reuse all authz — vs. a riskier core RBAC change).AuthorizationSubjectgainsagentId/subjectType. - IdP-delegated auth (
idpToken.ts, jose): validate a Bearer JWT against a tenant-trusted IdP's JWKS (iss/aud/resource), map subject → agent. Resource server only; no Alga AS. PRM at/.well-known/oauth-protected-resource. - Dispatch: agent path mints a short-lived agent-scoped key and calls
/api/v1→ kernel enforces the agent's roles. Every tool call written tomcp_agent_audit; exportable via/api/v1/mcp/audit. - Provisioning API:
/api/v1/mcp/{agents,idp-providers,audit}(EE, API-key admin auth). - E2E proof (live): Admin agent → reads a real ticket; no-role agent → 403 (RBAC deny); untrusted issuer → 401; agent action audited. T013-T017 covered by this live mock-IdP E2E.
SURPRISES / fixes during the live build:
- RLS is no longer used (per Robert). My agent tables' tenant_isolation RLS policies referenced
current_setting('app.current_tenant'), which the app no longer sets →guc.c find_option500s. Removed RLS from all four tables (live + migration files); tenant isolation is in code (.where({tenant})). - Global API middleware (
server/src/middleware.ts) enforcesx-api-keyon/api/*and rejected the agent'sBearerJWT ("API key missing"). Added/api/mcptoapiKeySkipPaths(it authenticates in-route)./api/v1/mcp/*provisioning stays gated. - Multi-dir migrations:
knex_migrationsreferences CE+EE+ext dirs, so single-dirmigrate:latestreports "corrupt". Applied the two new migrations surgically (runup()+ record). The repo'sserver/scripts/run-ee-migrations.jsis the proper merged runner. - Backing-user pattern means
api_keys.user_idstays set (the backing user) even though I made it nullable;agent_idlinks the agent. ThreadingagentId/subjectTypefully through the/api/v1subject builder is a future refinement (today attribution/audit live at the MCP layer; RBAC via the backing user).
Pre-existing Phases 2–3 notes (superseded for Phase 2 above)
Phases 2–3 (EE remote + governance) NOT started — F022-F043. F022/F023 (Streamable HTTP transport + 3 tools, EE-gated) are implementable now (analogous to the connector). F024+ (OAuth 2.1, agent identity, ABAC, approval gates, quotas, SSO) need product decisions first: OAuth AS-vs-IdP strategy, and the deferred approval-over-request/response mechanism.
LIVE BRING-UP (2026-06-07) — both MCPs running against the dev server (:3001, EE)
Dev server: feature/alga-mcp-server/server, Next 16.2.6, PORT=3001 npm run dev (nx server:next:dev), NEXT_PUBLIC_EDITION=enterprise (from server/.env). DB: docker algamcp-postgres-1. Test API key minted via DB insert (SHA-256 of a random token; internal user dorothy@kansas.oz; saved at /tmp/alga_mcp_token.txt, description 'mcp-test-key').
EE-BUILD GATE CLEARED. Restarted the dev server to pick up the Group-D next.config agent-tooling alias. /api/mcp tools/list returned the 3 tools — i.e. buildMetaToolDefinitions (an agent-tooling runtime value) resolved at runtime. So the turbopack/webpack alias + transpilePackages edits are correct. Server booted clean; chat path (search.ts shim runtime value) implicitly exercises the same alias.
LOCAL MCP — works. Built bin driven over real stdio (SDK StdioClientTransport) against :3001: search_api_registry('list tickets') → get-_api_v1_tickets; call_api_endpoint → HTTP 200, real ticket "Ruby Slippers Server Power Fluctuation"; search_business_data → valid response. Re-verified after the server restart.
SERVER MCP — works. New EE-gated POST /api/mcp (Streamable HTTP, JSON-RPC). Synthetic curl drive: unauth → 401; initialize → protocol result; tools/list → 3 tools; tools/call search_api_registry → ranked; tools/call call_api_endpoint(get-_api_v1_tickets) → HTTP 200 real ticket.
BUG FOUND + FIXED via live test: the connector's fetchRegistry only read top-level entries, but the real endpoint returns Alga's { data: { entries } } envelope → connector couldn't parse the registry. Fixed instanceClient.fetchRegistry to unwrap data; updated the E2E mock to use the envelope. (Pure-unit tests had missed it because the mock returned a bare {entries}.)
NOTES / not-yet-done:
app_search_indexhas 0 rows in this dev DB →search_business_datacorrectly returns empty. Tool is fine; the index just isn't populated.- Server MCP auth is an MVP stand-in: Alga API key (
x-api-key/Bearer validated viavalidateApiKeyAnyTenant), NOT the designed IdP-delegated OAuth (F024/F025). The 401 also advertises aWWW-Authenticate: ...resource_metadataheader, but the PRM endpoint isn't built yet. - Server MCP dispatch is self-HTTP to
/api/v1under the caller's key (reuses agent-toolingbuildRequest), NOT the designed in-process kernel dispatch under an agent subject (F031). Good enough to prove the transport + tool surface; swap to kernel dispatch when agent identity (F027) lands. - So F022 (transport) + F023 (3 tools over remote) = done (MVP); F024-F033 (OAuth/IdP, agent identity, audit) remain.