PSA/ee/docs/plans/2025-10-27-chat-function-calling-plan.md
Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

14 KiB
Raw Blame History

Chat Function Calling Migration Plan

Goal

Modernize the client chat experience to use OpenAI-style function calling contracts while temporarily simplifying the transport to single-response (non-streaming) requests.

Scope

  • DefaultLayout-integrated chat UI (Chat.tsx, Message.tsx, RightSidebarContent.tsx).
  • /api/chat/stream/* Next.js routes and the EE ChatStreamService entry point.
  • Supporting types, helpers, and tests touched by the request/response contract.
  • API key lifecycle components (ApiKeyService, REST API middleware) for vending scoped, temporary keys.

Constraints & Assumptions

  • Switch to non-streaming HTTP responses for the initial iteration; we can reintroduce streaming after the function call flows are stable.
  • Maintain current authentication/session handling, message persistence, and feedback UX.
  • EE service remains the authority for model invocations; we adapt its output shape but do not reimplement core business logic.

Phased Implementation Plan

Phase 0 Discovery & Alignment

  1. Current Flow Audit: Document the end-to-end request/response path across Chat.tsx, /api/chat/stream/*, and ChatStreamService, capturing payload shapes, persistence hooks, and UX triggers.
  2. Stakeholder Alignment: Share findings with product/EE stakeholders to confirm expectations for non-streaming behavior, telemetry, and rollout sequencing.

Phase 1 Contract Definition

  1. Schema Drafting: Define TypeScript types and JSON schema for the OpenAI-style response, including choices[*].message, function_call, and error envelopes.
  2. Helper Utilities: Introduce shared helpers (serialization, validation) in a neutral package so client and server consume the same contract.
  3. Migration Flagging: Decide on feature flag or environment toggle names to gate the new contract during rollout.
  4. Approval Policy Design: Specify the approval rules for function invocation (model-driven vs. user-confirmed), including required metadata and persistence hooks.

Phase 2 Temporary API Key Vending

  1. RBAC & Auth Review: Map how ApiKeyService, ApiKeyServiceForApi, and withApiKeyAuth derive tenant + user context to ensure temporary keys inherit permissions.
  2. Ephemeral Key Design: Extend the api_keys schema with purpose, metadata (jsonb), usage_limit, and usage_count columns; define purpose = 'ai_session' for chat-issued keys, default usage_limit = 1, usage_count = 0, and set expires_at = now() + interval '30 minutes'.
  3. Minting Workflow: Add a TemporaryApiKeyService.issueForAiSession helper that wraps ApiKeyService.createApiKey, writes the discriminator/metadata (chat_id, function_call_id, approval_id, issued by user, approval timestamp), and returns { apiKey, expiresAt } to the chat orchestrator.
  4. Revocation & Cleanup: Update validation helpers to increment usage_count atomically and deactivate keys when usage_count >= usage_limit or if the associated approval is revoked; schedule a cleanup-expired-ai-keys pg-boss job (every 10 minutes) to deactivate lingering expired keys and emit audit logs.
  5. Access Mediation: Enhance ApiKeyService.validate* and withApiKeyAuth to surface purpose/metadata in the request context, enforce tenant binding via runWithTenant, and short-circuit if a key is outside its intended scope (e.g., mismatched chat_id or function).
  6. OpenAPI Extraction Pipeline: Build ee/scripts/generate-chat-registry.ts to consume the enterprise spec (sdk/docs/openapi/alga-openapi.ee.json or /api/v1/meta/openapi), filter callable routes, merge overrides, and emit ee/server/src/chat/registry/apiRegistry.generated.ts.

Phase 3 Server Adaptation

  1. Endpoint Refactor: Create or repurpose an /api/chat handler that produces the non-streaming response while delegating business logic to EE services.
  2. EE Service Adapter: Implement an adapter that buffers the existing streaming output, assembles the final message, and maps it into the contract; skip legacy model support (function calling becomes the standard path).
  3. Legacy Path Coexistence: Guard the legacy streaming handler behind a runtime flag to ensure staged rollout and allow quick fallback.
  4. Temporary Key Integration: When a function call is approved, invoke TemporaryApiKeyService.issueForAiSession and inject key details into the response payload; log issuance and propagate audit context.
  5. Deferred Execution: Ensure server-side function execution is gated by explicit user approval, queuing the call until approval is granted (or rejected) before issuing credentials.

Phase 4 Client Integration

  1. API Client Update: Point chat mutations to the new non-streaming endpoint and update request payloads as needed.
  2. Function Call Handling: Parse function_call responses, trigger approval flows when required, and render call progress/results inline with the chat history; surface temporary key expiry countdown when relevant.
  3. Approval UX: Add user-facing prompts/notifications for pending approvals, including acceptance/denial actions and surfaced audit metadata; record approval outcomes alongside key issuance metadata.
  4. UX Adjustments: Ensure cancel/stop controls degrade gracefully, update loading states to reflect non-streaming responses, and handle key revocation (e.g., expiry) with user-facing messaging.

Phase 5 Quality & Validation

  1. Automated Tests: Expand unit/integration coverage across client and server for both plain and function-call responses; include TTL/usage_limit enforcement tests for temporary keys.
  2. Approval Flow Validation: Add automated and manual tests that cover approval-required invocations, rejected calls, audit logging, and forced expiry cleanup.
  3. Manual QA: Outline manual verification scripts for internal testers, including regression scenarios for message persistence, approvals, feedback flows, and key revocation.
  4. Telemetry Review: Verify logging/metrics capture the new contract fields, approval outcomes, and key issuance/consumption events; adjust dashboards or alerts as needed.

Phase 6 Rollout & Cleanup

  1. Staged Deployment: Enable the new flow in staging, then for internal users, before general release; monitor errata and rollback levers.
  2. Code Cleanup: Remove or archive unused streaming-specific code paths once adoption is complete.
  3. Future Streaming Work: Document follow-up tasks to reintroduce streaming with the function-call contract and track them in roadmap tooling.

Risks & Mitigations

  • Regression in chat UX: Mitigate with feature flag or progressive rollout and thorough manual QA.
  • Function call mismatch: Define strict TypeScript types and logging around contract translation.
  • Performance hit from non-streaming: Monitor response times; ensure backend can produce complete messages promptly.
  • Approval bypass or deadlocks: Enforce approval middleware server-side and include alerting for stuck or rejected calls.
  • Key leakage or privilege escalation: Scope temporary API keys tightly (short TTL, single-conversation linkage) and log issuance/usage for auditing.

Design Details

Temporary API Key Data Model

  • New Columns (api_keys table): purpose (varchar, default 'general'), metadata (jsonb, nullable), usage_limit (integer, nullable), usage_count (integer, default 0).
  • TTL Handling: Reuse existing expires_at; issue AI session keys with expires_at = now() + interval '30 minutes', minting a new key if an active one is absent or expired.
  • Metadata Shape (stored as JSON): { chat_id, function_call_id, approval_id, issued_by_user_id, issued_at, approved_by_user_id, approved_at }.
  • Indexes: Add composite index on (purpose, expires_at) to speed cleanup scans and (purpose, metadata->>'chat_id') if needed for auditing.

Key Issuance Flow

  1. Add optional parameters to ApiKeyService.createApiKey (purpose, metadata, usageLimit, expiresAt) and mirror them in ApiKeyServiceForApi.
  2. Chat backend requests TemporaryApiKeyService.issueForAiSession({ userId, chatId, functionCallId, approvalId }).
  3. Service verifies approval state, sets usage_limit = 1, expires_at = now() + 30 minutes, logs issuance (structured log + audit event), and returns plaintext key + expiry + key uuid. If a valid key already exists for the conversation, deactivate it and mint a fresh one.
  4. Chat response embeds { api_key, expires_at, key_id } into the OpenAI function-call payload so the AI has necessary credentials.

Key Consumption & Enforcement

  1. Downstream API handlers continue to use withApiKeyAuth; extend validation to fetch purpose, metadata, and usage_limit.
  2. On each request, increment usage_count in a transaction; if the count exceeds the limit or the key is expired/inactive, return 401 and deactivate.
  3. Enforce scope checks using metadata (e.g., ensure chat_id matches request header/context, ensure only approved function endpoints are callable); rely on existing RBAC permissions to gate endpoint access—no additional allowlist layer required.
  4. Surface key details on the req.context object so authorization layers can make chat-aware decisions (e.g., map to the issuing user for RBAC checks).

Revocation & Cleanup

  1. Immediate cleanup: when the AI reports function completion or approval is revoked, call TemporaryApiKeyService.revoke(keyId, reason) to deactivate key and annotate metadata.
  2. Scheduled cleanup: add pg-boss job cleanup-expired-ai-keys that runs every 10 minutes, selecting purpose = 'ai_session' keys with expires_at < now() and active = true, deactivating them and logging summary metrics.
  3. User sign-out/tenant disable: hook into existing sign-out flows to revoke outstanding AI session keys for that user.

Telemetry & Auditing

  • Emit structured logs on issuance, consumption, revocation, and failed validations (include tenant, chat_id, key_id, reason).
  • Send audit events to existing security/audit trail (if available) so administrators can review AI-initiated actions.
  • Track metrics via existing OpenTelemetry/PostHog instrumentation (number of keys issued, consumption success/failure, cleanup counts).

Function Call Definition Architecture

  • Registry Source (ee/server/src/chat/registry):
    • apiRegistry.schema.ts: Zod schema + TypeScript types describing each callable endpoint (id, name, description, tags, RBAC hints, required params, examples).
    • apiRegistry.overrides.ts: developer-maintained map of task metadata (playbooks, grouping, curated examples) produced from YAML/JSON files under ee/docs/api-registry/.
    • apiRegistry.generated.ts: build artifact from ee/scripts/generate-chat-registry.ts combining the enterprise OpenAPI spec with overrides.
    • apiRegistry.indexer.ts: optional helper to embed/serialize registry entries for semantic search (exports vector metadata for Postgres/pgvector or local cosine search).
  • Tool Implementations (ee/server/src/chat/tools):
    • searchApiRegistryTool.ts: implements search_api_registry, calling the indexer + returning top matches (id, summary, confidence, example usage).
    • describeApiFunctionTool.ts: resolves an entry by id, returning full schema/examples plus approval guidance.
    • invokeApiFunctionTool.ts: orchestrates approval check, temporary key issuance, HTTP invocation, and structured result payloads.
    • index.ts: central export consumed by ChatFunctionRouter with tool metadata for the OpenAI function-calling interface.
  • Chat Integration (packages/product-chat/ee):
    • services/functionPlanner.ts: helper invoked by the chat controller to decide when to call search_api_registry vs. direct execution.
    • components/ApprovalsPanel.tsx: renders the selected function (from describe_api_function) along with parameters for human approval.
  • Documentation & Playbooks (ee/docs/api-registry):
    • YAML/Markdown files describing canonical tasks (e.g., tickets.create.yaml) referenced by registry entries.
    • Editors update these files; the build script merges their metadata into apiRegistry.generated.ts on demand (or during prebuild).
  • Testing:
    • Unit tests in ee/server/src/chat/tools/__tests__ validating registry search, description payloads, and invocation guardrails.
    • Integration harness in packages/product-chat/ee/test simulating chat flows with mocked registry + approvals.

Registry Generation Pipeline

  1. Source Spec: Use sdk/scripts/generate-openapi.ts (already part of the build) to refresh sdk/docs/openapi/alga-openapi.ee.json. Runtime fallback: GET /api/v1/meta/openapi.
  2. Extraction Script: ee/scripts/generate-chat-registry.ts loads the spec, filters for operations flagged with x-chat-callable: true (added via OpenAPI registry decorators), normalizes method/path into stable ids, and captures request/response schemas.
  3. Metadata Merge: For each id, merge override data from ee/docs/api-registry/*.yaml (playbooks, curated examples, RBAC hints, approval notes). Emit warnings for stale overrides or missing spec entries.
  4. Output Artifacts: Write ee/server/src/chat/registry/apiRegistry.generated.ts (exporting an array) and optional search index JSON under ee/server/src/chat/registry/cache/.
  5. Build Integration: Add npm script pnpm --filter product-chat-ee generate-chat-registry invoked during dev and build to keep artifacts in sync. Ensure CI runs the generator and fails on drift.
  6. Spec Annotations: Update OpenAPI registry decorators (via server/src/lib/api/openapi/registry.ts) to mark eligible endpoints with x-chat-callable, x-chat-display-name, and x-chat-rbac-resource so the extractor has structured inputs.

Decisions & Notes

  • Function calling becomes the default path—no legacy/non-function-call model support required.
  • Function execution is deferred until explicit user approval; denied requests short-circuit without issuing credentials.
  • Telemetry relies on existing OpenTelemetry/PostHog pipelines; no new collectors are needed beyond event tagging.
  • Temporary AI keys use a rolling 30-minute TTL and are reissued on demand if absent/expired to follow the users session.
  • Endpoint access remains governed by the users RBAC permissions; no additional allowlist layer is required.

Rollout Plan

  • Land server + client changes behind an environment flag.
  • Verify in staging with representative conversations.
  • Enable in production for internal users before broad rollout.