Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

10 KiB
Raw Permalink Blame History

PRD: Teams Admin Diagnostics and Proactive Test Message

Plan slug: 2026-05-29-teams-diagnostics-test-message Owning area: EE / Microsoft Teams addon (ee/packages/microsoft-teams) + packages/integrations settings UI Related plans:

  • .ai/teams_improvements/microsoft-teams-addon-competitive-parity-plan.md (Phase 1: "Setup Wizard, Diagnostics, and Test Message")
  • ee/docs/plans/2026-05-24-teams-observability-loop/ (completed, PR #2562, F001F050) — built the teams_notification_deliveries, teams_audit_events, and teams_conversation_references tables this plan consumes.

Problem Statement

The Teams addon is a functional v0 (bot, message extension, quick actions, meetings, package generation, entitlement gating, and — as of the observability loop — persisted delivery/audit records). But an admin still has no way to confirm a tenant is correctly wired up except by reading server logs. There is no diagnostics surface and no test-message path. This is the gating blocker for charging: a non-developer admin cannot self-verify that profile, Graph token, bot credentials, user linkage, and delivery all work end-to-end.

Separately, the observability loop added teams_conversation_references (written on every inbound bot activity) and the repo already has sendBotActivity() — the full Bot Framework proactive-send primitive. Neither is consumed by anything yet. Building the test message on the proactive path lights up both, and validates the exact mechanism Phase 2 channel delivery will depend on.

User Value

  • MSP admins can run one-click diagnostics and a test message to confirm Teams is working, without engineering.
  • Diagnostics distinguish failure classes (missing addon, inactive integration, unready profile, missing package/base URL, missing bot credentials, missing user linkage, missing conversation reference, recent Graph delivery failure) so the fix is obvious.
  • Engineers get a proven proactive-send path (the building block for Phase 2 channel notifications) exercised in production.
  • Sales/demo risk drops — the addon becomes self-verifiable, which is the precondition for packaging it as paid.

Goals

  1. A runTeamsDiagnostics() server action returns a structured, step-based report (pass/warn/fail/skip per check + aggregated recommendations), modeled on runMicrosoft365Diagnostics().
  2. A sendTeamsTestMessage() server action delivers a synthetic message to the calling admin via the proactive bot path (sendBotActivity against the admin's stored teams_conversation_references row) and records a teams_notification_deliveries row.
  3. A diagnostics + test-message panel is added to TeamsIntegrationSettings.tsx (incremental — not a full wizard rebuild).
  4. A read helper for teams_conversation_references exists (the table's first consumer).
  5. All new DB reads are tenant-scoped; both actions are wrapped in withAuth and gated on the same permission as saveTeamsIntegrationSettings.
  6. No new migration — reuse teams_notification_deliveries (test row) and teams_conversation_references (lookup).

Non-Goals (Explicit)

  • No teams_channel_mappings, channel routing, or channel delivery. (Phase 2.)
  • No expanded notification categories (ticket_created, etc.). (Phase 2.)
  • No teams_user_preferences / quiet hours / mention-only. (Phase 2.)
  • No full 4-step setup wizard restructure — only add a diagnostics/test panel to the existing settings page.
  • No 402/403 entitlement-response standardization. (Open Decision, separate PR.)
  • No configurable channel tab, no trial flow, no metering.
  • No LLM/fuzzy bot intent, no SSO token exchange.
  • No new metrics export / Prometheus / log shipping.
  • No change to existing notification delivery or bot reply behavior — diagnostics/test are additive read + a new send path.

Target Users

  • MSP admins in Settings → Integrations → Teams.
  • Engineers debugging a tenant's Teams setup in staging/production.

Primary Flows

Flow A: Run diagnostics

  1. Admin clicks "Run diagnostics" in Teams settings.
  2. runTeamsDiagnostics() executes ordered checks, each producing {status: pass|warn|fail|skip, detail, data?, error?}:
    • addon entitlement active (getTeamsAvailability)
    • integration row exists + install_status = 'active'
    • capabilities include personal_bot + activity_notifications
    • selected Microsoft profile exists, not archived, has client secret ref
    • package metadata present + base URL resolvable
    • bot connector credentials configured (isBotConnectorConfigured)
    • calling admin's Microsoft user linkage present
    • conversation reference present for that admin
    • recent delivery health: most recent success + most recent failure (tenant-scoped read of teams_notification_deliveries)
  3. Report aggregates overallStatus (fail if any fail; warn if any warn; else pass) and a deduped recommendations list.
  4. UI renders the step list with status badges + recommendations.

Flow B: Send test message (proactive)

  1. Admin clicks "Send test message".
  2. sendTeamsTestMessage() resolves availability → admin Microsoft link → latest personal conversation reference.
  3. If addon inactive / integration inactive / bot not configured / no linkage / no conversation reference → returns a skipped result with an actionable reason (e.g. "message the bot once first") and records a skipped delivery row.
  4. Otherwise builds a test activity and calls sendBotActivity({serviceUrl, conversationId, activity}).
  5. Records a teams_notification_deliveries row (status sent/failed, destination_type = 'bot_test', category = 'test', actor metadata, idempotency key with attempt nonce).
  6. UI shows success or the mapped skip/failure reason.

Data Model / Integration Notes

  • No schema migration. Reuse:
    • teams_notification_deliveriescategory is nullable, no CHECK; status CHECK ∈ {skipped,sent,delivered,failed}; destination_type is free text NOT NULL. Test rows use category='test', destination_type='bot_test', status ∈ {skipped,sent,failed}.
    • teams_conversation_references — PK (tenant, microsoft_user_id, conversation_id); columns include service_url, conversation_type, last_activity_at. Reader selects newest personal row per (tenant, microsoft_user_id).
  • Transport already exists: teamsBotConnector.ts::sendBotActivity() (token via client-credentials → https://api.botframework.com/.default; trusted-serviceUrl suffix check; POST to /v3/conversations/{id}/activities). Credentials from env TEAMS_BOT_APP_ID / TEAMS_BOT_APP_TENANT_ID / TEAMS_BOT_APP_PASSWORD (already in helm/templates/{deployment,secret}.yaml).
  • PSA-user → Microsoft-user mapping (VERIFIED): resolveTeamsRecipientLink(tenant, userId) returns {providerAccountId}, and providerAccountId is the AAD oid. Conversation references are keyed by microsoft_user_id = activity.from.aadObjectId (teamsConversationReferences.ts:48-50). nextAuthOptions.ts:881-882 deliberately stores claims.oid as the Microsoft link's provider_account_id because it equals aadObjectId (explicit comment at 878-883). The bot's existing resolveTeamsLinkedUser already relies on this. So the test message can feed providerAccountId straight into the conversation-reference reader — no normalization helper needed. Known edge (pre-existing, not introduced here): the admin bulk backfill flow (ssoActions.ts:308) stores provider_account_id = lowerEmail; Microsoft links created that way won't match a bot conversation reference (the bot already can't resolve them). Diagnostics should surface this (F021/F022 warnings), not fix it.
  • Diagnostics report shape: mirror Microsoft365DiagnosticsReport / ...Step (shared/services/email/providers/MicrosoftGraphAdapter.ts:745+) — step id/title/status/durationMs/data/error, recommendations: string[], overallStatus.
  • Action wiring: mirror teamsObservabilityActions.ts (export const x = withAuth(impl)). Rebuild the package (tsup → dist) so the server picks up new exports (prior loop F045).

UX / UI Notes

  • New Card in TeamsIntegrationSettings.tsx below the existing config/package cards: title "Diagnostics & Test Message".
  • "Run diagnostics" button → step list, each row: status badge (pass=green / warn=amber / fail=red / skip=grey) + title + detail; recommendations rendered as a bullet list below.
  • "Send test message" button → success or friendly skip/error message; the missing_conversation_reference skip maps to "Open the Alga PSA bot in Teams and send it any message first, then retry."
  • Both buttons disabled when addon missing or integration not active (reuse existing canPersist-style gating).
  • All strings i18n'd with defaultValue fallbacks, mirroring existing integrations.teams.settings.* keys.

Risks / Open Questions

  • Mapping risk — RESOLVED (verified in code): provider_account_id (Microsoft) = AAD oid = microsoft_user_id by design (nextAuthOptions.ts:881-882 + teamsConversationReferences.ts:48-50). No normalization needed. Residual edge is the email-backfill linking path, which diagnostics surfaces rather than fixes.
  • Bot credentials in dev: isBotConnectorConfigured() is false without env creds, so test message is a no-op locally — diagnostics must report this clearly rather than appearing broken.
  • Permission gate: confirm the exact permission saveTeamsIntegrationSettingsImpl enforces and reuse it.
  • Idempotency for repeated tests: include an attempt nonce in the test delivery idempotency key so each click records a distinct row (the table has a UNIQUE on (tenant, idempotency_key)).

Acceptance Criteria / Definition of Done

  • runTeamsDiagnostics() returns all listed checks with correct pass/warn/fail/skip classification and an accurate overallStatus + recommendations; tenant-scoped; permission-gated; covered by unit tests.
  • sendTeamsTestMessage():
    • on a healthy tenant, sends via sendBotActivity and records a sent delivery row;
    • on each unhealthy precondition, returns the correct skip reason and records a skipped row;
    • on transport failure, records a failed row;
    • all writes tenant-scoped; covered by unit/integration tests.
  • The reader returns the newest personal conversation reference per (tenant, microsoft_user_id), tenant-scoped, null-safe.
  • Settings panel renders diagnostics steps + recommendations and the test-message result, with correct disabled states.
  • No new migration; no change to existing delivery/bot-reply behavior.
  • @alga-psa/microsoft-teams rebuilt so the server resolves the new exports.