PSA/ee/docs/plans/2025-11-03-multi-provider-sso-integration-plan.md
Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

9.9 KiB
Raw Blame History

Multi-Provider SSO Integration Plan

Overview

  • Deliver Google Workspace and Microsoft 365 SSO that fits the existing multi-tenant NextAuth.js stack without disrupting current credential flows.
  • Normalize identity data across providers so tenant routing, OTT vanity handoff, and user_type logic continue to function uniformly.
  • Provide a guided migration path that lets password-based users link SSO identities safely and allows ops to monitor and enforce adoption by tenant.

Phased Technical Plan

Phase 0 Foundations and Provider Enablement

  • Provision Azure AD enterprise app, confirm Google OAuth project, and capture client IDs, secrets, tenant IDs, and redirect URIs.
  • Update secret loader so Microsoft OAuth values (client ID, secret, authority/tenant) are available alongside Google via getNextAuthSecret.
  • Add Microsoft provider configuration (e.g., AzureADProvider) and refresh Google provider options to share the common claim extractor.
  • Refresh .env.example, devbox/dev compose templates, and developer docs to include both providers for local testing.
  • Document OAuth secrets and redirect URIs in EE configuration guides while leaving CE .env.example entries absent or commented.
  • Wrap provider registration in server/src/app/api/auth/[...nextauth]/options.ts with the existing isEnterprise guard so CE builds resolve to stubs.

Phase 1 Provider Integration and Claim Normalization

  • Implement a shared profile mapper that converts Google and Microsoft payloads into the ExtendedUser schema.
  • Extend signIn and jwt callbacks to apply tenant resolution from query tenant_hint, vanity-domain headers, or email-domain heuristics when provider data is incomplete.
  • Invoke existing user validation (active status, tenant membership, user_type) within the OAuth callback before token issuance.
  • Confirm session and redirect callbacks read normalized claims so OTT and redirect flows behave consistently across providers.
  • Place provider adapters, claim mappers, and account-link helpers in ee/server/src/lib/auth/ssoProviders.ts with matching stubs in server/src/empty/lib/auth/ssoProviders.ts.

Phase 2 Account Linking and Migration Path

  • Create or extend a user_auth_accounts table keyed by user ID and provider (google|microsoft) with provider subject IDs and metadata.
  • Deliver an authenticated “Connect SSO” flow that revalidates password and TOTP before capturing OAuth provider details.
  • Update credential login surfaces to detect linked providers, display SSO prompts, and steer users toward OAuth flows (local 2FA suppression to follow policy controls).
  • Build a batch backfill script for federated email domains and log unresolved accounts for manual review.
  • Store schema migrations for the new linking table under ee/server/migrations and supply CE no-op stubs.
  • Serve SSO buttons and the “Connect SSO” settings page from @ee components/pages with CE stubs in server/src/empty.

Phase 3 Built-in SSO Bulk Assignment UI

  • Extract bulk assignment preview/execute helpers into ee/server/src/lib/actions/ssoActions.ts (with CE stubs in server/src/empty) so the existing backfill script shares logic and only admins with settings.update can run it.
  • Build SsoBulkAssignment.tsx container plus SsoBulkAssignmentForm.tsx to render provider toggles, multi-domain input, and an internal/client selector with preview and execute buttons wired to the new actions.
  • Surface preview results inline (counts for linked, skipped, already linked) and trigger toast/audit events on assignment; ensure preview falls back gracefully when no domains or providers are selected.
  • Add a "Single Sign-On" tab to SecuritySettingsPage.tsx navigation, mount the bulk assignment UI, and keep Roles/Permissions/Policies unaffected.
  • Extend acceptance coverage to include UI-driven dry-run/assign flows and confirm CE/enterprise builds resolve the new components and actions correctly.

Decision (2025-11-06): Provider opt-in per user

  • Summary: New staff are explicitly classified as internal (password/TOTP only) or assigned a specific SSO provider at creation time. Auto-detection on login is disabled unless an admin enabled at least one provider for that user. Phase 3 still delivers the bulk assignment tooling so admins can flip providers for many accounts at once.
  • Reasoning: Automatically trusting any matching email introduced takeover risk (e.g., an attacker controlling an external domain or recycled mailbox could satisfy a loose email match). Requiring admins to opt users into a provider maintains least privilege and keeps forensic/audit signals intact.
  • Intended behavior:
    • User creation UI/API exposes a selector: Internal (password only) vs Google vs Microsoft (extendable). Default remains Internal.
    • Login callbacks accept OAuth responses only when the user record lists that provider; otherwise the attempt is rejected and logged.
    • Bulk assignment preview/execute flows remain valuable to migrate cohorts (entire domains, departments) into a provider without editing each profile.
    • Existing auto-link toggle is scoped to future automation that pre-assigns providers; it no longer blindly creates links during login without an admin decision.

OUT OF BAND Rollout, Monitoring, and Policy Controls

  • Introduce feature flags or configuration to enable SSO providers per tenant/portal for controlled rollout.
  • Instrument telemetry to capture provider usage, OTT handoffs, migration completions, and repeated password fallbacks.
  • Add policy controls allowing tenants to require SSO and to determine whether local TOTP remains after OAuth logins.
  • Publish operational runbooks covering break-glass password resets, tenant onboarding checklists, and SSO troubleshooting.
  • Update scripts/build-enterprise.sh, validate CE Docker builds resolve @ee imports to stubs, and gate new OAuth integration tests behind process.env.EDITION === 'enterprise'.

Acceptance Tests

  • EE Connect SSO flow: verifies password + TOTP gating, sets nonce cookie, and redirects through selected provider; linking renders in Account Management afterwards.
  • Migration banner journey: credential user with new SSO link sees call-to-action on next password login, completes OAuth sign-in, and subsequent logins default to provider without local 2FA prompts.
  • MSP credential form: with linked provider returns SSO prompt, Sign in with <provider> launches OAuth and lands on dashboard.
  • Client portal credential form: with tenant slug + linked provider displays SSO banner and respects vanity domain callback handling.
  • JWT/session callbacks: ensure tenant slug and contact/client data persist across logins for both Google and Microsoft profiles.
  • Backfill script dry-run/live modes: process sample users without creating duplicates; verify CE builds skip EE scripts/components and migration prompts respect pre-linked accounts.

Background and Investigational Notes

Existing Authentication Architecture

  • Tech stack: NextAuth.js with JWT strategy (configurable SESSION_MAX_AGE), custom session cookies, on-demand user validation per request.
  • Providers in place: Google OAuth using GoogleProvider; Keycloak integration via KeycloakProvider; custom credentials provider with password + 2FA.
  • User portals: Internal MSP staff sign in at /auth/msp/signin (user_type: internal); client users sign in at /auth/client-portal/signin (user_type: client, with clientId and contactId requirements).
  • Core files: NextAuth handler (server/src/app/api/auth/[...nextauth]/route.ts), options (.../options.ts), credential logic (server/src/lib/actions/auth.tsx), registration/reset, session cookies, and portal-specific forms.
  • JWT/session callbacks: signIn tracks last login and client redirects; jwt populates claims (id, email, tenant, user_type); session turns tokens into session objects; redirect routes users by user_type.
  • Client portal handoff: Vanity domain redirect uses OTT tokens via computeVanityRedirect and PortalSessionHandoff.tsx.

Credentials Flow Snapshot

  • Email/password checked against database hashed password.
  • 2FA enforced when two_factor_enabled using TOTP codes (passed as twoFactorCode).
  • JWT issued with tenant and user metadata; session callback mirrors data; redirect logic handles portal selection.

OAuth Flow Snapshot

  • User triggers provider button; profile callback runs.
  • Google: Currently requires existing DB user by email and verifies active status; assigns default user_type (internal).
  • Keycloak: Accepts profile data with tenant/user_type claims.
  • Once profile is accepted, standard JWT/session callbacks run.

Investigation Takeaways Relevant to Plan

  • Need consistent claim normalization so OTT and redirect logic remain unchanged across providers.
  • Tenant determination for OAuth logins is currently limited; must combine query hints, vanity headers, or email-domain mapping.
  • 2FA bypass expectations differ by provider; policy controls will decide whether to trust external 2FA or enforce local TOTP post-login.
  • Account linking is required to prevent duplicate user records and to let existing credential users migrate smoothly.
  • Future enhancements may include auto-provisioning (SCIM/Azure AD) and Google auto-provisioning; plan leaves hooks for these but focuses on core SSO enablement.

SSO Migration Tooling

  • Use pnpm tsx ee/scripts/backfill-sso-links.ts --provider=<google|microsoft> --domain=example.com[,domain2.com] [--dry-run] to seed user_auth_accounts rows for federated domains.
  • The script skips inactive users, records metadata indicating the backfill source, and treats email addresses as provisional provider IDs until real OAuth logins occur.
  • Run in --dry-run mode first to review counts, then execute without the flag to persist links; reruns are idempotent and update only matching tenant/user combinations.