Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz Source: /opt/alga-psa on psa.joliet.tech
13 KiB
13 KiB
Calendar Integrations Completion Plan
Purpose & Overview
Stabilize and complete the Google and Microsoft calendar integrations so MSPs can rely on the feature for production scheduling. The current implementation covers large portions of the happy path, but critical defects in OAuth persistence, tenant scoping, webhook processing, and UI compliance prevent real-world use. This plan hardens storage/security, fixes the sync pipeline, wires the event-driven flow, and adds the missing operational guardrails and tests.
Current State Findings (Completed)
- OAuth persistence breaks:
CalendarProviderService.updateProvidermerges camelCase objects into snake_case tables (server/src/services/calendar/CalendarProviderService.ts:202), so OAuth callbacks fail to write client credentials or tokens. - Secrets exposed to clients:
mapDbRowToProviderreturns access/refresh tokens and client secrets to the caller (CalendarProviderService.ts:364), andgetCalendarProviderssurfaces that data to the browser (server/src/lib/actions/calendarActions.ts:90viaCalendarIntegrationsSettings.tsx). - Tenant authenticity unchecked:
CalendarProviderService.deleteProviderandupdateProvideroperate on raw IDs without verifying the caller’s tenant (CalendarProviderService.ts:234,312). - State/nonce unverified:
initiateCalendarOAuthissues base64 state blobs but never records or validates them on callback (server/src/lib/actions/calendarActions.ts:12), so cross-tenant or replay attacks are possible. - Manual sync placeholder:
syncCalendarProviderstill returns a TODO stub (calendarActions.ts:378). - Webhook processing lacks tenant context:
CalendarWebhookProcessorpulls a random tenant fromcreateTenantKnex()and never wraps downstream calls inrunWithTenant(CalendarWebhookProcessor.ts:163), so schedule entry lookups fail. - New entry sync is invalid:
mapExternalEventToScheduleEntrynever setswork_item_type, causingScheduleEntry.createto violate the NOT NULL constraint when webhooks create entries (eventMapping.ts:159). - Event bus plumbing incomplete: Schedule entry CRUD never publishes the
SCHEDULE_ENTRY_*events the subscriber expects, so outbound sync is never triggered. - Data hygiene gaps:
fetchUserIdsByEmaillowercases keys but callers do not (eventMapping.ts:134), andCalendarProviderService.getProvidersonly honorsisActivewhile the subscriber passesactive(CalendarProviderService.ts:60+calendarSyncSubscriber.ts:52). - Provider state never advances: Sync jobs do not update
calendar_providers.last_sync_ator emitCALENDAR_SYNC_*/CALENDAR_CONFLICT_DETECTEDevents. - UI non-compliant: Buttons and interactive elements in
CalendarIntegrationsSettings.tsxand related forms lack requiredidattributes. - Operational safeguards missing: No background job renews Microsoft webhook subscriptions; Google Pub/Sub provisioning remains manual/document-only; conflict notifications stop at console logs.
Goals & Non-Goals
Goals
- Ensure OAuth flows reliably persist credentials, respect tenant boundaries, and keep secrets server-side.
- Make inbound/outbound sync resilient: tenant-aware, conflict-aware, and capable of creating or updating entries end-to-end.
- Deliver production-grade webhook + event bus behavior (including renewal jobs and conflict notifications).
- Align UI with internal standards (ID attributes, error handling) while keeping operators informed of sync status.
- Establish automated test coverage (unit/integration/e2e) and documentation so the feature can ship with confidence.
Non-Goals
- Adding new calendar providers beyond Google and Microsoft.
- Building advanced scheduling UX (bulk sync dashboards, multi-calendar UIs) beyond the existing settings surfaces.
- Implementing tenancy-wide scheduling analytics; focus remains on sync correctness and operability.
Phase 1 – Provider Auth & Storage Hardening
- Fix vendor config updates: Normalize vendor payloads to snake_case before persistence and remove camelCase keys in
CalendarProviderService.updateProvider(server/src/services/calendar/CalendarProviderService.ts). - Stop leaking secrets:
- Store access/refresh tokens encrypted via the secret provider or another at-rest mechanism.
- Update
mapDbRowToProvider/getCalendarProvidersto omit sensitive fields; expose token status via derived booleans instead.
- Enforce tenant ownership: Require tenant filters on
getProvider,updateProvider, anddeleteProvider; add guard clauses incalendarActionsto abort cross-tenant access. - Persist & validate OAuth state: Record nonce+tenant keys (Redis or DB) when
initiateCalendarOAuthis called; reject callbacks whose state is missing, expired, or mismatched. - Confirm redirect URI hygiene: Centralize redirect URI derivation and ensure it is stored once the provider is connected.
Deliverables
- Updated provider service with secure storage semantics and tenant guards.
- OAuth callback flow that succeeds end-to-end with secrets left server-side.
- Regression & smoke tests for both providers exercising OAuth + provider creation.
Phase 2 – Core Sync Pipeline Retrofit
- Tenant scoping: Wrap webhook handlers, manual sync, and subscriber invocations with
runWithTenant(provider.tenant, ...)before touching schedule data (CalendarWebhookProcessor.ts,calendarSyncSubscriber.ts). - Complete schedule entry mapping:
- Default
work_item_typetoad_hoc(or tenant-configured default) when absent. - Normalize attendee emails before lookup; tighten null/undefined guards in
eventMapping.ts.
- Default
- Provider status updates: Push
last_sync_atand connection status viaCalendarProviderService.updateProviderStatusafter successful syncs; emitCALENDAR_SYNC_STARTED/COMPLETED/FAILEDevents. - Implement manual sync: Replace the TODO in
syncCalendarProviderwith batch logic that enumerates recent schedule changes and pushes/pulls events per provider. - Repair filtering: Add an
activealias (or fix caller) when requesting providers from background jobs (CalendarProviderService.ts,calendarSyncSubscriber.ts). - Conflict event emission: Raise
CALENDAR_CONFLICT_DETECTEDthrough the event bus fromCalendarSyncService.detectConflict, capturing metadata for downstream notifications.
Deliverables
- Successful manual sync (both directions) verified via integration tests.
- Provider rows reflect current sync status/time after manual or webhook-driven cycles.
- Event bus receives conflict + sync lifecycle events suitable for notifications/metrics.
Phase 3 – Event Bus & Webhook Reliability
- Publish schedule entry events: Emit
SCHEDULE_ENTRY_CREATED/UPDATED/DELETEDfrom schedule entry CRUD paths (server/src/lib/models/scheduleEntry.tsor service layer) so outbound sync triggers automatically. - Harden subscriber:
- Normalize filter usage (
isActivevsactive). - Ensure log messages include tenant and provider context.
- Short-circuit on inactive/error providers.
- Normalize filter usage (
- Webhook resilience:
- Acknowledge and retry logic for Google Pub/Sub + Microsoft Graph failure cases.
- Persist and reuse sync tokens (Google
syncToken, Microsoft delta links) instead of re-querying 24h windows.
- Background jobs:
- Implement scheduled renewal for Microsoft webhook subscriptions (~50 hour cadence).
- Provide tooling/docs (or code paths) to initialize Google Pub/Sub topics/subscriptions per tenant.
- Conflict notifications: Wire the emitted conflict events into the notification system (in-app toast, email, or queue hooking) per product decision.
Deliverables
- Event bus dashboards show schedule entry events flowing; subscriber actions succeed under load.
- Webhook renewer job documented and running (with observability).
- Conflict events surface to users/operators with actionable messaging.
Phase 4 – UI & Operator Experience
- Bring components into compliance: Add unique
idattributes to every interactive element inCalendarIntegrationsSettings.tsx,GoogleCalendarProviderForm.tsx,MicrosoftCalendarProviderForm.tsx, andCalendarSyncStatusDisplay.tsx. - Surface sync health: Expand the settings UI to display last sync time, error messages, and manual sync progress (spinners/toasts) using the new backend status fields.
- Guard destructive actions: Replace
window.confirmwith the standardized dialog component for provider deletion. - Hide secrets: Ensure provider details render only non-sensitive metadata; add explicit badges for “OAuth complete” or “Action required.”
- Documentation: Captured operational onboarding steps in
docs/integrations/calendar-sync-operations.md(covers OAuth app prerequisites, webhook endpoints, and cron jobs).
Deliverables
- Calendar settings page passes internal UX/ID lint checks.
- Operators can see sync status, trigger manual sync, and resolve conflicts without developer tooling.
- Up-to-date runbook for onboarding new tenants to calendar sync.
Phase 5 – Testing & Rollout
- Unit coverage: Add tests for vendor config normalizers, event mapping helpers, and webhook processors (including failure branches).
- Integration tests: Simulate full OAuth + sync flows with mocked Google/Microsoft APIs to confirm mapping creation, updates, deletions, and conflict handling.
- End-to-end smoke: Extend Playwright (or Cypress) suites to authorize a provider and validate UI-driven manual sync.
- Monitoring & alerts: Instrument success/error counters for sync events, webhook renewals, and conflict occurrences; hook alerts into the existing observability stack.
- Release plan: Stage rollout (internal tenants → beta tenants → GA), with feature flags toggled once telemetry shows stability.
Release Readiness Acceptance Tests
- OAuth Connection Flow
- Create Google and Microsoft providers via the UI, complete OAuth, and verify provider rows persist encrypted credentials without leaking secrets to the client.
- Restart the server and confirm providers remain in
connectedstate and refresh tokens are valid.
- Manual Sync Both Directions
- Create a schedule entry in Alga, trigger the calendar sync via the server action, and verify the outbound call fires for the active provider. (Covered by
server/src/test/integration/calendar/scheduleAutoSync.integration.test.ts.) - Modify an external event and confirm manual sync updates the corresponding schedule entry. (Covered by the inbound scenario in
server/src/test/integration/calendar/manualSync.integration.test.ts.)
- Create a schedule entry in Alga, trigger the calendar sync via the server action, and verify the outbound call fires for the active provider. (Covered by
- Webhook Processing
- Receive Google Pub/Sub and Microsoft Graph notifications for create/update/delete and observe tenant-scoped processing, including automatic deletion of local entries when the external event is removed. (Covered by
server/src/test/integration/calendar/webhookProcessing.integration.test.ts.) - Force webhook failure scenarios (invalid client state, expired subscription) and confirm retries plus surfaced operator alerts.
- Receive Google Pub/Sub and Microsoft Graph notifications for create/update/delete and observe tenant-scoped processing, including automatic deletion of local entries when the external event is removed. (Covered by
- Conflict Handling
- Simultaneously change an event in Alga and the external calendar, ensure conflict detection fires,
CALENDAR_CONFLICT_DETECTEDis emitted, the mapping is markedconflict, and the user sees a notification with resolution options.
- Simultaneously change an event in Alga and the external calendar, ensure conflict detection fires,
- Provider Lifecycle & Security
- Delete a provider and validate vendor configs, event mappings, and webhooks are removed while external events remain untouched.
- Attempt cross-tenant access to provider IDs or OAuth callbacks and confirm permission denials with audit logs.
- UI Compliance & UX
- Run automated checks asserting every interactive element has a unique
idand that sync status, last sync timestamps, and error messages render correctly. - Validate the manual sync button displays progress feedback and disables while work is in-flight.
- Run automated checks asserting every interactive element has a unique
- Background Jobs
- Advance time to validate Microsoft webhook renewal runs at the expected cadence and logs renewal outcomes.
- Confirm cron failures emit alerts and do not silently disable webhooks.
- Telemetry & Observability
- Trigger successful and failed syncs/webhook renewals and confirm metrics, logs, and alerts reach the observability stack with provider/tenant dimensions.
- Regression / Multi-Tenant Isolation
- Run automated integration tests across two tenants to ensure no external events or schedule entries bleed between tenants during syncs.
Exit Criteria
- All automated suites green (unit/integration/e2e) for calendar sync domains.
- Observability dashboards in place with alert thresholds agreed upon by ops.
- Product sign-off after beta tenants complete validation without data loss.
Dependencies & Coordination
- Secret provider team for token encryption strategy and storage limits.
- SRE/Infra for webhook domain exposure, Pub/Sub topic provisioning, and cron job scheduling.
- Notifications team for conflict alert surfaces.
- QA for multi-tenant testing (ensure no cross-tenant data bleed).
Success requires tight coordination between platform (security/infra), backend (sync/webhook), and frontend teams to land all phases before declaring GA.