Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

19 KiB

Scratchpad — Unified Inbound Email Queue with Pointer Jobs

  • Plan slug: unified-inbound-email-pointer-queue
  • Created: 2026-03-01

What This Is

Working notes for moving Microsoft, Google, and IMAP inbound email ingress to one pointer-based Redis queue with consume-time idempotency.

Decisions

  • (2026-03-01) Use one queue ingestion model for all inbound providers: Microsoft callback, Google callback, and IMAP listener enqueue pointer jobs only.
  • (2026-03-01) Use consume-time idempotency instead of ingress-time idempotency.
  • (2026-03-01) Queue payloads stay pointer-only (no raw MIME/attachment bytes).
  • (2026-03-01) Source-content drift for IMAP between ingest and consume is accepted risk; unavailable source should produce deterministic skipped outcome.
  • (2026-03-01) Ingress success must mean durable enqueue success.
  • (2026-03-01) F001 implemented by defining UnifiedInboundEmailQueueJob as a discriminated union (provider) with provider-specific pointer objects (microsoft, google, imap), while keeping legacy EmailQueueJob for compatibility during migration.
  • (2026-03-01) Added a dedicated unified queue feature flag gate (UNIFIED_INBOUND_EMAIL_POINTER_QUEUE_*) so provider webhooks can move to enqueue-only behavior without forcing immediate cutover.

Discoveries / Constraints

  • IMAP service already retries webhook dispatch on non-2xx responses.
  • Existing IMAP in-app async queue implementation is in-memory and returns success after enqueue, which is not durable acceptance.
  • Microsoft and Google callback handlers currently fetch and process in callback path; this plan changes them to enqueue-only ingress.
  • Inbound email interface definitions are duplicated across shared/interfaces, server/src/interfaces, and packages/types/src/interfaces; all three must be kept in sync for type consumers.
  • Microsoft webhook handler is transaction-scoped per notification; queue-mode enqueue can be inserted before legacy fetch/process logic and short-circuit the callback path cleanly.
  • Google webhook flow can enqueue immediately after provider resolution + JWT verification, before any gmail_processed_history writes or Gmail API fetches.
  • IMAP listener now has enough metadata at fetch time (mailbox, uid, uidValidity, messageId) to emit pointer-only webhook payloads; no raw body is required for unified queue ingress.
  • Unified queue internals now track ready/processing/inflight/DLQ keys with lease metadata, enabling explicit claim and completion lifecycle management.
  • Queue enqueue now enforces a runtime pointer-only payload guard that rejects forbidden MIME/body/attachment keys at both top-level and nested pointer metadata.
  • Legacy IMAP in-memory async queue now rejects enqueue attempts when unified pointer queue mode is enabled for the same tenant/provider, preventing accidental production regressions to in-memory processing.
  • Security checks are still enforced before enqueue-only handoff: Microsoft validation/clientState checks, Google Pub/Sub JWT verification, and IMAP secret header verification all execute before unified-queue enqueue paths.
  • IMAP async-mode gating is now provider-aware and supports explicit legacy-path disablement via IMAP_INBOUND_EMAIL_IN_APP_ASYNC_DISABLED, while also auto-disabling async mode whenever unified pointer queue mode is enabled for a provider.
  • Unified queue now emits structured event logs for enqueue, consume_start, ack, retry, dlq, reclaim, and consumer skip with job/pointer identifiers and attempt metadata.
  • Microsoft webhook response contract now reports handoff mode (unified_pointer_queue/mixed/inline_processing) plus queue vs inline counts, aligning callback semantics with Google/IMAP queue-mode responses.
  • Queue consumer provider routing is implemented in processUnifiedInboundEmailQueueJob via provider-specific pointer resolution paths: Microsoft (messageId), Google (historyId plus discovered message IDs), and IMAP (uid mailbox fetch).

Commands / Runbooks

  • python3 /Users/roberisaacs/.codex/skills/alga-plan/scripts/scaffold_plan.py "Unified Inbound Email Queue with Pointer Jobs" --slug unified-inbound-email-pointer-queue
  • python3 /Users/roberisaacs/.codex/skills/alga-plan/scripts/validate_plan.py ee/docs/plans/2026-03-01-unified-inbound-email-pointer-queue
  • npm -w shared run typecheck
  • npm -w @alga-psa/types run build
  • npm -w server run typecheck
  • npm -w shared run typecheck (after Microsoft queue-mode changes)
  • npm -w server run typecheck (after Microsoft queue-mode changes)
  • npm -w email-service run build
  • npm -w @alga-psa/integrations run typecheck
  • npm -w server run test -- src/test/integration/microsoftWebhookUnifiedQueue.integration.test.ts
  • npm -w server run test -- src/test/integration/googleWebhookUnifiedQueue.integration.test.ts --coverage.enabled=false
  • npm -w server run test -- src/test/integration/imapWebhookHandoff.integration.test.ts --coverage.enabled=false
  • npm -w server run test -- src/test/integration/microsoftWebhookUnifiedQueue.integration.test.ts src/test/integration/googleWebhookUnifiedQueue.integration.test.ts src/test/integration/imapWebhookHandoff.integration.test.ts --coverage.enabled=false
  • npx vitest --config shared/vitest.config.ts services/email-service/src/emailService.webhookRetry.test.ts
  • npx vitest --config shared/vitest.config.ts shared/services/email/__tests__/unifiedInboundEmailQueueConsumer.test.ts
  • npm -w server run test -- src/test/unit/unifiedInboundEmailQueueJobProcessor.fetch.test.ts --coverage.enabled=false
  • npx vitest --config shared/vitest.config.ts shared/services/email/__tests__/unifiedInboundEmailQueue.test.ts
  • IMAP webhook route: packages/integrations/src/webhooks/email/imap.ts
  • IMAP in-memory queue: packages/integrations/src/webhooks/email/imapInAppQueue.ts
  • Microsoft webhook route: packages/integrations/src/webhooks/email/microsoft.ts
  • Google webhook route: packages/integrations/src/webhooks/email/google.ts
  • IMAP listener dispatch path: services/email-service/src/emailService.ts
  • Existing related plan: ee/docs/plans/2026-02-27-inbound-email-inapp-artifact-persistence-remaining-work/
  • Unified job contract files:
    • shared/interfaces/inbound-email.interfaces.ts
    • server/src/interfaces/email.interfaces.ts
    • packages/types/src/interfaces/email.interfaces.ts
  • Unified queue helper: shared/services/email/unifiedInboundEmailQueue.ts
  • Unified queue flag gate helper: shared/services/email/inboundEmailInAppFeatureFlag.ts
  • Unified queue consumer loop: shared/services/email/unifiedInboundEmailQueueConsumer.ts
  • Server queue job processor: server/src/services/email/unifiedInboundEmailQueueJobProcessor.ts
  • Server consumer entrypoint: server/src/bin/unifiedInboundEmailQueueConsumer.ts
  • Unified queue runbook: ee/docs/plans/2026-03-01-unified-inbound-email-pointer-queue/RUNBOOK.md
  • Microsoft unified ingress contract tests: server/src/test/integration/microsoftWebhookUnifiedQueue.integration.test.ts
  • Google unified ingress contract tests: server/src/test/integration/googleWebhookUnifiedQueue.integration.test.ts
  • IMAP webhook retry test: services/email-service/src/emailService.webhookRetry.test.ts
  • Unified queue consumer tests: shared/services/email/__tests__/unifiedInboundEmailQueueConsumer.test.ts
  • Unified queue job processor fetch tests: server/src/test/unit/unifiedInboundEmailQueueJobProcessor.fetch.test.ts
  • Unified queue primitives tests: shared/services/email/__tests__/unifiedInboundEmailQueue.test.ts

Progress Log

  • (2026-03-01) Completed F001: Added unified pointer job contract types with provider-specific pointer metadata and queue lifecycle fields (attempt, maxAttempts, enqueuedAt, jobId, schemaVersion).
  • (2026-03-01) Completed F002: Microsoft webhook now supports enqueue-only pointer handoff in unified-queue mode, using shared/services/email/unifiedInboundEmailQueue.ts and no longer requiring inline full-email fetch/processing when that mode is enabled.
  • (2026-03-01) Completed F003: Google webhook now supports enqueue-only pointer handoff in unified-queue mode (historyId, emailAddress, pubsubMessageId) and returns 503 when durable enqueue fails.
  • (2026-03-01) Completed F004: IMAP listener/webhook handoff now supports pointer-only ingress (mailbox, uid, uidValidity, optional messageId) and enqueues IMAP pointer jobs when unified queue mode is enabled.
  • (2026-03-01) Completed F005: Unified pointer ingress is now persisted in Redis list storage via shared/services/email/unifiedInboundEmailQueue.ts (RPUSH on a configurable queue key).
  • (2026-03-01) Completed F006: Unified queue mode ingress responses now acknowledge only after enqueue returns success; enqueue errors return non-success responses so callers can retry.
  • (2026-03-01) Completed F007: Microsoft, Google, and IMAP unified-queue paths now return 503 when enqueue fails, preserving upstream retry behavior.
  • (2026-03-01) Completed F008: Added a reusable consumer loop (UnifiedInboundEmailQueueConsumer) plus queue claim/ack/fail/reclaim primitives for processing unified inbound pointer jobs.
  • (2026-03-01) Completed F009: Added provider-specific consume-time pointer resolution in unifiedInboundEmailQueueJobProcessor for Microsoft (messageId), Google (historyId -> message IDs), and IMAP (uid fetch) before downstream processing.
  • (2026-03-01) Completed F010: Added consume-time idempotency insert/check against email_processed_messages with duplicate short-circuit when a normalized external identity already exists.
  • (2026-03-01) Completed F011: Queue job processor now calls processInboundEmailInApp for fetched provider messages and records final processing status back to email_processed_messages.
  • (2026-03-01) Completed F012: Consumer loop now ACKs only after handleJob completes successfully; failed jobs are not ACKed and are routed through retry/DLQ handling.
  • (2026-03-01) Completed F013: Added lease-based reclaim (reclaimExpiredUnifiedInboundEmailQueueJobs) so stale in-flight jobs are resurfaced back to the ready queue.
  • (2026-03-01) Completed F014: Failed jobs now increment attempt in queue payload state and only requeue while below configured maxAttempts.
  • (2026-03-01) Completed F015: Once attempt reaches maxAttempts, failed jobs are routed to the dedicated unified inbound pointer DLQ key.
  • (2026-03-01) Completed F016: Source-unavailable fetch failures now resolve as deterministic skipped outcomes (source_unavailable:*) recorded in email_processed_messages and do not rethrow for retry.
  • (2026-03-01) Completed F017: Consumer idempotency now uses a normalized external identity format (<provider>:<messageId>) prior to persistence checks.
  • (2026-03-01) Completed F018: Added assertPointerOnlyPayload validation in enqueue to reject raw content-like keys (rawMime, attachments, body, etc.) and enforce pointer-only queue contracts at runtime.
  • (2026-03-01) Completed F019: Added a defensive runtime guard in imapInAppQueue that throws when unified pointer queue mode is enabled for the tenant/provider, ensuring legacy in-memory queue path is bypassed/retired for production unified-mode processing.
  • (2026-03-01) Completed F020: Verified webhook auth/verification behavior is preserved in enqueue-only mode across Microsoft, Google, and IMAP paths (no auth bypass introduced by unified queue branching).
  • (2026-03-01) Completed F021: Aligned queue migration flags by extending IMAP async mode evaluation to accept provider context, auto-disable on unified mode, and honor IMAP_INBOUND_EMAIL_IN_APP_ASYNC_DISABLED for explicit legacy disablement.
  • (2026-03-01) Completed F022: Added structured observability events across queue lifecycle and consumer skip outcomes, including tenant/provider/pointer identifiers, attempts, and terminal reasons for retry/DLQ paths.
  • (2026-03-01) Completed F023: Updated provider callback contracts so unified mode explicitly reports queue handoff metadata and avoids inline-processing ambiguity in webhook responses.
  • (2026-03-01) Completed F024: Confirmed unified consumer routing dispatches per provider type and fetches provider-specific source payloads before shared in-app processing.
  • (2026-03-01) Completed F025: Added a dedicated runbook covering architecture, queue keys, feature flags, consumer startup, and local validation/failure-path checks.
  • (2026-03-01) Completed T001: Added Microsoft unified ingress contract test validating pointer-only enqueue payload shape (tenantId, providerId, provider pointer identifiers) and absence of raw content fields.
  • (2026-03-01) Completed T002: Added Google unified ingress contract test validating pointer-only enqueue payload shape (tenantId, providerId, historyId, pubsubMessageId) behind successful JWT/provider verification.
  • (2026-03-01) Completed T003: Extended IMAP webhook integration coverage with unified-mode pointer enqueue assertions (mailbox, uid, uidValidity, messageId) and pointer-only payload guards.
  • (2026-03-01) Completed T004: Added deferred-enqueue Microsoft webhook test proving 200 success is not returned until unified queue enqueue promise resolves.
  • (2026-03-01) Completed T005: Added deferred-enqueue Google webhook test proving callback success response is blocked until unified queue enqueue completion.
  • (2026-03-01) Completed T006: Added deferred-enqueue IMAP webhook test proving unified-mode success response is blocked until pointer job enqueue completion.
  • (2026-03-01) Completed T007: Added enqueue-failure assertions for Microsoft, Google, and IMAP unified ingress paths, each returning 503 to preserve upstream retry semantics.
  • (2026-03-01) Completed T008: Extracted and tested IMAP webhook retry helper to verify non-2xx ingress responses trigger retry attempts before eventual success.
  • (2026-03-01) Completed T009: Added consumer unit coverage confirming Microsoft pointer claims invoke handler and ACK path through unified consumer loop.
  • (2026-03-01) Completed T010: Validated Google pointer claims execute through the same unified consumer claim/handle/ACK lifecycle.
  • (2026-03-01) Completed T011: Validated IMAP pointer claims execute through the same unified consumer claim/handle/ACK lifecycle.
  • (2026-03-01) Completed T012: Added processor fetch test proving Microsoft pointer jobs resolve full provider payloads before shared in-app processing execution.
  • (2026-03-01) Completed T013: Added processor fetch test proving Google pointer jobs resolve message payloads (history cursor -> message IDs -> full payloads) before processing.
  • (2026-03-01) Completed T014: Added processor fetch test proving IMAP pointer jobs resolve mailbox UID content into normalized email payloads before processing.
  • (2026-03-01) Completed T015: Added idempotency happy-path test validating first consume writes normalized identity (provider:messageId) processing marker and executes downstream processing.
  • (2026-03-01) Completed T016: Added idempotency duplicate-path test validating unique-constraint collision (23505) short-circuits downstream processing with deduped skip outcome.
  • (2026-03-01) Completed T017: Processor fetch suite now asserts processInboundEmailInApp receives fully resolved provider payloads on successful consume-time fetch paths.
  • (2026-03-01) Completed T018: Added queue ACK primitive test validating successful consume removes payload from processing list and clears inflight hash/lease entries.
  • (2026-03-01) Completed T019: Added consumer failure-path test validating processing exceptions skip ACK and invoke retry-handling path (failUnifiedInboundEmailQueueJob).
  • (2026-03-01) Completed T020: Added reclaim-path queue test validating expired inflight claims are removed from processing structures and requeued to ready state.
  • (2026-03-01) Completed T021: Added retry-path queue test validating failed consume increments job attempt prior to requeue.
  • (2026-03-01) Completed T022: Added DLQ-path queue test validating jobs are moved to dead-letter storage once max attempts are reached.
  • (2026-03-01) Completed T023: Added IMAP source-unavailable processor test validating deterministic source_unavailable:* skip reason and consume-marker persistence.
  • (2026-03-01) Completed T024: Added consumer skipped-outcome test validating source-unavailable paths ACK and avoid retry-loop behavior.
  • (2026-03-01) Completed T025: Added queue payload-guard test validating enqueue rejects raw-content fields and enforces pointer-only contract.
  • (2026-03-01) Completed T026: Added IMAP regression test validating unified queue mode bypasses legacy in-memory async queue path even when legacy async flag is enabled.
  • (2026-03-01) Completed T027: Added Microsoft security regression test validating clientState mismatch blocks enqueue in unified mode.
  • (2026-03-01) Completed T028: Added Google security regression test validating JWT auth header remains required in enqueue-only mode.
  • (2026-03-01) Completed T029: Added IMAP security regression test validating webhook secret mismatch still rejects enqueue-only requests.
  • (2026-03-01) Completed T030: Existing unified-mode ingress contract tests confirm flag enablement routes Microsoft/Google/IMAP into enqueue-only handoff paths.
  • (2026-03-01) Completed T031: Added rollback-path IMAP test validating unified flag disablement preserves legacy in_app_async handoff behavior.
  • (2026-03-01) Completed T032: Queue logging tests now assert enqueue success/failure events include provider, tenant, and pointer identifiers.
  • (2026-03-01) Completed T033: Queue + consumer logging tests now assert retry/DLQ/skip event payloads include attempt counts and terminal reasons.
  • (2026-03-01) Completed T034: Idempotency persistence test coverage confirms first consume writes email_processed_messages marker for newly processed identities.
  • (2026-03-01) Completed T035: Duplicate-guard test coverage confirms unique-constraint collision blocks second consume and prevents downstream processing.
  • (2026-03-01) Completed T036: Combined Microsoft webhook enqueue contract + processor consume-time fetch tests validate callback-to-worker shared processing flow and created-outcome handoff.
  • (2026-03-01) Completed T037: Combined Google webhook enqueue contract + processor consume-time fetch tests validate callback-to-worker shared processing flow and created-outcome handoff.
  • (2026-03-01) Completed T038: Combined IMAP listener/webhook enqueue contract + processor consume-time fetch tests validate callback-to-worker shared processing flow and created-outcome handoff.
  • (2026-03-01) Completed T039: Idempotency duplicate-consume coverage validates repeated provider deliveries result in a single processed outcome with deduped no-op on repeats.
  • (2026-03-01) Completed T040: Plan docs validation complete (SCRATCHPAD.md + RUNBOOK.md) with unified architecture, flags, queue lifecycle, and local verification runbook steps.

Open Questions

  • Choose Redis queue primitive for implementation phase: Streams with consumer groups vs list-based queue with explicit inflight tracking.
  • Decide whether DLQ re-drive tooling is required in this scope or deferred.