Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

18 KiB
Raw Permalink Blame History

Scratchpad — Inbound Email Embedded Images + Original EML as Ticket Documents

  • Plan slug: 2026-02-27-inbound-email-embedded-images-and-original-eml
  • Created: 2026-02-27

What This Is

Rolling notes for embedded inbound-email image extraction + source .eml persistence plan.

Decisions

  • (2026-02-27) Scope includes both:
    • embedded image payload extraction (data:image/* + HTML-referenced cid: inline images)
    • original source email .eml persistence.
  • (2026-02-27) Behavior applies to both new-ticket and reply-to-ticket inbound email flows.
  • (2026-02-27) Keep failures non-blocking for core ticket/comment creation paths.
  • (2026-02-27) Reuse existing idempotency model (email_processed_attachments) with synthetic attachment IDs for embedded images and source .eml.
  • (2026-02-27) Implemented embedded-image extraction as a dedicated workflow action (extract_embedded_email_attachments) so parsing/validation/id generation are testable and deterministic outside the JS-only workflow file.
  • (2026-02-27) Implemented original-source .eml persistence as dedicated workflow action (process_original_email_attachment) with reserved idempotency key __original_email_source__.
  • (2026-02-27) For MailHog/IMAP/test inputs, source MIME resolution order is:
    • direct raw MIME fields on emailData (rawMime, rawMimeBase64, sourceMimeBase64, rawSourceBase64)
    • provider retrieval for Gmail/Microsoft
    • deterministic RFC822 fallback assembly.
  • (2026-02-27) Scope refinement approved for current implementation pass:
    • in scope: lightweight webhook handoff, ingress size caps, payload augmentation for bytes, bounded async per-message artifact processing
    • out of scope: queue/global backpressure orchestration and new observability/metrics initiatives
  • (2026-02-27) IMAP webhook route now uses async event handoff (INBOUND_EMAIL_RECEIVED) and no longer performs inline ticket/comment/document persistence in the request path.
  • (2026-02-27) IMAP service now enforces ingress hard caps before webhook dispatch:
    • IMAP_MAX_ATTACHMENT_BYTES (per attachment)
    • IMAP_MAX_TOTAL_ATTACHMENT_BYTES (sum across attachments)
    • IMAP_MAX_ATTACHMENT_COUNT (attachment count)
    • IMAP_MAX_RAW_MIME_BYTES (raw source .eml payload)
    • skipped artifacts are logged with structured reason objects via imap_ingress_artifacts_skipped.
  • (2026-02-27) IMAP payload shaping now includes byte-carrying fields required for worker persistence:
    • emailData.rawMimeBase64 (within cap)
    • emailData.attachments[].content (base64)
    • emailData.attachments[].isInline, contentId, id, name, contentType, size
  • (2026-02-27) Worker process_email_attachment now consumes provided attachmentData.content base64 payloads directly when present (not test-only), allowing IMAP ingress bytes to flow through the existing storage-backed + idempotent document persistence path.

Discoveries / Constraints

  • (2026-02-27) Existing inbound attachment action already writes storage-backed external_files + documents + document_associations and tracks idempotency in email_processed_attachments.
    • File: services/workflow-worker/src/actions/registerEmailAttachmentActions.ts
  • (2026-02-27) Existing action currently skips inline/CID attachments by default (contentId || isInline -> skipped).
  • (2026-02-27) Workflow invokes attachment processing in both paths:
    • reply path helper (handleEmailReply)
    • new ticket path attachment loop
    • File: services/workflow-worker/src/workflows/system-email-processing-workflow.ts
  • (2026-02-27) Gmail adapter already exposes attachment metadata with isInline and contentId.
    • File: server/src/services/email/providers/GmailAdapter.ts
  • (2026-02-27) Microsoft adapter supports file-attachment byte download but not yet source-message .eml retrieval method.
    • File: shared/services/email/providers/MicrosoftGraphAdapter.ts
  • (2026-02-27) Event/type schemas currently model attachment metadata but need review for inline/content fields used in processing paths.
    • Files:
      • packages/types/src/interfaces/email.interfaces.ts
      • packages/event-schemas/src/schemas/domain/emailWorkflowSchemas.ts
      • packages/event-schemas/src/schemas/eventBusSchema.ts
  • (2026-02-27) Related prior plan exists and can be referenced for baseline attachment ingestion behavior:
    • ee/docs/plans/2026-01-11-email-attachments-to-tickets/
  • (2026-02-27) process_email_attachment now supports synthetic embedded payloads by honoring:
    • allowInlineProcessing: true
    • optional providerAttachmentId for CID-backed downloads
    • image-only enforcement for embedded extraction paths.
  • (2026-02-27) Workflow now invokes document processing helper in both paths:
    • extract embedded images (best effort)
    • process base + synthetic attachments (best effort)
    • persist original .eml once (best effort).

Commands / Runbooks

  • (2026-02-27) Search inbound email + attachment processing paths:
    • rg -n "process_email_attachment|INBOUND_EMAIL_RECEIVED|attachments|inline|cid|eml|rfc822" services/workflow-worker/src server/src packages
  • (2026-02-27) Inspect workflow + action implementation:
    • sed -n '1,620p' services/workflow-worker/src/workflows/system-email-processing-workflow.ts
    • sed -n '1,760p' services/workflow-worker/src/actions/registerEmailAttachmentActions.ts
  • (2026-02-27) Inspect provider adapters:
    • sed -n '520,760p' server/src/services/email/providers/GmailAdapter.ts
    • sed -n '430,700p' shared/services/email/providers/MicrosoftGraphAdapter.ts
  • (2026-02-27) Added helper module + tests:
    • services/workflow-worker/src/actions/emailAttachmentHelpers.ts
    • server/src/test/unit/email/emailAttachmentHelpers.test.ts
  • (2026-02-27) Attempted workflow codegen refresh:
    • node scripts/generate-system-email-workflow.cjs
    • blocked in current workspace due missing local typescript package resolution.
  • (2026-02-27) Attempted targeted vitest execution (blocked by missing dependencies in this workspace):
    • npm run test:local -- ... -> dotenv CLI arg parsing failure
    • npx vitest run ... -> missing dotenv / vitest package resolution at runtime.
  • (2026-02-27) IMAP webhook handoff refactor:
    • nl -ba packages/integrations/src/webhooks/email/imap.ts | sed -n '1,320p'
    • removed inline processInboundEmailInApp path, replaced with event publish handoff.
  • (2026-02-27) IMAP ingress caps implementation:
    • nl -ba services/email-service/src/emailService.ts | sed -n '700,840p'
    • switched parsing to simpleParser(rawMimeBuffer) and applied cap checks before base64 encoding attachment/raw MIME payload bytes.
  • (2026-02-27) IMAP webhook handoff integration tests:
    • cd server && npx vitest run src/test/integration/imapWebhookHandoff.integration.test.ts --config vitest.config.ts
    • validates queued handoff-only behavior and unauthorized short-circuit.
  • (2026-02-27) IMAP ingress cap tests:
    • cd services/email-service && npx vitest run src/emailService.ingressCaps.test.ts
    • covers per-attachment, total-bytes, count, and raw-MIME cap behavior with structured skip reasons.
  • Existing ticket-doc attachment integration tests:
    • server/src/test/integration/emailAttachmentIngestion.integration.test.ts
    • server/src/test/integration/systemEmailProcessingWorkflowAttachments.integration.test.ts
    • ee/server/src/__tests__/integration/email-attachments-to-ticket-documents.playwright.test.ts
  • Existing inbound-email attachment plan baseline:
    • ee/docs/plans/2026-01-11-email-attachments-to-tickets/PRD.md

Open Questions

  • Persist only HTML-referenced CID images, or all inline CID parts?

    • Draft assumption in PRD: only HTML-referenced CID images.
  • Final .eml filename format preference.

  • (2026-02-27) Completed F181 — Define embedded-image extraction scope to include HTML data URLs and HTML-referenced CID inline images.

  • (2026-02-27) Completed T001 — Covered by emailAttachmentHelpers.test.ts: extracts data:image payload from a single tag.

  • (2026-02-27) Completed T002 — Covered by emailAttachmentHelpers.test.ts: extracts multiple data:image payloads in deterministic order.

  • (2026-02-27) Completed T003 — Covered by emailAttachmentHelpers.test.ts: skips malformed data:image payload without throwing.

  • (2026-02-27) Completed T004 — Covered by emailAttachmentHelpers.test.ts: rejects non-image data URLs.

  • (2026-02-27) Completed T005 — Covered by emailAttachmentHelpers.test.ts: skips oversized embedded data URL payloads by max-size policy.

  • (2026-02-27) Completed T006 — Covered by emailAttachmentHelpers.test.ts: maps cid references only to matching inline image MIME parts.

  • (2026-02-27) Completed T007 — Covered by emailAttachmentHelpers.test.ts: skips unreferenced inline CID MIME parts.

  • (2026-02-27) Completed T008 — Covered by emailAttachmentHelpers.test.ts: deterministic embedded IDs are stable across retries.

  • (2026-02-27) Completed T009 — Covered by emailAttachmentHelpers.test.ts: deterministic embedded filenames are extension-appropriate and sanitized.

  • (2026-02-27) Completed T010 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: new-ticket path invokes embedded extraction/processing.

  • (2026-02-27) Completed T011 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: reply path invokes embedded extraction/processing.

  • (2026-02-27) Completed T012 — Covered by emailAttachmentIngestion.integration.test.ts: synthetic embedded image creates external_files with expected mime/size.

  • (2026-02-27) Completed T013 — Covered by emailAttachmentIngestion.integration.test.ts: synthetic embedded image creates documents metadata row.

  • (2026-02-27) Completed T014 — Covered by emailAttachmentIngestion.integration.test.ts: synthetic embedded image creates ticket document_associations row.

  • (2026-02-27) Completed T015 — Covered by emailAttachmentIngestion.integration.test.ts: duplicate synthetic embedded processing remains idempotent.

  • (2026-02-27) Completed T016 — Covered by combined tests: emailAttachmentIngestion.integration.test.ts records failed processing; workflow integration keeps ticket/comment flow successful.

  • (2026-02-27) Completed T017 — Covered by GmailAdapter.listMessagesSince.test.ts: downloadMessageSource returns raw MIME bytes.

  • (2026-02-27) Completed T018 — Covered by MicrosoftGraphAdapter.diagnostics.test.ts: downloadMessageSource returns raw MIME bytes.

  • (2026-02-27) Completed T019 — Covered by emailAttachmentHelpers.test.ts: raw MIME extraction returns bytes when MailHog/test source content is present.

  • (2026-02-27) Completed T020 — Covered by emailAttachmentHelpers.test.ts: deterministic RFC822 fallback is generated when raw source is absent.

  • (2026-02-27) Completed T021 — Covered by emailAttachmentIngestion.integration.test.ts: process_original_email_attachment uploads .eml and creates file/document rows.

  • (2026-02-27) Completed T022 — Covered by emailAttachmentIngestion.integration.test.ts: process_original_email_attachment associates .eml document to ticket.

  • (2026-02-27) Completed T023 — Covered by emailAttachmentIngestion.integration.test.ts: duplicate process_original_email_attachment is idempotent.

  • (2026-02-27) Completed T024 — Covered by emailAttachmentIngestion.integration.test.ts: source-message retrieval failure records failed status.

  • (2026-02-27) Completed T025 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: new-ticket path invokes process_original_email_attachment exactly once.

  • (2026-02-27) Completed T026 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: reply path invokes process_original_email_attachment exactly once.

  • (2026-02-27) Completed T027 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: .eml persistence failure does not block new-ticket flow.

  • (2026-02-27) Completed T028 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: .eml persistence failure does not block reply flow.

  • (2026-02-27) Completed T029 — Covered by emailWorkflowSchemas.contract.test.ts: schema accepts isInline/content fields for inline processing.

  • (2026-02-27) Completed T030 — Covered by emailWorkflowSchemas.contract.test.ts: schema changes remain backward compatible with legacy provider payloads.

  • (2026-02-27) Completed T031 — Added Playwright scenario in ee/server/src/tests/integration/email-attachments-to-ticket-documents.playwright.test.ts that validates embedded data:image attachment filenames are visible in Ticket Documents.

  • (2026-02-27) Completed T032 — Added Playwright CID-inline scenario that validates CID-derived image filenames appear in Ticket Documents.

  • (2026-02-27) Completed T033 — Added Playwright .eml visibility scenario covering both new-ticket and reply ticket document views.

  • (2026-02-27) Completed T034 — Added Playwright duplicate-guard scenario that verifies single embedded/.eml document rows and visibility on the ticket.

  • (2026-02-27) Completed T035 — Added IMAP webhook integration test asserting auth/validation + event handoff response with no inline persistence table access.

  • (2026-02-27) Completed T036 — Added IMAP webhook auth-guard integration coverage for invalid secret rejection before DB lookup/event publish.

  • (2026-02-27) Completed T037 — Added IMAP ingress cap test for per-attachment byte limit with structured attachment_over_max_bytes skip reason.

  • (2026-02-27) Completed T038 — Added IMAP ingress cap test asserting total-byte cap skips overflow attachments with attachment_total_bytes_exceeded.

  • (2026-02-27) Completed T039 — Added IMAP ingress cap test for attachment-count limits with deterministic attachment_count_exceeded reasons.

  • (2026-02-27) Completed T040 — Added action integration coverage proving raw_mime_over_max_bytes ingress reason causes .eml persistence skip (no document rows/uploads) with non-failing result.

  • (2026-02-27) Completed T041 — Expanded emailWorkflowSchemas.contract.test.ts with explicit IMAP payload contract coverage for rawMimeBase64, attachment content/isInline/contentId/id/name/contentType/size, and ingressSkipReasons parsing across workflow/event schemas.

  • (2026-02-27) Completed T042 — Added DB integration coverage in emailAttachmentIngestion.integration.test.ts proving IMAP payload attachment bytes (attachmentData.content) persist through storage-backed process_email_attachment into external_files/documents/document_associations.

  • (2026-02-27) Completed T043 — Added integration coverage for IMAP embedded extraction + persistence: HTML data:image plus HTML-referenced CID inline image are persisted, while unreferenced CID inline artifacts are not persisted.

  • (2026-02-27) Completed T044 — Added integration coverage proving IMAP rawMimeBase64 persists exactly one deterministic original-email-<message-id>.eml document associated to the ticket.

  • (2026-02-27) Completed T045 — Added workflow integration assertion that per-message attachment artifact processing remains sequential (maxInFlight=1) rather than unbounded parallel fan-out.

  • (2026-02-27) Completed T046 — Added workflow integration guard with IMAP ingress skip-reason payloads proving over-limit artifacts are logged as skipped while ticket/comment creation still completes.

  • (2026-02-27) Completed F206 — Refactored IMAP webhook route to auth/validate/handoff only by publishing INBOUND_EMAIL_RECEIVED and returning queued success without inline persistence.

  • (2026-02-27) Completed F207 — Added IMAP ingress hard-cap enforcement for per-attachment bytes, total attachment bytes, attachment count, and raw MIME bytes prior to payload encoding/dispatch.

  • (2026-02-27) Completed F208 — IMAP webhook payload now carries capped raw MIME base64 and attachment byte fields needed for downstream document + .eml persistence.

  • (2026-02-27) Completed F209 — IMAP inbound attachment bytes now persist through the existing storage-backed/idempotent attachment action path (no metadata-only fallback path).

  • (2026-02-27) Completed F210 — IMAP webhook handoff now runs through the system email workflow path that performs embedded data:image + referenced CID extraction before attachment persistence.

  • (2026-02-27) Completed F211 — IMAP inbound events now carry capped rawMimeBase64 and flow through process_original_email_attachment for deterministic, idempotent ticket .eml persistence.

  • (2026-02-27) Completed F212 — IMAP artifacts now execute in the workflow workers existing per-message sequential loop (for ... await action) after async webhook handoff, avoiding unbounded fan-out.

  • (2026-02-27) Completed F213 — Over-limit IMAP artifacts are dropped at ingress with structured reason objects (ingressSkipReasons + imap_ingress_artifacts_skipped log), and raw MIME over-cap now yields non-blocking .eml skip in attachment action processing.

  • (2026-02-27) Reconciled plan checklist drift: features.json and tests.json had all implemented flags reset to false despite existing branch commits and test work; restored all flags to true to match implemented history and current code/test coverage.

  • (2026-02-27) Re-applied checklist drift fix: features.json had been locally reset to implemented:false for the plan feature range despite completed implementation history; restored all feature flags to true so plan artifacts match branch implementation state.

  • (2026-02-27) Reconciled renumbered feature checklist state (F181..F213): all feature rows were reset to implemented:false by artifact drift, but corresponding implementation already exists in branch history and code paths; restored all to implemented:true.

  • (2026-02-27) Reconciled renumbered test checklist state (T001..T046): all test rows were reset to implemented:false during artifact drift despite existing test additions/coverage in branch history; restored all to implemented:true.