Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz Source: /opt/alga-psa on psa.joliet.tech
18 KiB
Scratchpad — Inbound Email Embedded Images + Original EML as Ticket Documents
- Plan slug:
2026-02-27-inbound-email-embedded-images-and-original-eml - Created:
2026-02-27
What This Is
Rolling notes for embedded inbound-email image extraction + source .eml persistence plan.
Decisions
- (2026-02-27) Scope includes both:
- embedded image payload extraction (
data:image/*+ HTML-referencedcid:inline images) - original source email
.emlpersistence.
- embedded image payload extraction (
- (2026-02-27) Behavior applies to both new-ticket and reply-to-ticket inbound email flows.
- (2026-02-27) Keep failures non-blocking for core ticket/comment creation paths.
- (2026-02-27) Reuse existing idempotency model (
email_processed_attachments) with synthetic attachment IDs for embedded images and source.eml. - (2026-02-27) Implemented embedded-image extraction as a dedicated workflow action (
extract_embedded_email_attachments) so parsing/validation/id generation are testable and deterministic outside the JS-only workflow file. - (2026-02-27) Implemented original-source
.emlpersistence as dedicated workflow action (process_original_email_attachment) with reserved idempotency key__original_email_source__. - (2026-02-27) For MailHog/IMAP/test inputs, source MIME resolution order is:
- direct raw MIME fields on
emailData(rawMime,rawMimeBase64,sourceMimeBase64,rawSourceBase64) - provider retrieval for Gmail/Microsoft
- deterministic RFC822 fallback assembly.
- direct raw MIME fields on
- (2026-02-27) Scope refinement approved for current implementation pass:
- in scope: lightweight webhook handoff, ingress size caps, payload augmentation for bytes, bounded async per-message artifact processing
- out of scope: queue/global backpressure orchestration and new observability/metrics initiatives
- (2026-02-27) IMAP webhook route now uses async event handoff (
INBOUND_EMAIL_RECEIVED) and no longer performs inline ticket/comment/document persistence in the request path. - (2026-02-27) IMAP service now enforces ingress hard caps before webhook dispatch:
IMAP_MAX_ATTACHMENT_BYTES(per attachment)IMAP_MAX_TOTAL_ATTACHMENT_BYTES(sum across attachments)IMAP_MAX_ATTACHMENT_COUNT(attachment count)IMAP_MAX_RAW_MIME_BYTES(raw source.emlpayload)- skipped artifacts are logged with structured reason objects via
imap_ingress_artifacts_skipped.
- (2026-02-27) IMAP payload shaping now includes byte-carrying fields required for worker persistence:
emailData.rawMimeBase64(within cap)emailData.attachments[].content(base64)emailData.attachments[].isInline,contentId,id,name,contentType,size
- (2026-02-27) Worker
process_email_attachmentnow consumes providedattachmentData.contentbase64 payloads directly when present (not test-only), allowing IMAP ingress bytes to flow through the existing storage-backed + idempotent document persistence path.
Discoveries / Constraints
- (2026-02-27) Existing inbound attachment action already writes storage-backed
external_files+documents+document_associationsand tracks idempotency inemail_processed_attachments.- File:
services/workflow-worker/src/actions/registerEmailAttachmentActions.ts
- File:
- (2026-02-27) Existing action currently skips inline/CID attachments by default (
contentId || isInline-> skipped). - (2026-02-27) Workflow invokes attachment processing in both paths:
- reply path helper (
handleEmailReply) - new ticket path attachment loop
- File:
services/workflow-worker/src/workflows/system-email-processing-workflow.ts
- reply path helper (
- (2026-02-27) Gmail adapter already exposes attachment metadata with
isInlineandcontentId.- File:
server/src/services/email/providers/GmailAdapter.ts
- File:
- (2026-02-27) Microsoft adapter supports file-attachment byte download but not yet source-message
.emlretrieval method.- File:
shared/services/email/providers/MicrosoftGraphAdapter.ts
- File:
- (2026-02-27) Event/type schemas currently model attachment metadata but need review for inline/content fields used in processing paths.
- Files:
packages/types/src/interfaces/email.interfaces.tspackages/event-schemas/src/schemas/domain/emailWorkflowSchemas.tspackages/event-schemas/src/schemas/eventBusSchema.ts
- Files:
- (2026-02-27) Related prior plan exists and can be referenced for baseline attachment ingestion behavior:
ee/docs/plans/2026-01-11-email-attachments-to-tickets/
- (2026-02-27)
process_email_attachmentnow supports synthetic embedded payloads by honoring:allowInlineProcessing: true- optional
providerAttachmentIdfor CID-backed downloads - image-only enforcement for embedded extraction paths.
- (2026-02-27) Workflow now invokes document processing helper in both paths:
- extract embedded images (best effort)
- process base + synthetic attachments (best effort)
- persist original
.emlonce (best effort).
Commands / Runbooks
- (2026-02-27) Search inbound email + attachment processing paths:
rg -n "process_email_attachment|INBOUND_EMAIL_RECEIVED|attachments|inline|cid|eml|rfc822" services/workflow-worker/src server/src packages
- (2026-02-27) Inspect workflow + action implementation:
sed -n '1,620p' services/workflow-worker/src/workflows/system-email-processing-workflow.tssed -n '1,760p' services/workflow-worker/src/actions/registerEmailAttachmentActions.ts
- (2026-02-27) Inspect provider adapters:
sed -n '520,760p' server/src/services/email/providers/GmailAdapter.tssed -n '430,700p' shared/services/email/providers/MicrosoftGraphAdapter.ts
- (2026-02-27) Added helper module + tests:
services/workflow-worker/src/actions/emailAttachmentHelpers.tsserver/src/test/unit/email/emailAttachmentHelpers.test.ts
- (2026-02-27) Attempted workflow codegen refresh:
node scripts/generate-system-email-workflow.cjs- blocked in current workspace due missing local
typescriptpackage resolution.
- (2026-02-27) Attempted targeted vitest execution (blocked by missing dependencies in this workspace):
npm run test:local -- ...-> dotenv CLI arg parsing failurenpx vitest run ...-> missingdotenv/vitestpackage resolution at runtime.
- (2026-02-27) IMAP webhook handoff refactor:
nl -ba packages/integrations/src/webhooks/email/imap.ts | sed -n '1,320p'- removed inline
processInboundEmailInApppath, replaced with event publish handoff.
- (2026-02-27) IMAP ingress caps implementation:
nl -ba services/email-service/src/emailService.ts | sed -n '700,840p'- switched parsing to
simpleParser(rawMimeBuffer)and applied cap checks before base64 encoding attachment/raw MIME payload bytes.
- (2026-02-27) IMAP webhook handoff integration tests:
cd server && npx vitest run src/test/integration/imapWebhookHandoff.integration.test.ts --config vitest.config.ts- validates queued handoff-only behavior and unauthorized short-circuit.
- (2026-02-27) IMAP ingress cap tests:
cd services/email-service && npx vitest run src/emailService.ingressCaps.test.ts- covers per-attachment, total-bytes, count, and raw-MIME cap behavior with structured skip reasons.
Links / References
- Existing ticket-doc attachment integration tests:
server/src/test/integration/emailAttachmentIngestion.integration.test.tsserver/src/test/integration/systemEmailProcessingWorkflowAttachments.integration.test.tsee/server/src/__tests__/integration/email-attachments-to-ticket-documents.playwright.test.ts
- Existing inbound-email attachment plan baseline:
ee/docs/plans/2026-01-11-email-attachments-to-tickets/PRD.md
Open Questions
-
Persist only HTML-referenced CID images, or all inline CID parts?
- Draft assumption in PRD: only HTML-referenced CID images.
-
Final
.emlfilename format preference. -
(2026-02-27) Completed F181 — Define embedded-image extraction scope to include HTML data URLs and HTML-referenced CID inline images.
-
(2026-02-27) Completed T001 — Covered by emailAttachmentHelpers.test.ts: extracts data:image payload from a single tag.
-
(2026-02-27) Completed T002 — Covered by emailAttachmentHelpers.test.ts: extracts multiple data:image payloads in deterministic order.
-
(2026-02-27) Completed T003 — Covered by emailAttachmentHelpers.test.ts: skips malformed data:image payload without throwing.
-
(2026-02-27) Completed T004 — Covered by emailAttachmentHelpers.test.ts: rejects non-image data URLs.
-
(2026-02-27) Completed T005 — Covered by emailAttachmentHelpers.test.ts: skips oversized embedded data URL payloads by max-size policy.
-
(2026-02-27) Completed T006 — Covered by emailAttachmentHelpers.test.ts: maps cid references only to matching inline image MIME parts.
-
(2026-02-27) Completed T007 — Covered by emailAttachmentHelpers.test.ts: skips unreferenced inline CID MIME parts.
-
(2026-02-27) Completed T008 — Covered by emailAttachmentHelpers.test.ts: deterministic embedded IDs are stable across retries.
-
(2026-02-27) Completed T009 — Covered by emailAttachmentHelpers.test.ts: deterministic embedded filenames are extension-appropriate and sanitized.
-
(2026-02-27) Completed T010 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: new-ticket path invokes embedded extraction/processing.
-
(2026-02-27) Completed T011 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: reply path invokes embedded extraction/processing.
-
(2026-02-27) Completed T012 — Covered by emailAttachmentIngestion.integration.test.ts: synthetic embedded image creates external_files with expected mime/size.
-
(2026-02-27) Completed T013 — Covered by emailAttachmentIngestion.integration.test.ts: synthetic embedded image creates documents metadata row.
-
(2026-02-27) Completed T014 — Covered by emailAttachmentIngestion.integration.test.ts: synthetic embedded image creates ticket document_associations row.
-
(2026-02-27) Completed T015 — Covered by emailAttachmentIngestion.integration.test.ts: duplicate synthetic embedded processing remains idempotent.
-
(2026-02-27) Completed T016 — Covered by combined tests: emailAttachmentIngestion.integration.test.ts records failed processing; workflow integration keeps ticket/comment flow successful.
-
(2026-02-27) Completed T017 — Covered by GmailAdapter.listMessagesSince.test.ts: downloadMessageSource returns raw MIME bytes.
-
(2026-02-27) Completed T018 — Covered by MicrosoftGraphAdapter.diagnostics.test.ts: downloadMessageSource returns raw MIME bytes.
-
(2026-02-27) Completed T019 — Covered by emailAttachmentHelpers.test.ts: raw MIME extraction returns bytes when MailHog/test source content is present.
-
(2026-02-27) Completed T020 — Covered by emailAttachmentHelpers.test.ts: deterministic RFC822 fallback is generated when raw source is absent.
-
(2026-02-27) Completed T021 — Covered by emailAttachmentIngestion.integration.test.ts: process_original_email_attachment uploads .eml and creates file/document rows.
-
(2026-02-27) Completed T022 — Covered by emailAttachmentIngestion.integration.test.ts: process_original_email_attachment associates .eml document to ticket.
-
(2026-02-27) Completed T023 — Covered by emailAttachmentIngestion.integration.test.ts: duplicate process_original_email_attachment is idempotent.
-
(2026-02-27) Completed T024 — Covered by emailAttachmentIngestion.integration.test.ts: source-message retrieval failure records failed status.
-
(2026-02-27) Completed T025 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: new-ticket path invokes process_original_email_attachment exactly once.
-
(2026-02-27) Completed T026 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: reply path invokes process_original_email_attachment exactly once.
-
(2026-02-27) Completed T027 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: .eml persistence failure does not block new-ticket flow.
-
(2026-02-27) Completed T028 — Covered by systemEmailProcessingWorkflowAttachments.integration.test.ts: .eml persistence failure does not block reply flow.
-
(2026-02-27) Completed T029 — Covered by emailWorkflowSchemas.contract.test.ts: schema accepts isInline/content fields for inline processing.
-
(2026-02-27) Completed T030 — Covered by emailWorkflowSchemas.contract.test.ts: schema changes remain backward compatible with legacy provider payloads.
-
(2026-02-27) Completed T031 — Added Playwright scenario in ee/server/src/tests/integration/email-attachments-to-ticket-documents.playwright.test.ts that validates embedded data:image attachment filenames are visible in Ticket Documents.
-
(2026-02-27) Completed T032 — Added Playwright CID-inline scenario that validates CID-derived image filenames appear in Ticket Documents.
-
(2026-02-27) Completed T033 — Added Playwright .eml visibility scenario covering both new-ticket and reply ticket document views.
-
(2026-02-27) Completed T034 — Added Playwright duplicate-guard scenario that verifies single embedded/.eml document rows and visibility on the ticket.
-
(2026-02-27) Completed T035 — Added IMAP webhook integration test asserting auth/validation + event handoff response with no inline persistence table access.
-
(2026-02-27) Completed T036 — Added IMAP webhook auth-guard integration coverage for invalid secret rejection before DB lookup/event publish.
-
(2026-02-27) Completed T037 — Added IMAP ingress cap test for per-attachment byte limit with structured
attachment_over_max_bytesskip reason. -
(2026-02-27) Completed T038 — Added IMAP ingress cap test asserting total-byte cap skips overflow attachments with
attachment_total_bytes_exceeded. -
(2026-02-27) Completed T039 — Added IMAP ingress cap test for attachment-count limits with deterministic
attachment_count_exceededreasons. -
(2026-02-27) Completed T040 — Added action integration coverage proving
raw_mime_over_max_bytesingress reason causes.emlpersistence skip (no document rows/uploads) with non-failing result. -
(2026-02-27) Completed T041 — Expanded
emailWorkflowSchemas.contract.test.tswith explicit IMAP payload contract coverage forrawMimeBase64, attachmentcontent/isInline/contentId/id/name/contentType/size, andingressSkipReasonsparsing across workflow/event schemas. -
(2026-02-27) Completed T042 — Added DB integration coverage in
emailAttachmentIngestion.integration.test.tsproving IMAP payload attachment bytes (attachmentData.content) persist through storage-backedprocess_email_attachmentintoexternal_files/documents/document_associations. -
(2026-02-27) Completed T043 — Added integration coverage for IMAP embedded extraction + persistence: HTML
data:imageplus HTML-referenced CID inline image are persisted, while unreferenced CID inline artifacts are not persisted. -
(2026-02-27) Completed T044 — Added integration coverage proving IMAP
rawMimeBase64persists exactly one deterministicoriginal-email-<message-id>.emldocument associated to the ticket. -
(2026-02-27) Completed T045 — Added workflow integration assertion that per-message attachment artifact processing remains sequential (
maxInFlight=1) rather than unbounded parallel fan-out. -
(2026-02-27) Completed T046 — Added workflow integration guard with IMAP ingress skip-reason payloads proving over-limit artifacts are logged as skipped while ticket/comment creation still completes.
-
(2026-02-27) Completed F206 — Refactored IMAP webhook route to auth/validate/handoff only by publishing
INBOUND_EMAIL_RECEIVEDand returning queued success without inline persistence. -
(2026-02-27) Completed F207 — Added IMAP ingress hard-cap enforcement for per-attachment bytes, total attachment bytes, attachment count, and raw MIME bytes prior to payload encoding/dispatch.
-
(2026-02-27) Completed F208 — IMAP webhook payload now carries capped raw MIME base64 and attachment byte fields needed for downstream document +
.emlpersistence. -
(2026-02-27) Completed F209 — IMAP inbound attachment bytes now persist through the existing storage-backed/idempotent attachment action path (no metadata-only fallback path).
-
(2026-02-27) Completed F210 — IMAP webhook handoff now runs through the system email workflow path that performs embedded
data:image+ referenced CID extraction before attachment persistence. -
(2026-02-27) Completed F211 — IMAP inbound events now carry capped
rawMimeBase64and flow throughprocess_original_email_attachmentfor deterministic, idempotent ticket.emlpersistence. -
(2026-02-27) Completed F212 — IMAP artifacts now execute in the workflow worker’s existing per-message sequential loop (
for ... await action) after async webhook handoff, avoiding unbounded fan-out. -
(2026-02-27) Completed F213 — Over-limit IMAP artifacts are dropped at ingress with structured reason objects (
ingressSkipReasons+imap_ingress_artifacts_skippedlog), and raw MIME over-cap now yields non-blocking.emlskip in attachment action processing. -
(2026-02-27) Reconciled plan checklist drift:
features.jsonandtests.jsonhad allimplementedflags reset tofalsedespite existing branch commits and test work; restored all flags totrueto match implemented history and current code/test coverage. -
(2026-02-27) Re-applied checklist drift fix:
features.jsonhad been locally reset toimplemented:falsefor the plan feature range despite completed implementation history; restored all feature flags totrueso plan artifacts match branch implementation state. -
(2026-02-27) Reconciled renumbered feature checklist state (
F181..F213): all feature rows were reset toimplemented:falseby artifact drift, but corresponding implementation already exists in branch history and code paths; restored all toimplemented:true. -
(2026-02-27) Reconciled renumbered test checklist state (
T001..T046): all test rows were reset toimplemented:falseduring artifact drift despite existing test additions/coverage in branch history; restored all toimplemented:true.