# Reply Parsing and Outbound Markers
This reference outlines how inbound replies are parsed and how outbound notifications are prepared so that only the new message content is stored as a ticket comment.
## Outbound Markers
Every notification sent through `sendEventEmail` now includes reply-friendly markers:
- A visible banner rendered at the top of the message that reads `--- Please reply above this line ---`.
- A hidden `
` element with `data-alga-reply-token`, `data-alga-ticket-id`, and related attributes so the parser can recover the originating entity.
- Plain-text footers (`[ALGA-REPLY-TOKEN …]`, `ALGA-TICKET-ID:…`, etc.) appended to the text version to cover clients that strip HTML elements.
When a corresponding event provides a `replyContext`, a conversation token is generated, persisted to the `email_reply_tokens` table, and embedded into the outbound email. Tokens capture:
- `ticketId` or `projectId`
- Optional `commentId` (for comment notifications)
- Optional `threadId` for threading hints
The `email_reply_tokens` table keeps the mapping so the inbound workflow can resolve replies even if subjects or threading headers change.
## Parser Heuristics
`server/src/lib/email/replyParser.ts` applies a layered set of heuristics:
1. **Explicit boundary** – trims at `--- Please reply above this line ---` or localized variants.
2. **Token stripping** – removes `[ALGA-REPLY-TOKEN …]` and ID footers from the sanitized body.
3. **Provider headers** – recognises Gmail (`On … wrote:`), Outlook quoting, Microsoft Graph forwarded headers, and drops quoted history.
4. **Quote filtering** – skips `>` prefixed lines while still allowing inline responses after quoted blocks.
5. **Signature trimming** – strips common signatures (`Thanks,`, `Best regards,`, `Sent from …`) within the trailing 12 lines.
6. **Fallback** – if everything is removed the original body is retained and flagged as low confidence.
The parser returns:
- Sanitized text/html
- Applied heuristics and warnings
- Extracted token metadata (conversation token, ticket/comment/project identifiers)
- Confidence level (`high`, `medium`, `low`)
Fixtures under `server/src/lib/email/__fixtures__/` cover Gmail top-posting, Outlook inline replies, forwarded chains, and signature-heavy responses. Vitest inline snapshots describe the sanitized output for each scenario.
## Inbound Workflow Integration
The workflow stores the parser result (`metadata.parser`) alongside the comment. When a reply arrives:
- If a conversation token is present, `find_ticket_by_reply_token` resolves the target ticket/comment before consulting threading headers.
- Low-confidence parses log a warning and include truncated raw bodies in the comment metadata for operator review.
- Attachments are processed exactly as before; trimming only affects the comment body.
## Provider Notes
| Provider | Behaviours handled | Key heuristics |
|----------|-------------------|----------------|
| Gmail | Top-posted replies with `On … wrote:` headers, inline quoting, Mail API quoting markers | Boundary split, provider header detection, quote filtering |
| Outlook / Exchange | Prefixed `>` quoting, `_` separators, mobile signatures (`Sent from Outlook`) | Quote filtering, signature trimming |
| Microsoft Graph | Forwarded chains (`Forwarded message`), multilingual headers (`Envoyé :`, `De :`) | Localised header detection, forwarded header breakpoints |
These behaviours inform the regex lists in the parser so new providers can be added without rewriting workflow logic. Adjustments for additional locales can be made via `ReplyParserConfig` (see `getDefaultReplyParserConfig`).
## Configuration Surface
`getDefaultReplyParserConfig()` exposes the defaults. Consumers can supply overrides (alternate delimiter text, custom signature markers, etc.) when invoking the parser through `parse_email_reply`.
## Data Model Summary
```
email_reply_tokens
tenant UUID (PK part)
token TEXT (PK part)
ticket_id UUID NULL
project_id UUID NULL
comment_id UUID NULL
entity_type TEXT DEFAULT 'ticket'
metadata JSONB
template TEXT
recipient_email TEXT
created_at TIMESTAMP WITH TIME ZONE
expires_at TIMESTAMP WITH TIME ZONE NULL
```
Rows expire manually (retention policy TBD). The workflow uses the mapping purely to locate the target record on inbound replies.