Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

39 KiB
Raw Permalink Blame History

App-Wide Search — PRD

Plan ID: 2026-05-13-app-wide-search Owner: Natallia Bukhtsik Status: Draft (pending scope confirmation) Date: 2026-05-13


1. Summary

Add a global full-text search bar to the MSP navigation sidebar that finds records across all major entities in the system — clients, contacts, team members, tickets, ticket comments, projects, project tasks and their comments, assets, invoices, contracts/quotes, documents, KB articles, the service catalog, service requests, workflow tasks, schedule entries, interactions, and time-entry notes. The search is powered by a single tenant-scoped index table (app_search_index) using PostgreSQL FTS (tsvector/GIN) with a pg_trgm fuzzy fallback. The index is kept in sync via event-bus-driven incremental updates, with a one-time backfill and a periodic reconciliation job. Per-entity authorization is enforced both via denormalized ACL columns on the index and via record-level checks before snippets are returned.

2. Problem Statement

Today there is no way to "just find a thing" in Alga PSA. To locate a record, users must know which area of the app it lives in, navigate to that area, and use entity-specific filters. Cross-entity discovery (e.g., "any record that mentions ACME") is impossible. This costs MSP technicians and dispatchers measurable time on every ticket triage, project lookup, and client phone-call workflow.

3. Goals

  • G1 — Single search input in the MSP sidebar that returns relevant records across all major entity types in under ~300 ms p50 for a tenant of typical size.
  • G2 — Wide entity coverage. A v1 user can find clients, contacts, team members, tickets, ticket comments, projects, project tasks, project task comments, assets, invoices (incl. line items + annotations), contracts (incl. quote drafts), documents, KB articles, service catalog items, service requests, workflow tasks, schedule entries, interactions, time-entry notes, boards, categories, and tags.
  • G3 — Tenant isolation. No query ever returns a row from another tenant. Enforced at the schema (tenant in PK + distribution column) and query (mandatory tenant = ? predicate) layers.
  • G4 — Permission correctness. Search never surfaces titles, snippets, or even the existence of records the user cannot read in the app proper. Internal ticket comments stay hidden from non-internal users. Private documents stay hidden.
  • G5 — Fresh-enough index. Newly created/edited records appear in search within seconds (event-bus indexing latency). Worst-case staleness (during reconciliation lag) is bounded to under 24 hours.
  • G6 — Fuzzy matching. Users can type partial tokens (acm), shortened IDs (tic-10), and minor misspellings (exhcange) and still get useful results.
  • G7 — Pluggable. Adding a new entity type to the search index is a single registry entry plus an indexer module; no schema migrations required after v1.

4. Non-Goals

  • Client portal search. Out of scope for v1. Will reuse the index and ACL infrastructure in v2.
  • Semantic / vector / embedding search. Out of scope. The index table is designed so this can be added later as an additional column without restructuring.
  • Saved searches, search analytics dashboards, or admin tooling for tuning relevance.
  • Cross-tenant federation for support admins.
  • Search-driven bulk operations (e.g., "find all tickets matching X and reassign").
  • Highlighting inside the destination page beyond a deep-link anchor where one trivially exists (e.g., ticket comment hash).
  • Replacing entity-specific filters. The existing /api/v1/*/search endpoints stay; this is additive.

5. Users & Primary Flows

Personas

  • MSP Technician — receives a phone call, needs to find a client and their open tickets in seconds.
  • MSP Dispatcher — needs to find the right ticket/project/asset across hundreds of records to assign or schedule work.
  • MSP Manager — needs to discover invoices, contracts, or KB articles by partial title or content.

Primary flows

  1. Type-and-jump. User clicks the sidebar search (or presses Cmd/Ctrl+K), types a few characters, sees grouped results in a dropdown, presses Enter or clicks → navigates directly to the canonical URL of the chosen record.
  2. Find a comment in a ticket. User searches for a phrase they remember a tech mentioning. A ticket-comment result links to /msp/tickets/{ticket_id}#comment-{comment_id}. The destination page scrolls and highlights that comment briefly.
  3. Find a client by partial name. User types acm — sees ACME Corp, ACME Holdings, and any tickets/contracts containing "acme" in their title.
  4. Find by ticket number. User types TIC-1023 — gets the ticket as the top result regardless of FTS tokenization.
  5. Empty result. User searches for something that doesn't exist; sees a clear "no results" state with their query echoed back.

6. UX / UI Notes

The primary search surface is a dedicated results page at /msp/search?q=.... The sidebar input acts as a launcher with a minimal typeahead for the very top hits — it's not the surface users read through. Every search produces a real, shareable URL; every result row is a real anchor so Cmd/Ctrl+click opens in a new tab natively.

6.1 Sidebar launcher + typeahead

  • Location. Persistent input at the top of the MSP sidebar (server/src/components/layout/Sidebar.tsx). Also openable via Cmd/Ctrl+K.
  • Typeahead behavior. Debounced 200 ms; shows up to 5 title-only suggestions (no snippets, no group headers) ranked across all types. The last entry is always → See all N results which navigates to /msp/search?q=....
  • Keyboard. ↑/↓ to navigate, Enter on a row opens that record, Enter on the input (or on "See all results") goes to the results page, Esc closes.
  • Rows are real <a> elements with href pointing at the canonical URL — Cmd/Ctrl+click and middle-click open in new tabs without extra code.
  • Empty state. Quiet — typeahead just doesn't render until ≥2 chars.

6.2 Results page (/msp/search)

  • URL is the state. ?q=, ?type= (filter), ?cursor= (pagination), ?sort= (relevance | recent). Bookmarkable, shareable, back-button-safe.
  • Layout. Filter chips across the top (All, then one per entity type with a count badge); grouped results below by entity type when type=All, flat list when a single type is selected.
  • Each result row shows entity icon, title, subtitle (e.g., client name for a ticket), ts_headline snippet for body matches, and a relative updated_at tag.
  • Pagination. Cursor-based "prev / next"; default 25 rows/page; "Load more" is acceptable as an alternate.
  • Per-row affordances. Click to navigate, Cmd/Ctrl+click to open in new tab (native anchor behavior), right-click for browser context menu (also free with anchors).
  • Empty state. Echoes the query, suggests broader filters or removing the type filter, links to entity-type filtered pages.
  • Loading state. Skeleton rows; debounced server hits at 200 ms while the user types in the results-page input (mirrors the sidebar input).

6.3 Shared

  • i18n. All UI strings go through useTranslation('msp/core') under a new search.* namespace.
  • Accessibility. Stable kebab-case ids for the UI reflection system; ARIA combobox semantics on the sidebar input; ARIA region semantics on the results page; high-contrast snippet highlight; keyboard-only flows tested end to end.
  • Snippet HTML. ts_headline HTML is sanitized at the server-action boundary (allow only <mark>); the React component renders via a trusted-string wrapper, never dangerouslySetInnerHTML of unsanitized output.

7. Entity Scope

All of the following are indexed at v1 launch. Adding/removing entities later is one registry entry + one indexer module.

# Entity Source table(s) Title field Subtitle / body fields Deep-link URL pattern ACL strategy
1 Client clients client_name email, phone_no, notes /msp/clients/{client_id} Tenant-wide (MSP)
2 Contact contacts full_name email, phone_number, role /msp/contacts/{contact_name_id} Tenant-wide (MSP)
3 Team member users (internal only) first_name last_name username, email, title /msp/team/{user_id} Tenant-wide (MSP); excludes client-type users
4 Ticket tickets title ticket_number, client name (denormalized) /msp/tickets/{ticket_id} Ticket permission + board scope
5 Ticket comment comments (parent: ticket) parent ticket title note (markdown → text) /msp/tickets/{ticket_id}#comment-{comment_id} Ticket permission + is_internal flag
6 Project projects project_name description /msp/projects/{project_id} Project permission + client scope
7 Project phase project_phases phase_name description, parent project name /msp/projects/{project_id}/phases/{phase_id} Inherits project ACL
8 Project task project_tasks task_name description /msp/projects/{project_id}/tasks/{task_id} Inherits project ACL
9 Project task comment project_task_comments parent task name markdown_content (preferred) or BlockNote→text /msp/projects/{project_id}/tasks/{task_id}#comment-{task_comment_id} Inherits project ACL
10 Asset assets name asset_tag, serial_number, location, JSONB attributes → flattened /msp/assets/{asset_id} Tenant-wide; respects client-scope filters
11 Invoice invoices invoice_number client name (denormalized), total, status /msp/invoices/{invoice_id} Invoice permission
12 Invoice line item invoice_items parent invoice number description /msp/invoices/{invoice_id}#item-{item_id} Inherits invoice ACL
13 Invoice annotation invoice_annotations parent invoice number content /msp/invoices/{invoice_id}#annotation-{annotation_id} Inherits invoice ACL
14 Contract contracts contract_name contract_description; subtitle="Quote" when status='draft' else "Contract" (statuses: active, draft, terminated, expired) /msp/billing/contracts/{contract_id} Contract permission
15 Client contract client_contracts (joins contracts + clients) derived: {client_name} {contract_name} dates + status /msp/clients/{client_id}/contracts/{client_contract_id} Inherits client + contract ACL via client_scope_id
16 Document documents (content column holds BlockNote JSON) document_name content (BlockNote→text, truncated to 64 KB) /msp/documents/{document_id} Document permission; tenant-wide visibility at v1 (no internal share mechanism exists); optional client_scope_id from documents.client_id
17 KB article kb_articles (FK to documents) document name document content (BlockNote→text) /msp/knowledge-base/{article_id} KB read permission
18 Service catalog item service_catalog service_name description, JSONB attributes → flattened /msp/billing/services/{service_id} Service catalog permission
19 Service request submission service_request_submissions request_name submitted_payload JSONB → flattened strings /msp/service-requests/{submission_id} Service request permission
20 Service request definition service_request_definitions name description /msp/service-requests/definitions/{definition_id} Admin permission
21 Workflow task workflow_tasks (note: PK is task_id alone, not (tenant, task_id); tenant is a regular column) title description /msp/workflow-tasks/{task_id} Workflow task permission + assignee scope via assigned_users jsonb
22 Interaction interactions title interaction type name + counterparty (client/contact/ticket); notes (BlockNote→text) as body /msp/interactions/{interaction_id} Interaction permission
23 Schedule entry schedule_entries title notes /msp/schedule/{entry_id} Schedule permission + assignee scope
24 Time entry time_entries derived (work item + date) notes links to parent work item Time-entry permission + owner scope
25 Board boards channel_name /msp/tickets?board={board_id} Tenant-wide
26 Category categories category_name /msp/tickets?category={category_id} Tenant-wide
27 Tag tags tag_text filter link Tenant-wide

Note on quotes. The codebase has no separate quotes table; quotes are draft contracts. The indexer flags status = 'draft' contracts with subtitle = "Quote" so users searching "quote" get sensible results.

8. Architecture Overview

┌─────────────────────────────────────────────────────────────────────┐
│ MSP UI (Next.js)                                                    │
│  Sidebar.tsx ──► SearchPalette (cmdk) ──► searchAppAction()         │
└─────────────────────────────────────────────────────────────────────┘
                          │ withAuth server action
                          ▼
┌─────────────────────────────────────────────────────────────────────┐
│ Search service (server/src/lib/search/)                             │
│  • buildQuery(query, allowedTypes, user)                            │
│  • runQuery(knex, …)  → FTS + pg_trgm fallback                      │
│  • applyAcl(user, rows) → record-level final filter (defence-in-    │
│    depth on top of denormalized ACL columns)                        │
│  • formatResults() → snippets via ts_headline                       │
└─────────────────────────────────────────────────────────────────────┘
                          │ knex
                          ▼
┌─────────────────────────────────────────────────────────────────────┐
│ Postgres / Citus                                                    │
│  app_search_index  (distributed by tenant)                          │
│  • GIN(search_vector)                                               │
│  • GIN(title gin_trgm_ops), GIN(subtitle gin_trgm_ops)              │
└─────────────────────────────────────────────────────────────────────┘
                          ▲
                          │ upsert / delete
┌─────────────────────────────────────────────────────────────────────┐
│ Indexer framework (server/src/lib/search/indexers/)                 │
│  Registry: { client, contact, ticket, ticket_comment, … }           │
│  Each entry implements:                                             │
│    • async loadOne(tenant, id) → SearchDoc | null                   │
│    • async loadBatch(tenant, cursor, limit) → SearchDoc[]           │
│    • sourceEvents: EventType[]                                      │
└─────────────────────────────────────────────────────────────────────┘
                          ▲
        ┌─────────────────┼──────────────────┐
        │                 │                  │
  ┌──────────┐    ┌──────────────┐   ┌──────────────────┐
  │ Event    │    │ Backfill     │   │ Reconciliation   │
  │ bus      │    │ CLI / job    │   │ pg-boss cron     │
  │ sub      │    │              │   │  (daily diff)    │
  └──────────┘    └──────────────┘   └──────────────────┘
        ▲
        │ event-bus events
  ┌──────────┐
  │ Existing │  TICKET_CREATED, TICKET_UPDATED, TICKET_COMMENT_ADDED,
  │ publish  │  + new: CLIENT_*, CONTACT_*, PROJECT_*, ASSET_*,
  │ sites    │    INVOICE_*, CONTRACT_*, DOCUMENT_*, USER_*, etc.
  └──────────┘

9. Data Model

9.1 app_search_index table

CREATE TABLE app_search_index (
  tenant                uuid        NOT NULL,
  object_type           text        NOT NULL,   -- 'client' | 'ticket' | ...
  object_id             text        NOT NULL,   -- text to accommodate composite ids
  parent_type           text,
  parent_id             text,

  title                 text        NOT NULL,
  subtitle              text,
  body                  text,                   -- capped at 64 KB

  url                   text        NOT NULL,
  metadata              jsonb       NOT NULL DEFAULT '{}'::jsonb,

  -- Denormalized ACL hints
  visible_to_user_ids   uuid[]      NOT NULL DEFAULT '{}',   -- empty = no per-user restriction
  visible_to_roles      text[]      NOT NULL DEFAULT '{}',   -- empty = no role restriction
  is_internal_only      boolean     NOT NULL DEFAULT false,  -- e.g., internal ticket comments
  is_private            boolean     NOT NULL DEFAULT false,  -- e.g., private documents
  client_scope_id       uuid,                                -- if set, only users with access to this client can see
  required_permission   text,                                -- e.g., 'ticket:read'

  -- Search columns
  search_vector         tsvector    NOT NULL,
  search_lang           text        NOT NULL DEFAULT 'english',

  -- Bookkeeping
  source_updated_at     timestamptz NOT NULL,   -- the source row's updated_at, used by reconciliation
  indexed_at            timestamptz NOT NULL DEFAULT now(),

  PRIMARY KEY (tenant, object_type, object_id)
);

-- Citus
SELECT create_distributed_table('app_search_index', 'tenant');

-- Indexes
CREATE INDEX app_search_index_vector_gin
  ON app_search_index USING gin (search_vector);

CREATE INDEX app_search_index_title_trgm
  ON app_search_index USING gin (title gin_trgm_ops);

CREATE INDEX app_search_index_subtitle_trgm
  ON app_search_index USING gin (subtitle gin_trgm_ops);

CREATE INDEX app_search_index_recent
  ON app_search_index (tenant, source_updated_at DESC);

CREATE INDEX app_search_index_type
  ON app_search_index (tenant, object_type);

9.2 SearchDoc interface

The indexer-side representation, converted to a row at insert time. Defined in server/src/lib/search/types.ts.

export interface SearchDoc {
  tenant: string;
  objectType: SearchObjectType;
  objectId: string;
  parentType?: SearchObjectType;
  parentId?: string;
  title: string;
  subtitle?: string;
  body?: string;
  url: string;
  metadata?: Record<string, unknown>;
  acl: {
    visibleToUserIds?: string[];
    visibleToRoles?: string[];
    isInternalOnly?: boolean;
    isPrivate?: boolean;
    clientScopeId?: string;
    requiredPermission?: string;
  };
  sourceUpdatedAt: Date;
}

9.3 Body normalization

A new utility, server/src/lib/search/normalize.ts:

  • flattenBlockNote(json: unknown): string — walks BlockNote JSON, concatenating text nodes, dropping image data URIs.
  • flattenMarkdown(md: string): string — strips markdown formatting tokens.
  • flattenJsonbPayload(obj: unknown): string — recursively pulls string leaves out of a JSONB blob (used for service-request submissions and asset attributes).
  • truncateForIndex(text: string, maxBytes = 65_536): string — UTF-8-safe byte cap.

The existing public.process_large_lexemes() Postgres function is reused as the final cleanser of the body string before to_tsvector is computed in the indexer.

10. Indexer Framework

Located at server/src/lib/search/indexers/. One file per entity type:

server/src/lib/search/
  index.ts                         (registry)
  types.ts                         (SearchDoc, SearchObjectType union)
  normalize.ts
  upsert.ts                        (knex upsert into app_search_index)
  query.ts                         (FTS + pg_trgm)
  acl.ts                           (record-level filter helpers)
  ts_headline.ts                   (snippet builder)
  indexers/
    client.ts
    contact.ts
    user.ts
    ticket.ts
    ticket_comment.ts
    project.ts
    project_phase.ts
    project_task.ts
    project_task_comment.ts
    asset.ts
    invoice.ts
    invoice_item.ts
    invoice_annotation.ts
    contract.ts
    client_contract.ts
    document.ts
    kb_article.ts
    service_catalog.ts
    service_request_submission.ts
    service_request_definition.ts
    workflow_task.ts
    interaction.ts
    schedule_entry.ts
    time_entry.ts
    board.ts
    category.ts
    tag.ts

Each indexer exports:

export const clientIndexer: EntityIndexer = {
  objectType: 'client',
  sourceEvents: ['CLIENT_CREATED', 'CLIENT_UPDATED', 'CLIENT_DELETED'],
  loadOne: async (knex, tenant, id) => { /* … */ return doc | null },
  loadBatch: async (knex, tenant, cursor, limit) => { /* … */ return docs[] },
};

The registry exports getIndexer(objectType) and allIndexers().

11. Indexing Pipeline

11.1 Event-driven (real-time)

A new subscriber at server/src/lib/eventBus/subscribers/searchIndexSubscriber.ts listens to all configured sourceEvents. On each event it:

  1. Looks up the indexer via the registry.
  2. For *_DELETED events, deletes the row from app_search_index.
  3. For *_CREATED / *_UPDATED events, calls indexer.loadOne() and upserts.

Cascade events. Some entities depend on others:

  • A ticket update re-indexes the ticket. It also touches all ticket comments because their parent.title (denormalized) may have changed → handled by the ticket indexer dispatching TICKET_COMMENT_REINDEX for each comment.
  • An invoice update re-indexes line items + annotations.
  • A project update re-indexes phases + tasks (capped, async).

11.2 Gap: new events

The recon shows event-bus emission exists for ticket/comment events but NOT for clients/contacts/projects/assets/invoices/contracts/documents. The plan adds the missing publishes (one feature per entity family) at the existing action call sites. Use the schema in server/src/lib/eventBus/events.ts.

11.3 Backfill (one-time)

A new CLI script server/src/scripts/search-backfill.ts (also runnable via npm script npm run search:backfill) that:

  1. Iterates all tenants (or one specific tenant via flag).
  2. For each indexer, pages through loadBatch() in chunks of 500 and upserts.
  3. Logs progress to stdout and a row count summary per entity type per tenant.
  4. Is idempotent — re-running upserts overwrite.

11.4 Reconciliation (periodic)

A pg-boss job search:reconcile scheduled daily:

  1. For each tenant and each entity type, SELECT rows from the source table where updated_at > (max(source_updated_at) for that type).
  2. Re-index those rows.
  3. Also: SELECT source IDs missing from the index → re-index. SELECT index IDs missing from source → delete.

This catches drift from dropped events, manual SQL changes, or bugs.

11.5 ACL re-index triggers

Permission/assignment changes need to refresh denormalized ACL columns:

  • Ticket reassignment → re-index the ticket and its comments.
  • Document permission change → re-index the document.
  • User role change → background job re-indexes affected rows for that user's visible_to_user_ids membership. (Acceptable to run async; max staleness ~10 min via a pg-boss job triggered by user-update events.)

12. Search Server Action

New action at server/src/lib/actions/searchActions.ts:

export const searchAppAction = withAuth(async (
  user,
  { tenant },
  input: SearchAppInput
): Promise<SearchAppResult> => {
  const { query, types, limit = 30, cursor } = input;
  // 1. Trim/validate query
  // 2. Determine allowedTypes = types ∩ entitiesUserCanRead(user)
  // 3. Build SQL: WHERE tenant = ? AND object_type = ANY(?) AND acl_filter(user)
  //    ORDER BY rank DESC, source_updated_at DESC
  //    LIMIT (limit + ACL_OVERFETCH_BUFFER) OFFSET cursor
  // 4. Run record-level ACL pass for defence-in-depth
  // 5. Format snippets with ts_headline
  // 6. Return grouped results
});

Input

interface SearchAppInput {
  query: string;
  types?: SearchObjectType[];   // omit = all the user can see
  limit?: number;               // default 30, max 100
  cursor?: string;              // opaque, encodes (rank, object_id)
}

Output

interface SearchResultRow {
  type: SearchObjectType;
  id: string;
  parentId?: string;
  title: string;
  subtitle?: string;
  snippet?: string;             // ts_headline output, sanitized to allow only <mark>
  url: string;
  score: number;
  updatedAt: string;            // ISO
}

interface SearchAppResult {
  results: SearchResultRow[];
  groups: Record<SearchObjectType, number>;   // counts by type, before pagination
  totalCount: number;
  nextCursor?: string;
}

Variants

  • searchAppAction(input) — full results, used by the results page.
  • searchAppTypeaheadAction(input) — top-5 title-only suggestions for the sidebar, optimised for sub-100 ms p50. Skips snippet generation and grouping. Reuses the same query builder under the hood, with limit=5 and snippet=false.

13. Ranking & Snippets

13.1 SQL shape (simplified)

WITH q AS (
  SELECT websearch_to_tsquery('english', $query) AS tsq,
         $query AS raw
)
SELECT
  s.object_type, s.object_id, s.parent_id, s.title, s.subtitle, s.url,
  s.source_updated_at,
  ts_rank_cd(s.search_vector, q.tsq) AS fts_rank,
  GREATEST(
    similarity(s.title,    q.raw),
    similarity(s.subtitle, q.raw)
  ) AS trgm_rank,
  ts_headline('english', s.body, q.tsq,
              'MaxFragments=2,StartSel=<mark>,StopSel=</mark>') AS snippet
FROM app_search_index s, q
WHERE s.tenant = $tenant
  AND s.object_type = ANY($allowedTypes)
  AND acl_predicate(s, $user)
  AND (
    s.search_vector @@ q.tsq
    OR s.title    % q.raw          -- pg_trgm fuzzy
    OR s.subtitle % q.raw
  )
ORDER BY
  -- Weighted composite: prefer FTS hits, then trigram, then recency
  (fts_rank * 1.0 + trgm_rank * 0.4)
    * exp(-EXTRACT(epoch FROM (now() - source_updated_at)) / (90 * 86400))
  DESC,
  s.source_updated_at DESC,
  s.object_id
LIMIT $limit OFFSET $offset;

13.2 Field weighting

Per indexer, body strings are composed before to_tsvector is computed:

search_vector =
  setweight(to_tsvector('english', title    ), 'A')   -- highest
  || setweight(to_tsvector('english', subtitle), 'B')
  || setweight(to_tsvector('english', body    ), 'C');

13.3 Identifier matches

Short identifier matches (ticket numbers, invoice numbers, asset tags) are handled by a separate exact-match branch: if query matches ^[A-Z]+-?\d+$ it is also probed against denormalized metadata->>'identifier' and pinned at the top.

13.4 Time decay

Multiplier exp(-age_days / 90) floor 0.05. Tunable per object type later via metadata; not configurable in v1.

14. Permissions & ACL

14.1 Two-layer filter

  1. SQL-level filter via denormalized columns (is_internal_only, is_private, visible_to_user_ids, visible_to_roles, client_scope_id, required_permission). This is the load-bearing layer for pagination correctness.
  2. Record-level final pass in app code. A verifyResultVisibility(user, rows) call runs each row through the same permission helpers used by the entity's primary actions (e.g., assertTicketReadable). Mismatches are dropped and logged (telemetry for index drift, not user-facing).

14.2 ACL strategy per entity (v1)

Entity SQL filter Record-level final check
Client required_permission='client:read' checked via hasPermission none extra
Contact required_permission='contact:read' none extra
Team member required_permission='user:read' none extra
Ticket required_permission='ticket:read' AND board scope via visible_to_roles assertTicketReadable(user, ticket_id)
Ticket comment inherits ticket ACL; if is_internal_only=true, requires internal user parent ticket check + comment-internal check
Project required_permission='project:read' AND client_scope_id assertProjectReadable
Project phase/task/task-comment inherits project ACL parent project check
Asset required_permission='asset:read' AND optional client_scope_id none extra
Invoice (+ items, annotations) required_permission='invoice:read' AND client_scope_id none extra
Contract (+ client_contract) required_permission='contract:read' AND optional client_scope_id none extra
Document required_permission='document:read'; optional client_scope_id derived from documents.client_id when set (no internal share-list mechanism exists in CE — documents are otherwise tenant-wide) none extra
KB article required_permission='kb:read' none extra
Service catalog required_permission='service_catalog:read' none extra
Service request submission required_permission='service_request:read' AND optional client_scope_id none extra
Service request definition required_permission='admin' none extra
Workflow task required_permission='workflow_task:read'; visible_to_user_ids for assignee scope none extra
Interaction required_permission='interaction:read' none extra
Schedule entry required_permission='schedule:read' AND visible_to_user_ids for owner scope none extra
Time entry required_permission='time:read'; visible_to_user_ids for owner scope none extra
Board / Category / Tag required_permission='ticket:read' none extra

14.3 Defence-in-depth fail-safe

If the SQL filter and the record-level check disagree, the row is dropped AND a telemetry counter (server log + Sentry) increments search.acl_drift. Drift > 0 is treated as a bug.

15. Multi-Tenancy & Citus Considerations

  • app_search_index is distributed by tenant. All inserts, updates, deletes go via tenant-included WHERE clauses.
  • The indexer framework requires the caller to pass tenant explicitly; nothing relies on app.current_tenant GUC.
  • No joins across tables in the search query path; the index is fully denormalized. This avoids cross-shard join concerns.
  • The reconciliation job operates one tenant at a time.

16. Rollout & Migration

  • Single release. Ships in CE main branch; EE inherits automatically. No PostHog flag.
  • Backfill order on deploy.
    1. Apply migration adding app_search_index table + indexes + pg_trgm extension if not present.
    2. Deploy code with subscriber disabled by env var SEARCH_INDEX_LIVE=false.
    3. Run npm run search:backfill against production for each tenant.
    4. Flip SEARCH_INDEX_LIVE=true and roll the workers; subscriber begins indexing.
    5. Enable the sidebar UI via merge to main.
  • Reconciliation job is enabled day-1; first run catches anything missed during the backfill ↔ live-flip window.
  • No DB downtime required. Index build can run CONCURRENTLY after table creation.

17. Acceptance Criteria / Definition of Done

A user can do all of the following from a clean prod-like environment with a seeded multi-tenant dataset:

  1. Open the sidebar input, type acme, see ACME Corp as the top typeahead suggestion within 500 ms.
  2. Press Enter from the sidebar input → land on /msp/search?q=acme with full grouped results.
  3. Cmd/Ctrl+click any result row on the results page → opens in a new tab; original page state is preserved.
  4. Type TIC-1023 (or partial tic-10) and find the matching ticket as the top result.
  5. Type a phrase known to appear only in a single ticket comment; click the result and land on the ticket page with the comment highlighted.
  6. Type a phrase known to appear only in an internal ticket comment as an internal user — the comment is in the results.
  7. Repeat (6) as a non-internal user — the comment is NOT in the results.
  8. As user A, search for a project user A has no access to — the project does NOT appear.
  9. As user A (client-scoped, no access to client X), search for a document whose documents.client_id = X — the document does NOT appear in results.
  10. Type a misspelled client name (exhcange for Exchange) and still get the right top result via pg_trgm fallback.
  11. Create a new ticket; within 5 seconds it appears in search.
  12. Delete a record; within 5 seconds it disappears from search.
  13. Run npm run search:backfill on a fresh DB and verify a sampled record from each of the 27 entity types is searchable.
  14. Run the reconciliation job manually after deleting an index row directly via SQL → the row reappears.
  15. Search across multiple tenants in a load test; no cross-tenant leak in 1M queries.
  16. All search UI strings render correctly in pseudo locale (xx) — confirming i18n coverage.
  17. Sidebar typeahead AND results page are fully keyboard-navigable (Tab, arrows, Enter, Esc).
  18. The results-page URL with ?q=, ?type=, ?cursor=, ?sort= is bookmarkable: opening it cold renders the same results without the user re-typing.

18. Risks & Open Questions

Risks

  • R1 — Permission leakage. Mitigated by two-layer ACL, exhaustive permission tests per entity (one positive + one negative test minimum), and drift telemetry.
  • R2 — Index size. With ~27 entity types and documents/comment bodies, the index could be large for big tenants. Mitigations: 64 KB body cap, process_large_lexemes stripping, GIN compression. Budget: target index ≤ 20% of total DB size on a typical tenant.
  • R3 — Event-bus reliability. If events drop, search goes stale. Mitigated by daily reconciliation job and the source_updated_at watermark.
  • R4 — Backfill duration on big tenants. Single-tenant backfill could take hours. Mitigation: paged loads + parallel per-type workers; backfill runs offline before subscriber goes live.
  • R5 — Ranking quality. Time-decay constants are guesses. Plan: ship the formula, add a metadata.boost_score per object type later when we have telemetry. Out of scope for v1.
  • R6 — BlockNote schema changes. If document format changes, the flattener breaks. Mitigation: snapshot tests on real BlockNote payloads + version detection.

Open questions

All resolved 2026-05-13:

  • Q1 — Snippet HTML sanitization → Server-side rebuild. The query path emits ts_headline with controlled <mark> sentinels; the search service splits on sentinels, HTML-escapes each text segment, re-wraps match segments in <mark>, returns a string the client can render via a trusted-string component. No DOMPurify dependency on the client.
  • Q2 — Query length cap → 200 chars. Enforced in Zod input schema.
  • Q3 — time_entries.notes → Index only when non-empty (notes IS NOT NULL AND notes <> ''). No minimum-length threshold.
  • Q4 — Board/Category/Tag → Result rows, not filter chips. Surfaced as normal results.
  • Q5 — interactions → Uses current schema. Per migration 20250530000000_improve_interactions_schema.cjs, description was renamed to title and a new notes text column was added. notes stores BlockNote JSON. Indexer: title=title, body=flattenBlockNote(notes) truncated to 64 KB, subtitle=interaction type name + counterparty.

19. CE / EE Extension Mechanism

The entire search subsystem ships in CE. EE never forks any of these files; it extends via a single hook.

19.1 Registry merge

server/src/lib/search/index.ts imports two indexer arrays and merges them:

import { ceIndexers } from './indexers';                       // CE indexers (27 entries at v1)
import { eeIndexers } from 'ee/server/src/lib/search/indexers'; // STUBBED in CE (returns [])

const registry = new Map<string, EntityIndexer>(
  [...ceIndexers, ...eeIndexers].map(i => [i.objectType, i])
);

export function getIndexer(objectType: string): EntityIndexer | undefined {
  return registry.get(objectType);
}

export function allIndexers(): EntityIndexer[] {
  return [...registry.values()];
}

export function registeredObjectTypes(): string[] {
  return [...registry.keys()];
}

In CE builds, the import path ee/server/src/lib/search/indexers resolves to a stub file (existing CE/EE stub pattern in the repo) that exports export const eeIndexers: EntityIndexer[] = []. In EE builds, it resolves to the real EE module with EE-specific indexers (e.g., chat history, AI conversations, extension content).

19.2 Schema is shared

app_search_index.object_type is a plain text column, not an enum. CE and EE share the same table. CE-only deploys simply have no rows where object_type is an EE-only type.

19.3 Event subscriber is registry-driven

searchIndexSubscriber subscribes to the union of sourceEvents from allIndexers(). In CE, that union excludes EE event types. In EE, the same code (no fork) subscribes to additional event families because EE's indexers brought them in.

19.4 UI is data-driven

The results page filter chips and group headers iterate over registeredObjectTypes() so EE entity types appear automatically in EE builds without any UI code change. i18n keys for EE entity labels live in the EE locale namespace and are looked up by search.filters.{objectType} / search.groups.{objectType} — CE renders a fallback (humanized object_type) if the key is missing in CE.

19.5 Orphan safety on edition transition

If a deploy transitions from EE to CE (or an EE feature is removed), index rows whose object_type no longer has a registered indexer become invisible: the query layer always restricts object_type = ANY(registeredObjectTypes()). The orphans sit harmlessly in the table. Reconciliation skips object_types without registered indexers — it does NOT attempt to load the source row, because it has no indexer to do so. A maintenance script (out of v1 scope) can purge these rows on demand.

19.6 What EE must implement

For each EE-only entity that should be searchable, EE provides:

  1. An entry in ee/server/src/lib/search/indexers/index.ts (export const eeIndexers: EntityIndexer[]).
  2. A per-entity indexer module under ee/server/src/lib/search/indexers/<entity>.ts, following the same EntityIndexer interface as CE indexers.
  3. Event types in packages/event-schemas (or wherever EE event extensions live) with Zod payload schemas, plus publishes at the relevant EE action sites.
  4. i18n entries for the entity's filter/group label in the EE locale namespace.

EE does not touch the CE registry file, the subscriber, the search action, the query builder, or the UI.


End of PRD.