PSA/docs/superpowers/specs/2026-06-09-huntress-integration-design.md
Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

313 lines
15 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Huntress Integration — Design
Huntress is a managed security platform whose SOC reviews detections and
publishes incident reports. This integration turns those incident reports into
Alga tickets automatically: an MSP connects a Huntress account with an API key,
maps Huntress organizations to Alga clients, and every SOC-reviewed incident
becomes a routed, self-contained ticket — with no one watching the Huntress
portal.
It is the first security-monitoring integration. It reuses the RMM integration
data model (`rmm_integrations`, `rmm_organization_mappings`, `rmm_alerts`) with
`provider = 'huntress'`, ships EE-only like NinjaOne, and appears in the
integrations settings under a new "Security" section.
## Goals
1. **Incident → ticket, automatic.** Open incident reports become tickets via a
polling engine; no manual steps after setup.
2. **Org mapping that fails safe.** Huntress organization → Alga client, with a
mapping screen and exact-name auto-match. Incidents for unmapped
organizations become tickets on a designated fallback client and triage
board — never silently dropped.
3. **Routing config.** Severity → priority mapping, target board, and optional
category/subcategory, so security tickets land on the security board.
4. **Dedup / update-in-place.** One incident = one ticket; later incident
updates append internal notes to the existing ticket.
5. **Self-contained tickets.** Body contains the SOC summary, indicator types,
affected host details, remediation steps, and a deep link into the Huntress
portal.
## Non-goals (deferred)
- **Webhook ingestion.** Huntress webhooks are configured manually in their
portal and their payloads are not part of the public OpenAPI spec. The poller
is the sole ingestion path. If a webhook endpoint is added later, it should
only *trigger* an immediate poll cycle ("poke"), never be trusted as a data
source.
- **Write-back to Huntress** (`POST /v1/incident_reports/{id}/resolution`).
- **Escalations and signals ingestion** (`/v1/escalations`, `/v1/signals`).
The poller's processor is structured so a second entity stream can be added.
- **Per-client routing overrides.** Day-one routing is per-integration. The
existing `rmm_alert_rules` engine
(`ee/server/src/lib/integrations/ninjaone/alerts/alertProcessor.ts`) is a
natural future home for this; it is untouched by this work.
- **Huntress agent → asset sync engine.** Agents are looked up on demand for
ticket context and best-effort asset linking only.
## External API constraints
Source of truth: the Huntress public API (OpenAPI spec, `api.huntress.io`).
- **Auth:** HTTP Basic — `Base64(api_key:api_secret)`. Keys are generated
per-account at `<subdomain>.huntress.io/account/api_credentials`. There is no
OAuth and no partner-tier API.
- **Rate limit:** 60 requests/minute per account, sliding window.
- **Incident reports:** `GET /v1/incident_reports` supports `limit` (≤500),
`page_token`, `sort_field` (`id|created_at|updated_at`), `sort_direction`,
and filters (`status`, `severity`, `platform`, `organization_id`,
`agent_id`, `indicator_type`). **There is no "updated since" filter** — the
poller must walk pages sorted by `updated_at desc` until it passes its
cursor.
- **Incident fields used:** `id`, `organization_id`, `agent_id`, `severity`
(`low|high|critical`), `status` (`sent|closed|dismissed|auto_remediating|
deleting|partner_dismissed`), `subject`, `summary`, `body`,
`indicator_types`, `indicator_counts`, `platform`, `remediations` (first 10
inline), `sent_at`, `status_updated_at`, `closed_at`, `updated_at`.
- **Organizations:** `GET /v1/organizations``id`, `name`, `key`.
- **Agents:** `GET /v1/agents/{id}``hostname`, `platform`, `os`,
`ipv4_address`, `external_ip`, `serial_number`, `last_callback_at`.
- **Account:** `GET /v1/account``name`, `subdomain`. The subdomain is
captured at connect time to build portal deep links.
## Data model
No new tables. One migration-free reuse of the RMM schema
(`server/migrations/20251124000001_create_rmm_integration_tables.cjs`);
`rmm_integrations.provider` is an unconstrained string, so `'huntress'` needs
no migration.
### `rmm_integrations` (one row per tenant)
- `provider = 'huntress'`, `instance_url` = API base URL (default
`https://api.huntress.io`), `is_active`, `connected_at`, `sync_status`,
`sync_error`, `last_incremental_sync_at`.
- `settings` JSONB:
```jsonc
{
"accountName": "Acme MSP",
"accountSubdomain": "acmemsp", // for portal deep links
"incidentCursor": "2026-06-09T14:00:00Z", // max updated_at fully processed
"pollIntervalMinutes": 5,
"backfillDays": 7,
"severityPriorityMap": { // huntress severity -> priority_id
"critical": "<uuid>",
"high": "<uuid>",
"low": "<uuid>"
},
"boardId": "<uuid>", // required: security board
"categoryId": "<uuid|null>",
"subcategoryId": "<uuid|null>",
"fallbackClientId": "<uuid>", // required before polling activates
"fallbackBoardId": "<uuid>", // required: triage board
"autoCloseTickets": false,
"closedStatusId": "<uuid|null>" // used when autoCloseTickets = true
}
```
### `rmm_organization_mappings` (one row per Huntress organization)
- `external_organization_id` = Huntress org id (stringified),
`external_organization_name`, `client_id` (null = unmapped),
`auto_create_tickets` (default true), `metadata.auto_matched` flag.
- Unique on `(tenant, integration_id, external_organization_id)`.
### `rmm_alerts` (one row per incident report)
- `external_alert_id` = Huntress incident id (stringified) — the unique
constraint `(tenant, integration_id, external_alert_id)` is the dedup
guarantee.
- `severity` = Huntress severity, `status` = Huntress status, `message` =
incident `subject`, `device_name` = agent hostname when known,
`external_device_id` = agent id, `asset_id` when an asset match is found,
`ticket_id` once a ticket exists, `triggered_at` = `sent_at`.
- `metadata` JSONB: incident snapshot (`summary`, `indicator_types`,
`indicator_counts`, `platform`, `organization_id`, remediation summaries,
`status_updated_at`, portal URL, processing-error details when a cycle
fails on this incident).
### `tenant_external_entity_mappings`
When an incident's agent is matched to an existing Alga asset:
`integration_type = 'huntress'`, `alga_entity_type = 'asset'`,
`external_entity_id` = agent id, `external_realm_id` = org id.
### Tickets
- `source = 'huntress'`, `source_reference` = incident id.
- `board_id`, `priority_id`, `category_id`/`subcategory_id` from routing
config; `client_id` from the org mapping (or fallback client).
## Components
All new code is EE, under `ee/server/src/lib/integrations/huntress/`,
mirroring the NinjaOne layout (`ee/server/src/lib/integrations/ninjaone/`):
```
huntress/
huntressClient.ts REST client: Basic auth, throttle to 60 req/min,
429 backoff, page_token pagination
incidents/
incidentPoller.ts cursor walk + dispatch, transport-wrapped
incidentProcessor.ts upsert rmm_alerts; create / note / close decisions
ticketCreator.ts builds the self-contained ticket (NinjaOne
ticketCreator pattern: transaction, default status,
ticket number, comment thread + internal note)
organizations/
orgSync.ts org fetch -> mapping upsert + name auto-match
```
Supporting pieces:
- **Server actions:** `ee/server/src/lib/actions/integrations/huntressActions.ts`
— connect (validate via `GET /v1/account`, store secrets, upsert integration
row, initial org sync), disconnect, get status, update settings, sync
organizations, update a mapping row.
- **Secrets:** tenant-scoped via `ISecretProvider`
(`packages/core/src/lib/secrets/ISecretProvider.ts`):
`huntress_api_key`, `huntress_api_secret`.
- **Settings UI:**
`ee/server/src/components/settings/integrations/HuntressIntegrationSettings.tsx`
(connect card, routing config, poll/auto-close settings) and
`.../integrations/huntress/OrganizationMappingManager.tsx` (org table,
client picker, auto-match indicators, per-row ticket toggle, unmapped-count
badge, re-sync). Wired into
`packages/integrations/src/components/settings/integrations/RmmIntegrationsSetup.tsx`
via the `@enterprise` dynamic-import pattern.
- **Provider registry:**
`packages/integrations/src/lib/rmm/providerRegistry.ts` gains a
`category: 'rmm' | 'security'` field; Huntress registers with
`category: 'security'`, `requiresEnterprise: true`. The setup page renders
one card section per category.
- **Scheduling/transport:** the poll entry point runs through
`runRmmSyncWithTransport()`
(`ee/server/src/lib/integrations/rmm/sync/syncOrchestration.ts`):
`HUNTRESS_SYNC_TRANSPORT``RMM_SYNC_TRANSPORT``'direct'`. A recurring
job registered with the existing job scheduler
(`packages/jobs/src/lib/jobs/jobScheduler.ts`) iterates active Huntress
integrations on each tick and skips tenants whose `pollIntervalMinutes`
hasn't elapsed since `last_incremental_sync_at`.
## Flows
### Connect
1. User enters API key + secret (and optionally a base URL) in the Huntress
settings card.
2. Action calls `GET /v1/account`; on success stores secrets, upserts the
`rmm_integrations` row with `accountName`/`accountSubdomain`, and runs the
initial org sync.
3. Polling stays inactive until routing config is complete: `boardId`,
`fallbackClientId`, `fallbackBoardId`, and a full `severityPriorityMap`
(pre-filled at connect by case-insensitive name match against the tenant's
priorities: Critical/Urgent, High, Medium).
### Organization sync & auto-match
1. Fetch all organizations; upsert mapping rows by external org id (names
refresh on every sync).
2. For rows with `client_id IS NULL`, compare normalized names
(lowercase, collapse whitespace, strip punctuation) against the tenant's
clients. An exact normalized match links the client and sets
`metadata.auto_matched = true`; anything weaker is left unmapped for the
user. Auto-matched rows are visibly flagged in the UI and editable.
### Poll cycle (per integration)
1. List `/v1/incident_reports` sorted `updated_at desc`, page size 500,
collecting rows until one is older than `incidentCursor 60s` (overlap
absorbs clock skew; dedup makes reprocessing harmless). First run instead
collects back to `now backfillDays`.
2. Process collected incidents in ascending `updated_at` order through the
incident processor.
3. Advance `incidentCursor` past each successfully processed incident; stop
advancing at the first failure (that incident and everything newer is
retried next cycle) and record the error on the alert row's metadata.
4. On auth or API failure: `sync_status = 'error'`, `sync_error` set, banner
in the settings UI. Nothing is lost — incidents remain in Huntress and the
cursor resumes when the credential is fixed.
### Incident processing
For each incident, upsert the `rmm_alerts` row, then:
- **New, status open (`sent`, `auto_remediating`):** resolve the org mapping.
- Mapped client and `auto_create_tickets` on → create ticket with routing
config.
- Unmapped org (no row, or `client_id IS NULL`) → fetch and upsert the org
mapping row if it's unknown, then create the ticket on
`fallbackClientId`/`fallbackBoardId` with an `[Unmapped Org]` title prefix
and a note explaining how to map the organization.
- `auto_create_tickets` explicitly off on the mapping row (mapped or not) →
record the alert row only; the user opted that organization out.
- **New, status already closed/dismissed** (backfill case): record the alert
row without a ticket.
- **Status `deleting`:** skip.
- **Existing row, changed** (`updated_at`, status, or remediation state):
append an internal note to the linked ticket describing what changed. If the
new status is `closed`, `dismissed`, or `partner_dismissed` and
`autoCloseTickets` is on, also move the ticket to `closedStatusId`.
Ticket creation and the alert-row `ticket_id` update happen in one
transaction; the dedup constraint plus the `ticket_id` check make creation
idempotent across overlapping polls.
### Ticket content
Title: `[Huntress] <severity> — <subject>` (fallback tickets additionally get
the `[Unmapped Org]` prefix). Body sections:
1. **Incident** — severity, status, SOC analyst summary, indicator types and
counts, platform, sent/updated timestamps.
2. **Affected host** — hostname, OS, internal/external IPs, serial number,
last callback (from `GET /v1/agents/{id}`; omitted for host-less incidents
such as `microsoft_365` identity cases, which show the org context
instead).
3. **Remediations** — the inline remediation steps with status.
4. **Links** — deep link to the incident in the Huntress portal, built from
`accountSubdomain`. The exact portal path is confirmed against a live
account during implementation; the builder is isolated in one function so
a path correction is a one-line change.
An internal note (NinjaOne `ticketCreator.ts` pattern: comment thread first,
then the internal-note comment) records the raw incident identifiers for
audit.
### Asset linking (best effort)
If the incident has an `agent_id` and the org is mapped: look up the agent,
match `hostname` case-insensitively (tie-break on `serial_number`) against
the mapped client's assets. A unique match links the ticket via
`asset_associations`, sets `rmm_alerts.asset_id`, and upserts the
`tenant_external_entity_mappings` row. Ambiguous or missing matches skip
linking — host details are already in the body.
## Error handling summary
| Failure | Behavior |
| --- | --- |
| Huntress 401/403 | Integration `sync_status='error'` + UI banner; poll resumes on fix; no data loss |
| Huntress 429 | Client-side throttle should prevent it; on occurrence, back off and finish the cycle late |
| Per-incident processing error | Cursor stops before it; error stored in alert metadata; retried next cycle |
| Unmapped organization | Ticket on fallback client/triage board, flagged — never dropped |
| Unknown org id mid-poll | Org fetched on demand, unmapped mapping row created, fallback routing |
| Fallback config missing | Polling refuses to activate; settings UI requires it during setup |
| Agent lookup failure | Ticket still created; host section omitted with a note |
## Testing
- **Unit (mocked client interface):** cursor walker — pagination, overlap
window, backfill, failure-stops-cursor; incident processor — every state
transition (new/update/close × mapped/unmapped/auto-create-off ×
auto-close on/off); severity→priority resolution; name-normalization
auto-match; ticket-body builder (host-less incidents, remediation
rendering, deep link).
- **Integration (repo harness, seeded tenant):** connect → org sync →
poll → ticket created with correct board/priority/category → second poll
with an updated incident appends a note, not a ticket → closed incident
with auto-close on resolves the ticket; unmapped-org incident lands on the
fallback client/board.
- **Manual smoke:** against a real Huntress account — verify the portal deep
link path and the 60 req/min throttle behavior.