Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

228 lines
11 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# PRD: Integration Workflow Modules
- **Status:** Draft
- **Owner:** Robert Isaacs
- **Created:** 2026-06-12
- **Design:** `../2026-06-12-integration-workflow-modules-design.md` (architecture authority)
- **Branch:** `feature/integration-workflow-modules` off `main`
## 1. Problem statement & user value
The workflow designer can host integration-specific "app" modules — palette
tiles exposing an integration's operations as typed workflow actions, shown
only when that integration is connected — but exactly one exists (NinjaOne,
6 actions), its availability check is a hardcoded if-chain, and its module
wiring is inlined in `core.ts`. Meanwhile the operations MSPs actually
orchestrate across tools (run a script on an endpoint, trigger an RMM
automation, resolve a security incident, post to a Teams channel, put a
technician on the schedule) are absent from the palette, so the
"self-healing alert" loop the RMM alert events enable cannot be finished
inside Alga.
This project generalizes the module plumbing so each integration is one
self-contained registration, then ships four new modules (Tactical RMM,
Level, Huntress, Microsoft Teams), expands NinjaOne with script execution,
and adds a core `scheduling.create_entry` action. Incumbent PSAs offer no
user-composable canvas over third-party tools; the market's answer is a
separate $5001,500/mo orchestration product. This puts it in the PSA.
## 2. Goals
- Adding integration #6 requires one file plus one `core.ts` line — no
framework edits, no availability if-chain.
- A connected Tactical RMM tenant can run a stored script or raw command on
an agent from a workflow and use its output in later steps.
- A connected Level tenant can trigger a Level automation (optionally scoped
to specific devices) and check its run status.
- A connected Huntress tenant can enrich tickets from incident/agent/org
data and resolve the Huntress incident when work completes.
- A tenant with the Teams app active can notify a user's activity feed, DM a
user via the bot, and post to a channel where the app is installed.
- Workflows can create dispatch-board schedule entries that ride the
existing calendar sync to a technician's connected external calendar.
- Palette tiles appear if and only if the integration is available for the
tenant; NinjaOne behavior is unchanged (parity regression).
## 3. Non-goals
- QuickBooks Online / Stripe modules (financial follow-up, deliberately
deferred), Tanium (pre-release).
- Email app tile (`email.send` already covers tenant-provider send) and a
direct Calendar app tile (schedule-entry approach chosen instead).
- Huntress remediation approve/reject (sharp edge; revisit on demand).
- DB-backed module catalog or per-tenant module curation.
- Prebuilt workflow recipe gallery (launch follow-up, separate effort).
- New vendor capabilities beyond those listed (e.g. Tactical service
control, NinjaOne patch apply).
## 4. Personas & primary flows
- **Automation-minded MSP engineer:** builds "disk-full self-remediation":
RMM alert trigger → `tacticalrmm.agents.run_script` (cleanup script) →
ticket note with output → `levelio.alerts.resolve` / alert reset → ticket
closes. Builds "Huntress enrichment": ticket-created trigger →
`huntress.incidents.get` + `huntress.agents.get` → ticket fields/note →
on close, `huntress.incidents.resolve`.
- **Dispatcher:** escalation workflow posts to the service-desk Teams
channel and creates a schedule entry for the on-call technician.
- **Tenant admin:** connects/disconnects integrations and sees palette
tiles appear/disappear accordingly; no configuration beyond the
integration itself.
## 5. Framework changes (build first)
### 5.1 Availability resolver registry
New registry in `shared/workflow/runtime/registries/` (sibling of
`integrationModuleRegistry.ts`): resolver type
`(knex, tenantId) => Promise<boolean>` registered under a module's
`availabilityKey`. `loadAvailableFirstPartyIntegrationAppKeys`
(`ee/packages/workflows/src/actions/workflow-runtime-v2-actions.ts`)
replaces its if-chain with registry lookups; a module whose key has no
registered resolver is **not** available (fail closed). Resolver errors are
caught and treated as unavailable (palette listing must not 500 because one
vendor table is missing).
### 5.2 RMM availability factory
`rmmIntegrationAvailability(provider)` returns a resolver checking
`rmm_integrations` for `(tenant, provider)` with `is_active = true` and
`connected_at IS NOT NULL` — used by all four RMM modules.
### 5.3 One-call module registration + NinjaOne migration
`registerIntegrationWorkflowModule({ module, availability, registerActions })`
in `ee/packages/workflows`: registers actions, the module tile, and the
availability resolver, idempotently. `core.ts` becomes one call per
integration. NinjaOne migrates onto the helper and the RMM factory; the
hardcoded `'rmm:ninjaone'` branch is deleted. Regression bar: identical
palette/catalog output for connected and disconnected NinjaOne tenants.
## 6. Modules
Conventions for every action: `provider.noun.verb` ID, Zod input/output
schemas, `ui: { label, description, category, icon }`, correct
`sideEffectful`, `idempotency: { mode: 'engineProvided' }`, handler errors
thrown (engine normalizes). Vendor endpoint paths marked *verify* must be
confirmed against vendor docs during implementation.
### 6.1 NinjaOne expansion (`app:ninjaone`)
`FetchNinjaOneWorkflowClient` gains `runScript` (`POST
/v2/device/{id}/script/run`; type SCRIPT|ACTION, parameters, runAs) and a
scripting-options discovery call (*verify*, expected `GET
/v2/device/{id}/scripting/options`). New actions
`ninjaone.devices.run_script` (side-effectful) and
`ninjaone.devices.scripting_options` (read) join the existing six in
`allowedActionIds`.
### 6.2 Tactical RMM (`app:tacticalrmm`)
Client: extend `TacticalRmmClient`
(`packages/integrations/src/lib/rmm/tacticalrmm/tacticalApiClient.ts`) with
agent-detail, script-list, run-script, run-command, reboot wrappers
(*verify* paths: `/scripts/`, `/agents/{agent_id}/runscript/`,
`/agents/{agent_id}/cmd/`, `/agents/{agent_id}/reboot/`). The workflows
package takes a dependency on `packages/integrations`; runtime support
resolves the integration row + tenant secrets (`tacticalrmm_api_key` or
Knox trio) exactly as the existing sync path does.
Actions: `agents.find`, `agents.get`, `scripts.list`,
`agents.run_script` (returns output), `agents.run_command`,
`agents.reboot` (last three side-effectful).
### 6.3 Level (`app:levelio`)
Client: thin fetch client inside the workflows package (NinjaOne pattern —
`ee/server` client not importable). Tenant secret `levelio_api_key`;
cursor pagination as in the existing client.
Actions: `devices.find`, `devices.get`, `alerts.list_active`,
`alerts.resolve` (side-effectful, `POST /v2/alerts/{id}/resolve`),
`updates.list`, `automations.list` (automations + their webhook tokens),
`automations.trigger` (side-effectful, `POST
/v2/automations/webhooks/{token}` with optional `device_ids[]`; actionable
error when the automation has no webhook trigger configured in Level),
`automations.run_status`.
### 6.4 Huntress (`app:huntress`)
Client: thin fetch client in the workflows package; Basic auth from tenant
secrets `huntress_api_key`/`huntress_api_secret`; replicate the 60 req/min
throttle and 429 backoff of the `ee/server` client.
Actions: `incidents.find`, `incidents.get`, `incidents.resolve`
(side-effectful; Huntress write API, *verify* exact endpoint),
`organizations.list`, `agents.get`, `account.get`.
### 6.5 Microsoft Teams (`app:teams`)
Availability: own resolver — `teams_integrations.install_status = 'active'`
AND the Teams add-on active for the tenant (mirror the delivery path's
checks). The workflows package depends on `ee/packages/microsoft-teams`.
Actions (all side-effectful):
- `teams.notify_user` — Graph `sendActivityNotification` through the
existing delivery path with a generic template; target is an Alga user
with a linked Microsoft account; actionable error otherwise.
- `teams.send_dm` — proactive Bot Framework message (text + optional card)
via stored `teams_conversation_references`; actionable error if the user
has never opened the bot.
- `teams.post_to_channel` — requires **new** `createConversation` support
in `ee/packages/microsoft-teams` (proactive channel conversation via Bot
Framework; Graph app-only channel posting is Microsoft-protected, the bot
route is deliberate). Works in any channel of a team where the Alga
Teams app is installed; actionable error otherwise.
## 7. Core action: `scheduling.create_entry`
Shared (CE+EE) action in the `scheduling.*` business-operations family.
Inputs: assigned user(s), title, start/end, optional ticket/project link,
optional status/notes. Creates a dispatch-board schedule entry through the
same model layer the UI uses; the existing calendar sync pushes it outward
to a connected user calendar. Validation errors (unknown user, end before
start) are explicit.
## 8. Designer surface
Icon tokens for `tacticalrmm`, `levelio`, `huntress`, `teams` added to the
designer icon set, reusing the integration logos already shipped in the
settings UI. Palette grouping/ordering verified for the new `app:*` groups
(extend `PALETTE_CATEGORY_ORDER` handling only if required).
## 9. Error handling conventions
Handlers throw; the Temporal activity layer normalizes to the runtime error
payload and stamps `workflow_action_invocations` FAILED. Vendor HTTP errors
surface status + vendor message, never credentials. Most-common failures
get explicit messages: integration not connected/inactive; Level automation
lacks webhook trigger; Teams user unlinked / no bot conversation / app not
in team. Side-effectful actions rely on engine-provided idempotency so
retries are safe.
## 10. Testing & rollout
Unit: handler tests per integration with mocked clients (NinjaOne handler
test pattern); availability-registry and resolver tests; helper idempotency;
catalog gating matrix. Regression: NinjaOne palette parity. Manual smoke on
the local-test stack: palette gating on connect/disconnect, Tactical mock
run-script round-trip, Teams notify/DM/channel in a test tenant,
schedule entry on the dispatch board. No migrations and no new tables; all
gating is read-path, so rollout is inert until an integration is connected.
## 11. Editor behavior for disconnected integrations
Availability gates ADDING, not VIEWING. The designer catalog keeps
first-party integration records when the integration is disconnected,
flagged `available: false`; extension-app filtering is unchanged. The
palette excludes unavailable records (the original hide-when-not-enabled
decision stands). Existing steps referencing a disconnected integration
keep full group context: the grouped action config section renders with an
amber "not connected" banner naming the integration, the step card shows a
"Disconnected" badge with an explanatory tooltip, and the input-mapping
editor remains fully usable (it always resolved schemas from the
unfiltered registry). Publish stays allowed; runs fail with the handlers'
actionable INTEGRATION_INACTIVE errors. Publish-time and disconnect-time
warnings were considered and deferred.