alldigital/PSA

Fork 0

Hermes 284313f908

Bidi Control Character Guard / bidi-control-guard (push) Waiting to run

Details

Circular Dependency Check / Check for new circular dependencies (push) Waiting to run

Details

Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run

Details

E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run

Details

ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run

Details

Integration Tests / Check for relevant changes (push) Waiting to run

Details

Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions

Details

Mobile checks / Mobile lint + typecheck (push) Waiting to run

Details

Mobile checks / Mobile unit tests (push) Waiting to run

Details

Mobile checks / Mobile dependency audit (report) (push) Waiting to run

Details

Mobile checks / Mobile reproducibility checks (push) Waiting to run

Details

Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run

Details

Temporal Readiness / fast-readiness (push) Waiting to run

Details

Temporal Readiness / docker-parity (push) Waiting to run

Details

TypeScript Type Check / Nx affected typecheck (push) Waiting to run

Details

Unit Tests / Skipped-test budget (push) Waiting to run

Details

Unit Tests / Nx affected unit tests (push) Waiting to run

Details

Unit Tests / Server unit coverage (informational) (push) Waiting to run

Details

Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run

Details

Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions

Details

EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run

Details

Initial import of AlgaPSA codebase from PSA server

Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech

2026-06-22 16:12:17 -05:00

6.3 KiB

Raw Permalink Blame History

Smoke Tests — RMM Alert Handling

Manual validation for everything the automated core (tests.json) doesn't cover: UI surfaces, live RMM round-trips, Temporal schedule lifecycle, email delivery, and migrations against a real stack.

These steps are derived from the design (docs/plans/2026-06-12-rmm-alert-handling-design.md), not a running build — the feature is not implemented yet. Re-ground exact labels and routes against the real screens when executing, and update this file where they differ.

Preflight

A worktree dev stack with migrations run clean (npx knex migrate:latest from server/ exits 0; spot-check that rmm_alerts has dedup_key and rmm_maintenance_windows exists).
A NinjaOne sandbox connected at Settings → Integrations → RMM with at least one organization mapped to a test client and one device synced as an asset.
A catch-all alert rule: create it in the new Alert Rules section (create-ticket on, a board picked, notify yourself). Creating, editing, reordering, and toggling this rule during preflight doubles as the rules-UI compatibility check.
A test condition you can flip on demand on the sandbox device (e.g., a disk-space or stopped-service condition you control).

Risks this smoke is defending

A real client outage produces no ticket — SLA breach nobody notices.
A flapping condition floods the board — techs drown and miss real alerts.
Tickets close out from under a working tech, or stale alert tickets never close — lost work either way.
A maintenance window swallows alerts forever — "suppressed" silently becomes "lost".
Alga writes wrong state into the customer's RMM, or cleared tickets leave stale alerts piling up in the NinjaOne console.
Alert tickets land on the wrong board/client/priority — misdispatched work and a polluted audit trail.
Notify/automation wiring is dead — teams that rely on alert workflows and emails find out during an incident.
A disconnected integration keeps polling — zombie schedule hammering a revoked credential.

Flows

Flow 1 — a real alert becomes a correctly-routed ticket (risks 1, 6)

Trigger the test condition on the sandbox device. Within a minute, a ticket appears on the board your rule chose, with the rule's priority and assignee, linked to the correct client and asset, titled from your template with the device name filled in, and carrying an initial internal comment with the alert details. Check the asset's drawer: the alert shows in the alerts section with the linked ticket.

Flow 2 — a flapping condition doesn't storm the board (risk 2)

With Flow 1's ticket still open, clear and re-trigger the same condition twice. The board's ticket count for that client does not change; the existing ticket gains "re-triggered — 2nd occurrence" / "3rd occurrence" internal comments.

Flow 3 — alert clears: untouched closes, touched survives (risk 3)

Clear the condition in NinjaOne. The untouched ticket gains an "alert resolved" comment and moves to the closed status automatically. Then re-trigger (new ticket), add a tech comment to it, and clear the condition again: this ticket gains the resolution comment but stays open.

Flow 4 — closing a ticket clears the alert in the RMM (risk 5)

Re-trigger to get an open alert ticket, then close the ticket in Alga. In the NinjaOne console's alerts view, the alert is reset/gone within a minute. Then edit the rule to turn off "reset alert in RMM on ticket close", repeat, and confirm the alert stays active in NinjaOne (the opt-out is honored).

Flow 5 — maintenance window suppresses during, surfaces after (risk 4)

In Maintenance Windows, create a one-off window for the test client covering the next 30 minutes. Trigger the condition: no ticket appears, no notification fires, and the alert shows as suppressed on the asset. Edit the window to end now (or wait it out) while the condition is still active. Within one poll cycle (≤15 min) the alert is processed and the ticket appears. This is the highest-value silent-failure check in the set.

Flow 6 — the poller heals missed webhooks (risks 1, 3)

Break webhook delivery deliberately (delete the webhook registration in NinjaOne, or point it at a dead URL). Trigger one condition and clear a different, previously-ticketed one. Within one poll cycle, the new alert has a ticket and the cleared alert's untouched ticket is closed — same outcomes as webhooks, just slower. Restore the webhook afterward.

Flow 7 — notify and automation wiring is live (risk 7)

With your rule's notify-user set to yourself: trigger an alert and confirm both the in-app notification and the email arrive (and that opting out of the rmm-alert category in your notification preferences suppresses the email on a repeat). In the workflow designer, build a trivial workflow triggered by RMM_ALERT_TRIGGERED (e.g., add a ticket comment), trigger an alert, and confirm the workflow execution log shows a run.

Flow 8 — disconnect tears the poller down (risk 8)

Disconnect the NinjaOne integration in settings. The reconciliation schedule is gone from the Temporal UI and no polling errors accrue in server logs afterward. Reconnect and confirm the schedule reappears.

Variant — TacticalRMM parity (risks 1, 2)

Repeat Flows 1 and 2 against a TacticalRMM instance: webhook alert → routed ticket, repeat → occurrence comment. Confirm a reconciliation cycle runs without errors (outbound reset is expected to be skipped if Tactical ships without an adapter).

Pass criteria

Risks 1/6: every triggered test alert produced exactly one ticket, on the right board, client, priority, and asset — on both providers.
Risk 2: repeated firings never created a second open ticket.
Risk 3: untouched tickets closed themselves on alert clear; the touched ticket survived with its history intact.
Risk 5: NinjaOne's console agreed with Alga after every ticket close — cleared when the rule says reset, untouched when opted out.
Risk 4: nothing fired during the window, and the still-active alert became a ticket after it — suppressed never meant lost.
Risk 7: notification arrived on both channels, preferences were honored, and the alert-triggered workflow ran.
Risk 8: disconnect left no schedule and no errors; reconnect restored it.

6.3 KiB Raw Permalink Blame History

Smoke Tests — RMM Alert Handling

Preflight

Risks this smoke is defending

Flows

Flow 1 — a real alert becomes a correctly-routed ticket (risks 1, 6)

Flow 2 — a flapping condition doesn't storm the board (risk 2)

Flow 3 — alert clears: untouched closes, touched survives (risk 3)

Flow 4 — closing a ticket clears the alert in the RMM (risk 5)

Flow 5 — maintenance window suppresses during, surfaces after (risk 4)

Flow 6 — the poller heals missed webhooks (risks 1, 3)

Flow 7 — notify and automation wiring is live (risk 7)

Flow 8 — disconnect tears the poller down (risk 8)

Variant — TacticalRMM parity (risks 1, 2)

Pass criteria

6.3 KiB

Raw Permalink Blame History