Hermes 284313f908
Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Initial import of AlgaPSA codebase from PSA server
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz

Source: /opt/alga-psa on psa.joliet.tech
2026-06-22 16:12:17 -05:00

12 KiB
Raw Permalink Blame History

PRD — Extension scheduled tasks (endpoint-based)

  • Slug: extension-scheduled-tasks
  • Date: 2026-01-01
  • Status: Draft

Summary

Add “scheduled tasks” for EE extensions by allowing tenant admins to configure cron-based schedules that invoke a manifest-declared API endpoint on an installed extension, without requiring an external caller.

Implementation approach:

  • Schedules are configured in the extension settings UI and via APIs.
  • Each schedule points to an endpoint_id in a normalized endpoint table (strong guarantees).
  • At runtime, a scheduler invokes the Runner using the existing /v1/execute contract with a synthetic http payload.
  • When an extension install is updated to a new version, schedules are remapped to the new versions endpoints by matching (method,path); updates are blocked if any schedule cannot be remapped (unless explicitly overridden by admin action).
  • All cleanup is handled via business logic (no DB cascades; Citus constraints).

Problem

Extensions often need to “do something periodically” (sync, reconciliation, refresh caches, send reminders, pull from external systems). Today, extensions can only run when the host receives an HTTP request to /api/ext/[extensionId]/[[...path]], which requires an external trigger.

We need a first-party, tenant-scoped scheduling mechanism that:

  • Works with the Runner/Gateway execution model (out-of-process WASM execution)
  • Uses existing install-scoped config/providers/secrets
  • Is observable, auditable, and controllable by tenant admins
  • Behaves predictably across extension version upgrades

Goals

  • Allow tenant admins to create/edit/enable/disable scheduled invocations for an installed extension.
  • Schedules invoke a selected endpoint from the extensions manifest-declared endpoint list (method + path), using the same Runner execute pipeline as HTTP gateway invocations.
  • Provide safe, tenant-scoped controls: “run now”, view next run, view last run, view history/status.
  • Remap schedules on extension update when possible; block version updates when scheduled endpoints are removed/changed (per policy).
  • Enforce platform constraints: job runner abstraction (Temporal in EE), no cascades, clean deletion, and full tenant isolation.

Non-goals

  • General event bus / arbitrary “events” delivery to extensions (this plan uses synthetic HTTP invocations).
  • Extension-defined dynamic schedule creation via host APIs (e.g., alga.scheduler.register).
  • Sub-minute, high-frequency scheduling and/or large-scale fanout scheduling without explicit quota/limits work.
  • A full UI for complex parameter schemas per endpoint (optional “body template” can be added later).

Users and Primary Flows

Primary persona: MSP / tenant admin configuring an extension install.

Flow: Create a schedule

  1. Admin opens Settings → Extensions → <Extension> → Settings.
  2. Admin navigates to “Schedules” section.
  3. Admin clicks “Add schedule”.
  4. Admin selects an endpoint (method + path) from a dropdown sourced from the installed versions manifest endpoints.
  5. Admin sets schedule (cron + timezone), optional payload, and enables it.
  6. System creates a durable schedule in the job runner and persists schedule configuration.

Flow: Run now

  1. Admin clicks “Run now” on a schedule.
  2. System immediately triggers an execution using the same endpoint settings; execution is logged and visible in schedule history.

Flow: Update extension version (with schedules)

  1. Admin updates the extension install to a new version.
  2. System attempts to remap schedules to new version endpoints by matching (method,path).
  3. If all schedules remap, update succeeds.
  4. If any scheduled endpoint is missing, update is blocked with a clear list of affected schedules; admin can either edit schedules to valid endpoints or explicitly disable affected schedules and proceed (policy-dependent).

UX / UI Notes

  • Extension settings UI (ee/server/src/components/settings/extensions/ExtensionSettings.tsx) gains a “Schedules” card/section:
    • List schedules: name (optional), enabled, endpoint (method + path), cron + timezone, last run status, next run time (best-effort), actions (Run now / Edit / Disable / Delete).
    • Create/edit modal or inline form:
      • Endpoint dropdown: sourced from extension_api_endpoint for the current installed version_id.
      • Cron string input with validation feedback.
      • Timezone selector (defaults to tenant timezone if available; fallback UTC).
      • Optional JSON payload body (validated) and optional “headers” (likely restricted / optional).
    • Upgrade-block UX: if extension update is blocked due to missing endpoints, present explicit list and remediation actions.

Requirements

Functional Requirements

Endpoint materialization

  • Persist each versions manifest-declared endpoints into a normalized DB table (one row per {version_id, method, path}), producing stable endpoint_id values.
  • Provide an API to list endpoints for an installed extension (current version).

Schedule CRUD and execution

  • Create schedule for a tenant extension install:
    • Inputs: install_id, endpoint_id, cron, timezone, enabled, optional payload_json.
    • Output: schedule record including durable runner schedule id and/or associated job id.
  • Update schedule (including changing endpoint and schedule expression).
  • Enable/disable schedule without deleting configuration.
  • Delete schedule:
    • Must delete the underlying durable schedule (Temporal schedule / PG Boss schedule) and remove DB records.
  • Run schedule immediately (“run now”):
    • Must execute using the same endpoint selection and record execution.

Remap on extension update

  • When changing the installed version_id for a tenant install, attempt to remap schedules:
    • For each schedule, determine old endpoints (method,path).
    • Find matching endpoint in the new version by (method,path).
    • Update schedule rows to the new endpoint_id if found.
  • If any schedule cannot remap, block the version update (default policy), returning a structured error with affected schedule ids.
  • Provide a controlled override action to proceed by disabling affected schedules (optional but recommended for usability).

Cleanup (no cascades)

  • On uninstall/disable/tenant cleanup, delete schedules and any derived job runner resources via business logic.
  • On version deletion or registry cleanup, delete extension_api_endpoint rows via business logic (no DB cascade assumptions).

Non-functional Requirements

  • Tenant isolation: all schedule operations scoped to a tenant and install.
  • Reliability: schedules are durable (Temporal schedules in EE), at-least-once execution semantics.
  • Safety: cron frequency limits and per-tenant caps to prevent abuse.
  • Idempotency: each run has a stable invocation id; “run now” uses idempotency keys to avoid accidental duplicates.
  • Backpressure: no overlapping execution per schedule by default (configurable later).

Data / API / Integrations

Proposed schema (EE registry/admin DB)

New table: extension_api_endpoint

  • id (uuid)
  • version_id (FK-like reference to extension_version.id; enforced in application logic as needed)
  • method (string; normalized)
  • path (string; normalized)
  • handler (string)
  • created_at, updated_at
  • Unique constraint: (version_id, method, path)

New table: tenant_extension_schedule

  • id (uuid)
  • install_id (uuid; references tenant_extension_install.id via business logic)
  • tenant_id (string; duplicated for query locality + enforcement)
  • endpoint_id (uuid; references extension_api_endpoint.id via business logic)
  • name (string nullable)
  • cron (string)
  • timezone (string)
  • enabled (bool)
  • payload_json (jsonb nullable) — request body template for synthetic HTTP invocation
  • job_id (uuid/string nullable) — if we store our own job record association
  • runner_schedule_id (string nullable) — external schedule id for Temporal/pgboss (if not using jobs table)
  • last_run_at, last_run_status, last_error (nullable)
  • created_at, updated_at, deleted_at (optional soft delete)

Execution logs can reuse/extend extension_execution_log and/or add schedule_id to correlate runs.

APIs (EE)

  • List endpoints for install (current version): GET /api/extensions/{extensionId}/endpoints (tenant-scoped; uses install resolution to find version)
  • Schedule CRUD:
    • GET /api/extensions/{extensionId}/schedules
    • POST /api/extensions/{extensionId}/schedules
    • PATCH /api/extensions/{extensionId}/schedules/{scheduleId}
    • POST /api/extensions/{extensionId}/schedules/{scheduleId}/run-now
    • DELETE /api/extensions/{extensionId}/schedules/{scheduleId}
  • Install update/remap hooks live in install/update service layer (not only UI).

Execution payload to Runner

Use existing /v1/execute request body format with synthetic http:

  • http.method and http.path come from selected endpoint
  • http.body_b64 comes from payload_json serialized as JSON (if provided)
  • context includes schedule_id, scheduled_for, trigger = "schedule" (field naming TBD, but must be present for logs/metrics)

Security / Permissions

  • Only users with extension admin privileges can manage schedules (align with extension settings permissions).
  • Validate that endpoint_id belongs to the installed versions version_id (or the remapped version when updating).
  • Restrict which headers can be injected for scheduled calls (default: none; use payload + config/secrets instead).
  • Apply quotas/limits for schedules (per tenant/per install):
    • max schedules per install
    • min interval / cron frequency guardrails
    • max “run now” rate

Observability

  • Log schedule CRUD actions (who changed what; audit trail).
  • Log each scheduled invocation with:
    • schedule_id, install_id, registry_id, version_id, content_hash
    • execution start/finish, status, error summary
  • Expose metrics: runs, failures, duration, retries, skipped/disabled counts.

Rollout / Migration

  • Add new DB tables via EE migration.
  • Backfill endpoints for existing versions:
    • Either on first access (lazy materialization) or via a one-time backfill job.
  • Feature flag the UI section initially (optional).
  • Ship read-only endpoint listing first, then schedule creation, then update/remap enforcement.

Open Questions

  • How do we surface tenant timezone today (and where is it stored)?
  • Should update be blocked by default, or should we default to “disable affected schedules and proceed” with explicit confirmation?
  • How should we handle endpoints with path params for schedules (e.g., /things/:id)? (Likely prohibit selection or require static substitutions/payload template.)
  • Do we allow a per-schedule request body template only, or also query string?
  • Do we need “next run time” calculation in-app (cron parser) or via job runner introspection only?

Acceptance Criteria (Definition of Done)

  • A tenant admin can create a scheduled task for an installed extension by selecting a manifest endpoint, providing cron/timezone, and enabling it.
  • Scheduled tasks execute through the existing Runner POST /v1/execute pathway using synthetic HTTP payloads, with install-scoped config/providers/secrets applied.
  • Admin can run-now/disable/delete schedules; deletion cleans up job runner schedules and DB records (no cascades).
  • Extension version update attempts remap schedules by (method,path); update is blocked when any schedule cannot be remapped (with clear error details and remediation path).
  • Endpoint dropdown is sourced from stored endpoints for the currently installed version, not hard-coded.
  • Logging/metrics provide visibility into schedule runs and failures.