Some checks are pending
Bidi Control Character Guard / bidi-control-guard (push) Waiting to run
Circular Dependency Check / Check for new circular dependencies (push) Waiting to run
Citus Migration Smoke / Combined migrations on single-node Citus (push) Waiting to run
E2E Fresh Install Tests / fresh-install-e2e (push) Waiting to run
ext-v2 guardrails / Run ext-v2 guard and ESLint (push) Waiting to run
Integration Tests / Check for relevant changes (push) Waiting to run
Integration Tests / ${{ (github.event_name == 'schedule' || github.event.inputs.suite == 'full') && 'Full integration suite' || 'Tier-1 integration subset' }} (push) Blocked by required conditions
Mobile checks / Mobile lint + typecheck (push) Waiting to run
Mobile checks / Mobile unit tests (push) Waiting to run
Mobile checks / Mobile dependency audit (report) (push) Waiting to run
Mobile checks / Mobile reproducibility checks (push) Waiting to run
Secrets guard (env backups) / Ensure no tracked env backup files (push) Waiting to run
Temporal Readiness / fast-readiness (push) Waiting to run
Temporal Readiness / docker-parity (push) Waiting to run
TypeScript Type Check / Nx affected typecheck (push) Waiting to run
Unit Tests / Skipped-test budget (push) Waiting to run
Unit Tests / Nx affected unit tests (push) Waiting to run
Unit Tests / Server unit coverage (informational) (push) Waiting to run
Validate Tenant Management Schema / Check for relevant changes (push) Waiting to run
Validate Tenant Management Schema / Validate Tenant Management Schema (push) Blocked by required conditions
EE Workflows Build Guard / ee-workflows-build-guard (push) Waiting to run
Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz Source: /opt/alga-psa on psa.joliet.tech
188 lines
12 KiB
Markdown
188 lines
12 KiB
Markdown
# PRD — Extension scheduled tasks (endpoint-based)
|
||
|
||
- Slug: `extension-scheduled-tasks`
|
||
- Date: `2026-01-01`
|
||
- Status: Draft
|
||
|
||
## Summary
|
||
Add “scheduled tasks” for EE extensions by allowing tenant admins to configure cron-based schedules that invoke a *manifest-declared API endpoint* on an installed extension, without requiring an external caller.
|
||
|
||
Implementation approach:
|
||
- Schedules are configured in the extension settings UI and via APIs.
|
||
- Each schedule points to an `endpoint_id` in a normalized endpoint table (strong guarantees).
|
||
- At runtime, a scheduler invokes the Runner using the existing `/v1/execute` contract with a synthetic `http` payload.
|
||
- When an extension install is updated to a new version, schedules are remapped to the new version’s endpoints by matching `(method,path)`; updates are blocked if any schedule cannot be remapped (unless explicitly overridden by admin action).
|
||
- All cleanup is handled via business logic (no DB cascades; Citus constraints).
|
||
|
||
## Problem
|
||
Extensions often need to “do something periodically” (sync, reconciliation, refresh caches, send reminders, pull from external systems). Today, extensions can only run when the host receives an HTTP request to `/api/ext/[extensionId]/[[...path]]`, which requires an external trigger.
|
||
|
||
We need a first-party, tenant-scoped scheduling mechanism that:
|
||
- Works with the Runner/Gateway execution model (out-of-process WASM execution)
|
||
- Uses existing install-scoped config/providers/secrets
|
||
- Is observable, auditable, and controllable by tenant admins
|
||
- Behaves predictably across extension version upgrades
|
||
|
||
## Goals
|
||
- Allow tenant admins to create/edit/enable/disable scheduled invocations for an installed extension.
|
||
- Schedules invoke a selected endpoint from the extension’s manifest-declared endpoint list (method + path), using the same Runner execute pipeline as HTTP gateway invocations.
|
||
- Provide safe, tenant-scoped controls: “run now”, view next run, view last run, view history/status.
|
||
- Remap schedules on extension update when possible; block version updates when scheduled endpoints are removed/changed (per policy).
|
||
- Enforce platform constraints: job runner abstraction (Temporal in EE), no cascades, clean deletion, and full tenant isolation.
|
||
|
||
## Non-goals
|
||
- General event bus / arbitrary “events” delivery to extensions (this plan uses synthetic HTTP invocations).
|
||
- Extension-defined dynamic schedule creation via host APIs (e.g., `alga.scheduler.register`).
|
||
- Sub-minute, high-frequency scheduling and/or large-scale fanout scheduling without explicit quota/limits work.
|
||
- A full UI for complex parameter schemas per endpoint (optional “body template” can be added later).
|
||
|
||
## Users and Primary Flows
|
||
Primary persona: MSP / tenant admin configuring an extension install.
|
||
|
||
### Flow: Create a schedule
|
||
1. Admin opens `Settings → Extensions → <Extension> → Settings`.
|
||
2. Admin navigates to “Schedules” section.
|
||
3. Admin clicks “Add schedule”.
|
||
4. Admin selects an endpoint (method + path) from a dropdown sourced from the installed version’s manifest endpoints.
|
||
5. Admin sets schedule (cron + timezone), optional payload, and enables it.
|
||
6. System creates a durable schedule in the job runner and persists schedule configuration.
|
||
|
||
### Flow: Run now
|
||
1. Admin clicks “Run now” on a schedule.
|
||
2. System immediately triggers an execution using the same endpoint settings; execution is logged and visible in schedule history.
|
||
|
||
### Flow: Update extension version (with schedules)
|
||
1. Admin updates the extension install to a new version.
|
||
2. System attempts to remap schedules to new version endpoints by matching `(method,path)`.
|
||
3. If all schedules remap, update succeeds.
|
||
4. If any scheduled endpoint is missing, update is blocked with a clear list of affected schedules; admin can either edit schedules to valid endpoints or explicitly disable affected schedules and proceed (policy-dependent).
|
||
|
||
## UX / UI Notes
|
||
- Extension settings UI (`ee/server/src/components/settings/extensions/ExtensionSettings.tsx`) gains a “Schedules” card/section:
|
||
- List schedules: name (optional), enabled, endpoint (method + path), cron + timezone, last run status, next run time (best-effort), actions (Run now / Edit / Disable / Delete).
|
||
- Create/edit modal or inline form:
|
||
- Endpoint dropdown: sourced from `extension_api_endpoint` for the current installed `version_id`.
|
||
- Cron string input with validation feedback.
|
||
- Timezone selector (defaults to tenant timezone if available; fallback UTC).
|
||
- Optional JSON payload body (validated) and optional “headers” (likely restricted / optional).
|
||
- Upgrade-block UX: if extension update is blocked due to missing endpoints, present explicit list and remediation actions.
|
||
|
||
## Requirements
|
||
|
||
### Functional Requirements
|
||
#### Endpoint materialization
|
||
- Persist each version’s manifest-declared endpoints into a normalized DB table (one row per `{version_id, method, path}`), producing stable `endpoint_id` values.
|
||
- Provide an API to list endpoints for an installed extension (current version).
|
||
|
||
#### Schedule CRUD and execution
|
||
- Create schedule for a tenant extension install:
|
||
- Inputs: `install_id`, `endpoint_id`, `cron`, `timezone`, `enabled`, optional `payload_json`.
|
||
- Output: schedule record including durable runner schedule id and/or associated job id.
|
||
- Update schedule (including changing endpoint and schedule expression).
|
||
- Enable/disable schedule without deleting configuration.
|
||
- Delete schedule:
|
||
- Must delete the underlying durable schedule (Temporal schedule / PG Boss schedule) and remove DB records.
|
||
- Run schedule immediately (“run now”):
|
||
- Must execute using the same endpoint selection and record execution.
|
||
|
||
#### Remap on extension update
|
||
- When changing the installed `version_id` for a tenant install, attempt to remap schedules:
|
||
- For each schedule, determine old endpoint’s `(method,path)`.
|
||
- Find matching endpoint in the new version by `(method,path)`.
|
||
- Update schedule rows to the new `endpoint_id` if found.
|
||
- If any schedule cannot remap, block the version update (default policy), returning a structured error with affected schedule ids.
|
||
- Provide a controlled override action to proceed by disabling affected schedules (optional but recommended for usability).
|
||
|
||
#### Cleanup (no cascades)
|
||
- On uninstall/disable/tenant cleanup, delete schedules and any derived job runner resources via business logic.
|
||
- On version deletion or registry cleanup, delete `extension_api_endpoint` rows via business logic (no DB cascade assumptions).
|
||
|
||
### Non-functional Requirements
|
||
- Tenant isolation: all schedule operations scoped to a tenant and install.
|
||
- Reliability: schedules are durable (Temporal schedules in EE), at-least-once execution semantics.
|
||
- Safety: cron frequency limits and per-tenant caps to prevent abuse.
|
||
- Idempotency: each run has a stable invocation id; “run now” uses idempotency keys to avoid accidental duplicates.
|
||
- Backpressure: no overlapping execution per schedule by default (configurable later).
|
||
|
||
## Data / API / Integrations
|
||
### Proposed schema (EE registry/admin DB)
|
||
New table: `extension_api_endpoint`
|
||
- `id` (uuid)
|
||
- `version_id` (FK-like reference to `extension_version.id`; enforced in application logic as needed)
|
||
- `method` (string; normalized)
|
||
- `path` (string; normalized)
|
||
- `handler` (string)
|
||
- `created_at`, `updated_at`
|
||
- Unique constraint: `(version_id, method, path)`
|
||
|
||
New table: `tenant_extension_schedule`
|
||
- `id` (uuid)
|
||
- `install_id` (uuid; references `tenant_extension_install.id` via business logic)
|
||
- `tenant_id` (string; duplicated for query locality + enforcement)
|
||
- `endpoint_id` (uuid; references `extension_api_endpoint.id` via business logic)
|
||
- `name` (string nullable)
|
||
- `cron` (string)
|
||
- `timezone` (string)
|
||
- `enabled` (bool)
|
||
- `payload_json` (jsonb nullable) — request body template for synthetic HTTP invocation
|
||
- `job_id` (uuid/string nullable) — if we store our own job record association
|
||
- `runner_schedule_id` (string nullable) — external schedule id for Temporal/pgboss (if not using `jobs` table)
|
||
- `last_run_at`, `last_run_status`, `last_error` (nullable)
|
||
- `created_at`, `updated_at`, `deleted_at` (optional soft delete)
|
||
|
||
Execution logs can reuse/extend `extension_execution_log` and/or add `schedule_id` to correlate runs.
|
||
|
||
### APIs (EE)
|
||
- List endpoints for install (current version): `GET /api/extensions/{extensionId}/endpoints` (tenant-scoped; uses install resolution to find version)
|
||
- Schedule CRUD:
|
||
- `GET /api/extensions/{extensionId}/schedules`
|
||
- `POST /api/extensions/{extensionId}/schedules`
|
||
- `PATCH /api/extensions/{extensionId}/schedules/{scheduleId}`
|
||
- `POST /api/extensions/{extensionId}/schedules/{scheduleId}/run-now`
|
||
- `DELETE /api/extensions/{extensionId}/schedules/{scheduleId}`
|
||
- Install update/remap hooks live in install/update service layer (not only UI).
|
||
|
||
### Execution payload to Runner
|
||
Use existing `/v1/execute` request body format with synthetic `http`:
|
||
- `http.method` and `http.path` come from selected endpoint
|
||
- `http.body_b64` comes from `payload_json` serialized as JSON (if provided)
|
||
- `context` includes `schedule_id`, `scheduled_for`, `trigger = "schedule"` (field naming TBD, but must be present for logs/metrics)
|
||
|
||
## Security / Permissions
|
||
- Only users with extension admin privileges can manage schedules (align with extension settings permissions).
|
||
- Validate that `endpoint_id` belongs to the installed version’s `version_id` (or the remapped version when updating).
|
||
- Restrict which headers can be injected for scheduled calls (default: none; use payload + config/secrets instead).
|
||
- Apply quotas/limits for schedules (per tenant/per install):
|
||
- max schedules per install
|
||
- min interval / cron frequency guardrails
|
||
- max “run now” rate
|
||
|
||
## Observability
|
||
- Log schedule CRUD actions (who changed what; audit trail).
|
||
- Log each scheduled invocation with:
|
||
- `schedule_id`, `install_id`, `registry_id`, `version_id`, `content_hash`
|
||
- execution start/finish, status, error summary
|
||
- Expose metrics: runs, failures, duration, retries, skipped/disabled counts.
|
||
|
||
## Rollout / Migration
|
||
- Add new DB tables via EE migration.
|
||
- Backfill endpoints for existing versions:
|
||
- Either on first access (lazy materialization) or via a one-time backfill job.
|
||
- Feature flag the UI section initially (optional).
|
||
- Ship read-only endpoint listing first, then schedule creation, then update/remap enforcement.
|
||
|
||
## Open Questions
|
||
- How do we surface tenant timezone today (and where is it stored)?
|
||
- Should update be blocked by default, or should we default to “disable affected schedules and proceed” with explicit confirmation?
|
||
- How should we handle endpoints with path params for schedules (e.g., `/things/:id`)? (Likely prohibit selection or require static substitutions/payload template.)
|
||
- Do we allow a per-schedule request body template only, or also query string?
|
||
- Do we need “next run time” calculation in-app (cron parser) or via job runner introspection only?
|
||
|
||
## Acceptance Criteria (Definition of Done)
|
||
- A tenant admin can create a scheduled task for an installed extension by selecting a manifest endpoint, providing cron/timezone, and enabling it.
|
||
- Scheduled tasks execute through the existing Runner `POST /v1/execute` pathway using synthetic HTTP payloads, with install-scoped config/providers/secrets applied.
|
||
- Admin can run-now/disable/delete schedules; deletion cleans up job runner schedules and DB records (no cascades).
|
||
- Extension version update attempts remap schedules by `(method,path)`; update is blocked when any schedule cannot be remapped (with clear error details and remediation path).
|
||
- Endpoint dropdown is sourced from stored endpoints for the currently installed version, not hard-coded.
|
||
- Logging/metrics provide visibility into schedule runs and failures.
|