Excluded: .git, node_modules, secrets/, compose.env, assemblyscript tgz Source: /opt/alga-psa on psa.joliet.tech
12 KiB
Workflow Event Sourcing Model
This document provides a comprehensive overview of the event sourcing architecture used in our workflow system. It explains how events are captured, stored, replayed, and used to derive workflow state, with specific focus on action execution, distributed processing, and recovery mechanisms.
Table of Contents
- Introduction to Event Sourcing
- Core Components
- Event Capture and Storage
- Event Replay and State Derivation
- Action Execution Model
- Distributed Execution Across Workers
- Recovery and Resilience
- Optimization Techniques
Introduction to Event Sourcing
Event sourcing is an architectural pattern where all changes to an application's state are captured as a sequence of immutable events. Rather than storing the current state directly, the system derives the current state by replaying these events. This approach provides several benefits:
- Complete audit trail and history of all state changes
- Ability to reconstruct state at any point in time
- Natural fit for distributed and asynchronous processing
- Resilience to system failures and restarts
In our workflow system, event sourcing forms the foundation for workflow execution, enabling complex, long-running processes with seamless recovery capabilities.
Core Components
Our workflow event sourcing implementation consists of these major components:
-
Event Models: Definition of event types and structures
- Key file:
/shared/workflow/persistence/workflowInterfaces.ts
- Key file:
-
Event Storage: Persistence mechanisms for events
- Key file:
/shared/workflow/persistence/workflowEventModel.ts
- Key file:
-
Event Sourcing Engine: Core logic for event replay and state derivation
- Key file:
/shared/workflow/core/workflowEventSourcing.ts
- Key file:
-
Workflow Runtime: Execution engine that processes workflows
- Key file:
/shared/workflow/core/workflowRuntime.ts
- Key file:
-
Action Registry: Registration and execution of workflow actions
- Key file:
/shared/workflow/core/actionRegistry.ts
- Key file:
-
Worker Service: Distributed processing of workflow events
- Key file:
/services/workflow-worker/src/WorkflowWorker.ts
- Key file:
Event Capture and Storage
Event Structure
Events in our system follow a consistent structure defined in workflowInterfaces.ts:
export interface IWorkflowEvent {
event_id: string;
tenant: string;
execution_id: string;
event_name: string;
event_type: string;
from_state: string;
to_state: string;
user_id?: string;
payload?: Record<string, any>;
created_at: string;
}
Each event represents a significant occurrence in a workflow's lifecycle, such as:
- Workflow state transitions
- Action execution results
- External system interactions
- User-triggered events
Storage Mechanism
Events are stored in the workflow_events table and managed through the WorkflowEventModel component, which provides methods to create, retrieve, and query events.
Key functions in workflowEventModel.ts:
create: Persists a new event to the databasegetByExecutionId: Retrieves all events for a specific workflow executiongetByExecutionIdUntil: Retrieves events up to a specific point in time
The system also publishes events to Redis streams for distributed processing:
// From workflowRuntime.ts - enqueueEvent method
await redisStreamClient.publishEvent(streamEvent);
Event Replay and State Derivation
The heart of our event sourcing implementation is the ability to replay events to derive workflow state.
Replay Process
The replay process is implemented in workflowEventSourcing.ts through the replayEvents method:
static async replayEvents(
knex: Knex,
executionId: string,
tenant: string,
options: EventReplayOptions = {}
): Promise<EventReplayResult>
This method:
- Retrieves all events for a workflow execution
- Optionally starts from a snapshot for performance
- Applies each event sequentially to build the current state
- Returns the derived execution state
The applyEvent method handles the specific logic for updating state based on event type:
static applyEvent(
state: Record<string, any>,
event: WorkflowEvent
): Record<string, any>
State Structure
The workflow execution state consists of:
executionId: Unique identifier for the workflow executiontenant: Tenant identifier for multi-tenancycurrentState: Current state name of the workflowdata: Data object containing workflow variables and action resultsevents: Array of processed eventsisComplete: Flag indicating if the workflow has completed
Action Execution Model
Action Registry
Actions are registered through the ActionRegistry component, which maintains a catalog of available actions and handles their execution:
// From actionRegistry.ts
export class ActionRegistry {
private actions: Map<string, ActionDefinition> = new Map();
async executeAction(
actionName: string,
context: ActionExecutionContext
): Promise<any> {
// ...
}
}
Action Proxy
When a workflow executes, it receives an action proxy that provides access to registered actions:
// From workflowRuntime.ts
private createActionProxy(executionId: string, tenant: string): Record<string, any> {
const proxy = {};
// Get all registered actions
const actions = this.actionRegistry.getRegisteredActions();
// Create proxy methods for each action
for (const [actionName, actionDef] of Object.entries(actions)) {
// ...
// Create a function that executes the action
const executeAction = async (params: any) => {
return this.actionRegistry.executeAction(actionName, {
tenant,
executionId,
parameters: params,
idempotencyKey: `${executionId}-${actionName}-${Date.now()}-${Math.random().toString(36).substring(2, 9)}`
});
};
// ...
}
return proxy;
}
Idempotent Action Execution
To prevent duplicate action execution during replays, the system uses idempotency keys:
- Each action execution gets a unique idempotency key
- The system checks if an action with this key has already been executed
- If found, it returns the stored result rather than re-executing the action
This is implemented in workflowActionResultModel.ts with the getByIdempotencyKey method:
getByIdempotencyKey: async (knex: Knex, tenant: string, idempotencyKey: string): Promise<IWorkflowActionResult | null>
Distributed Execution Across Workers
Our workflow system supports distributed execution across multiple worker instances, enabling parallel processing of independent actions.
Event Distribution
Events are distributed through Redis streams, allowing multiple workers to consume and process events:
// From WorkflowWorker.ts
this.redisStreamClient.registerConsumer(
streamName,
this.processGlobalEvent.bind(this)
);
Parallel Action Execution
When a workflow contains parallel actions (such as using Promise.all), these actions can be processed by different worker instances:
// Example workflow code
const a = functionA();
const b = functionB();
await Promise.all([a, b]);
In this code:
functionAandfunctionBare dispatched as independent actions- Different worker instances can pick up and process each action
- Results are communicated back through events
- The workflow continues when both actions complete
Communication Mechanism
Results from distributed action execution are communicated back through events:
- Worker completes an action and stores the result
- Worker publishes a completion event
- The workflow runtime receives the event
- Event listeners for the original promise are notified
- The promise resolves with the action result
This pattern is implemented in the notifyEventListeners method:
// From workflowRuntime.ts
private notifyEventListeners(executionId: string, event: WorkflowEvent): void {
const executionState = this.executionStates.get(executionId);
if (!executionState) return;
// Get listeners for this event
const listeners = executionState.eventListeners?.get(event.name) || [];
// Notify listeners
for (const listener of listeners) {
try {
listener(event);
} catch (error) {
console.error(`Error notifying event listener:`, error);
}
}
}
Recovery and Resilience
One of the key benefits of our event sourcing architecture is its resilience to system failures and restarts.
Workflow State Recovery
If the workflow runtime is restarted, workflow execution can resume seamlessly:
- Events are replayed to reconstruct the workflow state
- The execution continues from where it left off
- Promises waiting on action results are re-established
- When action completion events arrive, execution proceeds
This recovery process is handled through the event replay mechanism described earlier.
In-flight Action Handling
Actions that were in-flight when a failure occurred are handled through:
- Idempotency checks to prevent duplicate execution
- Distributed locks to ensure exclusive processing
- Event-based communication for result propagation
Example of distributed locking:
// From workflowRuntime.ts - processQueuedEvent method
const lockKey = `event:${eventId}:processing`;
const lockOwner = `worker:${workerId}`;
const lockAcquired = await acquireDistributedLock(lockKey, lockOwner, {
waitTimeMs: 5000,
ttlMs: 60000
});
Optimization Techniques
Snapshots
To improve replay performance, the system can create and use snapshots of workflow state:
// From workflowEventSourcing.ts
if (!replayUntil && eventsProcessed > 20 && !debug) {
try {
const newSnapshot: WorkflowStateSnapshot = {
executionId,
tenant,
currentState: executionState.currentState,
data: executionState.data,
version: Date.now(),
timestamp: new Date().toISOString()
};
// Create snapshot asynchronously (don't await)
WorkflowSnapshotModel.create(knex, tenant, newSnapshot)
.then(() => {
// Prune old snapshots to avoid excessive storage
return WorkflowSnapshotModel.pruneSnapshots(knex, tenant, executionId);
})
.catch(error => {
logger.error(`[WorkflowEventSourcing] Error creating snapshot for execution ${executionId}:`, error);
});
} catch (error: any) {
// Log but don't fail the replay if snapshot creation fails
logger.error(`[WorkflowEventSourcing] Error creating snapshot for execution ${executionId}:`, error);
}
}
State Caching
The system uses in-memory caching to reduce database load:
// From workflowRuntime.ts
if (!options.debug && !options.replayUntil) {
const cachedState = this.stateCache.get(executionId);
if (cachedState && (Date.now() - cachedState.timestamp) < this.STATE_CACHE_TTL_MS) {
logger.debug(`[TypeScriptWorkflowRuntime] Using cached state for execution ${executionId}`);
return cachedState.state;
}
}
Batch Processing
Events can be processed in batches to improve performance:
// From workflowEventSourcing.ts
const {
useSnapshots = true,
batchSize = 100,
replayUntil,
debug = false
} = options;
Conclusion
The event sourcing architecture of our workflow system provides a robust foundation for building complex, long-running workflows with excellent resilience and scalability. By capturing all changes as events, we gain a complete audit trail and the ability to reconstruct workflow state at any point in time.
This document has covered the key aspects of our implementation, from event capture and storage to replay, action execution, and distributed processing. Understanding these concepts is essential for working with and extending the workflow system effectively.