
Disclaimer: This series is a personal, educational reference architecture. All diagrams, opinions, and frameworks are my own and are not affiliated with, sponsored by, or representative of my employer. I’m publishing it on my own time and without using any confidential information.
© 2026 Sean Miller. All rights reserved.
The AI Gateway is the front door. It authenticates, validates, routes, and normalizes. But it doesn’t execute anything intelligent. Once the gateway accepts a request and selects a workflow, it hands off to the Orchestration Engine: the runtime that turns intent into action.
If the AI Gateway is the front door, the Orchestration Engine is the factory floor. It runs versioned, approved graphs of tool calls, model calls, and human checkpoints with persisted state and step-level auditability. Every run is governed, replayable, and testable.
Ad hoc “agents” typically fail in three ways.
First, they lack determinism around non-determinism. When your agent decides what to do next, there’s no contract for what steps are allowed, what state must be preserved, or how to recover when something fails. The LLM just does whatever it does.
Second, they lack governed tool use. Tools get invoked without allowlists, schemas, retries, or idempotency guarantees. One hallucinated function call can corrupt production data or trigger an irreversible action.
Third, they lack replayability. When something goes wrong, there’s no step-level log showing what context was consumed, what decisions were made, or what outputs were produced. Debugging becomes guesswork. Auditing becomes impossible.
The Orchestration Engine solves all three problems by treating agentic behavior as workflow execution with explicit contracts at every boundary.
Figure 1: Orchestration Engine Component Diagram
The Orchestration Engine owns workflow execution: running graphs and steps, handling branching, timeouts, retries, and fallbacks. It maintains run state and step state, including references to intermediate artifacts. The engine enforces tool invocation discipline through schema validation, idempotency keys, rate limits, and safe retries. It applies model invocation discipline via routing hints, structured outputs, and stop conditions. It manages human-in-the-loop checkpoints by pausing execution, requesting approval, and resuming when granted. And it ensures traceability through step-level logs, causal chains, and replay support.
The engine also enforces runtime policy: tool allowlists, data scope constraints, and context budgets. Every decision passes through policy before execution.
The Orchestration Engine should not:
This separation keeps the engine focused on what it does well: reliable, auditable execution of governed workflows.
When people say “agentic,” you can map it to these internal components:
| Component | Responsibility |
|---|---|
| Workflow Executor | Runs the graph or state machine |
| Planner | Decides next steps when the workflow allows flexibility (optional) |
| Tool Dispatcher | Executes tool calls through the Tool Gateway with constraints |
| Model Invoker | Calls the Model Gateway and enforces output schemas |
| State Store Adapter | Persists run state to the Session Store or durable store |
| Checkpoint Manager | Handles HITL pauses and resumes |
| Trace Emitter | Emits structured run and step events to audit and telemetry |
Each component has a clear boundary. The Workflow Executor coordinates. The Planner advises. The dispatchers and invokers execute under strict contracts. The adapters persist. The emitters observe.
A workflow is a versioned, approved state machine with explicit schemas and policies. Each workflow defines:
This is the core architectural move: agents become workflows with guardrails. The workflow definition is code-reviewed, version-controlled, and deployed through your standard CI/CD pipeline. No more ad hoc prompt chains that drift between deploys.
# Example workflow definition (illustrative)
workflow:
id: case-risk-summary
version: "1.2.0"
owner: processing-team
steps:
- id: gather-context
type: tool
tool: document-retriever
input_schema: { caseId: string }
output_schema: { documents: Document[] }
- id: analyze-risk
type: model
model_class: high-accuracy
input_schema: { documents: Document[], prompt: string }
output_schema: { riskFactors: RiskFactor[], summary: string }
constraints:
max_tokens: 4000
required_citations: true
- id: human-review
type: checkpoint
trigger: { riskLevel: "high" }
timeout_hours: 24
escalation: processing-lead
policies:
allowed_tools: [document-retriever, policy-lookup]
data_scope: [case-documents, policy-rules]
max_cost_usd: 0.50
max_duration_minutes: 5
The workflow definition is declarative. The Orchestration Engine interprets it. This separation means you can update policy constraints without touching the execution logic.
Not all workflows are linear. Sometimes the next step depends on context: the user asks something unexpected, new documents arrive mid-case, a tool call fails and needs a fallback, or the system must choose between summarizing and drilling down based on signal.
The Planner is an optional decision-maker that chooses which step to run next when the workflow permits flexibility. The safest design treats it as a recommendation engine constrained by policy.
The planner suggests; the executor decides. Planning is constrained to selecting among enumerated, approved next steps, and the executor validates every decision against policy and budgets before execution.
Given:
The planner returns a plan decision:
The Workflow Executor then validates that decision against guardrails before proceeding.
Different scenarios call for different planner implementations:
Deterministic Planner (No AI): Best for strict compliance flows. The “plan” is if/else logic encoded in rules. No model involved. Maximum predictability.
Hybrid Planner (Rules First, Model for Ambiguity): Rules handle 80-90% of routing. The model is used for classification, intent extraction, or tie-breaking. This balances predictability with flexibility.
LLM Planner (Model Chooses the Path): The model selects among enumerated next steps. It must output strict JSON, cannot call tools directly, and can be forced to provide confidence and reasoning. Low confidence triggers HITL or a safer fallback path.
Figure 2: Planner Decision Point Sequence
The Planner can only choose from approved steps in the workflow graph. It outputs structured JSON that gets schema-validated. The Executor enforces budgets and tool allowlists before execution. Planner decisions are logged to audit and replayable. Low confidence or high-risk routes trigger human-in-the-loop review.
These constraints prevent the failure modes that plague unconstrained agents. The Planner is powerful because it’s limited.
Figure 3: Orchestration Engine Sequence Diagram with branching and HITL
A typical workflow run proceeds through several phases.
Initialization: The Workflow Executor receives the execution request from the AI Gateway, loads the workflow definition from the Workflow Registry, initializes run state, and begins at the entry step.
Step Execution: For each step, the executor checks policy constraints, invokes the appropriate dispatcher (tool, model, or checkpoint), captures the output, updates state, and emits trace events.
Branching: When the workflow reaches a decision point, the Planner evaluates the current context and selects the next step from the allowed options. The executor validates this selection against policy before proceeding.
HITL Checkpoints: When a step triggers human review, the Checkpoint Manager pauses execution, persists state, notifies the appropriate approver, and waits for a decision. On approval, execution resumes. On rejection or timeout, the workflow follows the defined escalation path.
Completion: When the workflow reaches a terminal state, the executor finalizes the run, emits completion events, and returns the structured result to the AI Gateway.
The orchestrator client call from the AI Gateway triggers execution. Here’s what happens inside:
// Illustrative pseudocode for a reference architecture (not production code)
// orchestrator/src/executor.ts
async function executeWorkflow(request: WorkflowExecuteRequest): Promise<WorkflowExecuteResult> {
const { workflowId, sessionId, correlationId, input, caller } = request;
// 1. Load workflow definition from registry
const workflow = await workflowRegistry.get(workflowId);
if (!workflow) {
throw new WorkflowNotFoundError(workflowId);
}
// 2. Initialize run state
const runId = generateRunId();
const runState: RunState = {
runId,
workflowId,
workflowVersion: workflow.version,
sessionId,
correlationId,
currentStepId: workflow.entryStep,
stepHistory: [],
artifacts: {},
startedAt: new Date().toISOString(),
status: 'running'
};
await stateStore.saveRunState(runState);
traceEmitter.emit('run.started', { runId, workflowId, correlationId });
// 3. Execute steps until terminal state
while (!isTerminalStep(runState.currentStepId, workflow)) {
const step = workflow.steps[runState.currentStepId];
// 4. Check policy before each step
const policyCheck = await policyEngine.checkStepAllowed({
workflowId,
stepId: step.id,
caller,
currentBudget: runState.budgetConsumed
});
if (!policyCheck.allowed) {
traceEmitter.emit('step.blocked', { runId, stepId: step.id, reason: policyCheck.reason });
throw new PolicyViolationError(policyCheck.reason);
}
// 5. Execute the step based on type
const stepResult = await executeStep(step, runState, input, caller);
// 6. Update state and history
runState.stepHistory.push({
stepId: step.id,
startedAt: stepResult.startedAt,
completedAt: stepResult.completedAt,
output: stepResult.output,
tokensUsed: stepResult.tokensUsed,
costUsd: stepResult.costUsd
});
runState.artifacts = { ...runState.artifacts, ...stepResult.artifacts };
runState.budgetConsumed = (runState.budgetConsumed ?? 0) + (stepResult.costUsd ?? 0);
// 7. Determine next step (may involve Planner)
runState.currentStepId = await determineNextStep(step, stepResult, workflow, runState);
await stateStore.saveRunState(runState);
traceEmitter.emit('step.completed', { runId, stepId: step.id, nextStepId: runState.currentStepId });
}
// 8. Finalize and return
runState.status = 'completed';
runState.completedAt = new Date().toISOString();
await stateStore.saveRunState(runState);
traceEmitter.emit('run.completed', { runId, workflowId });
return buildWorkflowResult(runState);
}
The step execution dispatcher routes to the appropriate handler:
// orchestrator/src/step-dispatcher.ts
async function executeStep(
step: WorkflowStep,
runState: RunState,
input: WorkflowInput,
caller: CallerIdentity
): Promise<StepResult> {
const startedAt = new Date().toISOString();
switch (step.type) {
case 'tool':
return await toolDispatcher.execute({
toolId: step.tool,
input: resolveStepInput(step.input_schema, runState, input),
caller,
idempotencyKey: `${runState.runId}-${step.id}`,
constraints: step.constraints
});
case 'model':
return await modelInvoker.execute({
modelClass: step.model_class,
input: resolveStepInput(step.input_schema, runState, input),
outputSchema: step.output_schema,
constraints: step.constraints
});
case 'checkpoint':
return await checkpointManager.requestApproval({
runId: runState.runId,
stepId: step.id,
trigger: step.trigger,
timeout: step.timeout_hours,
escalation: step.escalation,
context: buildCheckpointContext(runState)
});
case 'rule':
return await ruleEngine.evaluate({
ruleId: step.rule,
input: resolveStepInput(step.input_schema, runState, input)
});
default:
throw new UnknownStepTypeError(step.type);
}
}
The Planner integration for dynamic routing:
// orchestrator/src/planner.ts
async function determineNextStep(
currentStep: WorkflowStep,
stepResult: StepResult,
workflow: WorkflowDefinition,
runState: RunState
): Promise<string> {
// Static routing: explicit next step defined
if (currentStep.next) {
return currentStep.next;
}
// Conditional routing: evaluate conditions
if (currentStep.transitions) {
for (const transition of currentStep.transitions) {
if (evaluateCondition(transition.condition, stepResult, runState)) {
return transition.target;
}
}
}
// Dynamic routing: invoke Planner
if (currentStep.allowDynamicRouting) {
const plannerInput: PlannerInput = {
allowedNextSteps: currentStep.allowedNextSteps,
currentStateSummary: summarizeState(runState),
lastStepOutput: stepResult.output,
constraints: {
maxToolCallsRemaining: workflow.policies.max_tool_calls - runState.toolCallCount,
maxCostRemainingUsd: workflow.policies.max_cost_usd - runState.budgetConsumed,
requiredApproval: workflow.policies.hitl_required
}
};
const plannerOutput = await planner.decide(plannerInput);
// Validate planner decision
if (!currentStep.allowedNextSteps.includes(plannerOutput.selectedStepId)) {
traceEmitter.emit('planner.invalid', {
runId: runState.runId,
selected: plannerOutput.selectedStepId,
allowed: currentStep.allowedNextSteps
});
throw new InvalidPlannerDecisionError(plannerOutput.selectedStepId);
}
// Trigger HITL if low confidence or high risk
if (plannerOutput.confidence < 0.7 || plannerOutput.riskLevel === 'high') {
traceEmitter.emit('planner.hitl-triggered', { runId: runState.runId, reason: 'low-confidence' });
return 'human-review-step';
}
return plannerOutput.selectedStepId;
}
// Terminal: no next step
return 'terminal';
}
This code is intentionally simplified. Production implementations handle concurrency, distributed state, failure recovery, and many edge cases the pseudocode omits. The structure illustrates the separation of concerns and the flow of control.
Every step emits structured trace events. These events flow to the shared audit log and telemetry store, enabling:
Debugging: Trace a failed run step by step. See exactly what context was assembled, what the model received, what it produced, and why the workflow branched where it did.
Auditing: Answer “why did the system do X?” with evidence. Every decision is logged with causal links. Regulators and auditors can follow the chain.
Regression Testing: Replay historical runs against new workflow versions. Catch behavioral changes before they reach production.
Performance Analysis: Identify slow steps, expensive model calls, and frequent HITL triggers. Optimize based on data.
The Trace Emitter produces events like:
{
"eventType": "step.completed",
"runId": "run_abc123",
"correlationId": "corr_xyz789",
"workflowId": "case-risk-summary",
"workflowVersion": "1.2.0",
"stepId": "analyze-risk",
"stepType": "model",
"startedAt": "2026-01-09T14:23:45.123Z",
"completedAt": "2026-01-09T14:23:47.456Z",
"durationMs": 2333,
"tokensUsed": 1847,
"costUsd": 0.037,
"modelUsed": "gemini-2.5-flash",
"outputSummary": "Identified 3 risk factors: documentation gap, income verification, DTI threshold",
"nextStepId": "human-review"
}
These events are the foundation for observability dashboards, alerting, and continuous improvement.
In the next post, we’ll explore The Tool Gateway: the governed execution boundary between AI workflows and enterprise systems. We’ll examine how tools are registered, how RBAC and policy enforcement work at the tool level, and why deterministic tool behavior is essential for reliable AI systems.
This series will explore each component of the Lattice architecture in depth:
This series documents architectural patterns for enterprise AI platforms. Diagrams and frameworks are provided for educational purposes.