Workflows
Declarative finite state machine workflows backed by Temporal. State types including ContainerRun for CI/CD steps, Blackboard context, transition rules, and the Forge reference pattern.
Workflows
A Workflow in AEGIS is a declarative finite state machine (FSM) defined in YAML. It coordinates multiple agent executions, system commands, and human approval gates into a durable, restartable sequence. Workflows are executed by the Temporal-backed workflow engine, which guarantees that execution state survives orchestrator restarts.
Workflow Types: Temporal vs. ToolWorkflows
It is critical to distinguish between the two types of "Workflows" in the AEGIS ecosystem to avoid namespace collisions and conceptual errors.
| Feature | Temporal Workflows (Workflow) | SEAL ToolWorkflows (Macro-Tools) |
|---|---|---|
| Purpose | Async CI/CD pipelines, long-running multi-agent choreography. | Sync REST API macro-tools, context window compression. |
| Backend | Temporal (Durable, stateful, persistent). | SEAL Tooling Gateway (Stateless, sub-second). |
| Primitives | Agent, ContainerRun, HumanApproval. | Sequential REST calls with JSONPath extractors. |
| Duration | Minutes to weeks. | Milliseconds to seconds. |
| Invoked By | Operator or system trigger. | Agent LLM (via standard tools/call). |
This page documents Temporal Workflows. For documentation on REST macro-tools, see the SEAL Tooling Gateway architecture.
Workflow Scope and Visibility
Every workflow has a scope that controls who can discover and execute it:
| Scope | Visibility | Use Case |
|---|---|---|
| User | Only the owning user within their tenant | Personal sandbox, experimentation |
| Tenant | All users within the owning tenant | Team-shared workflows (default) |
| Global | All users across all tenants | Platform-standard workflows managed by operators |
Scope Resolution: When you reference a workflow by name, AEGIS resolves it using a narrowest-first strategy: your User-scoped workflows are checked first, then Tenant-scoped, then Global. This means you can shadow a platform workflow with a personal version of the same name.
Promotion and Demotion: Workflows can be promoted (User → Tenant → Global) or demoted (Global → Tenant → User) via CLI or API:
# Deploy a personal workflow
aegis workflow deploy my-workflow.yaml --scope user
# Promote to tenant-wide visibility
aegis workflow promote my-workflow --to tenant
# Promote to global (operator role required)
aegis workflow promote my-workflow --to global
# List all visible workflows (user + tenant + global)
aegis workflow list --visibleDirect jumps between User and Global are not allowed — you must promote through Tenant first to enforce a review step.
Semantic Discovery (Enterprise)
On nodes with the discovery service configured, registered workflows are automatically indexed for semantic search. Agents and operators can find workflows by natural-language intent (e.g. "deploy Node.js to Kubernetes") using the aegis.workflow.search MCP tool, rather than browsing the full list. This is an enterprise feature; nodes without discovery configured use aegis.workflow.list for enumeration.
Manifest Structure
apiVersion: 100monkeys.ai/v1
kind: Workflow
metadata:
name: code-review-pipeline
version: "1.0.0"
labels:
category: development
spec:
context:
review_threshold: 0.85
initial_state: ANALYZE
states:
ANALYZE:
kind: Agent
agent: code-analyzer
timeout: 120s
transitions:
- condition: on_success
target: REVIEW
- condition: on_failure
target: FAILED
REVIEW:
kind: Agent
agent: code-reviewer
timeout: 180s
transitions:
- condition: score_above
threshold: 0.85
target: APPROVE
- condition: score_below
threshold: 0.85
target: FAILED
feedback: "Score too low ({{REVIEW.score}}): {{REVIEW.output.reasoning}}"
APPROVE:
kind: Human
prompt: |
Review score: {{REVIEW.score}}
Analysis: {{REVIEW.output}}
Approve to merge? (yes/no)
timeout: 86400s
transitions:
- condition: input_equals_yes
target: MERGE
- condition: input_equals_no
target: REJECTED
feedback: "{{human.feedback}}"
MERGE:
kind: System
command: "aegis-merge-tool --pr-id {{input.pr_id}}"
transitions:
- condition: exit_code_zero
target: DONE
- condition: exit_code_non_zero
target: FAILED
DONE:
kind: System
command: "echo done"
transitions: []
FAILED:
kind: System
command: "echo failed"
transitions: []
REJECTED:
kind: System
command: "echo rejected"
transitions: []State Types
Agent
Executes an AEGIS agent using the 100monkeys iterative loop. The state waits for the agent execution to reach completed or failed before evaluating transitions.
ANALYZE:
kind: Agent
agent: code-analyzer # must be a deployed agent name
timeout: 300s # cancels execution if not complete within this time
input: |
{{workflow.task}}
transitions:
- condition: on_success
target: REVIEW
- condition: on_failure
target: FAILEDThe agent's final output is available as {{ANALYZE.output}}. Status is available as {{ANALYZE.status}}.
System
Runs a shell command on the orchestrator host. Useful for triggering external processes, sending notifications, or performing cleanup.
NOTIFY:
kind: System
command: "curl -X POST https://hooks.example.com/notify -d '{\"status\": \"done\"}'"
timeout: 30s
transitions:
- condition: exit_code_zero
target: DONE
- condition: exit_code_non_zero
target: FAILEDHuman
Suspends the workflow and waits for an operator to signal a decision. The workflow persists durably in Temporal — it will wait indefinitely (or until timeout).
APPROVAL_GATE:
kind: Human
prompt: |
Output: {{GENERATE.output}}
Approve to proceed? (yes/no)
timeout: 86400s # 24 hours; default_response taken if not signalled
default_response: reject
transitions:
- condition: input_equals_yes
target: PROCEED
- condition: input_equals_no
target: REDESIGN
feedback: "{{human.feedback}}"To resume a suspended workflow, send a signal through the workflow execution API:
POST /v1/workflows/executions/{execution_id}/signal
Content-Type: application/json
{
"response": "approved"
}ParallelAgents
Fans out to multiple agent executions simultaneously. All agents run concurrently; the state waits until all have completed (or timeout elapses) before evaluating transitions using the consensus configuration.
PARALLEL_REVIEW:
kind: ParallelAgents
agents:
- agent: security-reviewer
input: "Security audit: {{GENERATE.output}}"
weight: 2.0
timeout_seconds: 300
- agent: performance-reviewer
input: "Performance review: {{GENERATE.output}}"
weight: 1.0
timeout_seconds: 180
- agent: style-reviewer
input: "Style review: {{GENERATE.output}}"
weight: 1.0
timeout_seconds: 60
consensus:
strategy: weighted_average # weighted_average | majority | unanimous | best_of_n
threshold: 0.85
min_agreement_confidence: 0.75
min_judges_required: 2
timeout: 600s
transitions:
- condition: consensus
threshold: 0.85
agreement: 0.75
target: DONE
- condition: score_below
threshold: 0.85
target: FAILED
feedback: "Parallel review failed (score: {{PARALLEL_REVIEW.consensus.score}})"Individual results are available as {{STATENAME.agents.N.output}} (0-indexed). The aggregated score is {{STATENAME.consensus.score}}.
ContainerRun
Executes a deterministic command inside an isolated container — no LLM, no iteration loop. Use ContainerRun for CI/CD steps such as building binaries, running test suites, linting, or pushing deployment artifacts. The container is created, the command runs to completion, and the container is destroyed.
BUILD:
kind: ContainerRun
name: "Compile and Build" # human-readable label for the Synapse UI
image: "rust:1.75-alpine" # any Docker image or standard runtime reference
command: ["cargo", "build", "--release"]
workdir: "/workspace"
volumes:
- name: workspace
mount_path: /workspace
resources:
cpu: 2000 # millicores (2000 = 2 CPU cores)
memory: "4Gi"
timeout: "10m"
env:
CARGO_HOME: "/workspace/.cargo"
transitions:
- condition: exit_code_zero
target: TEST
- condition: exit_code_non_zero
target: FIX_BUILD_ERRORS
feedback: "Build stderr: {{BUILD.output.stderr}}"The step's output is stored on the Blackboard under the state name. Access it with {{BUILD.output.exit_code}}, {{BUILD.output.stdout}}, and {{BUILD.output.stderr}}.
You can also write multi-line scripts using shell: true:
DEPLOY:
kind: ContainerRun
name: "Push to Registry"
image: "docker:24-cli"
command:
- sh
- -c
- |
docker build -t {{blackboard.registry}}/{{blackboard.app}}:{{blackboard.version}} /workspace
docker push {{blackboard.registry}}/{{blackboard.app}}:{{blackboard.version}}
shell: true
registry_credentials: "secret:cicd/docker-registry"
volumes:
- name: workspace
mount_path: /workspace
resources:
timeout: "15m"
transitions:
- condition: exit_code_zero
target: DONE
- condition: exit_code_non_zero
target: NOTIFY_FAILUREParallelContainerRun
Runs multiple container commands concurrently within a single workflow state. All steps share the same volume mounts and complete before transitions are evaluated. Use it for running a test matrix, running lint + unit tests + format checks simultaneously, or building multiple build targets.
TEST:
kind: ParallelContainerRun
steps:
- name: unit-tests
image: "rust:1.75-alpine"
command: ["cargo", "test", "--workspace"]
workdir: "/workspace"
volumes:
- name: workspace
mount_path: /workspace
resources:
timeout: "5m"
- name: lint
image: "rust:1.75-alpine"
command: ["cargo", "clippy", "--all-targets", "--", "-D", "warnings"]
workdir: "/workspace"
volumes:
- name: workspace
mount_path: /workspace
- name: format-check
image: "rust:1.75-alpine"
command: ["cargo", "fmt", "--check"]
workdir: "/workspace"
volumes:
- name: workspace
mount_path: /workspace
completion: all_succeed # all_succeed | any_succeed | best_effort
transitions:
- condition: on_success
target: REVIEW
- condition: on_failure
target: FIX_TEST_ERRORSPer-step output is accessible by name: {{TEST.output.unit-tests.stdout}}, {{TEST.output.lint.stderr}}, etc.
The three completion strategies control how individual step outcomes are aggregated:
| Strategy | Behaviour |
|---|---|
all_succeed | State succeeds only if every step exits with code 0. |
any_succeed | State succeeds if at least one step exits with code 0. |
best_effort | State always transitions with on_success; per-step results are available on the Blackboard regardless. |
Subworkflow
Invokes another deployed workflow as a child execution. Use Subworkflow to compose workflows — extract reusable pipelines (data processing, deployment, validation) into standalone workflows and call them from a parent.
Two execution modes are supported:
| Mode | Behaviour |
|---|---|
blocking | The parent state waits for the child workflow to complete. The child's final Blackboard is written to the parent Blackboard under result_key. |
fire_and_forget | The parent starts the child workflow and immediately transitions to the next state. The child runs independently. |
TRIGGER_CHILD:
kind: Subworkflow
workflow_id: data-pipeline-v2
mode: blocking
result_key: pipeline_output
input: "{{workflow.context.task}}"
transitions:
- condition: on_success
target: PROCESS_RESULT
- condition: on_failure
target: HANDLE_ERRORIn blocking mode, the child workflow's final output is available as {{TRIGGER_CHILD.result}} and the full result object is stored under the result_key on the parent Blackboard (e.g., {{blackboard.pipeline_output}}). In fire_and_forget mode, only the child's execution ID is written to the Blackboard as {{TRIGGER_CHILD.child_execution_id}}.
The Blackboard
The Blackboard is a mutable key-value store shared across all states in a workflow execution. It is seeded from spec.context, can be overridden at workflow start with blackboard, and is updated as states complete.
Lifecycle
The Blackboard has three lifecycle phases:
- Seed — Before the workflow starts, the orchestrator writes all
spec.contextkeys into the Blackboard and then merges any startupblackboardoverrides. Startup overrides must be a JSON or YAML object and win on same-named top-level keys. The reservedworkflowkey cannot be overridden. - Live execution — The Temporal worker owns the Blackboard during execution. It receives the merged startup Blackboard, preserves authoritative
workflow.*metadata, and writes each completed state's output back under the state's name (for exampleREQUIREMENTS.output,REQUIREMENTS.status,REQUIREMENTS.score). All Handlebars templates in subsequent states are resolved against this live state. - Capture — When the workflow finishes, the orchestrator captures the final Blackboard snapshot from Temporal and persists it to PostgreSQL, where it is available via the execution history API.
Startup Overrides
Use startup Blackboard overrides when you want to inject per-run variables without changing the workflow manifest:
aegis workflow run release-pipeline \
--input @input.yaml \
--blackboard @blackboard.yamlExample blackboard.yaml:
review_threshold: 0.92
deploy_env: staging
release:
candidate: rc-4Rules:
--blackboardaccepts inline JSON/YAML or@fileinput.- The value must be an object, not a scalar or array.
- Blackboard overrides are merged on top of
spec.context. workflow.*metadata remains authoritative and cannot be replaced by overrides.- The live Blackboard is passed into downstream Agent states as structured execution context.
Accessing Blackboard Values
Use Handlebars template syntax in input, command, env, prompt, and transition expression/feedback fields:
# Access a prior state's output
command: "process-results --input '{{ANALYZE.output}}'"
# Access a spec.context or startup blackboard override value
command: "run-tests --threshold {{workflow.context.review_threshold}}"
# Access workflow start input
command: "git clone {{input.repo_url}}"
# Access top-level blackboard values used during live execution
command: "deploy --env {{blackboard.deploy_env}}"Available template variables:
| Variable | Description |
|---|---|
{{workflow.task}} | task key from the workflow start --input payload. |
{{workflow.context.KEY}} | Value from spec.context. |
{{STATE.output}} | Final text output of an Agent state. |
{{STATE.output.stdout}} | stdout of a System or ContainerRun state. |
{{STATE.output.stderr}} | stderr of a System or ContainerRun state. |
{{STATE.output.exit_code}} | Exit code of a System or ContainerRun state. |
{{STATE.output.STEPNAME.stdout}} | Per-step stdout of a ParallelContainerRun state. |
{{STATE.output.STEPNAME.stderr}} | Per-step stderr of a ParallelContainerRun state. |
{{STATE.output.STEPNAME.exit_code}} | Per-step exit code of a ParallelContainerRun state. |
{{STATE.status}} | Status: success, failed, or timeout. |
{{STATE.score}} | Validation score of an Agent state (0.0–1.0). |
{{STATE.consensus.score}} | Aggregated score of a ParallelAgents state. |
{{STATE.agents.N.output}} | Individual judge output (ParallelAgents, 0-indexed). |
{{STATE.result}} | Final output of a Subworkflow state (blocking mode). |
{{STATE.child_execution_id}} | Child execution ID of a Subworkflow state. |
{{blackboard.KEY}} | Any key from merged startup Blackboard data or any key written during execution. |
{{state.feedback}} | feedback string from the incoming transition. |
{{human.feedback}} | Feedback text from the Human signal --feedback argument. |
{{input.KEY}} | Key from the workflow start --input payload. |
Transition Rules
Each state has a transitions list. Rules are evaluated in order; the first matching condition wins.
transitions:
# Score-based routing (Agent states)
- condition: score_above
threshold: 0.95
target: EXCELLENT
- condition: score_between
min: 0.70
max: 0.95
target: GOOD
# Exit-code routing (System states)
- condition: exit_code_zero
target: SUCCESS
- condition: exit_code_non_zero
target: RETRY
feedback: "Command failed: {{EXECUTE.output.stderr}}"
# Human approval routing (Human states)
- condition: input_equals_yes
target: APPROVED
- condition: input_equals_no
target: REJECTED
feedback: "{{human.feedback}}"
# Consensus routing (ParallelAgents states)
- condition: consensus
threshold: 0.85
agreement: 0.75
target: PASSED
# Custom Handlebars expression
- condition: custom
expression: "{{blackboard.retry_count < workflow.context.max_retries}}"
target: RETRY
# Unconditional fallback — always place last
- condition: always
target: FAILEDA transition with no condition (or condition: always) is an unconditional fallback. Always include one as the last transition to prevent the FSM from stalling. For the complete list of condition types per state kind, see the Workflow Manifest Reference.
The Forge Reference Workflow
The Forge is the canonical multi-agent development workflow. It implements a 7-state sequential pipeline with a human approval gate:
RequirementsAI → [Human Approval] → ArchitectureAI → TesterAI → CoderAI → ReviewerAI → CriticAI → SecurityAI| State | Kind | Agent | Description |
|---|---|---|---|
requirements | Agent | requirements-agent | Analyzes the input task and writes a structured requirements document. |
approve_requirements | Human | — | Operator reviews and approves requirements before proceeding. |
architecture | Agent | architect-agent | Designs system architecture based on approved requirements. |
tests | Agent | tester-agent | Writes test cases and test harness before any implementation. |
code | Agent | coder-agent | Implements the solution, running tests iteratively via the execution loop. |
review | Agent | reviewer-agent | Reviews code for correctness, style, and adherence to requirements. |
critic | Agent | critic-agent | Adversarial review — attempts to find failure cases and security issues. |
security | Agent | security-agent | Static analysis and security-specific validation. |
Deploy and run the Forge workflow
Prerequite: Clone the aegis-examples repository to access the-forge example workflow below.
aegis workflow deploy ./agents/workflows/the-forge.yaml --force
aegis workflow run the-forge --input '{"task": "Build a REST API for user authentication", "repo_url": "https://github.com/org/repo"}'Deploying and Managing Workflows
# Deploy a workflow manifest
aegis workflow deploy ./my-workflow.yaml --force
# List all deployed workflows
aegis workflow list
# Start a workflow execution
aegis workflow run <workflow-name> --input '{"key": "value"}'
# Start with startup blackboard overrides from JSON or YAML
aegis workflow run <workflow-name> --blackboard @blackboard.yaml
# Follow workflow logs
aegis workflow logs <execution-id> --follow --verbose
aegis workflow logs <execution-id> --transitions
# Inspect a running or completed execution
aegis workflow executions get <execution-id>Use --force when you are re-deploying the same workflow name and version.
For execution status and Human-state signaling, use the CLI:
aegis workflow executions get <execution-id>
aegis workflow signal <execution-id> --response approvedThe underlying REST API remains available for integrations, but the CLI is the preferred operator path.
See the Workflow Manifest Reference for the complete field specification. See CLI Capability Matrix for currently CLI-exposed workflow operations.