Aegis Orchestrator
Guides

Building CI/CD Pipelines

Mix agent steps and container-based CI/CD steps — build, test, lint, and deploy — in a single unified AEGIS workflow.

Building CI/CD Pipelines

AEGIS workflows support two container-based state kinds — ContainerRun and ParallelContainerRun — that let you execute any Docker image as a deterministic, isolated step inside your workflow. This means you can mix AI agent steps and traditional CI/CD steps (compile, test, lint, push) in a single workflow manifest, sharing the same workspace volumes, transition logic, and observability.


Prerequisites

  • AEGIS daemon running (see Getting Started)
  • Docker available on the node running the Temporal worker
  • At least one agent deployed if you want to mix agent and CI/CD steps

Core Concepts

ContainerRun vs Agent

ContainerRunAgent
Runs an LLM?NoYes
Iteration loop?NoYes (up to max_iterations)
Deterministic?YesNo
Transitions onExit codeValidation score
Best forBuild, test, lint, deployCode generation, review, analysis

Volumes Are Shared

All states in a workflow share the same volumes. An agent writes code to /workspace; a ContainerRun step reads and compiles it from /workspace. This is the key primitive that makes unified pipelines possible.

Blackboard Integration

Container step outputs land on the Blackboard under the state name, accessible by subsequent states:

{{BUILD.output.exit_code}}    # 0 on success
{{BUILD.output.stdout}}       # captured stdout (up to 1 MB)
{{BUILD.output.stderr}}       # captured stderr (up to 1 MB)

For ParallelContainerRun, each step is keyed by its name:

{{TEST.output.unit-tests.stdout}}
{{TEST.output.lint.stderr}}

Step 1: A Minimal Build Pipeline

The simplest pipeline takes agent-generated code and compiles it. Create build-pipeline.yaml:

apiVersion: 100monkeys.ai/v1
kind: Workflow
metadata:
  name: build-pipeline
  version: "1.0.0"
  description: "Generate code with an agent, then compile it"
spec:
  initial_state: GENERATE

  volumes:
    - name: workspace
      persistent: false

  states:

    GENERATE:
      kind: Agent
      agent: coder-v2
      input: |
        {{workflow.task}}
      volumes:
        - name: workspace
          mount_path: /workspace
      transitions:
        - condition: on_success
          target: BUILD
        - condition: on_failure
          target: FAILED

    BUILD:
      kind: ContainerRun
      name: "Compile"
      image: "rust:1.75-alpine"
      command: ["cargo", "build", "--release"]
      workdir: "/workspace"
      volumes:
        - name: workspace
          mount_path: /workspace
      resources:
        cpu: 2000
        memory: "4Gi"
        timeout: "10m"
      transitions:
        - condition: exit_code_zero
          target: DONE
        - condition: exit_code_non_zero
          target: FIX_ERRORS
          feedback: "{{BUILD.output.stderr}}"

    FIX_ERRORS:
      kind: Agent
      agent: coder-v2
      input: |
        The build failed:
        {{state.feedback}}

        Fix the code in /workspace.
      volumes:
        - name: workspace
          mount_path: /workspace
      transitions:
        - condition: on_success
          target: BUILD
        - condition: on_failure
          target: FAILED

    DONE:
      kind: System
      command: "echo 'Build complete'"
      transitions: []

    FAILED:
      kind: System
      command: "echo 'Pipeline failed'"
      transitions: []

Deploy and run it:

aegis workflow deploy ./build-pipeline.yaml
aegis workflow run build-pipeline --input '{"task": "Write a Rust CLI tool that counts lines in a file"}'

When BUILD fails, the exit code triggers the FIX_ERRORS state. The agent receives the compiler errors via {{state.feedback}} and re-edits the files in /workspace. The workflow loops back to BUILD automatically.


Step 2: Parallel Testing

Running multiple quality checks in parallel reduces total pipeline duration. Use ParallelContainerRun to fan out:

TEST:
  kind: ParallelContainerRun
  steps:
    - name: unit-tests
      image: "rust:1.75-alpine"
      command: ["cargo", "test", "--workspace"]
      workdir: "/workspace"
      volumes:
        - name: workspace
          mount_path: /workspace
      resources:
        timeout: "5m"

    - name: lint
      image: "rust:1.75-alpine"
      command: ["cargo", "clippy", "--all-targets", "--", "-D", "warnings"]
      workdir: "/workspace"
      volumes:
        - name: workspace
          mount_path: /workspace
      resources:
        timeout: "3m"

    - name: format-check
      image: "rust:1.75-alpine"
      command: ["cargo", "fmt", "--check"]
      workdir: "/workspace"
      volumes:
        - name: workspace
          mount_path: /workspace
      resources:
        timeout: "2m"

  completion: all_succeed
  transitions:
    - condition: on_success
      target: REVIEW
    - condition: on_failure
      target: FIX_TEST_ERRORS

When any step fails, the agent in FIX_TEST_ERRORS can inspect exactly which check failed:

FIX_TEST_ERRORS:
  kind: Agent
  agent: coder-v2
  input: |
    One or more quality checks failed. Diagnose and fix the issues.

    Unit test output:
    {{TEST.output.unit-tests.stdout}}
    {{TEST.output.unit-tests.stderr}}

    Lint output:
    {{TEST.output.lint.stderr}}

    Format check output:
    {{TEST.output.format-check.stderr}}
  volumes:
    - name: workspace
      mount_path: /workspace
  transitions:
    - condition: on_success
      target: BUILD
    - condition: on_failure
      target: FAILED

Completion Strategies

Choose completion based on how strict you want the gate to be:

# All steps must pass (default for CI gates)
completion: all_succeed

# Continue if any step passes (e.g. multi-language build, one succeeds)
completion: any_succeed

# Always proceed; inspect per-step results downstream
completion: best_effort

Step 3: Deploying Artifacts

Container steps can authenticate to private registries using credentials stored in the secrets vault. Reference them with registry_credentials:

DEPLOY:
  kind: ContainerRun
  name: "Build and Push Image"
  image: "docker:24-cli"
  command:
    - sh
    - -c
    - |
      docker build \
        -t {{blackboard.registry}}/{{blackboard.app_name}}:{{blackboard.version}} \
        /workspace
      docker push \
        {{blackboard.registry}}/{{blackboard.app_name}}:{{blackboard.version}}
  shell: true
  registry_credentials: "secret:cicd/docker-registry"
  env:
    DOCKER_HOST: "tcp://dockerd:2375"
  volumes:
    - name: workspace
      mount_path: /workspace
  resources:
    timeout: "15m"
  transitions:
    - condition: exit_code_zero
      target: NOTIFY_SUCCESS
    - condition: exit_code_non_zero
      target: NOTIFY_FAILURE

registry_credentials is a path in the secrets vault. The orchestrator resolves the credentials at execution time and injects them into the container environment — your workflow manifest never contains raw secrets.


Step 4: Full Delivery Pipeline

Putting it all together — a pipeline where an agent generates code, CI/CD steps compile and test it, AI agents review it, and a final step deploys it:

apiVersion: 100monkeys.ai/v1
kind: Workflow
metadata:
  name: full-delivery-pipeline
  version: "1.0.0"
  description: "Agent-augmented CI/CD: generate → build → test → review → deploy"
  labels:
    purpose: cicd

spec:
  initial_state: GENERATE_CODE

  volumes:
    - name: workspace
      persistent: true

  states:

    # ── Stage 1: AI code generation ──────────────────────────────────────────

    GENERATE_CODE:
      kind: Agent
      agent: coder-v2
      input: |
        Implement: {{workflow.task}}
        Requirements: {{blackboard.requirements}}
      volumes:
        - name: workspace
          mount_path: /workspace
      transitions:
        - condition: on_success
          target: BUILD
        - condition: on_failure
          target: NOTIFY_FAILURE

    # ── Stage 2: Compile ──────────────────────────────────────────────────────

    BUILD:
      kind: ContainerRun
      name: "Compile"
      image: "rust:1.75-alpine"
      command: ["cargo", "build", "--release"]
      workdir: "/workspace"
      volumes:
        - name: workspace
          mount_path: /workspace
      resources:
        cpu: 2000
        memory: "4Gi"
        timeout: "10m"
      transitions:
        - condition: exit_code_zero
          target: TEST
        - condition: exit_code_non_zero
          target: FIX_BUILD_ERRORS
          feedback: "Build failed:\n{{BUILD.output.stderr}}"

    FIX_BUILD_ERRORS:
      kind: Agent
      agent: coder-v2
      input: |
        Build failed. Fix the compilation errors:
        {{state.feedback}}
      volumes:
        - name: workspace
          mount_path: /workspace
      transitions:
        - condition: on_success
          target: BUILD
        - condition: on_failure
          target: NOTIFY_FAILURE

    # ── Stage 3: Test + lint in parallel ────────────────────────────────────

    TEST:
      kind: ParallelContainerRun
      steps:
        - name: unit-tests
          image: "rust:1.75-alpine"
          command: ["cargo", "test", "--workspace"]
          workdir: "/workspace"
          volumes:
            - name: workspace
              mount_path: /workspace
          resources:
            timeout: "5m"
        - name: lint
          image: "rust:1.75-alpine"
          command: ["cargo", "clippy", "--all-targets", "--", "-D", "warnings"]
          workdir: "/workspace"
          volumes:
            - name: workspace
              mount_path: /workspace
          resources:
            timeout: "3m"
        - name: format-check
          image: "rust:1.75-alpine"
          command: ["cargo", "fmt", "--check"]
          workdir: "/workspace"
          volumes:
            - name: workspace
              mount_path: /workspace
      completion: all_succeed
      transitions:
        - condition: on_success
          target: REVIEW
        - condition: on_failure
          target: FIX_TEST_ERRORS

    FIX_TEST_ERRORS:
      kind: Agent
      agent: coder-v2
      input: |
        Tests or linting failed. Fix all issues.
        Unit tests: {{TEST.output.unit-tests.stderr}}
        Lint:        {{TEST.output.lint.stderr}}
        Format:      {{TEST.output.format-check.stderr}}
      volumes:
        - name: workspace
          mount_path: /workspace
      transitions:
        - condition: on_success
          target: BUILD
        - condition: on_failure
          target: NOTIFY_FAILURE

    # ── Stage 4: Multi-agent review ──────────────────────────────────────────

    REVIEW:
      kind: ParallelAgents
      agents:
        - agent: security-auditor
          input: "Security audit the code in /workspace."
          weight: 1.5
          timeout_seconds: 300
        - agent: architecture-reviewer
          input: "Review the architecture and SOLID compliance."
          weight: 1.0
          timeout_seconds: 180
      consensus:
        strategy: weighted_average
        threshold: 0.8
        min_judges_required: 2
      transitions:
        - condition: consensus
          threshold: 0.8
          agreement: 0.75
          target: DEPLOY
        - condition: score_below
          threshold: 0.8
          target: FIX_REVIEW_ISSUES
          feedback: "Review score: {{REVIEW.consensus.score}}\n{{REVIEW.agents.0.output}}"

    FIX_REVIEW_ISSUES:
      kind: Agent
      agent: coder-v2
      input: |
        Review feedback:
        {{state.feedback}}
        Address all identified issues.
      volumes:
        - name: workspace
          mount_path: /workspace
      transitions:
        - condition: on_success
          target: BUILD
        - condition: on_failure
          target: NOTIFY_FAILURE

    # ── Stage 5: Deploy ──────────────────────────────────────────────────────

    DEPLOY:
      kind: ContainerRun
      name: "Push to Registry"
      image: "docker:24-cli"
      command:
        - sh
        - -c
        - |
          docker build -t {{blackboard.registry}}/{{blackboard.app_name}}:{{blackboard.version}} /workspace
          docker push {{blackboard.registry}}/{{blackboard.app_name}}:{{blackboard.version}}
      shell: true
      env:
        DOCKER_HOST: "tcp://dockerd:2375"
      volumes:
        - name: workspace
          mount_path: /workspace
      registry_credentials: "secret:cicd/docker-registry"
      resources:
        timeout: "15m"
      transitions:
        - condition: exit_code_zero
          target: NOTIFY_SUCCESS
        - condition: exit_code_non_zero
          target: NOTIFY_FAILURE

    # ── Terminal states ───────────────────────────────────────────────────────

    NOTIFY_SUCCESS:
      kind: System
      command: "echo 'Pipeline completed successfully'"
      transitions: []

    NOTIFY_FAILURE:
      kind: System
      command: "echo 'Pipeline failed'"
      transitions: []

Using Private Registry Images

If your CI/CD steps use images from a private registry (e.g. ghcr.io/myorg/builder:v2), set image_pull_policy and provide registry_credentials:

BUILD:
  kind: ContainerRun
  name: "Custom Build Image"
  image: "ghcr.io/myorg/builder:v2"
  image_pull_policy: Always              # Always | IfNotPresent | Never
  registry_credentials: "secret:cicd/ghcr-token"
  command: ["make", "release"]
  workdir: "/workspace"
  volumes:
    - name: workspace
      mount_path: /workspace
  transitions:
    - condition: exit_code_zero
      target: DONE
    - condition: exit_code_non_zero
      target: FAILED

Store the registry credential in the secrets vault and reference it with the secret: prefix. The orchestrator resolves the credential at runtime and never exposes it to the container process environment or logs.


Retrying Flaky Steps

Container steps support retry configuration independent of the workflow FSM. This is useful for steps like pushing to a remote registry where transient network errors are expected:

PUSH_ARTIFACT:
  kind: ContainerRun
  name: "Push Artifact"
  image: "amazon/aws-cli"
  command: ["s3", "cp", "/workspace/dist/app.tar.gz", "s3://my-bucket/releases/"]
  workdir: "/workspace"
  volumes:
    - name: workspace
      mount_path: /workspace
  retry:
    max_attempts: 3
    backoff: "10s"
  transitions:
    - condition: exit_code_zero
      target: DONE
    - condition: exit_code_non_zero
      target: FAILED

max_attempts is the total number of attempts (including the first). backoff is the delay before the first retry; subsequent retries double the backoff (exponential).


Inspecting Pipeline Output

Synapse UI

The Synapse real-time execution viewer renders each ContainerRun step the same way it renders agent iterations — you can see the container image, command, exit code, stdout, and stderr as they stream in.

CLI

# Follow the event stream
aegis workflow logs <execution-id> --follow

For point-in-time execution status, use:

GET /v1/workflows/executions/{execution_id}

Common Images for CI/CD Steps

Use CaseImage
Rust compilationrust:1.75-alpine
Node.js build / testnode:20-alpine
Python testpython:3.11-slim
Docker build + pushdocker:24-cli
AWS CLI deployamazon/aws-cli
kubectl deploybitnami/kubectl:latest
Helm chart deployalpine/helm:3.14
Terraform applyhashicorp/terraform:1.7
Generic shell scriptsalpine:3.19

Any image accessible from the orchestrator node's Docker daemon (or from a registry the node can pull from) works.


Troubleshooting

Step times out before completion

Increase resources.timeout. The default is 5m. Long builds may need 30m or more:

resources:
  timeout: "30m"

Large build outputs are truncated

stdout and stderr are capped at 1 MB per step. For large outputs, write results to a volume file and read them from a later state:

BUILD:
  kind: ContainerRun
  command:
    - sh
    - -c
    - "cargo build --release 2>&1 | tee /workspace/build.log; exit ${PIPESTATUS[0]}"
  shell: true
  volumes:
    - name: workspace
      mount_path: /workspace

A later agent state can then read /workspace/build.log.

Image pull fails

Verify registry_credentials resolves to a valid credential in the vault and that the secret path is correct. Check the Synapse event stream for ContainerRunFailed events with reason: ImagePullFailed.


Next Steps

On this page