Deploying & Running Agents
Full agent lifecycle management via the AEGIS CLI — deploy, pause, resume, delete, execute, and monitor.
Deploying & Running Agents
This guide covers the complete agent lifecycle using the aegis CLI: deploying manifests, managing agent state, triggering executions, streaming iteration output, and inspecting execution history.
All operations assume the AEGIS daemon is running and accessible. Use --daemon-addr <host:port> to target a remote daemon, or the default localhost:9090 if running locally.
Authoring Approaches
There are two ways to supply an agent's runtime:
| Approach | When to Use |
|---|---|
| Generic AEGIS runtime | Most agents. Define all behavior in the manifest via spec.task.instruction. No Dockerfile or custom image needed. |
| Custom container | Agents with complex dependencies, non-standard libraries, or a fully custom bootstrap.py. See Writing Your First Agent. |
Both approaches use identical lifecycle and execution commands. The difference is entirely in the manifest format.
Generic AEGIS Runtime (No Custom Container)
For most use cases you do not need to build or maintain a Docker image. Define the agent's behavior in the manifest and let the orchestrator supply the runtime.
How it works:
- The orchestrator starts a container from the AEGIS-managed Python base image.
- It injects
bootstrap.pyinto the container automatically. - It renders the final LLM prompt by substituting Handlebars variables in
spec.task.prompt_template— replacing{{instruction}}with the manifest'sspec.task.instructiontext and{{input}}with the JSON passed at execution time. - The rendered prompt is passed to
bootstrap.pyasargv[1]. bootstrap.pycalls the LLM proxy at/v1/llm/generateand streams back the response.- If validation fails,
{{previous_error}}is populated with the failure output and the next iteration begins automatically.
No Python file to write. No Docker build. No image registry.
Example Manifest
apiVersion: 100monkeys.ai/v1
kind: Agent
metadata:
name: pr-reviewer
version: "1.0.0"
description: "Reviews pull request diffs and returns structured feedback."
spec:
runtime:
language: python
version: "3.11"
isolation: docker
task:
instruction: |
Review the provided code diff and return structured feedback covering:
- Security vulnerabilities or concerns
- Performance issues or opportunities
- Code quality and maintainability
- Idiomatic patterns and best practices
Provide specific line references where relevant. Be concise and actionable.
prompt_template: |
{{instruction}}
User: {{input}}
Reviewer:
execution:
mode: iterative
max_iterations: 5
security:
network:
mode: allow
allowlist:
- api.github.com
filesystem:
read:
- /workspace
write:
- /workspace/output
resources:
cpu: 1000
memory: "1Gi"
timeout: "300s"
volumes:
- name: workspace
storage_class: ephemeral
mount_point: /workspace
access_mode: read-write
size_limit_mb: 2048
ttl_hours: 1Deploy it:
aegis agent deploy ./pr-reviewer.yamlRun an execution, passing the diff as the {{input}} context:
aegis execute \
--agent pr-reviewer \
--input '{"diff": "<git diff output>", "repo": "my-org/my-repo", "pr": 42}' \
--watchThe full contents of the --input JSON string become the {{input}} value the LLM sees. Structure it however is useful for your agent's instruction.
Prompt Template Variables
| Variable | Populated With |
|---|---|
{{instruction}} | spec.task.instruction text from the manifest. |
{{input}} | JSON string passed via --input at execution time. |
{{iteration_number}} | Current iteration number (1-based). |
{{previous_error}} | Validator failure output or error from the previous iteration. Empty on iteration 1; injected automatically on retries. |
Use {{previous_error}} in the prompt template to give the LLM explicit feedback about what went wrong in earlier attempts:
task:
prompt_template: |
{{instruction}}
User: {{input}}
{% if previous_error %}
Your previous attempt failed with the following error. Fix it:
{{previous_error}}
{% endif %}
Reviewer:Setting a Specific Model
By default the agent uses the default alias defined in aegis-config.yaml. Override per-agent with spec.runtime.model:
spec:
runtime:
language: python
version: "3.11"
model: reasoning # maps to an alias in aegis-config.yaml llm.aliases
isolation: dockerAgent Lifecycle Commands
Deploy an Agent
aegis agent deploy ./my-agent/agent.yamlOn success, the agent is registered with status: deployed and assigned a UUID. The manifest is validated before acceptance — invalid manifests are rejected with a specific error.
Deployed agent "python-coder" (id: a1b2c3d4-0000-0000-0000-000000000001)To deploy and immediately print the full agent record as JSON:
aegis agent deploy ./my-agent/agent.yaml --output jsonList Agents
# Table format (default)
aegis agent list
# JSON format — useful for scripting
aegis agent list --output jsonExample table output:
ID NAME STATUS RUNTIME LABELS
a1b2c3d4-0000-... python-coder deployed docker type=developer
b2c3d4e5-0000-... code-reviewer paused docker
c3d4e5f6-0000-... security-scan deployed docker team=securityFilter by status:
aegis agent list --status deployed
aegis agent list --status pausedInspect an Agent
# By ID
aegis agent get a1b2c3d4-0000-0000-0000-000000000001
# By name
aegis agent get python-coderOutputs the full agent record including the stored manifest.
Update an Agent
aegis agent deploy ./my-agent/agent.yamlRe-deploying an agent with the same metadata.name updates the manifest in-place. Running executions are not affected — they continue using the manifest version that was active when they started.
Pause an Agent
Pausing prevents new executions but does not affect currently running ones.
aegis agent pause python-coder
# or by ID:
aegis agent pause a1b2c3d4-0000-0000-0000-000000000001Resume a Paused Agent
aegis agent resume python-coderDelete an Agent
Delete archives the agent (soft-delete). Historical execution records are retained. The agent cannot be restored after deletion.
aegis agent delete python-coderTo force-delete without confirmation prompt:
aegis agent delete python-coder --yesRunning Executions
Start an Execution
aegis execute \
--agent python-coder \
--input '{"task": "Write a function that reverses a linked list."}'Returns the execution ID immediately:
Execution started: a1b2c3d4-1111-0000-0000-000000000001Stream Execution Output
Use --watch to stream iteration events to the terminal:
aegis execute \
--agent python-coder \
--input '{"task": "Write a function that reverses a linked list."}' \
--watchExample output:
[2026-02-23T10:00:01Z] Execution a1b2c3d4-1111-... Started
[2026-02-23T10:00:01Z] Iteration 1 Started
[2026-02-23T10:00:03Z] Tool: fs.write /workspace/solution.py (234 bytes)
[2026-02-23T10:00:04Z] Tool: cmd.run python /workspace/solution.py
[2026-02-23T10:00:05Z] Tool: fs.write /workspace/result.json (89 bytes)
[2026-02-23T10:00:06Z] Iteration 1 Completed
[2026-02-23T10:00:06Z] Validation: exit_code=PASS json_schema=PASS (score=1.0)
[2026-02-23T10:00:06Z] Execution a1b2c3d4-1111-... Completed (1 iteration, 5.2s)Control Max Iterations
Override the default 10-iteration limit for a specific execution:
aegis execute \
--agent python-coder \
--input '{"task": "..."}' \
--max-iterations 3Inspecting Executions
List Executions
# All recent executions
aegis execution list
# Filter by agent
aegis execution list --agent python-coder
# Filter by status
aegis execution list --status completed
aegis execution list --status failed
aegis execution list --status running
# Limit results
aegis execution list --limit 20Get Execution Details
aegis execution get a1b2c3d4-1111-0000-0000-000000000001Returns a full execution record, including:
- Status
- Start and end timestamps
- Total iterations
- Each iteration's status, duration, and validation scores
Get Iteration Logs
# All iterations
aegis execution logs a1b2c3d4-1111-0000-0000-000000000001
# Specific iteration
aegis execution logs a1b2c3d4-1111-0000-0000-000000000001 --iteration 2Iteration logs include the LLM's final output text and any tool call summaries for that iteration.
Cancel a Running Execution
aegis execution cancel a1b2c3d4-1111-0000-0000-000000000001Cancellation stops the current iteration's container, releases any held locks, and marks the execution as cancelled.
Scripting with JSON Output
All CLI commands support --output json for machine-readable output. This is useful for CI/CD pipelines:
# Deploy and capture the agent ID
AGENT_ID=$(aegis agent deploy ./agent.yaml --output json | jq -r '.id')
# Start an execution and capture the execution ID
EXEC_ID=$(aegis execute --agent python-coder \
--input '{"task": "..."}' \
--output json | jq -r '.id')
# Poll until complete
while true; do
STATUS=$(aegis execution get $EXEC_ID --output json | jq -r '.status')
echo "Status: $STATUS"
if [[ "$STATUS" == "completed" || "$STATUS" == "failed" || "$STATUS" == "cancelled" ]]; then
break
fi
sleep 5
done
echo "Final status: $STATUS"Configuration and Flags
| Flag | Description |
|---|---|
--config <path> | Path to aegis-config.yaml. Defaults to ./aegis-config.yaml. |
--daemon-addr <host:port> | Target a specific daemon. Defaults to localhost:9090. |
--output json|table|yaml | Output format. Defaults to table. |
--yes | Skip confirmation prompts. |