Aegis Orchestrator
Core Concepts

Security Model

The two-layer AEGIS security model — infrastructure policy enforcement and SEAL protocol-level security.

Security Model

AEGIS enforces security at two independent layers that work in concert: infrastructure-level policy (declared in the agent manifest) and SEAL (Signed Envelope Attestation Layer, enforced at the tool call level). Both layers must be satisfied for any agent operation to proceed.

Some deployments also enable execution.tool_validation, a semantic pre-dispatch judge for selected tool calls. This is a safety control, not a third hard-security boundary. It can block a low-quality or dangerous tool intent before side effects occur, but it does not grant access, expand capabilities, or replace manifest policy or SEAL authorization.


Layer 1: Infrastructure Policy

Infrastructure policy is declared in the agent manifest under spec.security. It is evaluated by the orchestrator before the container starts and enforced by runtime and network controls during execution.

Network Policy

Controls which external domains the agent container is permitted to reach.

security:
  network:
    mode: allow          # "allow" = allowlist; "deny" = blocklist
    allowlist:
      - api.github.com
      - api.openai.com
      - pypi.org

In allow mode, all outbound connections not in the allowlist are blocked at the container network layer. In deny mode, all connections except those in the list are permitted.

Filesystem Policy

Controls which paths inside the container the agent may read from or write to.

security:
  filesystem:
    read:
      - /workspace
      - /agent
    write:
      - /workspace

Path restrictions are enforced by the AEGIS storage gateway at the AegisFSAL layer — not by kernel permissions. This means UID/GID of the running process are irrelevant; the gateway checks path prefix allowlists on every filesystem operation.

Resource Limits

CPU and memory ceilings prevent resource exhaustion.

security:
  resources:
    cpu: 1000              # millicores (1000 = 1 CPU core)
    memory: "1Gi"         # human-readable memory limit
    timeout: "300s"       # hard wall-clock timeout for the entire execution

Layer 2: SEAL (Signed Envelope Attestation Layer)

SEAL is the protocol-level security layer that governs all MCP tool calls. Every tool invocation from an agent is wrapped in a cryptographically signed SealEnvelope. The orchestrator verifies the signature and evaluates the call against the agent's assigned SecurityContext before forwarding to any tool server.

Key Concepts

ConceptDescription
SecurityContextA named set of permitted tool capabilities defined in node config (e.g., "aegis-system-agent-runtime", "research-safe", "aegis-system-operator").
SecurityTokenA short-lived JWT issued by the orchestrator at attestation time, scoping the agent to its SecurityContext.
SealEnvelopeEvery tool call is wrapped in an envelope containing the SecurityToken, an Ed25519 signature, and the inner MCP payload.
PolicyEngineCedar-based rule evaluator that checks each tool call against the SecurityContext capabilities.

Attestation Flow

At agent startup, bootstrap.py performs a one-time attestation:

  1. Bootstrap generates an Ed25519 keypair. The private key exists only in process memory and is never written to disk.
  2. Bootstrap sends an AttestationRequest (public key + container ID) to the orchestrator.
  3. The orchestrator verifies the container ID is a known live execution, then issues a SecurityToken (JWT signed by the orchestrator's root key via OpenBao) scoped to the agent's security_context from the manifest.
  4. Bootstrap receives the SecurityToken and uses it + the private key to sign all subsequent tool call envelopes.

Per-Call Authorization

On every tool call:

  1. SealMiddleware in the orchestrator receives the SealEnvelope.
  2. It verifies the Ed25519 signature against the public key registered at attestation.
  3. It decodes the SecurityToken and verifies it is not expired and matches the current execution_id.
  4. It passes the tool name and parameters to the PolicyEngine.
  5. The PolicyEngine evaluates Cedar rules for the tool pattern against the SecurityContext capabilities:
    • Does the capability list include a pattern matching this tool name?
    • If the call is a filesystem operation, is the path within the allowlist for this capability?
    • Has the rate limit for this tool been exceeded?
  6. If all checks pass, the tool call is forwarded to the appropriate routing path.
  7. Any failure at steps 2–5 emits a PolicyViolationBlocked event and returns an error to the agent.

Credential Isolation

Agents never receive API keys, database credentials, or other secrets. The orchestrator resolves credentials from OpenBao (see Secrets Management) and injects them directly into outbound tool call requests — invisible to the agent process. This is enforced architecturally: the credential resolution happens in the orchestrator host process after the SealEnvelope is verified.

Non-Repudiation

Because every tool call is signed with the agent's ephemeral Ed25519 private key, and only the orchestrator can issue SecurityTokens, there is a cryptographic audit trail proving which agent made which tool call. An agent cannot deny a tool invocation it made.


SecurityContext Configuration

SecurityContext definitions are declared in aegis-config.yaml:

security_contexts:
  - name: default
    capabilities:
      - tool: "fs.*"
        path_allowlist:
          - /workspace
        rate_limit:
          calls_per_minute: 100
      - tool: "cmd.run"
        rate_limit:
          calls_per_minute: 30
      - tool: "web.search"
        rate_limit:
          calls_per_minute: 10

  - name: restricted
    capabilities:
      - tool: "fs.read"
        path_allowlist:
          - /workspace
        rate_limit:
          calls_per_minute: 50

  - name: privileged
    capabilities:
      - tool: "fs.*"
        path_allowlist:
          - /workspace
          - /shared
      - tool: "cmd.run"
        rate_limit:
          calls_per_minute: 100
      - tool: "web.*"
        rate_limit:
          calls_per_minute: 50
      - tool: "github.*"
        rate_limit:
          calls_per_minute: 20

An agent's manifest references a context by name:

spec:
  security:
    security_context: default

Operator-Level SecurityContext

Platform operators use the aegis-system-operator SecurityContext, which provides a superset of the enterprise consumer tier. In addition to all safe commands (filesystem, shell, web, agent creation, workflow execution), it grants access to two additional tool categories:

Destructive operations — permanent deletions that are never available to consumer tiers:

  • aegis.agent.delete
  • aegis.workflow.delete
  • aegis.task.remove

Orchestrator commands — system introspection and runtime configuration:

  • aegis.system.info
  • aegis.system.config

Tools are classified into three categories at the SEAL gateway level:

CategoryConsumer tiersOperator
Safe commands✓ (all tiers)
Destructive commands
Orchestrator commands

Operator identity is resolved from aegis-system realm JWTs carrying aegis_role claims. See IAM & Identity Federation for the full authentication flow.


Chat Surface vs. Execution Surface

AEGIS applies security contexts across two independent surfaces. The context in effect depends on where the tool call is being made, not who triggered the chain.

Chat / MCP surface

When a user interacts via the Zaru client or any MCP client, their tool calls are evaluated against their tier context (zaru-free, zaru-pro, zaru-business, or zaru-enterprise). These contexts control which aegis tools the user may invoke — for example, whether they can call aegis.task.execute, signal a workflow, or access the agent registry. Zaru tier contexts deliberately do not grant filesystem or shell access; they are scoped to the orchestration API surface only.

Execution surface

When a user calls aegis.task.execute (or any tool that spawns an agent container), the resulting container does not run under the caller's tier context. It runs under the aegis-system-agent-runtime context — a built-in context that grants:

  • fs.* scoped to /workspace via path_allowlist
  • cmd.run — shell command execution inside the container
  • web.* — outbound web access
  • aegis read and execution tools (aegis.agent.get, aegis.workflow.list, aegis.task.execute, etc.)

This separation is the core of the two-surface model: the user's tier governs what they can ask the platform to do; aegis-system-agent-runtime governs what agent containers are allowed to do on the user's behalf. A Free-tier user whose context blocks direct shell access can still have an agent container run shell commands in /workspace — because those commands execute under aegis-system-agent-runtime, not zaru-free.

The aegis-system-agent-runtime context is a platform built-in. It does not need to be defined in aegis-config.yaml.


Agent Generation and Context Escalation

When a consumer triggers aegis.agent.generate, the resulting agent-creator-agent execution always runs in the aegis-system-default security context — not the caller's tier context. This is intentional: the generation workflow requires unrestricted tool access (tool_pattern: "*") to author, validate, and register the new agent manifest. The caller's tier determines whether they are permitted to invoke aegis.agent.generate at all, but it does not constrain the execution context of the generation agent itself.

This means the agent-creator-agent can reach any tool permitted by the aegis-system-default context, regardless of whether the consumer that triggered it would normally have access to those tools.


Runtime Isolation

The security model is strengthened by the underlying container runtime. AEGIS supports three isolation tiers, selectable per node via default_isolation in the node configuration:

Docker (development/staging):

  • Container runs with default seccomp profile.
  • Network isolation via Docker bridge networks.
  • No CAP_SYS_ADMIN required — AEGIS volumes are mounted over NFS, not FUSE.

Podman (managed/production):

  • Rootless mode eliminates the privileged daemon — no root-owned socket, no root-running background process. A compromised container cannot escalate to a host-level daemon.
  • User-namespace isolation maps container UID 0 to an unprivileged host UID, providing an additional containment layer beyond what Docker offers by default.
  • Compatible with the same OCI images and container network configuration as Docker. The orchestrator auto-detects the runtime from the configured container_socket_path.
  • Rootful Podman is also supported where rootless is impractical (e.g., NFS mounts requiring root).

Firecracker (hardened production):

  • Each agent execution runs in an independent KVM micro-VM.
  • The VM has no knowledge of the host network, no shared memory with the host, and a strict device model.
  • A compromised agent process is contained within its VM boundary.
  • ~125ms VM boot time; overhead is amortized over the execution duration.

See Podman Deployment and Firecracker Runtime for deployment details.

On this page