Aegis Orchestrator
Deployment

SeaweedFS Storage

Deploying SeaweedFS as the AEGIS volume storage backend — single-node dev setup, production HA topology, replication, S3 gateway, and health checks.

SeaweedFS Storage

SeaweedFS is the default volume storage backend for AEGIS. It provides POSIX-speed filesystem semantics for small files (critical for workloads involving git, npm install, Python venvs, and build artifacts), per-directory quota enforcement, TTL-based expiration, and an optional S3-compatible API for artifact inspection.

The AEGIS orchestrator communicates with SeaweedFS via the Filer HTTP API on port 8888. Agent containers access volume data exclusively through the orchestrator's NFS Server Gateway — they never communicate with SeaweedFS directly.

For the node configuration keys that point AEGIS at a SeaweedFS cluster, see Node Config Reference — spec.storage.


SeaweedFS Architecture

SeaweedFS has three distinct components:

ComponentDefault PortRole
Master Server9333Distributed coordination via Raft consensus; tracks which volume servers hold which data; leader election
Volume Server8080Stores actual file data as binary blobs; replicates data across nodes according to replication policy
Filer8888Provides a POSIX-style namespace (directory tree) on top of volume servers; stores file metadata (used by AEGIS)

AEGIS only communicates with the Filer. The Master and Volume Servers are internal SeaweedFS concerns.


Single-Node / Dev Setup

Use aegis init to generate the local stack, then start the SeaweedFS services from that stack:

curl -fsSL https://raw.githubusercontent.com/100monkeys-ai/aegis-examples/main/install.sh | bash
aegis init
docker compose -f ~/.aegis/docker-compose.yml up -d seaweedfs-master seaweedfs-volume seaweedfs-filer

Dev stack services:

ServicePortPurpose
seaweedfs-master9333Master coordination (single node, no Raft in dev)
seaweedfs-volume8080Data storage (no replication in dev)
seaweedfs-filer8888Filer API (backed by LevelDB metadata store in dev)

The dev filer uses an embedded LevelDB metadata store. Production deployments should use PostgreSQL (see below).

Configure AEGIS to use the dev stack by setting spec.storage.seaweedfs.filer_url in aegis-config.yaml:

apiVersion: 100monkeys.ai/v1
kind: NodeConfig
metadata:
  name: dev-node
spec:
  storage:
    backend: seaweedfs
    nfs_port: 2049
    seaweedfs:
      filer_url: "http://localhost:8888"
      default_ttl_hours: 24
      default_size_limit_mb: 1000
      gc_interval_minutes: 60

Production HA Topology

A production SeaweedFS deployment for AEGIS requires three tiers for high availability:

                     Load Balancer (HAProxy / Nginx)

               ┌──────────────┼──────────────┐
               ▼              ▼              ▼
          ┌─────────┐   ┌─────────┐   ┌─────────┐
          │ Filer 1 │   │ Filer 2 │   │ Filer 3 │
          └─────────┘   └─────────┘   └─────────┘
               │              │              │
               └──────────────┼──────────────┘

                     PostgreSQL (HA mode)      ← filer metadata

               ┌──────────────┼──────────────┐
               ▼              ▼              ▼
          ┌─────────┐   ┌─────────┐   ┌─────────┐
          │ Volume  │   │ Volume  │   │ Volume  │
          │ Server1 │   │ Server2 │   │ Server3 │
          └─────────┘   └─────────┘   └─────────┘
               │              │              │
               ▼              ▼              ▼
          ┌─────────┐   ┌─────────┐   ┌─────────┐
          │ Master1 │   │ Master2 │   │ Master3 │
          │(Leader) │   │(Follower│   │(Follower│
          └─────────┘   └─────────┘   └─────────┘

Minimum Sizing

NodeCPURAMDiskCount
Master2 cores4 GiB20 GiB SSD3 (Raft quorum)
Filer4 cores8 GiB20 GiB SSD3+ (stateless; scale for throughput)
Volume Server4 cores8 GiB1+ TiB HDD or NVMe3+ (scale for storage)

Filers are stateless (metadata in PostgreSQL) and can be scaled horizontally behind the load balancer without coordination.

Filer PostgreSQL Configuration

Set the following environment variables on each Filer container:

environment:
  - WEED_POSTGRES2_ENABLED=true
  - WEED_POSTGRES2_HOSTNAME=db-host
  - WEED_POSTGRES2_PORT=5432
  - WEED_POSTGRES2_DATABASE=seaweedfs
  - WEED_POSTGRES2_USERNAME=seaweedfs
  - WEED_POSTGRES2_PASSWORD=changeme
  - WEED_POSTGRES2_SSLMODE=require

Point AEGIS at the Filer load balancer address:

spec:
  storage:
    seaweedfs:
      filer_url: "http://seaweedfs-filer-lb:8888"

Data Replication

SeaweedFS replication is specified as a three-digit number: XYZ where:

  • X = extra copies in the same data center
  • Y = extra copies in another rack
  • Z = extra copies in another data center

AEGIS does not set the replication factor on individual volumes. Replication is configured globally in the SeaweedFS Master defaultReplicaPlacement setting and can be overridden per SeaweedFS collection.

A common convention for AEGIS deployments:

Volume storage_classRecommended ReplicationNotes
ephemeral000 (no extra copies)TTL-based; single copy is sufficient
persistent001 (one extra copy on a different node)Survives single volume server failure

To set the default placement, include -defaultReplicaPlacement=001 in the Master command arguments:

weed master -defaultReplicaPlacement=001 -mdir=/data

S3 Gateway

SeaweedFS includes an optional S3-compatible API server that exposes volume data as S3 objects. This is useful for inspecting execution artifacts using standard tools (aws s3 ls, S3 Browser, Cyberduck) without running a new execution.

Enabling the S3 Gateway

Add an S3 gateway service to your deployment:

weed s3 -port=8333 -filer=seaweedfs-filer:8888

Or in Docker Compose:

services:
  seaweedfs-s3:
    image: chrislusf/seaweedfs:3.60
    command: "s3 -port=8333 -filer=seaweedfs-filer:8888"
    ports:
      - "8333:8333"
    depends_on:
      - seaweedfs-filer

Configure AEGIS with the S3 endpoint:

spec:
  storage:
    seaweedfs:
      filer_url: "http://seaweedfs-filer:8888"
      s3_endpoint: "http://seaweedfs-s3:8333"
      s3_region: "us-east-1"

Path Conventions

Volume data is accessible in the S3 namespace using the tenant and volume identifiers:

s3://<tenant_id>/<volume_id>/<file_path>

For example, a file /workspace/main.py on volume vol-abc in the default single-tenant deployment (tenant_id = 00000000-0000-0000-0000-000000000001):

s3://00000000-0000-0000-0000-000000000001/vol-abc/main.py

The S3 gateway does not enforce manifest FilesystemPolicy rules. Access to the S3 endpoint should be restricted to operators and debugging tooling at the network level; agent containers must not reach it.


Health Checks

AEGIS checks SeaweedFS availability on startup and periodically during the GC cycle using the Filer API:

# Check filer health
curl -s http://seaweedfs-filer:8888/healthz

# Check cluster status (via master)
curl -s http://seaweedfs-master:9333/cluster/status | jq .

If the Filer is unreachable and fallback_to_local: true is set in the node configuration (the default), AEGIS falls back to a local filesystem volume provider. Volumes created on the local fallback are not accessible from other orchestrator nodes and do not support S3 inspection.

To disable the fallback and fail hard when SeaweedFS is unavailable:

spec:
  storage:
    backend: seaweedfs
    fallback_to_local: false

On this page