AegisFSAL architecture, FUSE and NFS transports, FileHandle structure, UID/GID squashing, path canonicalization, and SeaweedFS integration.

Storage Gateway

The AEGIS Storage Gateway is the security boundary for all filesystem access by agent containers. It is built on AegisFSAL, a transport-agnostic File System Abstraction Layer running in the orchestrator process. Agent containers access their volumes through one of two transports — FUSE (for rootless Podman) or NFS (for rootful Docker) — with identical authorization, audit, and policy enforcement regardless of transport.

Design Philosophy

Traditional container volume mounts (bind mounts, CAP_SYS_ADMIN FUSE mounts) give agent containers unrestricted access to mounted storage once the mount is established. AEGIS takes a different approach:

Every POSIX operation is routed through the orchestrator-controlled AegisFSAL before reaching SeaweedFS. This means:

Per-operation authorization: The orchestrator validates every read, write, create, and delete against the execution's manifest policies.
Full audit trail: Every file operation is published as a StorageEvent domain event.
Path traversal prevention: Server-side path canonicalization blocks ../ attempts before they reach SeaweedFS.
No elevated privileges: Agent containers require zero special capabilities (CAP_SYS_ADMIN is not needed).

Component Hierarchy

Agent Container
  │
  ├── [FUSE transport] bind mount from host FUSE mountpoint
  │     /workspace → host path → FUSE daemon
  │
  ├── [NFS transport] kernel NFS client
  │     mount: addr=orchestrator_host, nfsvers=3, proto=tcp, nolock
  │     /workspace → NFS server
  ▼
Orchestrator Host
  ├── FUSE Daemon (in-process, translates POSIX ops → FSAL calls)
  ├── NFS Server Gateway (user-space, tcp, port 2049)
  │     NFSv3 protocol handler (nfsserve Rust crate)
  ▼
AegisFSAL (File System Abstraction Layer)
  │  receive: LOOKUP, READ, WRITE, READDIR, GETATTR, CREATE, REMOVE
  ├──► Decode FileHandle → extract volume_id + execution context
  ├──► Authorize: does execution or workflow execution own this volume?
  ├──► Canonicalize path: reject ".." components
  ├──► Enforce FilesystemPolicy (manifest allowlists)
  ├──► Apply UID/GID squashing (return agent container's UID/GID, not real ownership)
  ├──► Enforce quota (size_limit_bytes)
  ├──► Publish StorageEvent to Event Bus
  ▼
StorageProvider trait (via StorageRouter)
  ├── SeaweedFS POSIX API client (default)
  ├── OpenDalStorageProvider (S3, GCS, Azure)
  ├── LocalHostStorageProvider (NVMe, bind mounts)
  └── SealStorageProvider (Remote Node execution coordination)

AegisFileHandle

The NFSv3 protocol requires servers to return an opaque FileHandle for each file and directory. AEGIS encodes authorization information directly into the FileHandle:

FileHandle layout (48 bytes raw, ~52 bytes serialized, ≤64 bytes NFSv3 limit):
┌──────────────────────────────────────────────────┐
│  execution_id  (UUID binary, 16 bytes)            │
│  volume_id     (UUID binary, 16 bytes)            │
│  path_hash     (u64, 8 bytes)  — FNV hash of path │
│  created_at    (i64, 8 bytes)  — Unix timestamp   │
└──────────────────────────────────────────────────┘

Because NFSv3's 64-byte limit does not allow storing a full file path in the handle, path_hash contains a hash of the canonical path. The NFS server maintains a bidirectional in-memory FileHandleTable (fileid3 ↔ AegisFileHandle) that maps numeric file IDs to handles and reconstructs paths on demand. This table is per-execution and discarded when the execution ends.

On every NFS operation, AegisFSAL decodes the FileHandle, extracts volume_id, and verifies that the requesting execution context is authorized to access that volume. For the FUSE transport, authorization uses the dual execution context carried in each FsalService gRPC request: an execution_id (for agent-owned volumes) or a workflow_execution_id (for workflow-owned volumes used by container steps). Exactly one is set per request. If the caller does not own the volume, the operation fails with NFS3ERR_ACCES and an UnauthorizedVolumeAccess event is published.

The 64-byte NFSv3 FileHandle size limit is a hard protocol constraint enforced by the kernel NFS client. The current layout serializes to ~52 bytes via bincode, safely within the limit.

UID/GID Squashing

When SeaweedFS stores files, they carry a real POSIX UID/GID. Agent containers run as varying user IDs. Without squashing, file ownership mismatches would cause permission errors.

AegisFSAL overrides all file metadata returned by GETATTR to report the agent container's UID/GID rather than the real file ownership:

All GETATTR responses return uid = agent_container_uid, gid = agent_container_gid.
POSIX permission bit checks (chmod 600) are not enforced by the NFS server.
Authorization is handled entirely by the manifest FilesystemPolicy, not kernel permission bits.

The agent_container_uid and agent_container_gid are stored in the Execution metadata when the container is created and retrieved by AegisFSAL during each operation.

Path Canonicalization

All incoming paths are canonicalized before reaching the StorageProvider:

Resolve any . components.
Detect any .. components.
If .. is detected, reject the entire operation with NFS3ERR_ACCES and publish a PathTraversalBlocked event.
Strip the volume's root prefix to produce a path relative to the SeaweedFS bucket.

Example:

Incoming:    /workspace/../etc/passwd
Step 2:      ".." detected
Step 3:      REJECTED → NFS3ERR_ACCES
             PathTraversalBlocked event published

Filesystem Policy Enforcement

Each WRITE, CREATE, and REMOVE operation is validated against the manifest's FilesystemPolicy:

spec:
  security:
    filesystem:
      read:
        - /workspace
        - /agent
      write:
        - /workspace

If an agent attempts to write to /agent/config.py but only /workspace is in write, the operation is blocked with NFS3ERR_PERM and a FilesystemPolicyViolation event is published.

Quota Enforcement

When size_limit is set in the volume declaration, AegisFSAL tracks cumulative bytes written to the volume. Before each WRITE:

current_volume_size + write_size > parsed(size_limit)?
  → YES: fail with NFS3ERR_NOSPC, emit VolumeQuotaExceeded event
  → NO:  proceed with write

Quota accounting is maintained in-memory per execution and persisted to PostgreSQL. It is not affected by file deletions in Phase 1 (quota only tracks bytes written, not net storage used).

Storage Routing

The StorageRouter acts as a proxy for the AegisFSAL, allowing operations on volumes using distinct backends (like OpenDAL or LocalHost). Every POSIX operation requests the StorageRouter to find the correct StorageProvider for the specified volume_id in the FileHandle.

For more in-depth operational mechanisms regarding diverse storage modes, see Storage Backends.

AegisFSAL is designed as a transport-agnostic core. All transports — FUSE, NFS, and virtio-fs — share the same authorization logic, path canonicalization, UID/GID squashing, quota tracking, and event publishing with zero code duplication:

FUSE transport (rootless Podman):
  FUSE Daemon → AegisFSAL → StorageProvider

NFS transport (rootful Docker):
  NFSv3 Frontend → AegisFSAL → StorageProvider

virtio-fs transport (Firecracker):
  virtio-fs Frontend → AegisFSAL → StorageProvider

FUSE Transport

The FUSE transport enables native POSIX filesystem access to workspace volumes in rootless container runtimes where NFS mounts are unavailable. It runs as a host-side daemon process separate from the orchestrator, translating POSIX operations (read, write, readdir, getattr, create, remove, etc.) into gRPC calls to the orchestrator's FsalService, which delegates to AegisFSAL.

Because the FUSE daemon proxies all operations through the same AegisFSAL entry points as the NFS server, all security guarantees are identical: per-operation authorization, manifest policy enforcement, path traversal prevention, UID/GID squashing, quota enforcement, and StorageEvent audit trail.

Host-Side Daemon Architecture

The FUSE daemon runs as a standalone process on the host, outside of any container or pod. This design eliminates the need for CAP_SYS_ADMIN or /dev/fuse access inside the orchestrator pod.

Host OS
├── aegis-fuse-daemon (standalone process)
│     ├── Listens on 127.0.0.1:50053 (FuseMountService)
│     ├── Connects to orchestrator gRPC on :50051 (FsalService)
│     └── Manages FUSE mountpoints under /tmp/aegis-fuse/
│
├── aegis-orchestrator (pod-core or daemon)
│     ├── Calls FuseMountService.Mount on execution start
│     ├── Bind-mounts host FUSE path into container
│     └── Calls FuseMountService.Unmount on execution end
│
└── Agent Container
      └── /workspace → bind mount from /tmp/aegis-fuse/<volume_id>

The FUSE daemon acts as a gRPC FSAL proxy: every POSIX system call from an agent container flows through the kernel FUSE layer to the daemon, which forwards the operation to the orchestrator's FsalService gRPC endpoint. Each FsalService request carries a dual execution context -- either execution_id (for agent-owned volumes) or workflow_execution_id (for workflow-owned volumes accessed by container steps). The orchestrator applies all AegisFSAL security checks using whichever context is present and routes the operation to the appropriate StorageProvider.

Mount Lifecycle

An execution starts. The orchestrator calls FuseMountService.Mount(execution_id, volume_id, tenant_id) on the FUSE daemon.
The FUSE daemon creates a FUSE mountpoint at <mount_prefix>/<volume_id> (e.g., /tmp/aegis-fuse/<volume_id>).
The daemon returns the host mount path in the MountResponse.
The orchestrator creates the agent container with a bind mount from the host FUSE path to the container's /workspace (or the manifest-declared mount_path).
All POSIX operations from the container flow through the kernel FUSE layer to the daemon, which proxies them via gRPC to the orchestrator's FsalService.
When the execution ends, the orchestrator calls FuseMountService.Unmount(execution_id, volume_id). The daemon unmounts the FUSE filesystem and cleans up.

systemd Deployment

In production deployments, the FUSE daemon runs as a systemd service that starts before the orchestrator:

# /etc/systemd/system/aegis-fuse-daemon.service  (system service)
# or ~/.config/systemd/user/aegis-fuse-daemon.service  (user service for rootless)
[Unit]
Description=AEGIS FUSE Daemon
Before=aegis.service

[Service]
ExecStart=/usr/local/bin/aegis fuse-daemon start \
  --orchestrator-url grpc://localhost:50051 \
  --mount-prefix /tmp/aegis-fuse \
  --listen-addr 127.0.0.1:50053
Restart=on-failure
RestartSec=5s

[Install]
WantedBy=default.target

The Before=aegis.service ordering ensures the FUSE daemon is ready to accept mount requests before the orchestrator starts processing executions.

Bidirectional Mount Propagation

The FUSE mount directory uses bidirectional mount propagation so that FUSE-backed volumes mounted by the daemon are visible to execution containers on the same host. This is required because rootless container runtimes isolate mount namespaces per container -- without propagation, FUSE mounts would be invisible to other containers.

Rootless Podman

In rootless Podman deployments, the kernel NFS client is not available because mounting NFS requires root privileges. The FUSE transport solves this:

The FUSE daemon mounts a FUSE filesystem at a host path (e.g., /tmp/aegis-fuse/<volume_id>).
The container is started with a bind mount from the host FUSE mountpoint into the container's /workspace.
All POSIX operations from the container flow through the kernel's FUSE layer to the FUSE daemon, which proxies them via gRPC to AegisFSAL.

No privileged mount operations or special capabilities are required. The entire path -- from container I/O to storage backend -- runs unprivileged.

Firecracker (virtio-fs)

The same FUSE daemon supports virtio-fs transport for Firecracker microVMs. In this configuration, the daemon acts as the vhost-user-fs backend, and the guest kernel communicates via the virtio-fs device. The AegisFSAL security boundary is preserved identically -- the VM guest cannot bypass authorization any more than a container can.

Transport Selection

The orchestrator selects the filesystem transport based on configuration:

Configuration	Transport	Runtime
`fuse_daemon_address` set	FUSE + bind mount	Rootless Podman
`fuse_daemon_address` unset (default)	NFS volume driver	Rootful Docker
Firecracker runtime	virtio-fs (via FUSE daemon)	Firecracker microVM

Transport selection is per-node. All transports produce the same observable behavior from the agent's perspective -- a POSIX filesystem at /workspace (or the manifest-declared mount_path) backed by the same AegisFSAL authorization and the same StorageProvider backends.

Volume Lifecycle

Volumes follow a deterministic state machine managed by the orchestrator:

Creating ──► Available ──► Attached ──► Detached
    │                          │             │
    │                          └─────────────┤
    │                                        │
    └──────────────────────────────────► Deleting ──► Deleted

Any non-terminal state ──► Failed

State	Meaning
`Creating`	Directory being provisioned in SeaweedFS and quota being set
`Available`	SeaweedFS directory ready; no container has mounted it yet
`Attached`	NFS export active; container is mounted and I/O is live
`Detached`	Container stopped; NFS export removed; volume data intact in SeaweedFS
`Deleting`	Delete request accepted; SeaweedFS directory removal in progress
`Deleted`	SeaweedFS directory confirmed removed; record retained briefly for audit trail
`Failed`	A state transition failed (e.g., SeaweedFS unreachable during creation or deletion)

Available → Attached occurs when the container starts and the NFS mount is confirmed active. Attached → Detached occurs when the container stops or is killed. Ephemeral volumes with no active execution proceed immediately from Detached to Deleting. Persistent volumes remain in Detached until explicitly deleted via the CLI or API.

Failed volumes are surfaced through volume management APIs and can be retried by reissuing the delete request through the orchestrator API.

Volume Ownership Types

Every volume has an ownership type that determines its lifecycle scope and automatic cleanup behavior. The orchestrator uses the ownership to decide when a volume's data should be garbage-collected.

Ownership	Scope	Cleanup
`Execution`	Scoped to a single agent execution	Auto-deleted when the execution ends
`WorkflowExecution`	Shared across all executions within a single workflow run	Auto-deleted when the workflow execution completes
`Persistent`	User-owned, not tied to any execution	Manual — deleted only when the user explicitly removes it via the CLI or API

When a workflow runs, its workspace volume is created with WorkflowExecution ownership. This means the volume persists across all states in the workflow, allowing multiple agents and containers to access the same files through NFS.

For example, in a code-generation workflow:

A code-writing agent runs in the first workflow state and writes source files to /workspace.
The workflow transitions to a build-and-test state, which launches a container to compile and run tests.
Both the agent and the container mount the same workspace volume — the build container sees the files the agent wrote.

Because the volume is owned by the workflow execution (not by any individual agent execution), it is not cleaned up when the first agent finishes. It remains available until the entire workflow completes, at which point the orchestrator garbage-collects it automatically.

Persistent volumes, by contrast, survive indefinitely and are managed through the volume management API. They are useful for user workspaces that should retain files across multiple independent workflow runs.

Phase 1 Constraints

`nolock` Mount Option

All NFS mounts in Phase 1 use nolock. This disables the NLM (Network Lock Manager) protocol, meaning POSIX advisory file locks (flock, fcntl) are not coordinated across agents.

This is safe for the common case of single-agent-per-volume. For multi-agent coordination (swarms), use the ResourceLock mechanism provided by the swarm coordination context instead of POSIX locks.

Single-Writer Constraint

Persistent volumes with ReadWrite access can only be mounted by one execution at a time. Attempting a second ReadWrite mount on the same volume returns VolumeAlreadyMounted. Multiple executions may hold ReadOnly mounts simultaneously.

SRE & Performance Tuning

To optimize the AegisFSAL NFS Server Gateway for varied agent workloads, operators should tune the kernel NFS client mount options. By default, the orchestrator mounts volumes with the following options:

addr=<orchestrator_host>,nfsvers=3,proto=tcp,soft,timeo=10,nolock,acregmin=3,acregmax=60

Graceful Degradation (soft,timeo=10): A soft mount ensures that if the NFS server crashes or becomes unreachable, the agent container's I/O operations will return an EIO error rather than hanging indefinitely in a D-state. The timeo=10 parameter specifies a 1-second timeout (measured in deciseconds).
Client Caching (acregmin, acregmax): The kernel NFS client caches file attributes. Lowering these values (acregmin=1) reduces cache staleness at the cost of more GETATTR calls to the orchestrator. For high-throughput artifact generation, keeping the defaults (acregmin=3, acregmax=60) reduces load on the orchestrator.
Latency Overhead: Because every POSIX operation routes through the AegisFSAL orchestrator process for authorization, there is an expected 1-2ms latency overhead per operation compared to a direct FUSE mount. This is generally acceptable for agent-driven code generation, but may affect high-frequency I/O workloads.

Export Path Routing

Each volume gets a unique NFS export path derived from its tenant and volume identifiers:

/{tenant_id}/{volume_id}

The orchestrator maintains a runtime NfsVolumeRegistry — a concurrent map of VolumeId → NfsVolumeContext. When an execution mounts a volume, its export path is registered. When the execution ends and the volume is detached, the entry is removed. Volumes that are not currently mounted have no active export and cannot be reached via NFS.

The agent container's NFS mount is configured to target the orchestrator host at the volume's export path:

addr=<orchestrator_host>,nfsvers=3,proto=tcp,soft,timeo=10,nolock
device: :/<tenant_id>/<volume_id>
target: /workspace  (or mount_path from manifest)

Storage Events

Every file operation handled by AegisFSAL publishes a StorageEvent to the event bus. These events are persisted to PostgreSQL by a background StorageEventPersister task and form the complete file-level audit trail for each execution.

Event	Trigger
`FileOpened`	Agent opens a file (`open()` / `create()`)
`FileRead`	Bytes are read from a file — includes `offset` and `bytes_read`
`FileWritten`	Bytes are written to a file — includes `offset` and `bytes_written`
`FileClosed`	File handle is released
`DirectoryListed`	`readdir` is called on a directory
`FileCreated`	A new file is created
`FileDeleted`	A file is removed
`PathTraversalBlocked`	A `..` component was detected in the incoming path
`FilesystemPolicyViolation`	An operation violated a manifest `read`/`write` allowlist
`QuotaExceeded`	A write would exceed the volume's `size_limit`
`UnauthorizedVolumeAccess`	The requesting execution does not own the volume

All events carry volume_id and a timestamp. Agent-sourced events carry execution_id; ContainerStep FUSE reads carry workflow_execution_id instead. Exactly one is always set. File operation events additionally carry the canonicalized path, byte counts, and latency in milliseconds.

:::note Agent vs. Workflow Storage Events Storage events originating from agent FSAL calls carry an execution_id referencing the agent execution. Events from ContainerStep FUSE filesystem reads carry a workflow_execution_id instead. The execution_id and workflow_execution_id fields are mutually exclusive — exactly one is always set. :::

SeaweedFS Integration

SeaweedFS is the default StorageProvider. The orchestrator communicates with SeaweedFS through two separate interfaces:

Interface	Used For
HTTP Filer API (`port 8888`)	Directory lifecycle (create, delete, set quota, get usage, list) — called by `VolumeManager` during volume provisioning and GC
HTTP Filer API (`port 8888`)	POSIX file operations (open, read, write, stat, readdir, create, rename, delete) — called by `AegisFSAL` on every NFS LOOKUP, READ, WRITE, READDIR, GETATTR etc.

Volume data is stored in SeaweedFS at the following path structure:

/{tenant_id}/{volume_id}/{file_path}

For example, a file /workspace/main.py written by an execution exec-abc on volume vol-xyz in the default single-tenant environment is stored at:

/00000000-0000-0000-0000-000000000001/vol-xyz/main.py

Replication

SeaweedFS replication is configured independently of AEGIS at the SeaweedFS layer. The AEGIS orchestrator does not set the replication factor on volume directories — this is controlled by the SeaweedFS default replication setting and can be overridden in SeaweedFS collection configuration.

A common convention for AEGIS deployments is:

Storage Class	SeaweedFS Replication	Rationale
Ephemeral	`000` (no replication)	TTL-based; durability not required
Persistent	`001` (one copy on different nodes)	Survives single node failure

Health Checks

The orchestrator checks SeaweedFS health via the Filer API on startup and periodically thereafter. If SeaweedFS is unreachable and fallback_to_local is enabled in the node configuration, the orchestrator falls back to a local filesystem StorageProvider. The local fallback does not support S3 artifact inspection or multi-node access.

Storage Gateway

On this page