The operator/agent boundary
Design rationale for hyperhive's two-principal trust model. The
implementation work — container network isolation, the unifying
gateway, core-daemon privsep — is tracked as area:ops issues on
the forge.
Today "the operator surface" and "the agent surface" are a
convention, not a boundary — nothing stops a container from
curling the core daemon on localhost:<port>, or another agent's
web UI. Network isolation, the gateway, and privsep together turn
that convention into an enforced boundary.
Two principals, two paths
- Operator — reaches every UI (the dashboard + every per-agent page) through the gateway, on one origin. Operator-authority actions (approve / deny, answer-as-operator, lifecycle POSTs) are served by the core daemon and only reachable via the gateway.
- Agent — speaks only for itself, only over its per-agent
unix socket. The socket's identity is the agent (see
docs/conventions.md, "identity = socket"). An agent must not be able to reach the core daemon's HTTP surface, another agent's socket, or another agent's web UI.
Design rule
Operator-authority actions never get a per-agent-socket entry point. They live on the core backend.
Worked example — answering an operator-targeted question is a
POST /answer-question/{id} on the core dashboard, never an
AgentRequest variant. If it were a per-agent-socket request, an
agent could curl its own socket and spoof an operator answer.
The per-agent web UI POSTs cross-origin to the core for these
(see the inline-answer feature — the loose-ends section on each
agent page).
Why network isolation is the load-bearing step
Containers currently share the host network namespace, so a
container can reach localhost:<core-port>, the dashboard, and
every other agent's web port. Until that changes, the
operator/agent split is on the honour system — every boundary
claim above is aspirational. Network isolation is what makes the
boundary real; the gateway and privsep are ergonomics and
defence-in-depth layered on top.
The area:ops issues followed this sequencing:
- Gateway — pure ergonomics win, unblocks same-origin (lets the
cross-origin CORS shim on
/answer-question/{id}go away), no behavioural risk. An nginx nixos-container now sits in front of all surfaces; per-agent UIs are proxied under/agent/<name>/. - Network isolation — the load-bearing step that turns the honour-system split into an enforced boundary. In progress.
- Privsep — defence in depth on the core process;
hive-c0reruns as the unprivilegedhive-coreuser and delegates root operations tohive-priv, a narrow socket-activated helper. Seedocs/security.mdfor the privilege boundary table.
hive-priv socket activation
hive-priv is always socket-activated by the hive-priv.socket
systemd unit. The unit binds /run/hive/priv.sock with
SocketGroup=hive-core and mode 0660 and passes the ready listener
to the helper as fd 3 (LISTEN_FDS). The helper requires this and
bails if it isn't socket-activated — there is intentionally no
self-bind fallback.
Dropping the old fallback removed a dev/prod divergence: when
hive-priv bound the socket itself it created the file owned by
root's primary group rather than hive-core, so a hive-core client
couldn't connect the way the socket unit's SocketGroup grant
intends. Requiring socket activation everywhere means dev and prod
take the exact same path and the group grant always holds.