Multi-hive swarms

A swarm is a collection of agents that share an identity and coordinate across one or more hives. A single hyperhive instance running on one host is already a swarm (one hive). This doc covers the additional config needed when the swarm spans multiple hosts.

Terminology

Hive identity config

services.hyperhive = {
  domain   = "pr1ma.example.com";   # machine-addressable DNS domain
  hiveName  = "pr1ma";              # human display name (optional)
  swarmName = "constellat1on";      # shared swarm display name (optional)
};

domain is required when matrix federation is on (matrix.enable); it drives HYPERHIVE_HIVE_DOMAIN in every container so agents can form qualified labels (iris@pr1ma.example.com). hiveName and swarmName are purely display — they surface in the dashboard chrome header and per-agent system prompts. Federated hives at different domains can share a swarmName.

See docs/conventions.md § Hive identity for the env-var chain and qualify() / qualified_label() semantics.

Declaring peer hives

services.hyperhive.swarm.peers = {
  "lab.example.com"  = { };                               # CA-trusted (Let's Encrypt etc.)
  "edge.corp"        = { certFingerprint = "sha256:…"; }; # self-signed TLS
};

The attrset key is the peer's DNS domain. certFingerprint is optional:

certFingerprint scopes only to hive-c0re's own peer HTTPS checks (the P33RS dashboard links and agent peer discovery below). It is not consulted by matrix federation — tuwunel validates a peer's federation certificate against the system CA bundle independently (see Matrix federation below), so pinning a fingerprint here does nothing for a self-signed matrix gateway cert.

Fingerprint format

The value is the string sha256: followed by exactly 64 hexadecimal digits — the SHA-256 digest of the peer's DER-encoded TLS leaf certificate. The hex is case-insensitive (upper or lower both parse), carries no colon separators between bytes, and any value not matching this shape is ignored with a warning rather than weakening trust.

sha256:b1946ac92492d2347c6235b4d2611184a3f5b6cae6c19d6e3c2f0a8e7d4c9f12

Generate it from the peer's certificate with openssl. The -fingerprint -sha256 output is uppercase and colon-separated, so strip the colons, lowercase, and prepend the sha256: prefix:

# from a PEM/CRT file
openssl x509 -in peer.crt -noout -fingerprint -sha256 \
  | sed 's/^.*=//; s/://g' | tr 'A-Z' 'a-z' | sed 's/^/sha256:/'

# straight from the live endpoint (port 443)
echo | openssl s_client -connect peer.example.com:443 -servername peer.example.com 2>/dev/null \
  | openssl x509 -noout -fingerprint -sha256 \
  | sed 's/^.*=//; s/://g' | tr 'A-Z' 'a-z' | sed 's/^/sha256:/'

Pin the leaf certificate, not an intermediate or the CA — the digest must match the exact cert the peer serves on its HTTPS endpoint. When the peer rotates its cert, update the pin to the new fingerprint (or switch the peer to a CA-trusted cert and drop the field).

The nix module serialises the attrset to a HYPERHIVE_PEERS JSON array ([{ domain, cert_fingerprint }]) injected into the c0re environment and forwarded to agent containers.

What the config does at runtime

  1. Dashboard P33RS tabparse_peer_hives() in dashboard.rs reads HYPERHIVE_PEERS and includes peer_hives: Vec<{ name, url }> in /api/state. The dashboard shows a P33RS tab (hidden when the list is empty) with a card per peer linking to https://{domain}/. See docs/web-ui/dashboard.md § P33RS tab.

  2. Agent identity — the same HYPERHIVE_PEERS env var is forwarded to agent containers by meta.rs; agent code can call identity::peers() to discover peer hives and address them with qualified names (agent@domain).

  3. Matrix federation — when matrix.enable is on, tuwunel federates with the peer's matrix server (discovered via the peer's .well-known/matrix/server delegation, which the gateway serves). Federation validates the peer's TLS certificate against the system CA bundle — independently of certFingerprint, which it never consults. A self-signed gateway certificate therefore won't federate even with a fingerprint pinned above: the peers need CA-issued certs (ACME) or a shared private CA trusted on both gateway hosts. See docs/matrix.md for federation firewall + TLS requirements.

Bilateral setup

Each hive must declare the other. If hive A lists hive B as a peer, B must also list A for agents on B to see A in their peer list:

# hive A (pr1ma.example.com)
services.hyperhive.swarm.peers."edge.corp" = { };

# hive B (edge.corp)
services.hyperhive.swarm.peers."pr1ma.example.com" = { certFingerprint = "sha256:…"; };

Mixed trust is fine: A trusts B via CA bundle (no fingerprint), B pins A's self-signed cert.

WireGuard inter-hive mesh (optional)

The peer config above uses public HTTPS for all inter-hive traffic. For private deployments — or to reduce latency and TLS overhead on intra-swarm traffic — hive-c0re can configure a host-to-host WireGuard mesh.

Generating keys

On each hive host:

wg genkey | install -m 0400 /dev/stdin /etc/wireguard/hive.key
wg pubkey < /etc/wireguard/hive.key   # → share this with peer operators

Config example (two hives)

# hive A (pr1ma.example.com, mesh IP 10.100.0.1)
services.hyperhive = {
  swarm.wireguard = {
    enable        = true;
    privateKeyFile = "/etc/wireguard/hive.key";
    address        = "10.100.0.1/24";
    listenPort     = 51820;          # optional, default 51820
  };

  swarm.peers."edge.corp" = {
    certFingerprint    = "sha256:…";   # TLS trust (unchanged)
    wireguardPublicKey = "base64key="; # peer's wg pubkey
    wireguardEndpoint  = "203.0.113.42:51820"; # peer's public IP:port
    wireguardAddress   = "10.100.0.2/32";      # peer's mesh IP
  };
};

# hive B (edge.corp, mesh IP 10.100.0.2)
services.hyperhive = {
  swarm.wireguard = {
    enable        = true;
    privateKeyFile = "/etc/wireguard/hive.key";
    address        = "10.100.0.2/24";
  };

  swarm.peers."pr1ma.example.com" = {
    wireguardPublicKey = "base64key="; # hive A's wg pubkey
    wireguardEndpoint  = "198.51.100.1:51820";
    wireguardAddress   = "10.100.0.1/32";
  };
};

What the mesh does

NAT / one-sided endpoints

If one host is behind NAT and can't accept incoming connections, only that host needs a null wireguardEndpoint on the peer config — the other side initiates. With keepalive on, the NAT hole stays open.

If both hosts are behind NAT, a STUN relay or a third host (exit node) is required. Out of scope for v0.

Cross-references