hive-ci: Forgejo Actions Runner

The hive-ci module runs a Forgejo Actions runner in a hive-ci nixos-container, executing CI jobs from .forgejo/workflows/ci.yml (e.g., nix flake check on every PR).

Operator bootstrap

Set services.hyperhive.ci.enable = true in the host NixOS config. That's it — no manual token provisioning.

Requirements:

Container design

Auto-registration flow

hive-ci-prefetch.service is a host-side oneshot that runs on every boot before nixos-container@hive-ci.service. It handles both first-run registration and stale-credential detection. The core admin token is accessed only on the host and never bind-mounted into the container.

Every boot

  1. Check for the core admin token at /var/lib/hyperhive/forge-core-token. If absent (forge still initialising), write TOKEN=placeholder and exit — the runner service will fail gracefully until the next boot.
  2. If .runner exists at /var/lib/nixos-containers/hive-ci/var/lib/gitea-runner/hive/.runner: validate the runner ID against GET /api/v1/admin/runners/{id} using the core token:
    • 200: runner still registered — write TOKEN=placeholder to /run/hive-ci/runner-token and exit; runner reuses .runner credentials.
    • 404: runner was deleted from forge (e.g. after a wipe) — delete .runner, proceed to re-registration below.
    • 000 (forge unreachable): keep existing .runner; write placeholder token; the runner itself will surface the connectivity error.
    • other non-200: treat as stale, delete .runner and re-register.
    • malformed .runner (no id field): delete and re-register.
  3. If .runner is absent (first boot or purged above): fetch a fresh registration token from GET /api/v1/admin/runners/registration-token. Retries for 30s in case forge is still starting. Writes TOKEN=<real> to /run/hive-ci/runner-token.
  4. Container starts with /run/hive-ci/runner-token bind-mounted read-only. gitea-actions-runner reads the token, registers itself, and persists credentials to .runner. On subsequent boots step 2 validates these credentials and fast-paths past registration.

CI workflow

The single CI job is defined in .forgejo/workflows/ci.yml:

name: CI
on:
  pull_request:
    branches: ["**"]
jobs:
  check:
    name: nix flake check
    runs-on: [hive-ci]
    steps:
      - uses: actions/checkout@v3
      - name: check
        run: nix flake check

This runs on every PR, executing all flake checks (treefmt, rustfmt, cargo test, cargo clippy, module evaluation). No --no-build: the checks' derivations are the canonical source of truth.

Security: unsandboxed builds and trusted contributors

hive-ci should only run CI for trusted contributors. The security boundary is weaker than it looks:

What unsandboxed builds mean

nspawn containers cannot create user-namespaces, so nix.settings.sandbox-fallback = true is set in the container. This means every nix build (and nix flake check) runs without a build sandbox — the build process has full access to the container filesystem, network, and any bind-mounts during the build phase.

A malicious default.nix or build script in a PR can therefore:

The core admin token (forge-core-token) is not bind-mounted into the container. It is accessed only by the host-side hive-ci-prefetch.service before the container starts. A build process can still reach forge over the network, but cannot use the admin token to issue privileged API calls.

Note: nix flake check --no-build (eval-only) reduces the attack surface but does not eliminate it — builtins.fetchGit, builtins.fetchurl, and import-from-derivation can reach the network and filesystem during evaluation. The default CI workflow runs full nix flake check (builds derivations), which is the higher-risk path.

Mitigation

For a hive used by a single operator or a small trusted team, the risk is low — all contributors are already trusted with forge access anyway.

For repos with external contributors or fork PRs:

The current design is appropriate for a trusted-team hive where all contributors have implicit forge access.

The CI runner builds derivations through the host nix-daemon — the hive-ci container shares the host store and has no daemon of its own. Build outputs accumulate in /nix/store with no automatic collection, and a busy CI day can fill the disk until every job fails fast with ENOSPC.

Store GC is a host-level concern, so it belongs in the host's own NixOS configuration, not in the hyperhive service modules — a single service should not reach out and change the host's global nix-daemon options. Add the following to your host config:

{
  # Daily GC: delete store paths not referenced by a live root and older
  # than a day. Keeps the store bounded between builds.
  nix.gc = {
    automatic = true;
    dates = "daily";
    options = "--delete-older-than 1d";
  };

  # Disk-pressure GC: when free space drops below min-free mid-build, the
  # daemon collects garbage up to max-free before continuing. This is the
  # real-time net the daily timer can't provide — a same-day build burst is
  # what fills the disk. Tune to your disk size.
  nix.settings.min-free = 20 * 1024 * 1024 * 1024;  # 20 GiB
  nix.settings.max-free = 50 * 1024 * 1024 * 1024;  # 50 GiB
}

Remote builders: if CI dispatches builds to a remote builder (e.g. via nix.buildMachines / ssh-ng://), the build outputs land in that host's store, so the same GC config should be applied wherever the builder runs — GC on the coordinator host won't reclaim space on the builder.

References