On 2026-05-14, PR #1154 merged into gonka-ai/gonka, contributed by @pixelplex. It adds a sticky-routing layer in front of versiond and wires a multi-instance deployment shape into both the production deploy and the local test net.

Some background. Devshards are the off-chain units that coordinate inference work, part of the inference shards architecture introduced in v0.2.11. versiond is the service that handles devshard requests. Until now a deployment ran a single versiond instance behind the proxy.

What changed

  • A new versiond-router/ service. It is an nginx proxy that hashes each incoming request on escrowID — the identifier that ties a request to its session — and forwards it to a fixed upstream. Same session, same instance, every time. The service ships with its own sub-Makefile mirroring the existing proxy/ layout, and the top-level Makefile gains versiond-router-build-docker and versiond-router-release targets.
  • Production overlay. deploy/join/docker-compose.versiond.yml is a compose overlay that brings up devshard-postgres, two versiond instances (versiond and versiond2, sharing config through an x-versiond YAML anchor), and the new versiond-router. It overrides the proxy environment so /devshard/* traffic routes through the router. The router image is pinned to ghcr.io/product-science/versiond-router:0.2.12. The base docker-compose.yml is left untouched.
  • Local test net. local-test-net/docker-compose.versiond.yml mirrors the same shape with three versiond instances behind the router, so testermint exercises the multi-instance path.

The PR touched 12 files: 441 additions, 43 deletions.

Why it matters

A single versiond instance is both a throughput bottleneck and a single point of failure. But devshard sessions carry state — a request mid-session cannot simply be handed to a different instance that has never seen it. Plain round-robin load balancing would break that.

The router solves this with consistent hashing on escrowID: each session is deterministically pinned to one upstream, so an operator can run several versiond instances side by side and still keep every session coherent. Because the base compose file is untouched, single-instance deployments keep working exactly as before — the multi-instance shape is opt-in through the overlay.

For the test net, the three-instance setup means the routing path gets exercised in regular test runs before it reaches production hosts.