On 2026-05-14, PR #1154 merged into gonka-ai/gonka, contributed by @pixelplex. It adds a sticky-routing layer in front of versiond and wires a multi-instance deployment shape into both the production deploy and the local test net.
Some background. Devshards are the off-chain units that coordinate inference work, part of the inference shards architecture introduced in v0.2.11. versiond is the service that handles devshard requests. Until now a deployment ran a single versiond instance behind the proxy.
What changed
- A new
versiond-router/service. It is an nginx proxy that hashes each incoming request onescrowID— the identifier that ties a request to its session — and forwards it to a fixed upstream. Same session, same instance, every time. The service ships with its own sub-Makefile mirroring the existingproxy/layout, and the top-level Makefile gainsversiond-router-build-dockerandversiond-router-releasetargets. - Production overlay.
deploy/join/docker-compose.versiond.ymlis a compose overlay that brings updevshard-postgres, two versiond instances (versiondandversiond2, sharing config through anx-versiondYAML anchor), and the newversiond-router. It overrides theproxyenvironment so/devshard/*traffic routes through the router. The router image is pinned toghcr.io/product-science/versiond-router:0.2.12. The basedocker-compose.ymlis left untouched. - Local test net.
local-test-net/docker-compose.versiond.ymlmirrors the same shape with three versiond instances behind the router, so testermint exercises the multi-instance path.
The PR touched 12 files: 441 additions, 43 deletions.
Why it matters
A single versiond instance is both a throughput bottleneck and a single point of failure. But devshard sessions carry state — a request mid-session cannot simply be handed to a different instance that has never seen it. Plain round-robin load balancing would break that.
The router solves this with consistent hashing on escrowID: each session is deterministically pinned to one upstream, so an operator can run several versiond instances side by side and still keep every session coherent. Because the base compose file is untouched, single-instance deployments keep working exactly as before — the multi-instance shape is opt-in through the overlay.
For the test net, the three-instance setup means the routing path gets exercised in regular test runs before it reaches production hosts.