Gonka's integration test harness picked up a scheduling fix on 2026-06-11. PR #1333, authored by @patimen and merged into the upgrade-v0.2.14 branch, teaches the upgrade rehearsal workflow to place its test upgrade inside a safe inference window instead of at an arbitrary block height. The change adds 227 lines across 4 files.
What changed
Testermint is the Kotlin test harness that exercises a full Gonka cluster, and one of its jobs is rehearsing chain software upgrades end-to-end before they reach MainNet. Until this PR, the rehearsal picked its upgrade height naively: the current block plus a configurable lead (UPGRADE_REHEARSAL_LEAD_BLOCKS, default 80). Depending on where the epoch cycle happened to be, that height could land inside a Proof of Compute or validation stage — exactly the windows where an upgrade should not fire.
The fix replaces that arithmetic with stage-aware scheduling:
- A new helper,
findStageSafeInferenceBlockintestermint/src/main/kotlin/Epochs.kt, takes the earliest acceptable block and returns aStageSafeInferenceBlockthat sits inside an inference window with at least 3 blocks of slack (INFERENCE_STAGE_SLACK_BLOCKS) before the next PoC start. - When the requested lead overshoots the current window, the helper projects one epoch forward and anchors the upgrade on the next inference window instead of failing.
- The existing
safeForInferenceflag indata/epoch.ktnow reuses the shared slack constant instead of a hard-coded3, so the two checks cannot drift apart. - A new 155-line test file,
EpochSchedulingHelpersTest.kt, pins down four scenarios: a lead that fits the current window, an unsafe window tail that gets skipped, scheduling from the validation phase, and a long lead that lands past the next epoch.
Why it matters
Every Gonka epoch cycles through stages: a Proof of Compute sprint where hosts prove their GPU capacity, a validation phase where those proofs are checked, and a long inference window where the network serves regular AI workloads. Real on-chain upgrades activate during inference, when nodes can restart without disrupting proof deadlines. A rehearsal that fired mid-PoC was testing a scenario the network would never deliberately execute — and could fail for reasons that had nothing to do with the upgrade being rehearsed.
With v0.2.14 preparation underway, the rehearsal workflow is part of the release gate. Deterministic, stage-safe scheduling removes a source of flaky failures from that gate; per the PR description, the upgrade rehearsal workflow runs green on this branch.