Gonka's integration test harness picked up a scheduling fix on 2026-06-11. PR #1333, authored by @patimen and merged into the upgrade-v0.2.14 branch, teaches the upgrade rehearsal workflow to place its test upgrade inside a safe inference window instead of at an arbitrary block height. The change adds 227 lines across 4 files.

What changed

Testermint is the Kotlin test harness that exercises a full Gonka cluster, and one of its jobs is rehearsing chain software upgrades end-to-end before they reach MainNet. Until this PR, the rehearsal picked its upgrade height naively: the current block plus a configurable lead (UPGRADE_REHEARSAL_LEAD_BLOCKS, default 80). Depending on where the epoch cycle happened to be, that height could land inside a Proof of Compute or validation stage — exactly the windows where an upgrade should not fire.

The fix replaces that arithmetic with stage-aware scheduling:

  • A new helper, findStageSafeInferenceBlock in testermint/src/main/kotlin/Epochs.kt, takes the earliest acceptable block and returns a StageSafeInferenceBlock that sits inside an inference window with at least 3 blocks of slack (INFERENCE_STAGE_SLACK_BLOCKS) before the next PoC start.
  • When the requested lead overshoots the current window, the helper projects one epoch forward and anchors the upgrade on the next inference window instead of failing.
  • The existing safeForInference flag in data/epoch.kt now reuses the shared slack constant instead of a hard-coded 3, so the two checks cannot drift apart.
  • A new 155-line test file, EpochSchedulingHelpersTest.kt, pins down four scenarios: a lead that fits the current window, an unsafe window tail that gets skipped, scheduling from the validation phase, and a long lead that lands past the next epoch.

Why it matters

Every Gonka epoch cycles through stages: a Proof of Compute sprint where hosts prove their GPU capacity, a validation phase where those proofs are checked, and a long inference window where the network serves regular AI workloads. Real on-chain upgrades activate during inference, when nodes can restart without disrupting proof deadlines. A rehearsal that fired mid-PoC was testing a scenario the network would never deliberately execute — and could fail for reasons that had nothing to do with the upgrade being rehearsed.

With v0.2.14 preparation underway, the rehearsal workflow is part of the release gate. Deterministic, stage-safe scheduling removes a source of flaky failures from that gate; per the PR description, the upgrade rehearsal workflow runs green on this branch.