Recovery guide: Consensus failure after missing the upgrade

If your node did not apply the latest upgrade in time, it may halt with a consensus failure at block ⁨2459189⁩. This happens because the node is running an outdated binary that is no longer compatible with the network.

To recover, follow the steps below.

  1. Make sure the node is fully stopped before proceeding.

shell

docker stop node

  1. Replace the old binary with the latest released version.

shell

# Download Binary
sudo rm -rf inferenced.zip .inference/cosmovisor/upgrades/v0.2.9-post2/ .inference/data/upgrade-info.json
sudo mkdir -p  .inference/cosmovisor/upgrades/v0.2.9-post2/bin/
wget -q -O  inferenced.zip 'https://github.com/product-science/race-releases/releases/download/release%2Fv0.2.9-post2/inferenced-amd64.zip' && \
echo "8de51bdd1d2c0af5f1da242e10b39ae0ceefd215f94953b9d95e9276f7aa70c7  inferenced.zip" | sha256sum --check && \
sudo unzip -o -j  inferenced.zip -d .inference/cosmovisor/upgrades/v0.2.9-post2/bin/ && \
sudo chmod +x .inference/cosmovisor/upgrades/v0.2.9-post2/bin/inferenced && \
echo "Inference Installed and Verified"

# Link Binary
echo "--- Final Verification ---" && \
sudo rm -rf .inference/cosmovisor/current
sudo ln -sf upgrades/v0.2.9-post2 .inference/cosmovisor/current
echo "75410178a4c3b867c0047d0425b48f590f39b9e9bc0f3cf371d08670d54e8afe .inference/cosmovisor/current/bin/inferenced" | sudo sha256sum --check && \

Verify the binary version.

sha256sum .inference/cosmovisor/current/bin/inferenced


The ⁨sha⁩ must be ⁨75410178a4c3b867c0047d0425b48f590f39b9e9bc0f3cf371d08670d54e8afe⁩.

  1. Because the node stopped mid-consensus, the inference state must be rolled back to the previous block. Run the rollback command:

shell

source config.env && docker compose run --rm --no-deps -ti node /root/.inference/cosmovisor/current/bin/inferenced rollback

  1. Start the node again:

shell

source config.env && docker compose up node --no-deps --force-recreate -d

  1. Check logs to confirm the node is producing blocks and no longer failing consensus:

shell

docker logs --tail=100 -f node

You should see the node:

  • catching up to the network
  • no repeated consensus failure errors