Release recovery¶
What to do when a release goes wrong — a CI build fails, a publish half-completes, a tag needs re-running. Companion to Releasing.
The one rule¶
Published versions are immutable. Once a file exists on PyPI or a version exists on npm, you can never overwrite it. So every recovery is one of two moves:
- Retry the same version — safe, and only fills gaps. Requires the artifacts to be
byte-identical (they are: same tag → same commit → same build). This is what
skip-existingon PyPI and re-pushing the tag are for. - Bump to a new version — required whenever the content must change. You cannot "fix and
re-publish
0.3.0"; you publish0.3.1.
Retrying: re-run vs re-dispatch¶
The tag is cut by CI (the prepare job), not pushed by hand — so recovery is about the
run, not the tag:
- Retry the same version → "Re-run failed jobs" on the failed run (Actions → the run →
Re-run failed jobs). Same tag → same commit → byte-identical artifacts; PyPI
skip-existingfills whatever didn't upload and the GitHub Release step is idempotent. This is the move for a transient failure (a flaky network, a secret you just added). - A fresh dispatch always cuts a NEW version.
preparecomputesbump(last tag, bump), and the failed run already pushedvX.Y.Z, so re-dispatching yieldsvX.Y.(Z+1)(or higher) — it will not retryX.Y.Z. Use a fresh dispatch only when you need to change shipped content (immutability rule): you cannot fix and re-publish0.3.0; you release0.3.1.
A
git pushcan't re-trigger anything (there is no tag trigger) — everything is aworkflow_dispatchor a job re-run from the Actions UI. Ifpreparefailed before tagging, nothing was pushed: just fix and dispatch again (it computes the same version).
Why the pipeline is recoverable by design¶
- npm publishes before PyPI. In
release.yml,pypi-publishneedsnpm-publish, so an npm failure aborts the run before any immutable PyPI version is burned. (npm runs in parallel with the slow wheel builds, so this costs almost no wall-clock.) - PyPI uses
skip-existing: true— re-runs never error on already-uploaded files. - Build jobs gate the publish jobs — a build break can never half-publish.
- The GitHub Release is gated on both publishes — the "release" only appears once the packages are actually out.
- The
+devfinalize is gated on both publishes —mainreturns to the dev marker only after a real publish; if onlyfinalizefails, the release itself already succeeded.
Failure playbook¶
| Where it failed | State | Do this |
|---|---|---|
gate (CI not green / not run / timed out) |
Nothing done | This is a pre-flight — the commit's ci.yml isn't green. Fix CI and re-dispatch, wait for CI then re-dispatch, or (if you accept the risk) re-dispatch with skip_ci_check=true. |
A build job (build-rfb/nvenc/vtenc) |
Nothing published | Fix the cause, then Re-run failed jobs (transient), or dispatch a fresh release (a code fix needs a new version). Safe — no artifacts are out. |
npm-publish |
Maybe some npm packages out | Re-run failed jobs. Already-published packages error (npm has no skip-existing) — publish only the missing ones by hand: pnpm --dir widgets --filter "@habemus-papadum/rfb-<pkg>" publish --provenance --no-git-checks. PyPI hasn't run yet (it's gated on npm), so nothing to undo there. |
pypi-publish (e.g. rfb ok, nvenc failed) |
Some PyPI files out (immutable) | Re-run failed jobs. skip-existing skips the uploaded files and completes the rest. Never rebuild-and-overwrite an existing file. |
github-release |
Packages published, no Release | Re-run the job, or locally: gh release create vX.Y.Z --generate-notes (or gh release edit vX.Y.Z if a draft exists). |
finalize (the +dev bump) |
Release is fine | Re-run the finalize job, or locally: python3 scripts/_versioning.py set X.Y.Z+dev (X.Y.Z = the release just cut) → uv lock → commit + push main. |
| Need to change shipped content | Version burned | Dispatch a fresh release (pick the bump); prepare computes the next version from the last tag. Do not attempt to overwrite. Avoid npm unpublish (72-hour window, discouraged, breaks anyone who installed it). |
Auth gotchas (token publishing)¶
CI currently publishes with token secrets (not OIDC — the maintainer is locked out of PyPI 2FA; see the migration proposal).
If pypi-publish fails with a 403 / auth error, the PYPI_API_TOKEN secret is missing,
revoked, or not scoped to that project — check gh secret list and that the token covers all
three projects. (Once trusted publishing is restored, an OIDC error there instead means the
publisher isn't registered — add it per the migration proposal, then re-run.)
If npm-publish fails on provenance (missing "repository" field or an OIDC error): the
published package.jsons must carry a repository field, the repo must be public, and the job
must have permissions: id-token: write — all already set; a failure here usually means a
missing/incorrectly-scoped NPM_TOKEN secret (needs write on the whole @habemus-papadum
scope).
Break-glass fallback¶
If CI is unavailable, publish out-of-band from a maintainer box with scripts/publish.sh (see
Break-glass fallback). The same
immutability rule applies: SKIP_* / *_WHEEL_DIR to complete a partial release; bump the
version to change content.