etcd v3.6 upgrade: Stop zombie cluster members

Evgeny Anikiev December 27, 2025 k8s
etcd v3.6 upgrade: Stop zombie cluster members

etcd v3.6 is coming, but there's a gotcha.

The issue: when upgrading from v3.5 to v3.6, some clusters hit a nasty bug. Old, removed cluster members start re-appearing as "zombies" and rejoin the consensus. Your cluster goes down. Stays down until you manually kill them.

Why? In v3.5 and earlier, the v2store held membership truth. In v3.6, that job moves to v3store. Problem is, in some older clusters, these two stores got out of sync. When you upgrade, boom—dead members come back to life.

The fix is stupidly simple: always upgrade to v3.5.26 or later first. That version added automatic syncing between v2store and v3store. Your cluster repairs itself. Then you upgrade to v3.6 safely.

The three known triggers for this mess:

1. Old etcdctl snapshot restore bugs (v3.4 and earlier) that didn't properly remove members

2. --force-new-cluster in v3.5 and earlier sometimes left zombies behind (fixed in v3.5.22)

3. --unsafe-no-sync enabled (which, yeah, is sketchy anyway) could cause membership crashes

Real talk: there might be other triggers nobody's found yet. So don't assume you're safe just because you didn't do those three things. The only guarantee is going through v3.5.26.

Safe upgrade path:

→ Upgrade to v3.5.26 or later
→ Wait. Confirm all members are healthy
→ Then upgrade to v3.6

If v3.5.26 isn't available from your vendor yet, wait. Don't risk it.

Props to Christian Baumann for reporting this—a bug that's been lurking for years.

Tags:

☁️ AWS Cloud That Saves and Scales

Helping SaaS teams cut costs, speed up releases, and scale securely with DevOps done right

Uncover Bottlenecks & Savings - Free 30-Min Review