Two possible data inconsistency issues in etcd v3.4.[20-21] and v3.5

Benjamin Wang

unread,

Nov 21, 2022, 3:00:47 PM11/21/22

to etcd...@googlegroups.com, d...@kubernetes.io

Tl;dr - Two issues below:

1 - etcd v3.5.[0-5] data inconsistency issue for a case when etcd crashes during processing defragmentation operation

2 - etcd v3.4.[20-21] and v3.5.5 data inconsistency issue for a case when auth is enabled and a new member added to the cluster

After recently discovered consistency problems in etcd-3.5, etcd maintainers are investing in extensive testing of data consistency in different etcd crash modes. As part of the process we discovered following issues:

Issue 1: etcd v3.5.[0-5] data inconsistency issue for a case when etcd crashes during processing defragmentation operation

If etcd crashes during an online defragmentation operation, when the etcd instance starts again, it might reapply some entries which have already been applied. This might result in the member's data becoming inconsistent with the other members.

This issue does not occur when performing the defragmentation operation offline using etcdutl.

Usually there is no data loss, and clients can always get the latest correct data. The only issue is the problematic etcd member’s revision might be a little larger than the other members. However, if etcd reapplies some conditional transactions, the issue might cause data inconsistency. Please get more detailed information from the discussion in the PR 14685 (fixed in #14730).

The affected versions are 3.5.0, 3.5.1, 3.5.2, 3.5.3, 3.5.4 and 3.5.5. The issue was resolved in 3.5.6.

Recommendations:

Enable data corruption check with `--experimental-initial-corrupt-check` and `--experimental-compact-hash-check-enabled` flags. Please refer to enabling-data-corruption-detection. This helps even if you enable it after a crash.
If you are running multi-node etcd 3.5.[0-5], please do NOT use etcdctl or clientv3 API to perform defragmentation. If you must run online defragmentation anyway, please take extra precautions to prevent a node crash in the middle of the operation, such as checking that the node has plenty of memory available and do not sigkill defragmentation. The safer approach to execute defragmentation is to use etcdutl, but this requires taking each member offline one at a time.If you already ran into this issue, then please follow https://etcd.io/docs/v3.5/op-guide/data_corruption/#restoring-a-corrupted-member to restore the problematic member.

Issue 2: etcd v3.4.[20-21] and v3.5.5 data inconsistency issue for a case when auth is enabled and a new member added to the cluster

This issue only affects etcd clusters where auth is enabled.

Recent issue 14571 surfaced a data inconsistency issue for a specific case as detailed in this note. When the auth is enabled, newly added members might fail to apply data due to permission denied, and eventually become data inconsistent.

In this situation, clients (e.g. etcdctl) connected to the problematic member (the new added member) will fail to read or write any data due to permission denied. Restarting the new etcd member can resolve the permission failures, but afterwards clients might get stale data from the problematic member.

Please get more detailed information from the discussion in the issue 14571.

The affected versions are 3.4.20, 3.4.21 and 3.5.5. The issue was resolved in 3.4.22 and 3.5.6.

Recommendations:

If you are on an old version and auth is enabled, do not upgrade to 3.4.20, 3.4.21 or 3.5.5.
If you are already on the affected versions and auth is enabled, then please do not add new members until you upgrade to 3.4.22 or 3.5.6.
If you already ran into this issue, then please follow https://etcd.io/docs/v3.5/op-guide/data_corruption/#restoring-a-corrupted-member to restore the problematic member.

Thanks veshij@ who reported and resolved this issue.

Thanks,
etcd-maintainers

Alexis Richardson

unread,

Nov 21, 2022, 3:25:24 PM11/21/22

to wac...@vmware.com, etcd...@googlegroups.com, d...@kubernetes.io

How many people are working on this issue?

--
You received this message because you are subscribed to the Google Groups "dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to dev+uns...@kubernetes.io.
To view this discussion on the web visit https://groups.google.com/a/kubernetes.io/d/msgid/dev/SN6PR05MB56805A9E04FD3A1648D926E1BC0A9%40SN6PR05MB5680.namprd05.prod.outlook.com.

Benjamin Wang

unread,

Nov 21, 2022, 3:33:08 PM11/21/22

to dev, ale...@weave.works, etcd...@googlegroups.com, d...@kubernetes.io, Benjamin Wang

Both issues have already been resolved.

Issue 1:

The affected versions are 3.5.0, 3.5.1, 3.5.2, 3.5.3, 3.5.4 and 3.5.5. The issue was resolved in 3.5.6.

Issue 2:

The affected versions are 3.4.20, 3.4.21 and 3.5.5. The issue was resolved in 3.4.22 and 3.5.6.

Marek Siarkowicz

unread,

Nov 22, 2022, 9:16:09 AM11/22/22

to dev, wac...@vmware.com, ale...@weave.works, etcd...@googlegroups.com, d...@kubernetes.io

One positive note.

First data inconsistency issue 14685 was discovered by a new linearizability testing framework developed by etcd maintainers.
This is a huge step as all previous issues were discovered via user reports and not by testing.
Now we are able to find issues in a proactive way by expanding test scenarios. Instead of just reactive waiting for user to report an issue.
There is still a lot of work to make the framework fully operational, however it has already show ability to both reproduce historical issues and find new previously unknown issues.

If you are interested in validating correctness of distributed systems or want to help us improve etcd reliability please follow the work at #14045
New contributors are welcomed.

Thanks,
Marek

Reply all

Reply to author

Forward