diff --git a/Filer-Active-Active-cross-cluster-continuous-synchronization.md b/Filer-Active-Active-cross-cluster-continuous-synchronization.md index 3fc9f72..28dca81 100644 --- a/Filer-Active-Active-cross-cluster-continuous-synchronization.md +++ b/Filer-Active-Active-cross-cluster-continuous-synchronization.md @@ -37,9 +37,9 @@ If there are 3 or more clusters, you can choose fully connected setup or chained In a sense, you can mix and match to setup filer synchronization as you wish. -### Fully Connected Setup +### Fully Connected Setup? -Run one `weed filer.sync` for each pair of clusters. The fully connected topology provides redundancy in case of network failures. +It is tempting to create a fully connected network topology. E.g., run one `weed filer.sync` for each pair of clusters. The fully connected topology may seem to be able to provide redundancy in case of network failures. ``` cluster1 <-- filer.sync --> cluster2 @@ -47,6 +47,14 @@ cluster2 <-- filer.sync --> cluster3 cluster3 <-- filer.sync --> cluster1 ``` +However, this topology has a loop. + +Every filer will leave a signature on each message. The `filer.sync` use the signatures to avoid processing the same message twice. But for any node within a loop, the same message can come from two difference neighbors. So this mechanism could not help to identify the duplicated the messages. + +Because most metadata messages are idempotent, the network loop is not efficient but still works OK. + +But for directory renaming, the execution order matters. So the loop should be avoided, or the directories will be inconsistent. + ### Chained Setup ``` cluster1 <-- filer.sync --> cluster2 <-- filer.sync --> cluster3