I made some experiments with volume replication. I did not get up and running. Or I understood something wrong. These are the steps to reproduce the problem (Ubuntu with SeaWeedFS in Docker container):
Initialize a test directory.
rm -fr /tmp/weedtest/ ; mkdir -p /tmp/weedtest/m1 /tmp/weedtest/m2 /tmp/weedtest/m3 /tmp/weedtest/v1 /tmp/weedtest/v2 /tmp/weedtest/v3
start one master server and three volume servers:
docker run -it --rm --name master1 -v /tmp/weedtest/m1:/data weed master -defltReplication=001
docker run -it --rm --name volume1 -v /tmp/weedtest/v1:/data --link master1:master weed volume
docker run -it --rm --name volume2 -v /tmp/weedtest/v2:/data --link master1:master weed volume
docker run -it --rm --name volume3 -v /tmp/weedtest/v3:/data --link master1:master weed volume
run a benchmark against this setup:
docker run -it --rm --link master1:master weed benchmark -server master:9333
While the benchmark is running I stop one of the volume servers. These are the logfiles combined in chronological order (master is red, volume blue and benchmark green)
I0913 05:57:44 8 volume_server.go:117] Shutting down volume server...
I0913 05:57:44 8 volume_server.go:119] Shut down successfully!
Completed 24655 of 1048576 requests, 2.4% 5061.0/s 5.1MB/s
[…]
Completed 28164 of 1048576 requests, 2.7% 3508.6/s 3.5MB/s
Completed 28164 of 1048576 requests, 2.7% 0.0/s 0.0MB/s
Completed 28164 of 1048576 requests, 2.7% 0.0/s 0.0MB/s
I0913 05:58:00 8 topology_event_handling.go:58] Removing Volume 3 from the dead volume server Node:topo:DefaultDataCenter:DefaultRack:172.17.0.3:8080, volumes:map[3:Id:3, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false 5:Id:5, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false], Ip:172.17.0.3, Port:8080, PublicUrl:172.17.0.3:8080, Dead:true I0913 05:58:00 8 volume_layout.go:181] Volume 3 has 1 replica, less than required 2
I0913 05:58:00 8 volume_layout.go:157] Volume 3 becomes unwritable
I0913 05:58:00 8 topology_event_handling.go:58] Removing Volume 5 from the dead volume server Node:topo:DefaultDataCenter:DefaultRack:172.17.0.3:8080, volumes:map[5:Id:5, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false 3:Id:3, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false], Ip:172.17.0.3, Port:8080, PublicUrl:172.17.0.3:8080, Dead:true I0913 05:58:00 8 volume_layout.go:181] Volume 5 has 1 replica, less than required 2
I0913 05:58:00 8 volume_layout.go:157] Volume 5 becomes unwritable
I0913 05:58:00 8 node.go:220] topo:DefaultDataCenter:DefaultRack removes Node:172.17.0.3:8080, volumes:map[3:Id:3, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false 5:Id:5, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false], Ip:172.17.0.3, Port:8080, PublicUrl:172.17.0.3:8080, Dead:true volumeCount = 10 I0913 05:58:00 8 topology_event_handling.go:39] DataNode Node:172.17.0.3:8080, volumes:map[3:Id:3, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false 5:Id:5, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false], Ip:172.17.0.3, Port:8080, PublicUrl:172.17.0.3:8080, Dead:true is dead! Completed 28164 of 1048576 requests, 2.7% 0.0/s 0.0MB/s
Completed 28164 of 1048576 requests, 2.7% 0.0/s 0.0MB/s
Completed 28164 of 1048576 requests, 2.7% 0.0/s 0.0MB/s
Although I waited quite a long time the benchmark does not recover from the failure of one volume server. The master states that the volume becomes unwritable. Is this the expected behaviour?