volume replication problem?

94 views
Skip to first unread message

Jesper Zedlitz

unread,
Sep 13, 2016, 2:10:25 AM9/13/16
to Seaweed File System
I made some experiments with volume replication. I did not get up and running. Or I understood something wrong. These are the steps to reproduce the problem (Ubuntu with SeaWeedFS in Docker container):

Initialize a test directory.
rm -fr /tmp/weedtest/ ; mkdir -p /tmp/weedtest/m1 /tmp/weedtest/m2 /tmp/weedtest/m3 /tmp/weedtest/v1 /tmp/weedtest/v2 /tmp/weedtest/v3 

start one master server and three volume servers:
docker run -it --rm --name master1 -v /tmp/weedtest/m1:/data weed master -defltReplication=001
docker run -it --rm --name volume1 -v /tmp/weedtest/v1:/data --link master1:master weed volume
docker run -it --rm --name volume2 -v /tmp/weedtest/v2:/data --link master1:master weed volume
docker run -it --rm --name volume3 -v /tmp/weedtest/v3:/data --link master1:master weed volume

run a benchmark against this setup:
docker run -it --rm --link master1:master weed benchmark -server master:9333

While the benchmark is running I stop one of the volume servers. These are the logfiles combined in chronological order (master is red, volume blue and benchmark green)
I0913 05:57:44     8 volume_server.go:117] Shutting down volume server...
I0913 05:57:44     8 volume_server.go:119] Shut down successfully!
Completed 24655 of 1048576 requests, 2.4% 5061.0/s 5.1MB/s
I0913 05:57:44     7 upload_content.go:78] failing to upload to http://172.17.0.3:8080/3,6d06676042f1 Post http://172.17.0.3:8080/3,6d06676042f1: read tcp 172.17.0.6:34548->172.17.0.3:8080: read: connection reset by peer
Failed to write with error:Post http://172.17.0.3:8080/3,6d06676042f1: read tcp 172.17.0.6:34548->172.17.0.3:8080: read: connection reset by peer
I0913 05:57:44     7 upload_content.go:78] failing to upload to http://172.17.0.3:8080/3,6cff1f740c40 Post http://172.17.0.3:8080/3,6cff1f740c40: EOF
Failed to write with error:Post http://172.17.0.3:8080/3,6cff1f740c40: EOF
I0913 05:57:44     7 upload_content.go:78] failing to upload to http://172.17.0.3:8080/5,6d0d9c4d0e16 Post http://172.17.0.3:8080/5,6d0d9c4d0e16: read tcp 172.17.0.6:34428->172.17.0.3:8080: read: connection reset by peer
I0913 05:57:44     7 upload_content.go:78] failing to upload to http://172.17.0.3:8080/5,6d0416c297dc Post http://172.17.0.3:8080/5,6d0416c297dc: read tcp 172.17.0.6:34400->172.17.0.3:8080: read: connection reset by peer
Failed to write with error:Post http://172.17.0.3:8080/5,6d0d9c4d0e16: read tcp 172.17.0.6:34428->172.17.0.3:8080: read: connection reset by peer
I0913 05:57:44     7 upload_content.go:78] failing to upload to http://172.17.0.3:8080/5,6d0fffb1a443 Post http://172.17.0.3:8080/5,6d0fffb1a443: dial tcp 172.17.0.3:8080: getsockopt: connection refused
[…]
I0913 05:57:44     7 upload_content.go:78] failing to upload to http://172.17.0.3:8080/5,6e4c8880a7c5 Post http://172.17.0.3:8080/5,6e4c8880a7c5: dial tcp 172.17.0.3:8080: getsockopt: connection refused
Failed to write with error:Post http://172.17.0.3:8080/5,6e4c8880a7c5: dial tcp 172.17.0.3:8080: getsockopt: connection refused
Completed 28164 of 1048576 requests, 2.7% 3508.6/s 3.5MB/s
Completed 28164 of 1048576 requests, 2.7% 0.0/s 0.0MB/s
Completed 28164 of 1048576 requests, 2.7% 0.0/s 0.0MB/s
I0913 05:58:00     8 topology_event_handling.go:58] Removing Volume 3 from the dead volume server Node:topo:DefaultDataCenter:DefaultRack:172.17.0.3:8080, volumes:map[3:Id:3, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false 5:Id:5, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false], Ip:172.17.0.3, Port:8080, PublicUrl:172.17.0.3:8080, Dead:true
I0913 05:58:00     8 volume_layout.go:181] Volume 3 has 1 replica, less than required 2
I0913 05:58:00     8 volume_layout.go:157] Volume 3 becomes unwritable
I0913 05:58:00     8 topology_event_handling.go:58] Removing Volume 5 from the dead volume server Node:topo:DefaultDataCenter:DefaultRack:172.17.0.3:8080, volumes:map[5:Id:5, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false 3:Id:3, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false], Ip:172.17.0.3, Port:8080, PublicUrl:172.17.0.3:8080, Dead:true
I0913 05:58:00     8 volume_layout.go:181] Volume 5 has 1 replica, less than required 2
I0913 05:58:00     8 volume_layout.go:157] Volume 5 becomes unwritable
I0913 05:58:00     8 node.go:220] topo:DefaultDataCenter:DefaultRack removes Node:172.17.0.3:8080, volumes:map[3:Id:3, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false 5:Id:5, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false], Ip:172.17.0.3, Port:8080, PublicUrl:172.17.0.3:8080, Dead:true volumeCount = 10
I0913 05:58:00     8 topology_event_handling.go:39] DataNode Node:172.17.0.3:8080, volumes:map[3:Id:3, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false 5:Id:5, Size:0, ReplicaPlacement:001, Collection:benchmark, Version:2, FileCount:0, DeleteCount:0, DeletedByteCount:0, ReadOnly:false], Ip:172.17.0.3, Port:8080, PublicUrl:172.17.0.3:8080, Dead:true is dead!
Completed 28164 of 1048576 requests, 2.7% 0.0/s 0.0MB/s
Completed 28164 of 1048576 requests, 2.7% 0.0/s 0.0MB/s
Completed 28164 of 1048576 requests, 2.7% 0.0/s 0.0MB/s

Although I waited quite a long time the benchmark does not recover from the failure of one volume server. The master states that the volume becomes unwritable. Is this the expected behaviour?

Chris Lu

unread,
Sep 13, 2016, 12:54:25 PM9/13/16
to Seaweed File System
The benchmark tool does not have the re-try logic when error happens.

Chris

--
You received this message because you are subscribed to the Google Groups "Seaweed File System" group.
To unsubscribe from this group and stop receiving emails from it, send an email to seaweedfs+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages