Hi there,
while experimenting with seaweedfs (via s3) I noticed that when
uploading somewhat larger blobs (~ 4MiB) using the Minio warp
benchmark the async replication of the volume process sporadically
logs an error:
-- 8< --
E1128 16:59:57 1 upload_content.go:234] upload 4194304 bytes to
http://10.132.15.198:8080/46,06edfbb86bdefa?ts=1606582797&ttl=&type=replicate:
Post
http://10.132.15.198:8080/46,06e
dfbb86bdefa?ts=1606582797&ttl=&type=replicate: read tcp
172.17.0.3:37852->
10.132.15.198:8080: read: connection reset by peer
goroutine 1011466 [running]:
runtime/debug.Stack(0x109, 0x0, 0x0)
/usr/lib/go/src/runtime/debug/stack.go:24 +0x9d
runtime/debug.PrintStack()
/usr/lib/go/src/runtime/debug/stack.go:16 +0x22
github.com/chrislusf/seaweedfs/weed/operation.upload_content(0xc000a1c200,
0x4d, 0xc0012d3928, 0x0, 0x0, 0x100, 0x400000, 0x0, 0x0, 0xc0012d3ba0,
...)
/go/src/
github.com/chrislusf/seaweedfs/weed/operation/upload_content.go:235
+0xd36
github.com/chrislusf/seaweedfs/weed/operation.doUploadData(0xc000a1c200,
0x4d, 0x0, 0x0, 0xc000a1c200, 0xc00be9e000, 0x400000, 0x7ffe00, 0x0,
0x0, ...)
/go/src/
github.com/chrislusf/seaweedfs/weed/operation/upload_content.go:169
+0x49d
github.com/chrislusf/seaweedfs/weed/operation.retriedUploadData(0xc000a1c200,
0x4d, 0x0, 0x0, 0x0, 0xc00be9e000, 0x400000, 0x7ffe00, 0xc000ac9000,
0x0, ...)
/go/src/
github.com/chrislusf/seaweedfs/weed/operation/upload_content.go:96
+0x1ba
github.com/chrislusf/seaweedfs/weed/operation.UploadData(...)
/go/src/
github.com/chrislusf/seaweedfs/weed/operation/upload_content.go:69
github.com/chrislusf/seaweedfs/weed/topology.ReplicatedWrite.func1(0xc007dfe7a0,
0x12, 0xc007dfe7c0, 0x12, 0x7243a3, 0xc0035f8940)
/go/src/
github.com/chrislusf/seaweedfs/weed/topology/store_replicate.go:85
+0x670
github.com/chrislusf/seaweedfs/weed/topology.distributedOperation.func1(0xc00097a2d0,
0xc007dfe7a0, 0x12, 0xc007dfe7c0, 0x12, 0xc001b2c300)
/go/src/
github.com/chrislusf/seaweedfs/weed/topology/store_replicate.go:152
+0x55
created by
github.com/chrislusf/seaweedfs/weed/topology.distributedOperation
/go/src/
github.com/chrislusf/seaweedfs/weed/topology/store_replicate.go:151
+0xda
W1128 16:59:57 1 upload_content.go:100] uploading to
http://10.132.15.198:8080/46,06edfbb86bdefa?ts=1606582797&ttl=&type=replicate:
upload 4194304 bytes to
http://10.132.15.198:808
0/46,06edfbb86bdefa?ts=1606582797&ttl=&type=replicate: Post
http://10.132.15.198:8080/46,06edfbb86bdefa?ts=1606582797&ttl=&type=replicate:
read tcp 172.17.0.3:37852->
10.132.15.198:8080:
read: connection reset by peer
-- 8< --
The volume process on the other end does not log an error. Will this
be retried and compensated for?
Another thought I came up with regards to the consistency properties
of multiple filer instances backed by a leveldb2 store and configured
active-active-replication (via `-peers`): Is my assumption correct
that the replication is eventually consistent? I sometimes observed
that a file that had just been uploaded was not retrievable via `get`
shortly thereafter. The request probably arrived on another filer
instance. Is there a way to compensate for this (except for setting up
a "real" distributed filer store)?
Thanks,
Thilo