LeoFS and Cloudberry

99 views
Skip to first unread message

Cliff Cook

unread,
Jun 13, 2017, 3:39:18 PM6/13/17
to LeoProject.LeoFS
Has anyone had any success with LeoFS and Cloudberry?  We are using 1.3.4 on Ubuntu.  We can connect and start the backup but it never completes.  Only the Bootsector files copy for a total of 149K, none of the data ever gets there.  Cloudberry tries for ever and gets up to about 44%, but none of the data every appears on the LeoFS Side.  I have tried modifying the various app.config files to change timeouts and file size to no avail.  In the leo_gateway logs I see entries similar to the following:

 [E] gate...@127.0.0.1 2017-06-09 20:06:19.49780 -0400 1497053179 leo_gateway_http_commons:put_large_object_1/2 833 [{key,"backup/MBS-8b64a841-3881-4666-b0af-2d4d67e1a801/CBB_SPACE-AD1/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001:/20170609223849/0.cbrevision\n734"},{cause,timeout}]

On the Cloudberry side I see 

2017-06-13 15:32:04,130 [S3] [4] INFO  - Uploading part, bucket: backup, key: MBS-8b64a841-3881-4666-b0af-2d4d67e1a801/CBB_SPACE-AD1/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001:/20170613050109/0.cbrevision, uploadId: 3f7dd49e7f56b9d000de643029613621, partNumber: 3616, length 10 MB(10485760)

2017-06-13 15:32:04,692 [CL] [9] INFO  - Chunk 3618 for backup/MBS-8b64a841-3881-4666-b0af-2d4d67e1a801/CBB_SPACE-AD1/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001:/20170613050109/0.cbrevision is ready for transfer. Size: 10485760. Start offset: 37926993920

2017-06-13 15:32:07,858 [CL] [6] ERROR - Command::Run failed:

UploadChunk; Source:05263e28-0ea3-4b27-91e2-8d77d314fcb4; Destination:/backup/MBS-8b64a841-3881-4666-b0af-2d4d67e1a801/CBB_SPACE-AD1/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001:/20170613050109/

System.NotSupportedException: The stream does not support concurrent IO read or write operations.

  at System.Net.ConnectStream.InternalWrite(Boolean async, Byte[] buffer, Int32 offset, Int32 size, AsyncCallback callback, Object state)


Is this an config issue?    Any help appreciated.

yoshiyuki kanno

unread,
Jun 13, 2017, 8:39:38 PM6/13/17
to Cliff Cook, LeoProject.LeoFS
Hi,

Could you try to set the large_object.reading_chunked_obj_len in
leo_gateway.conf to less value like 131072 (means 128KB)?
Since the default (5242880 in bytes := 5MB) has been chosen for
relatively broad bandwidth environments,
so timeout between LeoFS and Cloudberry could happen under a narrow
ones with the default.

Please let me know if you face any other trouble.

Best,
Kanno.
> --
> You received this message because you are subscribed to the Google Groups
> "LeoProject.LeoFS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to leoproject_leo...@googlegroups.com.
> To post to this group, send email to leoproje...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/leoproject_leofs/964f7efa-d9b4-4c0f-ae44-986bf31a7023%40googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.



--
Yoshiyuki Kanno
LeoFS Committer(http://www.leofs.org)
--------------------------------------------------
Stoic Corp.
URL: http://www.stoic.co.jp/
E-mail: yoshiyu...@stoic.co.jp

Vladimir Mosgalin

unread,
Jun 14, 2017, 8:05:52 AM6/14/17
to LeoProject.LeoFS, cwcoo...@gmail.com

среда, 14 июня 2017 г., 3:39:38 UTC+3 пользователь mocchira написал:
Hi,

Could you try to set the large_object.reading_chunked_obj_len in
leo_gateway.conf to less value like 131072 (means 128KB)?
Since the default (5242880 in bytes := 5MB) has been chosen for
relatively broad bandwidth environments,

I'm interested in this as well.

Could you please explain the difference between lowering "large_object.reading_chunked_obj_len" and raising "http.timeout_for_body"? I mean, why lower the first instead of raising the second, are there some advantages? Can some problems arise from having "large_object.reading_chunked_obj_len" few times lower than "large_object.chunked_obj_len" (first 128KB, second 5MB, as per your suggestion)?

To be exact, I'm interested in a mix of high-speed (1 Gbps) and lower speed (100 Mbps and lower - limited by other factors, not just network speed) clients; I understand how to tweak these values so that slow clients won't have problems, but which is the best for high-speed clients - raised timeout or lowered reading_chunked_obj_len?

yoshiyuki kanno

unread,
Jun 14, 2017, 10:41:38 PM6/14/17
to Vladimir Mosgalin, LeoProject.LeoFS, Cliff Cook
Hi Vladimir,

> Could you please explain the difference between lowering "large_object.reading_chunked_obj_len" and raising "http.timeout_for_body"? I mean, why lower the first instead of raising the second, are there some advantages? Can some problems arise from having "large_object.reading_chunked_obj_len" few times lower than "large_object.chunked_obj_len" (first 128KB, second 5MB, as per your suggestion)?

Good point.
Please check the preview docs I'm now writing up here:
https://mocchira.github.io/leofs/faq/administration/#what-should-i-do-when-a-timeout-error-happen-during-upload-a-very-large-file
That said, logically it's same between lowering the buffer size and
raising the timeout.
However there is one benefit using the former one(tweaking the buffer
size) rather than the latter.
As the memory allocation request in Erlang VM happen in
**large_object.reading_chunked_obj_len**,
The larger buffer size we set, the more memory footprint the host
running leo_gateway needs at once so that if leo_gateway accepts lots
of connections and those try to upload very large files in parallel
then the odds OOM could happen increase.
That's why I'd recommend lowering the buffer size.
I will add this context into the docs, thank you!

> To be exact, I'm interested in a mix of high-speed (1 Gbps) and lower speed (100 Mbps and lower - limited by other factors, not just network speed) clients; I understand how to tweak these values so that slow clients won't have problems, but which is the best for high-speed clients - raised timeout or lowered reading_chunked_obj_len?

As I said on the above reply, lowering the buffer size would be the
recommendation in terms of safety (less memory usage) however for
high-speed clients, raising the timeout could be more suitable choice
in theory because the larger buffer size we set, the less interaction
happen between
- LeoFS <-> Erlang VM
- Erlang VM <-> OS

However I'm not sure there is a meaningful difference in any case
between two solutions in terms of latency/throughput so that I would
recommend you to benchmark along with your actual use cases.

Best,
Kanno.
> --
> You received this message because you are subscribed to the Google Groups
> "LeoProject.LeoFS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to leoproject_leo...@googlegroups.com.
> To post to this group, send email to leoproje...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/leoproject_leofs/60ebce05-a68f-4e10-8491-5100cb378141%40googlegroups.com.

Vladimir Mosgalin

unread,
Jun 19, 2017, 4:36:49 PM6/19/17
to LeoProject.LeoFS, vladi...@gmail.com, cwcoo...@gmail.com

четверг, 15 июня 2017 г., 5:41:38 UTC+3 пользователь mocchira написал:
Hi Vladimir,

> Could you please explain the difference between lowering "large_object.reading_chunked_obj_len" and raising "http.timeout_for_body"? I mean, why lower the first instead of raising the second, are there some advantages? Can some problems arise from having "large_object.reading_chunked_obj_len" few times lower than "large_object.chunked_obj_len" (first 128KB, second 5MB, as per your suggestion)?

Good point.
Please check the preview docs I'm now writing up here:
https://mocchira.github.io/leofs/faq/administration/#what-should-i-do-when-a-timeout-error-happen-during-upload-a-very-large-file
That said, logically it's same between lowering the buffer size and
raising the timeout.
However there is one benefit using the former one(tweaking the buffer
size) rather than the latter.

Thank you for explanation! The fact that "reading_chunked_obj_len" is *just* buffer size between gateway and client is the one I was missing to fully understand the situation. Now that I know it the rest is pretty obvious, yes.

Maybe some comments in config file should be changed to get rid of the confusion? Let me elaborate, there is this section

## HTTP timeout for reading header
## http.timeout_for_header = 5000
## HTTP timeout for reading body
## http.timeout_for_body = 15000
## HTTP sending chunk length
## http.sending_chunked_obj_len = 5242880

where it's pretty obvious that sending_chunked_obj_len is buffer size for sending from gateway to client.

However, with these
## Threshold of length of a chunked object
large_object.threshold_of_chunk_len = 5767168
## Reading length of a chuncked object
##   * If happening timeout when copying a large object,
##     you will solve to set this value as less than 5MB.
##   * default: "large_object.chunked_obj_len" (5242880 - 5MB)
large_object.reading_chunked_obj_len = 5242880

you'd usually think that there is some (but what exactly?) relation between these parameters, while, in fact, they turn out to be unrelated, as the first one mostly affects how data is split to chunks / stored (right?) and the second is just a network buffer to client. Adding to the confusion is the fact that second one isn't prefixed by "http" (like sending_chunked_obj_len) and it's impossible to understand if this is about reading from client or reading from storage (before sending to client). In fact, I was totally sure that "timeout" mentioned here is timeout between leo_gateway and leo_storage.

yoshiyuki kanno

unread,
Jun 19, 2017, 9:59:42 PM6/19/17
to Vladimir Mosgalin, LeoProject.LeoFS, Cliff Cook
Hi,

> you'd usually think that there is some (but what exactly?) relation between these parameters, while, in fact, they turn out to be unrelated, as the first one mostly affects how data is split to chunks / stored (right?) and the second is just a network buffer to client. Adding to the confusion is the fact that second one isn't prefixed by "http" (like sending_chunked_obj_len) and it's impossible to understand if this is about reading from client or reading from storage (before sending to client). In fact, I was totally sure that "timeout" mentioned here is timeout between leo_gateway and leo_storage.

Yep, absolutely right regarding "isn't prefixed by http".
I will file this on github later, thanks.

Best,
Kanno.
> --
> You received this message because you are subscribed to the Google Groups
> "LeoProject.LeoFS" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to leoproject_leo...@googlegroups.com.
> To post to this group, send email to leoproje...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/leoproject_leofs/69eb2b66-c50a-442e-9852-a9f16916009a%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages