Strange error handling of elliptics C++ and Python bindings when cache flag is set

44 views
Skip to first unread message

Artem Savinov

unread,
Apr 7, 2015, 5:27:05 AM4/7/15
to rever...@googlegroups.com

Hi,


I use elliptics 2.25.6.13 and eblob 0.22.11.

config: https://gist.github.com/reclosedev/cac5234c4b538d3fd3f0#file-ioserv-json

blob_size is 1G and  "blob_flags": "582".


When client can’t write to cluster due to some error, e.g. “No space left on device: -28” I expect it to fail, but it doesn’t raise any error and returns LookupResultEntry when session is created with cache flag.


The easiest way to reproduce is to make elliptics unable to write by creating blob on tmpfs with 1500M limit, e.g.

# tailf /etc/fstab

tmpfs   /var/blob/    tmpfs   size=1500M         0  0


See example script: https://gist.github.com/reclosedev/cac5234c4b538d3fd3f0#file-write_expect_errors-py


Output:

https://gist.github.com/reclosedev/cac5234c4b538d3fd3f0#file-z_output1-with-cache-txt


So when ‘cache’ flag is used, no errors were raised. Also, sometimes it affects next commands. Append commands doesn’t raise/return errors, and commands with flags = 0 get timeout instead of an instant error.


Without ‘cache’ flag (commented out do_read_write(elliptics.io_flags.append | elliptics.io_flags.cache)) it works as expected:

https://gist.github.com/reclosedev/cac5234c4b538d3fd3f0#file-z_output2-no-cache-txt

thanks

Artem Savinov

unread,
Apr 7, 2015, 7:50:22 AM4/7/15
to rever...@googlegroups.com
This behaviour is also reproduced on elliptics 2.26 with a little modifications for backward incompatible changes in script and config.
But client 2.26 starts to raise/return errors after write with cache flag earlier than 2.25.

Evgeniy Polyakov

unread,
Apr 7, 2015, 9:44:23 AM4/7/15
to Artem Savinov, rever...@googlegroups.com
Hi Artem

07.04.2015, 14:50, "Artem Savinov" <asav...@asdco.ru>:
> This behaviour is also reproduced on elliptics 2.26 with a little modifications for backward incompatible changes in script and config.
> But client 2.26 starts to raise/return errors after write with cache flag earlier than 2.25.

You didn't provide server logs, but it looks like cache write succeeded, and it was exactly what you asked.

Even more, you write data with 'append' flag, this means that cache will only store smaller chunks of data,
not the whole file, while on-disk structure may fall out of the maximum blob size and return -28 'no space left on device'.

Also, 2.25 elliptics is about 2-3 years old, do not do this, its really time for upgrade

Artem Savinov

unread,
Apr 7, 2015, 10:10:57 AM4/7/15
to rever...@googlegroups.com, asav...@asdco.ru, z...@ioremap.net


You didn't provide server logs, but it looks like cache write succeeded, and it was exactly what you asked.

i'm sorry, attach.
log-2.26
log-2.25

Evgeniy Polyakov

unread,
Apr 7, 2015, 10:16:29 AM4/7/15
to Artem Savinov, rever...@googlegroups.com


07.04.2015, 17:11, "Artem Savinov" <asav...@asdco.ru>:
>> You didn't provide server logs, but it looks like cache write succeeded, and it was exactly what you asked.
>
> i'm sorry, attach.

Indeed cache writes succeeded (small chunks to be appended fit the cache),
but it could not be synced to disk and non-cache writes also failed because of lack of free space.

Artem Savinov

unread,
Apr 7, 2015, 10:29:48 AM4/7/15
to rever...@googlegroups.com, asav...@asdco.ru, z...@ioremap.net
Ok, thanks. I will not use cache flag then, it works as expected.
 
But for me it's strange that the next async write with apend (no cache) doesn't return error, and write with flags = 0 is temed out instead of instant error:
 
Flags: elliptics.io_flags.append | elliptics.io_flags.cache
Writing using async Session.write_data()
  FAILED, expected no entries, got 1 [<elliptics.core.LookupResultEntry object at 0x7f8565926c80>]
  size: 4, error: ''
 
 
Flags: elliptics.io_flags.append
Writing using async Session.write_data()
  FAILED, expected no entries, got 1 [<elliptics.core.LookupResultEntry object at 0x7f8565926c08>]
  size: 4, error: ''
 
 
Flags: 0
Writing using async Session.write_data()
  OK, write failed 1:927a008f7f94...1d90c7fc8a67: Failed to process WRITE command: Connection timed out: -110

Evgeniy Polyakov

unread,
Apr 7, 2015, 10:32:19 AM4/7/15
to Artem Savinov, rever...@googlegroups.com


07.04.2015, 17:29, "Artem Savinov" <asav...@asdco.ru>:
> Ok, thanks. I will not use cache flag then, it works as expected.
>
> But for me it's strange that the next async write with apend (no cache) doesn't return error, and write with flags = 0 is temed out instead of instant error:

There are no timeout errors in 2.26 logs you have provided
Cache was heavily changed between 2.25 and 2.26, they just can not be compared
Reply all
Reply to author
Forward
0 new messages