tons of 'had to retry' messages and a crash

20 views
Skip to first unread message

Brian C. Hill

unread,
Mar 16, 2022, 12:16:57 PM3/16/22
to s3...@googlegroups.com
Hello,

I am using s3ql 3.8.1, Python 3.9.2 w/ pyfuse 3.2.1 on CentOS 7 (with kernel-ml 15.16.12).

I am seeing tons of these during a copy (rdiff-backup, specifically):

Mar 15 16:03:38 myhost mount.s3ql[1455]: Server did not provide Content-Type, assuming XML
Mar 15 16:03:38 myhost mount.s3ql[1455]: Had to retry 470 times over the last 60 seconds, server or network problem?
Mar 15 16:03:39 myhost mount.s3ql[1455]: Had to retry 471 times over the last 60 seconds, server or network problem?
Mar 15 16:03:39 myhost mount.s3ql[1455]: Had to retry 472 times over the last 60 seconds, server or network problem?
Mar 15 16:03:39 myhost mount.s3ql[1455]: Had to retry 466 times over the last 60 seconds, server or network problem?

This is new in that this was working fine until now.

During one run, mount.3sql exited:

Mar 15 21:02:23 myhost mount.s3ql[1455]: Unmounting file system...
Mar 15 21:02:29 myhost mount.s3ql[1455]: Uncaught top-level exception:
Mar 15 21:02:29 myhost mount.s3ql[1455]: Traceback (most recent call last):
Mar 15 21:02:29 myhost mount.s3ql[1455]: File "/opt/python/3.9.2/lib/python3.9/site-packages/s3ql/block_cache.py", line 687, in _get_entry
Mar 15 21:02:29 myhost mount.s3ql[1455]: el = self.cache[(inode, blockno)]
Mar 15 21:02:29 myhost mount.s3ql[1455]: KeyError: (2815633, 0)
Mar 15 21:02:29 myhost mount.s3ql[1455]: During handling of the above exception, another exception occurred:
Mar 15 21:02:29 myhost mount.s3ql[1455]: Traceback (most recent call last):
Mar 15 21:02:29 myhost mount.s3ql[1455]: File "/usr/local/bin/mount.s3ql", line 8, in <module>
Mar 15 21:02:29 myhost mount.s3ql[1455]: sys.exit(main())
Mar 15 21:02:29 myhost mount.s3ql[1455]: File "/opt/python/3.9.2/lib/python3.9/site-packages/s3ql/mount.py", line 131, in main
Mar 15 21:02:29 myhost mount.s3ql[1455]: trio.run(main_async, options, stdout_log_handler)
Mar 15 21:02:29 myhost mount.s3ql[1455]: File "/opt/python/3.9.2/lib/python3.9/site-packages/trio/_core/_run.py", line 1946, in run
Mar 15 21:02:29 myhost mount.s3ql[1455]: raise runner.main_task_outcome.error
Mar 15 21:02:29 myhost mount.s3ql[1455]: File "/opt/python/3.9.2/lib/python3.9/site-packages/s3ql/mount.py", line 275, in main_async
Mar 15 21:02:29 myhost mount.s3ql[1455]: await pyfuse3.main()
Mar 15 21:02:29 myhost mount.s3ql[1455]: File "/opt/python/3.9.2/lib/python3.9/site-packages/_pyfuse3.py", line 43, in wrapper
Mar 15 21:02:29 myhost mount.s3ql[1455]: await fn(*args, **kwargs)
Mar 15 21:02:29 myhost mount.s3ql[1455]: File "src/pyfuse3.pyx", line 781, in main
Mar 15 21:02:29 myhost mount.s3ql[1455]: File "/opt/python/3.9.2/lib/python3.9/site-packages/trio/_core/_run.py", line 813, in __aexit__
Mar 15 21:02:29 myhost mount.s3ql[1455]: raise combined_error_from_nursery
Mar 15 21:02:29 myhost mount.s3ql[1455]: File "/opt/python/3.9.2/lib/python3.9/site-packages/_pyfuse3.py", line 43, in wrapper
Mar 15 21:02:29 myhost mount.s3ql[1455]: await fn(*args, **kwargs)
Mar 15 21:02:29 myhost mount.s3ql[1455]: File "src/internal.pxi", line 272, in _session_loop

I have no idea how to interpret either issue.

What is/are the problem(s)?

Brian


Nikolaus Rath

unread,
Mar 17, 2022, 4:54:37 AM3/17/22
to s3...@googlegroups.com
On Mar 16 2022, "Brian C. Hill" <bch...@bch.net> wrote:
> Hello,
>
> I am using s3ql 3.8.1, Python 3.9.2 w/ pyfuse 3.2.1 on CentOS 7 (with kernel-ml 15.16.12).
>
> I am seeing/tons/of these during a copy (rdiff-backup, specifically):
>
> Mar 15 16:03:38 myhost mount.s3ql[1455]: Server did not provide
> Content-Type, assuming XML
> Mar 15 16:03:38 myhost mount.s3ql[1455]: Had to retry 470 times over
> the last 60 seconds, server or network problem?
> Mar 15 16:03:39 myhost mount.s3ql[1455]: Had to retry 471 times over
> the last 60 seconds, server or network problem?
> Mar 15 16:03:39 myhost mount.s3ql[1455]: Had to retry 472 times over
> the last 60 seconds, server or network problem?
> Mar 15 16:03:39 myhost mount.s3ql[1455]: Had to retry 466 times over
> the last 60 seconds, server or network problem?

This means that S3QL encountered network errors or temporary server
errors, so it had to resent requests many times before it got a
successful response from the server.

> During one run, mount.3sql exited:
[...]
> 43, in wrapper
> Mar 15 21:02:29 myhost mount.s3ql[1455]: await fn(*args, **kwargs)
> Mar 15 21:02:29 myhost mount.s3ql[1455]: File "src/pyfuse3.pyx",
> line 781, in main
> Mar 15 21:02:29 myhost mount.s3ql[1455]: File
> "/opt/python/3.9.2/lib/python3.9/site-packages/trio/_core/_run.py",
> line 813, in __aexit__
> Mar 15 21:02:29 myhost mount.s3ql[1455]: raise
> combined_error_from_nursery
> Mar 15 21:02:29 myhost mount.s3ql[1455]: File
> "/opt/python/3.9.2/lib/python3.9/site-packages/_pyfuse3.py", line
> 43, in wrapper
> Mar 15 21:02:29 myhost mount.s3ql[1455]: await fn(*args, **kwargs)
> Mar 15 21:02:29 myhost mount.s3ql[1455]: File "src/internal.pxi",
> line 272, in _session_loop

This looks incomplete. Are you sure there weren't more lines of output?

Best,
-Nikolaus

--
GPG Fingerprint: ED31 791B 2C5C 1613 AF38 8B8A D113 FCAC 3C4E 599F

»Time flies like an arrow, fruit flies like a Banana.«
Reply all
Reply to author
Forward
0 new messages