`bup save` fails with unexpected EOF

Stefan Monnier

unread,

Mar 11, 2023, 12:40:30 PM3/11/23

to bup-...@googlegroups.com

A cron job of mine is failing for the last two days.
Here's the output of the `sh -x` run of the script:

+ bup on <MYHOST> index --exclude=<BLABLA>
+ bup on <MYHOST> save -n <NAME> <DIRS>
Traceback (most recent call last):
File "<frozen runpy>", line 198, in _run_module_as_main
File "<frozen runpy>", line 88, in _run_code
File "/usr/lib/bup/bup/main.py", line 417, in <module>
main()
File "/usr/lib/bup/bup/main.py", line 414, in main
wrap_main(lambda : run_subcmd(cmd_module, subcmd))
File "/usr/lib/bup/bup/compat.py", line 98, in wrap_main
sys.exit(main())
^^^^^^
File "/usr/lib/bup/bup/main.py", line 414, in <lambda>
wrap_main(lambda : run_subcmd(cmd_module, subcmd))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/bup/bup/main.py", line 409, in run_subcmd
run_module_cmd(module, args)
File "/usr/lib/bup/bup/main.py", line 292, in run_module_cmd
import_and_run_main(module, args)
File "/usr/lib/bup/bup/main.py", line 287, in import_and_run_main
module.main(args)
File "/usr/lib/bup/bup/cmd/on.py", line 60, in main
for line in iter(dmc.readline, b''):
File "/usr/lib/bup/bup/helpers.py", line 497, in readline
return self._readline()
^^^^^^^^^^^^^^^^
File "/usr/lib/bup/bup/helpers.py", line 682, in _readline
return b''.join(self._read_parts(find_eol))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/bup/bup/helpers.py", line 663, in _read_parts
while self._load_buf(None):
^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/bup/bup/helpers.py", line 653, in _load_buf
if not self._next_packet(timeout):
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/bup/bup/helpers.py", line 629, in _next_packet
ns = b''.join(checked_reader(self.infd, 5))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/bup/bup/helpers.py", line 571, in checked_reader
if not buf: raise Exception("Unexpected EOF reading %d more bytes" % n)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Exception: Unexpected EOF reading 5 more bytes

AFAICT the problem is not due to the underlying SSH connection being
interrupted (I was sharing that same SSH connection via mux-ing at the
same time in an interactive session to double-check).

The error seems deterministic.

A few weeks ago I updated Bup to Debian testing's 0.33 on the machine
on which I run the script, while <MYHOST> is still running Debian stable's
0.32, but that combination worked fine until two days ago, so I don't
know if it's relevant.

Any idea where the problem could lie?

Stefan

Greg Troxel

unread,

Mar 11, 2023, 2:13:40 PM3/11/23

to 'Stefan Monnier' via bup-list

"'Stefan Monnier' via bup-list" <bup-...@googlegroups.com> writes:

> A cron job of mine is failing for the last two days.
> Here's the output of the `sh -x` run of the script:

> Exception: Unexpected EOF reading 5 more bytes
>
> AFAICT the problem is not due to the underlying SSH connection being
> interrupted (I was sharing that same SSH connection via mux-ing at the
> same time in an interactive session to double-check).

I had this problem for a while and was unable to figure it out, and now
I don't have it. I don't remember the trajectory in between. I dimly
recall that the consensus was very suspicious of ssh even though it
really seemed it was ok. I know this isn't that helpful, but you are
not alone.

Rob Browning

unread,

Mar 11, 2023, 2:34:01 PM3/11/23

to Stefan Monnier, bup-...@googlegroups.com

"'Stefan Monnier' via bup-list" <bup-...@googlegroups.com> writes:

> AFAICT the problem is not due to the underlying SSH connection being
> interrupted (I was sharing that same SSH connection via mux-ing at the
> same time in an interactive session to double-check).
>
> The error seems deterministic.
>
> A few weeks ago I updated Bup to Debian testing's 0.33 on the machine
> on which I run the script, while <MYHOST> is still running Debian stable's
> 0.32, but that combination worked fine until two days ago, so I don't
> know if it's relevant.
>
> Any idea where the problem could lie?

Hmm, not offhand, and not sure I'm reading the error right, but wondered
if the remote could be failing for some reason, provoking the short
read. Though if so, I'd hope the stderr would end up somewhere we could
see it. (Suppose it's also possible something's not flushing on error.)

And I suppose it's possible it's the version mismatch. While
mismatching versions might work, but I don't think we have much, if any
testing across versions (with respect to the remote communication).

--
Rob Browning
rlb @defaultvalue.org and @debian.org
GPG as of 2011-07-10 E6A9 DA3C C9FD 1FF8 C676 D2C4 C0F0 39E9 ED1B 597A
GPG as of 2002-11-03 14DD 432F AE39 534D B592 F9A0 25C8 D377 8C7E 73A4

Stefan Monnier

unread,

Mar 15, 2023, 3:09:36 PM3/15/23

to bup-list

I circumvented the problem with:

mv BUPDIR BUPDIR.problem
BUP_DIR=BUPDIR bup init
ln BUPDIR.problem/objects/pack/pack* BUPDIR/objects/pack
BUP_DIR=BUPDIR bup get -s BUPDIR.problem --ff FOO

and now my backups work again.

I still have no idea what's going on there (both `git fsck` and `bup
fsck` are happy with the state of BUPDIR.problem). I still have
BUPDIR.problem around in case you want me to try and diagnose the
problem further.

Stefan

Rob Browning

unread,

Mar 18, 2023, 2:08:57 PM3/18/23

to Stefan Monnier, bup-list

"'Stefan Monnier' via bup-list" <bup-...@googlegroups.com> writes:

> I still have no idea what's going on there (both `git fsck` and `bup
> fsck` are happy with the state of BUPDIR.problem). I still have
> BUPDIR.problem around in case you want me to try and diagnose the
> problem further.

Only because we've had a number of issues with it, I vaguely wondered if
it could be the index. If so, clearing it on the remote might also have
fixed the problem (at the cost of regenerating it from scratch).

(I still want to get back to the fix I was toying with (new indexing
approach), but haven't yet.)

Glad you got it working again.

Stefan Monnier

unread,

Mar 18, 2023, 6:10:20 PM3/18/23

to Rob Browning, bup-list

>> I still have no idea what's going on there (both `git fsck` and `bup
>> fsck` are happy with the state of BUPDIR.problem). I still have
>> BUPDIR.problem around in case you want me to try and diagnose the
>> problem further.
>
> Only because we've had a number of issues with it, I vaguely wondered if
> it could be the index. If so, clearing it on the remote might also have
> fixed the problem (at the cost of regenerating it from scratch).

I did try to throw away the remote's index, but it did not make
any difference.

Stefan

Reply all

Reply to author

Forward