I've recently (noticed about 8 days ago) started experiencing an issue while performing backup-fetches. It looks something like this (wrapper only contains a few swift auth & container params):
$ envs/wal-e/bin/wal-e-wrapper-restore.sh --terse backup-fetch /var/lib/pgsql/9.2/data_next LATEST
keystoneclient.httpclient WARNING Failed to retrieve management_url from token
keystoneclient.httpclient WARNING Failed to retrieve management_url from token
keystoneclient.httpclient WARNING Failed to retrieve management_url from token
lzop: Invalid argument: <stdin>
lzop: <stdin>: Compressed data violation
wal_e.retries WARNING MSG: retrying after encountering exception
DETAIL: Exception information dump:
Traceback (most recent call last):
File "/var/lib/pgsql/envs/wal-e/lib/python2.7/site-packages/wal_e/retries.py", line 62, in shim
return f(*args, **kwargs)
File "/var/lib/pgsql/envs/wal-e/lib/python2.7/site-packages/wal_e/worker/swift/swift_worker.py", line 73, in fetch_partition
TarPartition.tarfile_extract(pl.stdout, self.local_root)
File "/var/lib/pgsql/envs/wal-e/lib/python2.7/site-packages/wal_e/tar_partition.py", line 261, in tarfile_extract
bufsize=pipebuf.PIPE_BUF_BYTES)
File "/usr/lib64/python2.7/tarfile.py", line 1690, in open
**kwargs)
File "/usr/lib64/python2.7/tarfile.py", line 1574, in __init__
self.firstmember = self.next()
File "/usr/lib64/python2.7/tarfile.py", line 2338, in next
raise ReadError("empty file")
ReadError: empty file
HINT: A better error message should be written to handle this exception. Please report this output and, if possible, the situation under which it arises.
STRUCTURED: time=2015-03-12T02:25:52.919447-00 pid=24675
wal_e.retries WARNING MSG: retrying after encountering exception
DETAIL: Exception information dump:
Traceback (most recent call last):
File "/var/lib/pgsql/envs/wal-e/lib/python2.7/site-packages/wal_e/retries.py", line 62, in shim
return f(*args, **kwargs)
File "/var/lib/pgsql/envs/wal-e/lib/python2.7/site-packages/wal_e/worker/swift/swift_worker.py", line 78, in fetch_partition
raise exc
AssertionError: This socket is already used by another greenlet: <bound method Waiter.switch of <gevent.hub.Waiter object at 0x7f1e581b76e0>>
HINT: A better error message should be written to handle this exception. Please report this output and, if possible, the situation under which it arises.
STRUCTURED: time=2015-03-12T02:25:53.429872-00 pid=24675
wal_e.retries WARNING MSG: retrying after encountering exception
DETAIL: Exception information dump:
Traceback (most recent call last):
File "/var/lib/pgsql/envs/wal-e/lib/python2.7/site-packages/wal_e/retries.py", line 62, in shim
return f(*args, **kwargs)
File "/var/lib/pgsql/envs/wal-e/lib/python2.7/site-packages/wal_e/worker/swift/swift_worker.py", line 78, in fetch_partition
raise exc
OSError: [Errno 32] Broken pipe
The AssertionError would seem to indicate some sort of issue with greenlet synchronization.
The context of this issue is on an RHEL 6.6 machine with the following dependencies contained within a Python27 based virtualenv:
$ pip list
argparse (1.2.1)
azure (0.8.4)
Babel (1.3)
boto (2.32.1)
futures (2.2.0)
gevent (1.0.1)
greenlet (0.4.4)
iso8601 (0.1.10)
lockfile (0.10.2)
netaddr (0.7.12)
oslo.config (1.4.0)
pbr (0.10.0)
pip (1.5.6)
prettytable (0.7.2)
python-daemon (1.6.1)
python-keystoneclient (0.11.1)
python-swiftclient (2.3.1.60.gc9f79e6)
pytz (2014.7)
requests (2.4.3)
setuptools (3.6)
simplejson (3.6.4)
six (1.8.0)
stevedore (1.0.0)
wal-e (0.8c2)
wsgiref (0.1.2)
(wal-e)
I did attempt to downgrade wal-e wondering if this might be related to recent releases but the same behaviour is present in 0.8a1 and 0.7.0.
I haven't had a chance to go any deeper on this particular issue but figured I'd post it here in case anyone else has experienced it as well or the issue is more obvious to other group members.
Thanks,
Dave