prefetch appears to cause fatal error archive file "X" has wrong size: 0 instead of Y (in 0.8c1)

hun...@napofearth.com

unread,

Oct 21, 2014, 8:27:17 PM10/21/14

to wa...@googlegroups.com

Hi,

After upgrading from 0.7.2 to 0.8c1, I've noticed that my typical run of wal-e backup-fetch + wal-fetch in recovery.conf now fails unless I disable prefetching. This ultimately means that PostgreSQL will fail to start after a backup-fetch. The logs from a successful restore / recovery without prefetch and a failed restore / recovery with it are included below.

So far as environment goes, both runs were done from a clean Ubuntu 12.04 VM with an empty PostgreSQL 9.3 cluster directory, and both runs pulled from the same S3 endpoint. I initially suspected that the S3 endpoint was to blame -- it's a local Ceph mirror of what's in AWS -- but the error also occurred when pointing to the original bucket in AWS. I was also able to verify that the WAL file in question (0000000300000042000000F5) does not exist in S3.

So, apart from my own user error, my guess is that the prefetch is writing out something it shouldn't have when trying (and necessarily failing) to fetch the non-existing, N + 1th WAL file. But, that's just a guess. I need to spend a little more time reading through the directories that prefetch creates (and how PostgreSQL interacts with them) to offer any better opinion. My apologies for not doing so before posting this -- especially since fully understanding PostgreSQL's restore_command is Worth the Trouble.

Please let me know if this should be filed as a bug on Github, and also if there are any more details I can provide about the state of this particular WAL-E archive. It's private data, of course, but it is a static archive at this point (i.e. no more WAL files are being written to this particular prefix, nor have they been for 4-5 days), so I should at least be able to provide consistent answers to any questions you might have. This WAL-E restore from a fresh VM is also already automated, so it's easy to re-run this again with extra logging or whatnot, if that helps.

Kind regards,

HJB

with default (`--prefetch 8`). recovery.conf:

restore_command = '/usr/bin/env AWS_ACCESS_KEY_ID=XXX \

AWS_SECRET_ACCESS_KEY=YYY WALE_S3_ENDPOINT=https+path://Z \

/opt/wal-e/virtualenv/bin/wal-e \

--s3-prefix s3://BUCKET/PREFIX/ wal-fetch "%f" "%p"'

log:

wal_e.operator.backup INFO MSG: promoted prefetched wal segment

STRUCTURED: time=2014-10-21T23:43:31.730119-00 pid=15828 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/0000000300000042000000EF.lzo prefix=PREFIX/ seg=0000000300000042000000EF

2014-10-21 23:43:31 UTC LOG: restored log file "0000000300000042000000EF" from archive

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-21T23:43:32.319833-00 pid=15854

wal_e.operator.backup INFO MSG: promoted prefetched wal segment

STRUCTURED: time=2014-10-21T23:43:32.365143-00 pid=15854 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/0000000300000042000000F0.lzo prefix=PREFIX/ seg=0000000300000042000000F0

2014-10-21 23:43:32 UTC LOG: restored log file "0000000300000042000000F0" from archive

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-21T23:43:32.942442-00 pid=15872

wal_e.operator.backup INFO MSG: promoted prefetched wal segment

STRUCTURED: time=2014-10-21T23:43:32.986935-00 pid=15872 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/0000000300000042000000F1.lzo prefix=PREFIX/ seg=0000000300000042000000F1

2014-10-21 23:43:33 UTC LOG: restored log file "0000000300000042000000F1" from archive

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-21T23:43:33.507799-00 pid=15893

wal_e.operator.backup INFO MSG: promoted prefetched wal segment

STRUCTURED: time=2014-10-21T23:43:33.549269-00 pid=15893 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/0000000300000042000000F2.lzo prefix=PREFIX/ seg=0000000300000042000000F2

2014-10-21 23:43:33 UTC LOG: restored log file "0000000300000042000000F2" from archive

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-21T23:43:34.103761-00 pid=15907

wal_e.operator.backup INFO MSG: promoted prefetched wal segment

STRUCTURED: time=2014-10-21T23:43:34.155012-00 pid=15907 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/0000000300000042000000F3.lzo prefix=PREFIX/ seg=0000000300000042000000F3

2014-10-21 23:43:34 UTC LOG: restored log file "0000000300000042000000F3" from archive

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-21T23:43:34.677411-00 pid=15935

wal_e.operator.backup INFO MSG: promoted prefetched wal segment

STRUCTURED: time=2014-10-21T23:43:34.726169-00 pid=15935 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/0000000300000042000000F4.lzo prefix=PREFIX/ seg=0000000300000042000000F4

2014-10-21 23:43:34 UTC LOG: restored log file "0000000300000042000000F4" from archive

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-21T23:43:35.191866-00 pid=15947

wal_e.operator.backup INFO MSG: promoted prefetched wal segment

STRUCTURED: time=2014-10-21T23:43:35.257994-00 pid=15947 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/0000000300000042000000F5.lzo prefix=PREFIX/ seg=0000000300000042000000F5

2014-10-21 23:43:35 UTC FATAL: archive file "0000000300000042000000F5" has wrong size: 0 instead of 16777216

2014-10-21 23:43:35 UTC LOG: startup process (PID 15621) exited with exit code 1

2014-10-21 23:43:35 UTC LOG: terminating any other active server processes

with `--prefetch 0`. recovery.conf:

restore_command = '/usr/bin/env AWS_ACCESS_KEY_ID=XXX \

AWS_SECRET_ACCESS_KEY=YYY WALE_S3_ENDPOINT=https+path://Z \

/opt/wal-e/virtualenv/bin/wal-e \

--s3-prefix s3://BUCKET/PREFIX/ \

wal-fetch --prefetch 0 "%f" "%p"'

log:

STRUCTURED: time=2014-10-21T23:57:09.337312-00 pid=15932 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/0000000300000042000000F5.l

zo prefix=PREFIX/ seg=0000000300000042000000F5 state=complete

2014-10-21 23:57:09 UTC LOG: WAL file is from different database system: WAL file database system identifier is 6072803227874029436, pg_control database system identifier is 5943550808455905278.

2014-10-21 23:57:09 UTC LOG: redo done at 42/F456F620

2014-10-21 23:57:09 UTC LOG: last completed transaction was at log time 2014-10-16 05:30:02.567065+00

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-21T23:57:09.533006-00 pid=15939

wal_e.operator.backup INFO MSG: begin wal restore

STRUCTURED: time=2014-10-21T23:57:09.563344-00 pid=15939 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/0000000300000042000000F4.l

zo prefix=PREFIX/ seg=0000000300000042000000F4 state=begin

wal_e.blobstore.s3.s3_util INFO MSG: completed download and decompression

DETAIL: Downloaded and decompressed "s3://BUCKET/PREFIX/wal_005/0000000300000042000000F4.lzo" to "pg_xlog/RECOVERYXLOG"

STRUCTURED: time=2014-10-21T23:57:10.137669-00 pid=15939

wal_e.operator.backup INFO MSG: complete wal restore

STRUCTURED: time=2014-10-21T23:57:10.138563-00 pid=15939 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/0000000300000042000000F4.l

zo prefix=PREFIX/ seg=0000000300000042000000F4 state=complete

2014-10-21 23:57:10 UTC LOG: restored log file "0000000300000042000000F4" from archive

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-21T23:57:10.353051-00 pid=15959

wal_e.operator.backup INFO MSG: begin wal restore

STRUCTURED: time=2014-10-21T23:57:10.385350-00 pid=15959 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/00000004.history.lzo prefi

x=PREFIX/ seg=00000004.history state=begin

lzop: <stdin>: not a lzop file

wal_e.blobstore.s3.s3_util WARNING MSG: could no longer locate object while performing wal restore

DETAIL: The absolute URI that could not be located is s3://BUCKET/PREFIX/wal_005/00000004.history.lzo.

HINT: This can be normal when Postgres is trying to detect what timelines are available during restoration.

STRUCTURED: time=2014-10-21T23:57:10.514732-00 pid=15959

wal_e.operator.backup INFO MSG: complete wal restore

STRUCTURED: time=2014-10-21T23:57:10.515860-00 pid=15959 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/00000004.history.lzo prefi

x=PREFIX/ seg=00000004.history state=complete

2014-10-21 23:57:10 UTC LOG: selected new timeline ID: 4

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-21T23:57:10.728339-00 pid=15966

wal_e.operator.backup INFO MSG: begin wal restore

STRUCTURED: time=2014-10-21T23:57:10.761165-00 pid=15966 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/00000003.history.lzo prefi

x=PREFIX/ seg=00000003.history state=begin

wal_e.blobstore.s3.s3_util INFO MSG: completed download and decompression

DETAIL: Downloaded and decompressed "s3://BUCKET/PREFIX/wal_005/00000003.history.lzo" to "pg_xlog/RECOVERYHISTORY"

STRUCTURED: time=2014-10-21T23:57:10.988149-00 pid=15966

wal_e.operator.backup INFO MSG: complete wal restore

STRUCTURED: time=2014-10-21T23:57:10.989599-00 pid=15966 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/00000003.history.lzo prefix=PREFIX/ seg=00000003.history state=complete

2014-10-21 23:57:11 UTC LOG: restored log file "00000003.history" from archive

2014-10-21 23:57:11 UTC LOG: archive recovery complete

2014-10-21 23:57:11 UTC LOG: database system is ready to accept connections

Daniel Farina

unread,

Oct 21, 2014, 8:59:23 PM10/21/14

to Hunter Blanks, wa...@googlegroups.com

On Tue, Oct 21, 2014 at 5:27 PM, <hun...@napofearth.com> wrote:
> Hi,
>
> After upgrading from 0.7.2 to 0.8c1, I've noticed that my typical run of
> wal-e backup-fetch + wal-fetch in recovery.conf now fails unless I disable
> prefetching. This ultimately means that PostgreSQL will fail to start after
> a backup-fetch. The logs from a successful restore / recovery without
> prefetch and a failed restore / recovery with it are included below.

Uh oh. That sounds like a stop-ship bug although I haven't encountered
it myself. Looks like somehow what is a 404 gets translated to an
empty file and then hilarity ensues.

Are you using standby_mode=on by any chance?

Hunter Blanks

unread,

Oct 21, 2014, 10:16:57 PM10/21/14

to Daniel Farina, wa...@googlegroups.com

Daniel,

We do have hot_standby = on. I ought to have included the postgresql.conf. Notable settings on this particular box are:

wal_level = archive

# archive_mode (default)
# archive_command (default)

max_wal_senders = 5 # max number of walsender processes (if master)

hot_standby = on # "on" allows queries during recovery (if not master)

Thus, when in recovery, we do allow queries, since in production, we want a hot standby brought up with WAL-E (and replicating indefinitely due to recovery.conf that also contains a hot standby host) to be able to serve queries.

-HJB

Daniel Farina

unread,

Oct 21, 2014, 10:38:23 PM10/21/14

to hun...@napofearth.com, Daniel Farina, wal-e

On Tue, Oct 21, 2014 at 7:04 PM, Hunter Blanks <hun...@napofearth.com> wrote:
> Daniel,
>
> We do have hot_standby = on. I ought to have included the postgresql.conf.

hot_standby=on is not the same standby_mode=on in recovery.conf.
Yeah yeah it's one of those things
http://www.postgresql.org/docs/9.3/static/standby-settings.html

The reason for this query is that I believe postgres will *retry* when
it gets a bad wal segment in this situation, otherwise it may blow up
as you relate. This is somewhat important to avoid having WAL-E flush
prefetched WAL to disk which I took note as a serious bottleneck.

Hunter Blanks

unread,

Oct 22, 2014, 12:49:26 AM10/22/14

to Daniel Farina, Daniel Farina, wal-e

Daniel,

Ah. To be clear, the recovery.conf in the failing case only restore_command.

Although we do use hot_standby = on when bringing up a replicated slave, so that it can pull WAL files either by WAL-E or by streaming replication, hot_standby is not on in the failing case.

Thus, the only difference between the passing failing case is the addition of "--prefetch 0" to restore_command, the only config option passed into recovery.conf.

-HJB

Daniel Farina

unread,

Oct 22, 2014, 1:13:26 AM10/22/14

to Hunter Blanks, wal-e

On Tue, Oct 21, 2014 at 9:49 PM, Hunter Blanks <hun...@napofearth.com> wrote:
> Daniel,
>

> Ah. To be clear, the recovery.conf in the failing case only restore_command.

So, no standby_mode = on in recovery.conf? Can you give that a try
and user the "trigger file" to come out of recovery?

> Although we do use hot_standby = on when bringing up a replicated slave, so
> that it can pull WAL files either by WAL-E or by streaming replication,
> hot_standby is not on in the failing case.
>
> Thus, the only difference between the passing failing case is the addition
> of "--prefetch 0" to restore_command, the only config option passed into
> recovery.conf.

I can see why that may happen, because of a faithful rendering of the
404 as a proper exit code rather than a bogus empty file.

Hunter Blanks

unread,

Oct 22, 2014, 1:31:07 AM10/22/14

to Daniel Farina, wal-e

Daniel,

On Tue, Oct 21, 2014 at 10:12 PM, Daniel Farina <dan...@heroku.com> wrote:

So, no standby_mode = on in recovery.conf? Can you give that a try
and user the "trigger file" to come out of recovery?

I'll give that a try in the AM and let you know how it goes. I'd imagine it should work fine, although it probably doesn't fix the root problem. For development environments, we almost always automate WAL-E recovery up to the last checkpoint and then kick it out of recovery. Requiring standby_mode = on makes it so the provisioner has to figure out when to take the machine out of recovery. Doing that right seems a little tricky.

-HJB

Daniel Farina

unread,

Oct 22, 2014, 1:40:09 AM10/22/14

to Hunter Blanks, wal-e

Yeah, it probably should be fixed, but there are Reasons.

I personally can't work up any enthusiasm for the standby_mode=off
contract, because any interesting exit code kicks Postgres out of
recovery. OOM of wal-e or the shell that spawns it, Python segfaults,
any WAL-E bug, you name it: unfortunately there is no contract whereby
the de-archiver can say "there really was no file there" rather than
"I exited non-normally for any reason" (and, if one has problems with
shell, envdir, or whatever, then WAL-E may not even get a chance to
execute).

I think for the sake of "it should just work" this behavior can be
fixed...there is no reason to promote empty files or create them in
404 situations, but therein lies why this bug was not found by Heroku
Postgres via me.

Daniel Farina

unread,

Oct 22, 2014, 1:57:38 PM10/22/14

to Hunter Blanks, wal-e

Yeah, looking at this more, I'm pretty sure I flubbed this:

def __exit__(self, exc_type, exc_val, exc_tb):
try:
if exc_type is None:
# Success. Mark the segment as complete.
#
# In event of a crash, this os.link() without an fsync
# can leave a corrupt file in the prefetch directory,
# but given Postgres retries corrupt archive logs
# (because it itself makes no provisions to sync
# them), that is assumed to be acceptable.
os.link(self.tf.name, path.join(
self.prefetch_dir.prefetched_dir, self.segment.name))
finally:
shutil.rmtree(self.prefetch_dir.seg_dir(self.segment))

In combination with:

def wal_prefetch(self, base, segment_name):
url = '{0}://{1}/{2}'.format(
self.layout.scheme, self.layout.store_name(),
self.layout.wal_path(segment_name))
pd = prefetch.Dirs(base)
seg = WalSegment(segment_name)
pd.create(seg)
with pd.download(seg) as d:
logger.info(
msg='begin wal restore',
structured={'action': 'wal-prefetch',
'key': url,
'seg': segment_name,
'prefix': self.layout.path_prefix,
'state': 'begin'})

ret = do_lzop_get(self.creds, url, d.dest,
self.gpg_key_id is not None, do_retry=False)

logger.info(
msg='complete wal restore',
structured={'action': 'wal-prefetch',
'key': url,
'seg': segment_name,
'prefix': self.layout.path_prefix,
'state': 'complete'})

return ret

Note how the code immediately above uses do_lzop_get which return
codes to signify a 404. So the __exit__ won't clean up as
anticipated.

In the prior code, on re-thinking, the comment probably wrong or
Postgres has a bug: Postgres, if not already, should never trust a
RECOVERY_XLOG (semi-temporary file) in pg_xlog that is available
a-priori, and always run the restore_command once. Whereas, since
WAL-E can't currently figure out if the system has been online
continuously since it starts and exits so frequently, WAL-E's
promotion logic is liable to commit such a mistake.

The fix that is apparent to me is to find a way to ensure continuous
system operation even between executions, such as spitting out
boot-time to the ".wal-e" directory somewhere. This is slightly
non-portable and a bit grotty but I don't have a better idea right
now.

Hunter Blanks

unread,

Oct 22, 2014, 4:08:56 PM10/22/14

to Daniel Farina, wal-e

Daniel,

Thanks for writing. Setting standby_mode = on and then triggering end of recovery with trigger_file does work, regardless of whether prefetch is enabled. More logs follow.

I share your sympathies on the limitations of "exit-code-as-interface" we get from PostgreSQL's restore_command. Of course, since we've daemonized wal-prefetch, it doesn't actually matter if we exit non-zero? Couldn't we just raise an exception if do_lzop_get() returns false?

-HJB

Logs:

2014-10-22 19:52:08 UTC LOG: restored log file "0000000300000042000000F4" from archive

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-22T19:52:08.937034-00 pid=18312

wal_e.operator.backup INFO MSG: begin wal restore

STRUCTURED: time=2014-10-22T19:52:08.970987-00 pid=18312 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/00000004.history.lzo prefix=PREFIX/ seg=00000004.history state=begin

lzop: <stdin>: not a lzop file

wal_e.blobstore.s3.s3_util WARNING MSG: could no longer locate object while performing wal restore

DETAIL: The absolute URI that could not be located is s3://BUCKET/PREFIX/wal_005/00000004.history.lzo.

HINT: This can be normal when Postgres is trying to detect what timelines are available during restoration.

STRUCTURED: time=2014-10-22T19:52:09.098063-00 pid=18312

wal_e.operator.backup INFO MSG: complete wal restore

STRUCTURED: time=2014-10-22T19:52:09.099036-00 pid=18312 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/00000004.history.lzo prefix=PREFIX/ seg=00000004.history state=complete

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-22T19:52:09.290002-00 pid=18332

wal_e.operator.backup INFO MSG: begin wal restore

STRUCTURED: time=2014-10-22T19:52:09.323112-00 pid=18332 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/00000005.history.lzo prefix=PREFIX/ seg=00000005.history state=begin

lzop: <stdin>: not a lzop file

wal_e.blobstore.s3.s3_util WARNING MSG: could no longer locate object while performing wal restore

DETAIL: The absolute URI that could not be located is s3://BUCKET/PREFIX/wal_005/00000005.history.lzo.

HINT: This can be normal when Postgres is trying to detect what timelines are available during restoration.

STRUCTURED: time=2014-10-22T19:52:09.457994-00 pid=18332

wal_e.operator.backup INFO MSG: complete wal restore

STRUCTURED: time=2014-10-22T19:52:09.458903-00 pid=18332 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/00000005.history.lzo prefix=PREFIX/ seg=00000005.history state=complete

2014-10-22 19:52:09 UTC LOG: selected new timeline ID: 5

wal_e.main INFO MSG: starting WAL-E

DETAIL: The subcommand is "wal-fetch".

STRUCTURED: time=2014-10-22T19:52:09.649697-00 pid=18339

wal_e.operator.backup INFO MSG: begin wal restore

STRUCTURED: time=2014-10-22T19:52:09.682317-00 pid=18339 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/00000003.history.lzo prefix=PREFIX/ seg=00000003.history state=begin

wal_e.blobstore.s3.s3_util INFO MSG: completed download and decompression

DETAIL: Downloaded and decompressed "s3://BUCKET/PREFIX/wal_005/00000003.history.lzo" to "pg_xlog/RECOVERYHISTORY"

STRUCTURED: time=2014-10-22T19:52:09.915124-00 pid=18339

wal_e.operator.backup INFO MSG: complete wal restore

STRUCTURED: time=2014-10-22T19:52:09.916287-00 pid=18339 action=wal-fetch key=s3://BUCKET/PREFIX/wal_005/00000003.history.lzo prefix=PREFIX/ seg=00000003.history state=complete

Daniel Farina

unread,

Oct 22, 2014, 4:28:01 PM10/22/14

to Hunter Blanks, wal-e

On Wed, Oct 22, 2014 at 1:08 PM, Hunter Blanks <hun...@napofearth.com> wrote:
> Daniel,
>

> Thanks for writing. Setting standby_mode = on and then triggering end of
> recovery with trigger_file does work, regardless of whether prefetch is
> enabled. More logs follow.
>
> I share your sympathies on the limitations of "exit-code-as-interface" we
> get from PostgreSQL's restore_command. Of course, since we've daemonized
> wal-prefetch, it doesn't actually matter if we exit non-zero? Couldn't we
> just raise an exception if do_lzop_get() returns false?

Yeah. That fix I think is quite tractable in a number of ways,
including that one. Care to whip up a test (if not a huge pain) and
patch? If you don't have time to do it soon, I will, since there is a
release pending.

The other matter of dealing with crashes (unfortunately by experiment
in my first design, syncing the WAL prefetched in this way is a
meaningful bottleneck in practical scenarios) is a bit more
troublesome. I have my design above (boot-time based) which is not
even quite 100% in events like messing with mounts. And I'm not keen
on something heavyhanded that punishes the common case for a
vanishingly small (or, zero in the case of standby_mode=on) to solve
the problem "completely".

I think I may leave the crash recovery case as defect for now.

Hunter Blanks

unread,

Oct 22, 2014, 6:00:40 PM10/22/14

to Daniel Farina, wal-e

Daniel,

OK! I gave it a few tries, but I've not been able to get tox up and running (pip install -e . consistently fails with message "error: None"). So this pull request has only been tested by running a restore with prefetch:

https://github.com/wal-e/wal-e/pull/144

I'll attach logs from that restore to the PR for reference. The commit message at least offers a guess as to where prefetch might be tested.

-HJB

Daniel Farina

unread,

Oct 22, 2014, 6:21:12 PM10/22/14

to Hunter Blanks, wal-e

On Wed, Oct 22, 2014 at 3:00 PM, Hunter Blanks <hun...@napofearth.com> wrote:
> Daniel,
>

> OK! I gave it a few tries, but I've not been able to get tox up and running
> (pip install -e . consistently fails with message "error: None"). So this
> pull request has only been tested by running a restore with prefetch:
>
> https://github.com/wal-e/wal-e/pull/144

Are you on Ubuntu or something (in which case tox has a package like
python-tox)? Is there a reason you are not using "pip install tox"?

If memory serves, tox is taken as a given to bootstrap the entire test
process. WAL-E's setup.py or requirements.txt is ignorant for now.

Jeff Frost

unread,

Oct 28, 2014, 5:12:24 PM10/28/14

to wa...@googlegroups.com, dan...@heroku.com, hun...@napofearth.com

On Wednesday, October 22, 2014 3:00:40 PM UTC-7, Hunter Blanks wrote:

Daniel,

OK! I gave it a few tries, but I've not been able to get tox up and running (pip install -e . consistently fails with message "error: None"). So this pull request has only been tested by running a restore with prefetch:

https://github.com/wal-e/wal-e/pull/144

I'll attach logs from that restore to the PR for reference. The commit message at least offers a guess as to where prefetch might be tested.

For what it's worth, this fixed the same problem for me when restoring GPG encrypted WAL files.

Daniel Farina

unread,

Oct 28, 2014, 5:24:03 PM10/28/14

to Jeff Frost, wal-e, Hunter Blanks

Nice. Thanks for catching and writing the symptom and its solution.

Reply all

Reply to author

Forward