On Tue, Aug 12, 2014 at 4:16 PM, Paul Tiseo <
paulx...@gmail.com> wrote:
> I kinda looked there already and didn't see any notable error messages. When
> I do a manual backup-push, I have to hit a ctrl-C to interrupt it, so syslog
> seems to only contain non-error messages:
>
> Aug 12 22:44:00 ip-10-123-148-194 wal_e.operator.backup INFO MSG: start
> upload postgres version metadata#012 DETAIL: Uploading to
> s3://production_postgresql/basebackups_005/base_000000010000000000000020_00000040/extended_version.txt.#012
> STRUCTURED: time=2014-08-12T22:44:00.429147-00 pid=4964
> Aug 12 22:44:00 ip-10-123-148-194 wal_e.operator.backup INFO MSG:
> postgres version metadata upload complete#012 STRUCTURED:
> time=2014-08-12T22:44:00.553103-00 pid=4964
> Aug 12 22:44:00 ip-10-123-148-194 wal_e.worker.upload INFO MSG:
> beginning volume compression#012 DETAIL: Building volume 0.#012
> STRUCTURED: time=2014-08-12T22:44:00.844788-00 pid=4964
> Aug 12 22:44:02 ip-10-123-148-194 wal_e.worker.upload INFO MSG: begin
> uploading a base backup volume#012 DETAIL: Uploading to
> "s3://production_postgresql/basebackups_005/base_000000010000000000000020_00000040/tar_partitions/part_00000000.tar.lzo".#012
> STRUCTURED: time=2014-08-12T22:44:02.007589-00 pid=4964
> Aug 12 22:44:02 ip-10-123-148-194 wal_e.worker.upload INFO MSG: finish
> uploading a base backup volume#012 DETAIL: Uploading to
> "s3://production_postgresql/basebackups_005/base_000000010000000000000020_00000040/tar_partitions/part_00000000.tar.lzo"
> complete at 15728.8KiB/s. #012 STRUCTURED:
> time=2014-08-12T22:44:02.459143-00 pid=4964
These are for base backups, so not very interesting given your problem.
> The log in /var/log/postgresql is likewise uninformative. The error from
> archive_command is repeated instances of this:
>
> 2014-08-12 23:11:00.520 UTC,,,30174,,53ea0337.75de,2679,,2014-08-12 12:06:15
> UTC,,0,LOG,00000,"archive command failed with exit code 111","The failed
> archive command was: envdir /etc/wal-e.d/env wal-e wal-push
> pg_xlog/000000010000000000000009.00000028.backup",,,,,,,,""
> 2014-08-12 23:11:00.520 UTC,,,30174,,53ea0337.75de,2680,,2014-08-12 12:06:15
> UTC,,0,WARNING,01000,"archiving transaction log file
> ""000000010000000000000009.00000028.backup"" failed too many times, will try
> again later",,,,,,,,,""
Exit code 111: this is probably emitted by the shell attempting to run
WAL-E rather than WAL-E itself.. Like Cody wrote, consider using
absolute paths to wal-e if $PATH isn't quite right as seen from the
respect of the postgres user. Here it shows you are relying on $PATH
being configured for Postgres: envdir /etc/wal-e.d/env **wal-e**
wal-push. (double-star emphasis mine). Consider not doing that.
Also possible is that the wal-e executable bit is not set for the relevant user.
To test the theory that this is a $PATH resolution problem, you may
want to adjust your archive_command like this to get more output:
"type wal-e; [...the rest...]"
or possibly:
"which wal-e; [...the rest...]"