Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Need help with: pg_dump and not-interactive passwords (~/.pgpass) on Apache.

982 views
Skip to first unread message

Erwin Moller

unread,
Apr 23, 2010, 7:06:28 AM4/23/10
to
Hi,

Goal: I am trying to offer a client of mine a full database export
download for their database (scheme and data).

The environment: PHP5/Postgres8.1.17/Apache2
The place where I want to offer this download is secured: a PHP page
which is only accessible after a succesful login (enforced via $_SESSION).

I can execute shell commands in PHP and catch the output.
Right now I am trying something along these lines:

/usr/bin/pg_dump -h localhost -p 5432 -D -O -x -U someuser somedb

The problem is that pg_dump doesn't want a password on the commandline.
(The reason why the password is not welcome on the commandline is clear:
If it did accept a password it would be visible for all who can ps on
the same machine.)

But how should I circumvent this restriction in a smart/safe way?
I read the advise in the postgresfora to use .pgpass instead.

If I understand it right I should create a .pgpass file in the
userdirectory (with 600 permissions).
Format should be: hostname:port:database:username:password
(http://www.postgresql.org/docs/current/interactive/libpq-pgpass.html)

Now to my problem/confusion:
The user is www-data in my case (that is Apache running PHP).
That is the same user that runs all other websites on the same machine.
Besides the fact www-data doesn't have a homedirectory, it would be an
unsafe solution to put my db-credentials in there because all other
websites could access the same database, without knowing the password(!).

So in my opinion that solution would be more unsafe than allowing a
password as an argument to pg_dump, because the latter would only be
visible for a short time, and using a .pgpass would open up my database
for all, all the time.

I must be missing something important here.
Can anybody clarify it a little more for me?

Is my above analysis correct and how should I solve this problem?

Thanks for your time!

Regards,
Erwin Moller


--
"There are two ways of constructing a software design: One way is to
make it so simple that there are obviously no deficiencies, and the
other way is to make it so complicated that there are no obvious
deficiencies. The first method is far more difficult."
-- C.A.R. Hoare

Marco Mariani

unread,
Apr 23, 2010, 8:56:01 AM4/23/10
to
On 04/23/2010 01:06 PM, Erwin Moller wrote:

> /usr/bin/pg_dump -h localhost -p 5432 -D -O -x -U someuser somedb
>
> The problem is that pg_dump doesn't want a password on the commandline.
> (The reason why the password is not welcome on the commandline is clear:
> If it did accept a password it would be visible for all who can ps on
> the same machine.)


Would it be feasible to modify pg_dump to read a password from a
specified file? Ldapsearch and other programs do it.

That way you could write the password to a pipe from your application,
and feed its output to pg_dump -y /dev/fd/0

Erwin Moller

unread,
Apr 23, 2010, 10:09:46 AM4/23/10
to
Marco Mariani schreef:

Hi Marco,

Thanks for your suggestion.
That would surely be better than sharing the password with the rest of
the users on the machine. :-)

But I never heard of that -y option, nor does my 'man pg_dump' know it.
Possible I misunderstand your suggestion.

I posted my question in a different form to comp.lang.php, and Captain
Paralytic had an easy solution:

(From PHP)
$command = "/usr/lib/postgresql/8.1/bin/pg_dump -h localhost -p 5432 -D
-O -x -U someuser somedbdb < /path/to/dbpassword.txt";
$handle = popen($command, 'r');
.. read the stream...

But I like the part of your suggestion that writes the password to a
pipe. It is safer than having it open for www-data to read.

Now I only have make a pipe of that file and I am ready to go. :-)

Thanks.

Marco Mariani

unread,
Apr 23, 2010, 10:15:17 AM4/23/10
to
On 04/23/2010 04:09 PM, Erwin Moller wrote:

> But I never heard of that -y option, nor does my 'man pg_dump' know it.

Yes, I was mildly proposing to add the option. It's not there.

Erwin Moller

unread,
Apr 23, 2010, 10:47:03 AM4/23/10
to
Marco Mariani schreef:

> On 04/23/2010 04:09 PM, Erwin Moller wrote:
>
>> But I never heard of that -y option, nor does my 'man pg_dump' know it.
>
> Yes, I was mildly proposing to add the option. It's not there.
>

Well, to be honest, I simply don't have the balls to expand the default
pg_dump on my production environment. ;-)

By the way: I just told a co-worker of mine about the problem and he
said that I solved that problem years ago in another project. (I totally
forgot. I clearly need a memory upgrade.)

An alternative solution is this command:

$command = "export PGPASSWORD=theneededpassword
/usr/lib/postgresql/8.1/bin/pg_dump -D -O -x -c -h localhost -p 5433 -U
someuser somedatabase";

I like that solution even better.
Simple and clean.
I bet somebody in here gave me that solution a few years ago. ;-)

Thanks for your help.

Erwin Moller

unread,
Apr 23, 2010, 10:54:26 AM4/23/10
to
Erwin Moller schreef:

> Marco Mariani schreef:
>> On 04/23/2010 04:09 PM, Erwin Moller wrote:
>>
>>> But I never heard of that -y option, nor does my 'man pg_dump' know it.
>>
>> Yes, I was mildly proposing to add the option. It's not there.
>>
>
> Well, to be honest, I simply don't have the balls to expand the default
> pg_dump on my production environment. ;-)
>
> By the way: I just told a co-worker of mine about the problem and he
> said that I solved that problem years ago in another project. (I totally
> forgot. I clearly need a memory upgrade.)
>
> An alternative solution is this command:
>
> $command = "export PGPASSWORD=theneededpassword
> /usr/lib/postgresql/8.1/bin/pg_dump -D -O -x -c -h localhost -p 5433 -U
> someuser somedatabase";
>

The above line was scrambled. Here is a working one:
$command = "export
PGPASSWORD=theneededpassword\n/usr/lib/postgresql/8.1/bin/pg_dump -D -O

-x -c -h localhost -p 5433 -U someuser somedatabase";

All on one line.

Regards,
Erwin Moller

Mladen Gogala

unread,
Apr 24, 2010, 2:02:10 PM4/24/10
to
On Fri, 23 Apr 2010 13:06:28 +0200, Erwin Moller wrote:

> I must be missing something important here. Can anybody clarify it a
> little more for me?

Yup. Dumping database through the web interface is not a really good
idea. For some things, CLI the king. For everything else, there is a
Mastercard(TM).

--
http://mgogala.byethost5.com

Erwin Moller

unread,
Apr 25, 2010, 11:24:30 AM4/25/10
to
Mladen Gogala schreef:


Hi,

Well, personally I think that is for the developer to decide.
For example: I don't want my clients to have shell access. That would
give me more security headaches than offering a backup via a (secured)
webinterface.
If they insist in backing up their database at regular intervals
themselfs, I want to offer that to them, even tough I make daily backups.

And I don't use Mastercard. ;-)

Mladen Gogala

unread,
Apr 25, 2010, 12:42:05 PM4/25/10
to
On Sun, 25 Apr 2010 17:24:30 +0200, Erwin Moller wrote:

>> Yup. Dumping database through the web interface is not a really good
>> idea. For some things, CLI the king. For everything else, there is a
>> Mastercard(TM).
>
>
> Hi,
>
> Well, personally I think that is for the developer to decide.

Actually, no it is not for the developer to decide. Database backup is a
sensitive operation which should not be a part of the application.

> For
> example: I don't want my clients to have shell access.

Neither do I. Why would your clients have to do database backup? Database
backup is a part of the DBA job. I am an Oracle DBA working with Oracle
since 1989, lately into Postgres, and have never offered such capability.
In the world of commercial database applications, backup is offered only
as a DBA tool. Clients don't do backups, period. DBA does backups.
Second, pg_dump is not a good backup tool. It offers a logically
consistent snapshot of your data, but it doesn't allow you to do
recovery. There is a strong possibility of data loss.
Do you allow your users to have their database schemas? You are probably
an ISP, offering hosting? If your users are using Postgres, as they
should because it's a great database, they can do it from their own
client tools:

pg_dump -U mgogala -W -h lpo-postgres-01 scott >/tmp/scott.sql
Password:

It works like a charm. I needed -W because I am a trusted user on this
host, being a DBA, so it wouldn't have asked me for my password without
"-W". You get some perks from being able to edit pg_hba.conf.
The bottom line is this: if the client is using Postgres, he or she
should be proficient with psql and pg_dump. No need for web dump.

> That would give
> me more security headaches than offering a backup via a (secured)
> webinterface.

If they need to backup their application data, which is, by the way, not
instilling confidence, you could always offer to dump them into CSV
format.

--
http://mgogala.byethost5.com

Erwin Moller

unread,
Apr 26, 2010, 5:31:43 AM4/26/10
to
Mladen Gogala schreef:

> On Sun, 25 Apr 2010 17:24:30 +0200, Erwin Moller wrote:
>
>>> Yup. Dumping database through the web interface is not a really good
>>> idea. For some things, CLI the king. For everything else, there is a
>>> Mastercard(TM).
>>
>> Hi,
>>
>> Well, personally I think that is for the developer to decide.


Hi Mladen Gogala,

(Sorry for the lengthy response, but I don't agree to a lot of things
you say and want to make clear how and why that is.)

>
> Actually, no it is not for the developer to decide. Database backup is a
> sensitive operation which should not be a part of the application.

That is just your opinion, not a fact.
Why shouldn't a database backup NOT be a part of an application?
I agree that in most cases it is nonsense, but not always. (Read on)

>
>> For
>> example: I don't want my clients to have shell access.
>
> Neither do I. Why would your clients have to do database backup?

There can be many reasons.
To name a few:
1) Client feels more secure when they have their own copy.
2) They want more backups than are currently configured by me (daily).
3) They want to try something really wild, and want a backup before
doing that.

Possibly a lot more reasons.

I don't know. I don't care.

I built their application, they paid for it. They own it (I always give
intelectual property to my clients.)
Bottomline: If the owner of the software I make for them want it, they
will get it.

If I think it is a really stupid request I will of course tell them, but
in this case I don't think it is stupid.


> Database
> backup is a part of the DBA job.

Here is the problem: We are not talking some fortune 500 company here
with dedicated DBA's.
This is a small company that needed a highly specific solution. I built
that for them. There is nowhere a DBA to be found.
I am the one closest to that in this situation, and my Postgres backups
are configured and running fine, but 'only' on a daily basis.


> I am an Oracle DBA working with Oracle
> since 1989, lately into Postgres, and have never offered such capability.
> In the world of commercial database applications, backup is offered only
> as a DBA tool. Clients don't do backups, period. DBA does backups.

Well, you make it sound like that is carved in stone. It isn't.
Don't get me wrong: Of course a dedicated DBA is prefered over the
solution I made, but a dedicated DBA is overkill (= too expensive).
So we settle for: daily backups + the option to create a backup on the
fly if the client sees some reason for that.

> Second, pg_dump is not a good backup tool. It offers a logically
> consistent snapshot of your data, but it doesn't allow you to do
> recovery. There is a strong possibility of data loss.


'strong possibility of data loss'?
This is what Postgres website says about pg_dump:

source: http://www.postgresql.org/docs/8.3/static/app-pgdump.html
---------------------------
Description

pg_dump is a utility for backing up a PostgreSQL database. It makes
consistent backups even if the database is being used concurrently.
pg_dump does not block other users accessing the database (readers or
writers).

Dumps can be output in script or archive file formats. Script dumps are
plain-text files containing the SQL commands required to reconstruct the
database to the state it was in at the time it was saved. To restore
from such a script, feed it to psql. Script files can be used to
reconstruct the database even on other machines and other architectures;
with some modifications even on other SQL database products.
---------------------------

And a little futher down:
--------------------------------
Notes

If your database cluster has any local additions to the template1
database, be careful to restore the output of pg_dump into a truly empty
database; otherwise you are likely to get errors due to duplicate
definitions of the added objects. To make an empty database without any
local additions, copy from template0 not template1, for example:

CREATE DATABASE foo WITH TEMPLATE template0;

pg_dump has a limitation; when a data-only dump is chosen and the option
--disable-triggers is used, pg_dump emits commands to disable triggers
on user tables before inserting the data and commands to re-enable them
after the data has been inserted. If the restore is stopped in the
middle, the system catalogs might be left in the wrong state.

Members of tar archives are limited to a size less than 8 GB. (This is
an inherent limitation of the tar file format.) Therefore this format
cannot be used if the textual representation of any one table exceeds
that size. The total size of a tar archive and any of the other output
formats is not limited, except possibly by the operating system.

The dump file produced by pg_dump does not contain the statistics used
by the optimizer to make query planning decisions. Therefore, it is wise
to run ANALYZE after restoring from a dump file to ensure good
performance. The dump file also does not contain any ALTER DATABASE ...
SET commands; these settings are dumped by pg_dumpall, along with
database users and other installation-wide settings.

Because pg_dump is used to transfer data to newer versions of
PostgreSQL, the output of pg_dump can be loaded into newer PostgreSQL
databases. It also can read older PostgreSQL databases. However, it
usually cannot read newer PostgreSQL databases or produce dump output
that can be loaded into older database versions. To do this, manual
editing of the dump file might be required.
---------------------------------

I can live with these limitations.
More importantly: If I receive the output of pg_dump I can restore the
database to that state.

Where is this 'strong possibility of data loss'?
I have used pg_dump on many many occasions and never encountered any
dataloss.

Am I missing something important here?


> Do you allow your users to have their database schemas?

No. They are laymen. I designed the database and created all interaction
with the database. They don't even have the password, nor do they know
how how I named their database (unless they look into the pg_dump output
I just created for them.).

> You are probably an ISP, offering hosting?

Yes and no. We am not selling hosting actively, but we have our own
dedicated servers on which we run application for clients.

Main reason for this is that we can control the environment in great
detail and keep things safe.
We had our share of frustrations with hosting providers, so we decided a
long time ago to do our own hosting.
And most big ISP don't offer Postgres at all.

> If your users are using Postgres, as they
> should because it's a great database, they can do it from their own
> client tools:
>
> pg_dump -U mgogala -W -h lpo-postgres-01 scott >/tmp/scott.sql
> Password:

My client doesn't understand that, nor do they need to.
And they need shell access for that, don't they?
I don't offer that to them. Even if I did they wouldn't know how to use it.

In short: the reason I offer this db backup via a webinterface is for
the following scenario:
1) Client wants to try something wild with the app and is afraid he/she
will screw up important stuff.
2) They download a snapshot (that runs via pg_dump behing the scenes)
3) They try their wild things.
4a) If they are happy, no futher actions
4b) If they indeed screwed up (for example: deleted important stuff)
they can ask me to roll back to the excact state the app was in before
they started.

I don't see anything bad in that approach.
(Of course, they could ask me every time to do the backup, but that
would cost them. Now they can play around in the knowledge I am only
needed if they need to restore.)


>
> It works like a charm. I needed -W because I am a trusted user on this
> host, being a DBA, so it wouldn't have asked me for my password without
> "-W". You get some perks from being able to edit pg_hba.conf.
> The bottom line is this: if the client is using Postgres, he or she
> should be proficient with psql and pg_dump. No need for web dump.


Well, client doesn't know how to tell SQL appart from JavaScript. ;-)
And they don't have a bashshell or anything. I don't offer that to them.


>
>> That would give
>> me more security headaches than offering a backup via a (secured)
>> webinterface.
>
> If they need to backup their application data, which is, by the way, not
> instilling confidence, you could always offer to dump them into CSV
> format.

CSV instead of SQL?
What difference would that make?
Sounds like more work for me, so maybe it is a good idea. ;-)

The way I see it:
- It is THEIR data. Not mine. I am not going to keep them away from
their own data.
- If the pg_dump output makes them happy because they can experiment
more and have the knowledge I can restore it: fine. No harm done here.
- I don't see any advantage in CSV or XML over the SQL export. They
won't understand either format.


As always: feel free to completely disagree. :-)
I am curious what you have to say (honestly) and why you think the way
you do.
But arguments like 'Database backup is a sensitive operation which
should not be a part of the application.' are simply not always true.
That is too dogmatic.
If you think I have it all backwards, please correct me. I am always
willing to learn!

Thomas Kellerer

unread,
Apr 26, 2010, 5:43:50 AM4/26/10
to
Mladen Gogala, 25.04.2010 18:42:

> Second, pg_dump is not a good backup tool. It offers a logically
> consistent snapshot of your data, but it doesn't allow you to do
> recovery.

pg_dump *is* Postgres' backup tool (apart from a PITR backup).

What do you mean with "doesn't allow you to do recovery"?

The file that pg_dump creates (either the SQL script or the custom format) will allow me to restore the database to the state it was in when I took the backup.


> There is a strong possibility of data loss.

Can you give some hard facts why you consider it having a "strong possibility of data loss"?
It is the equivalent to Oracle's (obsolete) "exp" tool

Thomas

Mladen Gogala

unread,
Apr 26, 2010, 8:16:18 AM4/26/10
to
On Mon, 26 Apr 2010 11:43:50 +0200, Thomas Kellerer wrote:

> Mladen Gogala, 25.04.2010 18:42:
>> Second, pg_dump is not a good backup tool. It offers a logically
>> consistent snapshot of your data, but it doesn't allow you to do
>> recovery.
>
> pg_dump *is* Postgres' backup tool (apart from a PITR backup).

Apologies for the lengthy post, but this is an important topic.

pg_dump is a tool to take snapshots, not backups. Backup is done using OS
tools like tar, cpio or Bacula. What you call "PITR Backup" is actually a
db interface to OS backup which causes it to log the entire block into
the WAL file, not just the change record. You swich the database to
backup mode by issuing pg_start_backup('Label') and then do OS backup:


http://www.postgresql.org/docs/current/static/continuous-archiving.html


The main difference between a backup and a snapshot is that backup allows
you to do point in time recovery. Snapshot does not.

>
> What do you mean with "doesn't allow you to do recovery"?

There is no way to recover the data between the time the dump wa taken
and present.


>
> The file that pg_dump creates (either the SQL script or the custom
> format) will allow me to restore the database to the state it was in
> when I took the backup.
>
>
>> There is a strong possibility of data loss.
>
> Can you give some hard facts why you consider it having a "strong
> possibility of data loss"? It is the equivalent to Oracle's (obsolete)
> "exp" tool
>
> Thomas


And exp is not backup, either. BTW, I prefer pg_dump to exp because
pg_dump produces plain, readable SQL, unlike exp which produces a binary
file. To tell the truth, it pg_dump -F c also produces binary format,
suitable for parallel recovery, but that's another story.

Basically, it is now 07:45 AM on Monday, April 26th here in New York
City. If I take pg_dump now, I have no chance of recovering data entered
into my database at 08:00 AM. The "pg_dump" utility will issue a SQL
query, constrained by the ACID requirements and produce an output from
which I can recover my database to that state. That is called "snapshot",
it's like a photograph: still, frozen in time, you can't do much with it.

If you want to recover all of your data, you must do a proper online
backup, with the WAL archives. As for the export utility, here is what
Tom Kyte has to say about that: http://tinyurl.com/2domn66
Let me quote him:
"Export can make a logical copy of data, but that is all - never will it
create a backup"

The same, of course, holds true for the pg_dump utility. It's a logical
copy, a snapshot, a still photo, not a backup. Backup is taken by tar,
cpio or alike. Backup allows you to do a point in time recovery. Snapshot
does not.

Snapshots make sense when you want a *logical* state of your database at
some moment like end of a month, end of a quarter, project delivery or
some other significant event. Snapshots are usually taken so that you can
re-create the database in that state. That, for instance, may be reuired
by law. It's much cheaper to have a backup tape (or DVD) in a fire-proof
cabinet then to hold the entire database around for years. Backup is what
you take daily to be able to recover from a disasterous events like a
disk crash. The principal purpose of backup is to allow you to recover
all of your data, losing only the transactions that weren't committed at
the time of the failure.
If Postgres wants to find its way into a corporate server room, then this
is a distinction that has to be made. Corporate DR plan cannot rely on
pg_dump utility alone, it must include backup. Being a DBA, my primary
task is to make sure that the databases I am in charge of are available
to my users and that users do not lose data. I can't do that with pg_dump
alone.

--
http://mgogala.byethost5.com

Mladen Gogala

unread,
Apr 26, 2010, 8:18:43 AM4/26/10
to
On Mon, 26 Apr 2010 11:31:43 +0200, Erwin Moller wrote:


> Am I missing something important here?

Yes. Please, read my reply to Thomas Kellerer. Once again, pg_dump is
*NOT* a backup tool.

--
http://mgogala.byethost5.com

Thomas Kellerer

unread,
Apr 26, 2010, 8:27:33 AM4/26/10
to
Mladen Gogala, 26.04.2010 14:16:

>> pg_dump *is* Postgres' backup tool (apart from a PITR backup).
>
> Apologies for the lengthy post, but this is an important topic.
>
> pg_dump is a tool to take snapshots, not backups.

It all depends on your requirements. If you can live with the fact that you might lose date between the last backup (or "snapshot" in your terms) and "now" then pg_dump *is* a valid "backup" tool.

If your requirement is, that you may not lose *any* committed transaction, then pg_dump is indeed the wrong tool.

Thomas

Mladen Gogala

unread,
Apr 26, 2010, 8:49:19 AM4/26/10
to

This is a discussion of terminology. While the usual definition of
"backup" is just to store the data offline, I believe that the "real
backup" should allow me to restore the database to its most current
state. The distinction between a "backup" and a "snapshot" is indeed
something from the world of oracle. I do propose to adopt the similar
terminology in the world of Postgres, because the distinction is valid.
I may appear to be nitpicking, but I am a DBA, please forgive me.

--
http://mgogala.byethost5.com

Erwin Moller

unread,
Apr 26, 2010, 9:13:55 AM4/26/10
to
Mladen Gogala schreef:


Hi again,

I read your response.
The difference between 'backup' and 'snapshot' you described is new to
me. (I use 'snapshot' and 'backup' interchangeably, but that is
apparently wrong.)

So, bottom line is (using your terminology in my wording):
-> As a snapshot tool pg_dump suffices.
Reason: It can take a certain point-in-time snapshot.

-> As a back-up tool is doesn't suffice.
Reason: A back-up tool also takes into account all committed
transactions performed after the back-up, allowing you to go back to
*any* point in time. Actually, I formulated that poorly too because
there is no 'after' the backup. There is no 'after the backup' because
the back upping is a continuous process.

Do I understand that right?

Anyway, for my client's purposes a snapshot is good enough, so pg_dump
is also good enough for me. Luckily me, otherwise I should get an extra
job to pay for Oracle's horrible licensing.

Thanks for the explanation. I wasn't aware of the distinction.

Mladen Gogala

unread,
Apr 26, 2010, 5:11:18 PM4/26/10
to
On Mon, 26 Apr 2010 15:13:55 +0200, Erwin Moller wrote:

> Do I understand that right?

Yup.

>
> Anyway, for my client's purposes a snapshot is good enough, so pg_dump
> is also good enough for me.

Good, just don't do it over the web. Let your clients get the Postgres
client and backup their databases using their own client tool.

> Luckily me, otherwise I should get an extra
> job to pay for Oracle's horrible licensing.

Why would you need Oracle? Postgres can do a perfectly good DB backup. I
tested it on a NetApp, using snapshot, and it worked like a charm. There
is a function pg_start_backup('label') which tells Postgres when should
it start writing the entire block into the WAL archive, not just the
change record, and you're good to go.


--
http://mgogala.freehostia.com

Erwin Moller

unread,
Apr 28, 2010, 1:51:32 AM4/28/10
to
Mladen Gogala schreef:

Hi,

Yes, I found the WAL archive functionality shorty after reading your
messages.
I have never worked with it before, so I'll have to study it more before
using it.

For those like me (who never heard of it before), this is a good
introduction:
http://www.postgresql.org/docs/8.3/static/continuous-archiving.html

Thanks Mladen, for explaining the difference between snapshot and
continuous archiving. I never gave it much thought before.

0 new messages