Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Comparing two (largish) tables on different servers

0 views
Skip to first unread message

Gregory S. Williamson

unread,
Nov 9, 2004, 5:41:00 PM11/9/04
to
This is probably a silly question.

Our runtime deployment of database servers (7.4) involves some redundant/duplicate databases. In order to compare tables (about 5 gigs each) on different servers I unload the things (takes a while etc.), sort them with a UNIX sort and then do a cksum on them.

Is there any way to do this from inside postgres that anyone knows of ? I looked through the manual and the contrib stuff and didn't see much ...

Thanks,

Greg Williamson
DBA
GlobeXplorer LLC

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Pierre-Frédéric Caillaud

unread,
Nov 9, 2004, 7:12:08 PM11/9/04
to
Idea :
Write a program which connects on the two databases, creates a cursor on
each to return the rows in order, then compare them as they come (row 1
from cursor 1 == row 1 from cursor 2, etc). Fetch in batchs. If there's a
difference you can then know which row.
I hope you have an index to sort on, to save you a huge disk sort.

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Sam Mason

unread,
Nov 10, 2004, 4:18:21 AM11/10/04
to
Gregory S. Williamson wrote:
>Is there any way to do this from inside postgres that anyone knows of
>? I looked through the manual and the contrib stuff and didn't see
>much ...

Not really "inside postgres"; but could you do something like:

mkfifo db1
psql -h "db1" -t -q -c "$query" > db1
mkfifo db2
psql -h "db2" -t -q -c "$query" > db2
diff -u -0 db1 db2

That should work with most shells under Unix. . .

Have fun,
Sam

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Michael Fuhr

unread,
Nov 10, 2004, 2:32:50 PM11/10/04
to
On Wed, Nov 10, 2004 at 09:18:21AM +0000, Sam Mason wrote:
>
> mkfifo db1
> psql -h "db1" -t -q -c "$query" > db1
> mkfifo db2
> psql -h "db2" -t -q -c "$query" > db2
> diff -u -0 db1 db2

This should work for small data sets, but the OP said the tables
were about 5G. Unless you use a cursor, psql will fetch the entire
result before writing anything. Also, some implementations of diff
might read all of the data from one file before reading much from
the other file, especially if the files have differences. Hope
you have lots of memory....

--
Michael Fuhr
http://www.fuhr.org/~mfuhr/

0 new messages