Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

merge two big tables

0 views
Skip to first unread message

Pet

unread,
Jun 18, 2009, 4:11:57 AM6/18/09
to
I have one table A, which should be updated sometimes. I.e. I download
copy of A with some new records. I cannot just drop table A, because
there may be records from other sources. What is the best way to do
such update?

One option could be to create temporary table and do UNION on this two
tables. That would append new records without duplicates. Than create
all indexes on temp table, drop original table and rename temporary
table to original one.

I wonder if there is better way doing this.

Thanks, Pet

Laurenz Albe

unread,
Jun 18, 2009, 4:46:32 AM6/18/09
to

Let's call the original table which you want to update "A" and
the copy of A that has some additional records "B".
You want to copy the additional records from B to A.

A manual way to do it would be:
- Load B into the same database as A.
- INSERT INTO a
(SELECT b.* FROM b LEFT OUTER JOIN
a USING(id)
WHERE a.id IS NULL);

Here "id" is the primary key.

If the amount of data is too large, or if you want it
in a more automated fashion, you might consider alternate
solutions, e.g. storing an update timestamp in each row
and extracting only the newly changed ones, or keeping a
boolean flag per row that is TRUE if the row was already
exported for import in A.

For the latter idea, you could use a statement like
UPDATE b SET exported=TRUE WHERE exported=FALSE RETURNING b.*
to export the new rows in B.

The really tough problem in a setup like yours is dealing
with conflicts: what if there is already a row in A coming
from some other source that has the same primary key as the
new row from B? Have you thought of that?

Yours,
Laurenz Albe


Tep

unread,
Jun 18, 2009, 5:27:57 AM6/18/09
to
On 18 Jun., 10:46, "Laurenz Albe" <inv...@spam.to.invalid> wrote:
> Pet wrote:
> > I have one table A, which should be updated sometimes. I.e. I download
> > copy of A with some new records. I cannot just drop table A, because
> > there may be records from other sources. What is the best way to do
> > such update?
>
> > One option could be to create temporary table and do UNION on this two
> > tables. That would append new records without duplicates. Than create
> > all indexes on temp table, drop original table and rename temporary
> > table to original one.
>
> > I wonder if there is better way doing this.
>
> Let's call the original table which you want to update "A" and
> the copy of A that has some additional records "B".
> You want to copy the additional records from B to A.
>
> A manual way to do it would be:
> - Load B into the same database as A.
> - INSERT INTO a
>      (SELECT b.* FROM b LEFT OUTER JOIN
>                       a USING(id)
>       WHERE a.id IS NULL);
>
> Here "id" is the primary key.

That's it probably what I am looking for

>
> If the amount of data is too large, or if you want it
> in a more automated fashion, you might consider alternate
> solutions, e.g. storing an update timestamp in each row
> and extracting only the newly changed ones, or keeping a
> boolean flag per row that is TRUE if the row was already
> exported for import in A.

I have no control on table B. There are no timestamps

>
> For the latter idea, you could use a statement like
> UPDATE b SET exported=TRUE WHERE exported=FALSE RETURNING b.*
> to export the new rows in B.
>
> The really tough problem in a setup like yours is dealing
> with conflicts: what if there is already a row in A coming
> from some other source that has the same primary key as the
> new row from B? Have you thought of that?

Yes, it is really a problem. Table A is not so big and wouldn't grow
fast. So, Id's or records are all below say 5 50000000. I insert
records from other sources with ID's above this upper bound. It is
very unlikely, that table to be imported ever reaches this upper bound
of rows. I have no idea how to solve this in other way...


Thanks for your help!!!

Pet

>
> Yours,
> Laurenz Albe

0 new messages