Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Removing LOADed duplicates

0 views

Skip to first unread message

Bill Goss

unread,

Feb 24, 1996, 3:00:00 AM2/24/96

I want to know if there are any good ways of solving the
problem of loading duplicates into a table (I'm using
DB2 -- MVS, naturally!).

- Here's the situation:
- The table already contains rows
- I LOAD RESUME some additional rows (or accidentally the same
data again!)
- Some (or all) of the new rows are duplicates of the original
data
- The duplicates are now in the database. DB2 doesn't catch
the problem until it's building the indexes.
- I can't use the REPAIR utility or SQL to remove the
duplicates, because there are too many of them.

The only way I know of resolving the problem is:
- Drop the unique indexes and primary key
- Unload the data
- Remove duplicates (using some form of sort)
- Reload the data
- Add the primary key and unique indexes
- Replace all dropped foreign keys
- CHECK all Check Pending tables
- Rebind all dependent plans

This is an enormous amount of work! Have you got a better
solution?

Bill...@zeta.org.au

Steve Booth

unread,

Feb 26, 1996, 3:00:00 AM2/26/96

In article <4gmdhp$1...@gidora.kralizec.net.au>, Bill Goss
<bill...@zeta.org.au> writes

Hi Bill,

Princeton Softech has a product that will allow you to do what you need.
It is called Move for DB2, one of their Relational Tools, and is an
extract/migration tool.

Baiscally, Move allows you to extract rows from multiple DB2 tables,
based on selction criteria if you wish. The rows extracted can be be
inserted into another DB2 environment with the options of:
a) remove all existing rows in the tables and insert the new ones
b) insert new rows and replace existing ones
c) insert new rows and ignore existing ones.

Move for DB2 is obviously much more sophisticated and functional than
this description - it makes pulling out and using related subsets very
easy and quick and you can populate databases on other platforms with
it, too.

I suggest you contact Murray Firze at Princeton Softech Pty on 61-8-364-
2211 for further info.

Best regards,

Steve Booth
Tarragon Software Ltd (UK Distributor for Princeton Softech)
44-1487-815815

Kenneth Lahn

unread,

Jun 12, 1996, 3:00:00 AM6/12/96

I'm confused. Are you dropping the indexes prior to LOAD? Why? Let
BLDINDEX put the dupes into a discard file. Otherwise, load into an
unindexed table and issue:
INSERT INTO some_temp_table
SELECT key-column, COUNT(*) FROM table GROUP BY key-column HAVING COUNT(*)
> 1

This will give you the keys of all rows which are NOT unique.

0 new messages