Could any body tell me how to delete duplicate rows in a table. I was
able to list the duplicate rows in table like this:
select rowid, a.*
from a, b {a & b are aliases for sam table. basically a self join}
where a.rowid != b.rowid;
I can't this select statement as subquery because we can not the update
the table in a subquery. Any ideas ... please. BTW we have not yet
installed 4gl yet.
Thanks in advance.
Siva Pothireddy
AT&T
This is a quick version of what I have done in the past. Please check
it carefully as I don't have my old code with me. I assume you want to
save one of the duplicates and delete all the other duplicates.
{ select the duplicates }
select keyfield, count(*)
from table
group by keyfield haveing count(*) > 1
into temp A;
{ select rowids of duplicates }
select rowid, keyfield
from table
where keyfield in ( select keyfield from A)
into temp B;
{ save one of the duplicates so it does not get deleted }
select keyfield, min(rowid) save_rowid
from B
group by keyfield
into temp C;
{ delete the duplicates }
delete from table
where rowid in
( select rowid from B where rowid not in
( select save_rowid from C ));
Again, test this carefully and make a backup before you delete.
Regards - lester
#############################################################################
# Lester Knutsen les...@access.digex.net #
# Advanced DataTools Corporation Voice: 703-256-0267 #
# Grant group privileges for Informix databases with DB Privileges #
#############################################################################
One method I use, which is long winded an inelegant, but does work is as
follows:
In SQL do
unload to "dupl.unl"
select unique * from dup_table;
delete from dup_table where 1=1;
load from "dupl.unl"
insert into dup_table;
Dup_table is the name of the table woth the duplicates.
You can speed up this process by dropping you indexes first nad recreating
them afterward. It is also a good idea to turn off logging or have
continous logging on as this process fills up the logs incredibly quikly.
Yours
Wayne
|\/\/ayne
|Wayne Harlech-Jones
|Windhoek, Namibia, Africa
|wa...@jones.mac.alt.na
>Could any body tell me how to delete duplicate rows in a table. I was
>able to list the duplicate rows in table like this:
>
>select rowid, a.*
> from a, b {a & b are aliases for sam table. basically a self
join}
> where a.rowid != b.rowid;
>
Use another table hold your unique rows. Two choices:
1) Use a temp table to hold the unique rows, delete from the
permanent
table, and insert from the temp table
or 2) Use a permanent table, insert the unique rows, drop the
original,
and rename the other.
Hope this helps.
John Prideaux
ala...@ix.netcom.com
Here's one solution:
select min(rowid) tmp_rowid from _tablename_ group by _key_fields_
into temp tmp_table with no log;
delete from _tablename_
where rowid not in
(select tmp_rowid from tmp_table);
Replace _tablename_ and _key_fields_ with appropriate names.
Catalin.
--
===========================================================================
Catalin Badea _/_/_/ _/_/_/ _/_/_/ _/_/_/ _/ _/ _/_/_/
_/ _/ _/ _/ _/_/ _/
Senior Technical Architect _/ _/_/ _/ _/_/_/ _/ _/_/_/
_/ _/ _/ _/ _/ _/
Tecsys Inc. _/ _/_/_/ _/_/_/ _/_/_/ _/ _/_/_/
====== c...@tecsys.com =========== include system "disclaimer.4gh" ======
Thanx.
Ravi.