we need to delete over 100,000 rows out of an several million rows table
hold in DB2/2 vs2.1.2.
Is it possible to do this job without transaction log buffer switched
on?
DB2 CONNECT TO BIGDB
DB2 DELETE FROM BIGTABLE WHERE ...
DB2 CONNECT RESET
Thanks in advance.
Noone seems to know the answer to this one, but I'm very curious for the
answer. I myself have a pretty large database in which I encounter this
problem myself.
Too bad.
Sacha
(work) spr...@businessnet.net
(priv@) sa...@prins.net
http://www.businessnet.net/people/sacha/
http://www.businessnet.net/
-----------------------------------------------------------------------
"Arnie ist zur…k und er hat ziemlich schlecht geschlafen. Da hilft nur
eines: die Kanone laden und drauf los ballern. Viel Spass, Arnie!" - From
a dutch video store's movie review of 'Eraser'.
I had a similar problem. I had a table full of junk that was growing fast, and
I cannot remove the table, because it would break our apps. It had approx.
700,000 rows to delete. We had an extended weekend coming up, so I
wrote a Rexx program that used a cursor to delete the rows one at a time.
No problem, except after 10,000 rows, the database access became terrible.
So I made it so it deleted 5,000 rows one at a time, it then did a Reorg, then
deleted 5,000 more until the table was empty.
----Steve
Stephen Amadei
Director of MIS
Dandy Connections, Inc.
Atlantic City, NJ
>so I
>wrote a Rexx program that used a cursor to delete the rows one at a time.
>No problem, except after 10,000 rows, the database access became terrible.
>So I made it so it deleted 5,000 rows one at a time, it then did a Reorg, then
>deleted 5,000 more until the table was empty.
thanks for your answer. I'm pretty sure that I read a hint about deleting mass data in a DB2/2
manual years ago, but I don't remember the hint. At least there was a way to switch off the
transaction log for such a job ...
Visit me at http://ourworld.compuserve.com/homepages/Harald_Wilhelm
Harald (HaWi) Wilhelm
>"Harald Wilhelm" <Harald_...@ibm.net> writes: > On 13 Mar 1997 03:18:29 GMT, dand...@digex.net wrote:
>>
>> >so I
>> >wrote a Rexx program that used a cursor to delete the rows one at a time.
>> >No problem, except after 10,000 rows, the database access became terrible.
>> >So I made it so it deleted 5,000 rows one at a time, it then did a Reorg, then
>> >deleted 5,000 more until the table was empty.
>>
Can't you do something like a drop table space?
In all other DB2-databases this is the fastest way to erase data.
Michel de Bokx
10600...@compuserve.com
Please be careful with LOAD REPLACE. LOAD utility works at
TABLESPACE level, so if you have more than one table in the
TABLESPACE, your LOAD REPLACE will delete all the tables and
then load the data in the desired table.
What you can do is execute 'DELETE FROM tablename' first.
This will delete all the data from this table only. Also If
you are using segmented tablespace this is very fast as it
updates only the space map page and not the entire
tablespace. After this step is carried out, you can LOAD RESUME
the table with the saved data.
Hope this helps.
Sunit
-------------------==== Posted via Deja News ====-----------------------
http://www.dejanews.com/ Search, Read, Post to Usenet
<BLOCKQUOTE TYPE=CITE>On Wed, 19 Mar 1997 20:47:58 GMT, <> wrote:
<BR>
<BR><I>>"Harald Wilhelm" <Harald_...@ibm.net> writes:
> On 13 Mar 1997 03:18:29 GMT, dand...@digex.net wrote:</I>
<BR><I>>></I>
<BR><I>>> >so I</I>
<BR><I>>> >wrote a Rexx program that used a cursor to delete the rows
one at a time.</I>
<BR><I>>> >No problem, except after 10,000 rows, the database access
became terrible.</I>
<BR><I>>> >So I made it so it deleted 5,000 rows one at a time, it
then did a Reorg, then</I>
<BR><I>>> >deleted 5,000 more until the table was empty.</I>
<BR><I>>></I>
<BR>Can't you do something like a drop table space?
<BR>In all other DB2-databases this is the fastest way to erase data.
<BR>Michel de Bokx
<BR>10600...@compuserve.com
</BLOCKQUOTE>
The reason that (1) you can't just drop the tablespace is that the
plans (packages) will all need to be reorganized. If you indeed don't
need ANY of the data, then issue DELETE FROM table (no predicates) which,
if you are using a segmented tablespace (which you should be) will run
very fast -- it's mass delete processing.
<BR>
<BR>The reason that individual deletes (2) run slow is that after a while the
logging process necessary to "remember" all those deletes becomes
cumbersome. Try one or more of the following:
<BR> - commit more frequently
<BR> - build an IN (list) for the deletes if you are deleting
by key e.g. delete from table where c1 in (:hv1,:hv2,:hv3,...) and fill
in the variables with keys. Commit after each "batch".
<BR>
<BR>--
<BR>Kenneth Lahn
<BR>
<BR><A HREF="http://www.connix.com/~lahnkv">http://www.connix.com/~lahnkv</A>
<BR>
</BODY>
</HTML>
>What you can do is execute 'DELETE FROM tablename' first.
thanks for your answer. Sorry but it's not working for us.
We have in fact a 50 Million row database in DB2/2. Daily we remove the data of the oldest day
out of this database and insert the data of the current day. Daily information is around 2
Million rows. So we have to delete (and insert) 2 Million rows daily.
One row of the table in question is 70 Bytes raw data. Our transaction log is 8 primary and 120
secondary log files with 4095 pages (16MB) each. During the delete step and the insert job
afterwards, nobody is working with the database.
The interesting point is, that deleting 2 Million rows of 70 Bytes each, don't fit into this
huge transaction log buffer. The docs say that 2,5 times of the raw data is needed in the
transaction logs, but currently it seems that we need 100 times of the raw data.
Why???
Visit me at http://www.hawi.de
Harald (HaWi) Wilhelm
Create a partitioned tablspace or tablespaces and then have one
partition per day over the tablespace or mutiple tablespace (like one TS
per month and one part per day). You then get a partitioning tool like
'PARTITION EXPERT' which would then manage the TS like a GDG rolling the
partitions off. This would then releive you of the requirement to do
deletes and of course the logging problem.
hmmmmmm.
>Try to split the delete into smaller portions using more fields of the
>key (not only the date field but also the second one - if it's a multi
>field key) and commit after every group.
that's not the question. The question is how to calculate the size of the transaction log
buffer. 2 Million rows to delete with 70 Bytes each should fit into 2GB free transaction log
space.
I think that's something wrong with the transaction log manager of DB2/2 vs2.1.2 csd8122.
>Create a partitioned tablspace or tablespaces and then have one
>partition per day over the tablespace or mutiple tablespace (like one TS
>per month and one part per day). You then get a partitioning tool like
>'PARTITION EXPERT' which would then manage the TS like a GDG rolling the
>partitions off. This would then releive you of the requirement to do
>deletes and of course the logging problem.
good idea, will have a look at it.
>I believe the original poster
>was using DB2 1.2 (as I was when I posted my (slow) solution done with Rexx).
no, I started the thread and we're using DB2/2 vs2.1.2 (8122).
>So Harald could delete 2 million of the 50 million rows by exporting the 48
>million rows he wants then importing them back in. But, by my calculations,
>Haralds 48 million rows would (at 70 bytes each) would require exporting and
>importing 3.2 Gigs of data.
oh it's a pain <g>. If we want to delete 2,000,000 million rows with 70 Bytes each, we need a
transaction log space of nearly 3GB (god knows why). If I use your approach ... same space.
I would like to switch off the transaction log manager until this DELETE has passed. This was
possible with DB2/2 vs1.2 (as far I remember) but is no longer available for DB2/2 vs2.xxx.
My mistake. I must have come in late.
>>So Harald could delete 2 million of the 50 million rows by exporting the 48
>>million rows he wants then importing them back in. But, by my calculations,
>>Haralds 48 million rows would (at 70 bytes each) would require exporting and
>>importing 3.2 Gigs of data.
>
>oh it's a pain <g>. If we want to delete 2,000,000 million rows with 70 Bytes each, we need a
>transaction log space of nearly 3GB (god knows why). If I use your approach ... same space.
>
>I would like to switch off the transaction log manager until this DELETE has passed. This was
>possible with DB2/2 vs1.2 (as far I remember) but is no longer available for DB2/2 vs2.xxx.
I was never able to do this with 1.2. I wish I could. But by the time I find how to,
we'll probably go the rest of the way and finish upgrading to 2.1.2. Oh well.
This thread has gone on a long time now. I would have thought that
deleting 100,000 70-byte rows with a plain DELETE statement (no
intermediate COMMITs) should not pose any real problem (in this day and
age!). Moreover, it should outperform unloading and reloading several
million rows (less deletions).
Can't somebody help with the real issue - why does Harald's transaction
log apparently require so much space?
--
Jeremy Rickard
>[SNIP]
>Can't somebody help with the real issue - why does Harald's transaction
>log apparently require so much space?
>--
>Jeremy Rickard
Is it possible that Harald has some monitoring functions activated
causing exceptionally verbose logging?
Willem Wals (wa...@usoft.nl)
>I would have thought that
>deleting 100,000 70-byte rows with a plain DELETE statement (no
>intermediate COMMITs) should not pose any real problem (in this day and
>age!)
in fact we have to delete 2,000,000 rows out of 50,000,000 rows with 70Bytes each. But even for
this our transaction log space of 2GB should be enough free space.
>Can't somebody help with the real issue - why does Harald's transaction
>log apparently require so much space?
Yes, this morning I received two mails from IBM Network (Compuserve). The first message said "we
will check it". The second message said "IBMDB2" (the Compuserve Support forum) is no support
forum. Ha!
>Is it possible that Harald has some monitoring functions activated
>causing exceptionally verbose logging?
no, definetely.
Why does noone from IBM respond to this question? I asked in CIS:IBMDB2, several newsgroups,
etc.. It seems to me that nobody knows what the transaction log manager really does.
>On Sat, 29 Mar 1997 11:11:04 +0000, Jeremy Rickard wrote:
>>I would have thought that deleting 100,000 70-byte rows with a plain
>>DELETE statement (no intermediate COMMITs) should not pose any real
>>problem (in this day and age!)
>in fact we have to delete 2,000,000 rows out of 50,000,000 rows with 70Bytes
>each.
Hence the title - not!? Okay, so I agree that a more frequent COMMIT
approach would be a good idea for these volumes,
>But even for this our transaction log space of 2GB should be enough
free space.
I've had a go at replicating your problem (more or less) on DB2 v2.1.2
For Windows 95. I used a 1,050,000 row x 70 byte table (20 million was
a bit too much effort!), and the same log settings as you.
An update of all rows took 22 log files - about twice the space required
simply to store a before and after copy of each row. I don't know what
this overhead is for? Looking at the log files in WinWord, some seemed
to contain roughly what I might have expected - before and after copies
with little else, but others files contained no row copies at all, just
unreadable characters (nearly all squares) throughout. Can this be
intentional?
A delete of all rows (except one) took 29 log files. I would have
thought it should be fewer than the update (no after copy required)?
Even so, on this basis your 2 million row delete should take "only" 1
gigabyte.
>Yes, this morning I received two mails from IBM Network (Compuserve).
>The first message said "we will check it".
Good. Keep us posted on progress.
P.S. If any of you IBM techies would care to clarify any of these
questions, *despite* the fact that this isn't an officially supported
IBM newsgroup, we'd be very interested in your response.
--
Jeremy Rickard
>For Windows 95. I used a 1,050,000 row x 70 byte table (20 million was
>a bit too much effort!), and the same log settings as you.
no, we delete 2M rows only.
The docs say that a delete transaction log entry needs 2.5 times of the raw data with a small 21
bytes (I think) header.
>A delete of all rows (except one) took 29 log files.
Yep and we end up with the max of 128 files.
>For Windows 95.
Seems that DB2 for OS/2 has a problem there. Perhaps it's HPFS???
BTW, we had some very serious problems with heavy load. Our machines crashed regulary. After
applying 8130 all problems went away.
>On Thu, 3 Apr 1997 01:57:48 +0100, Jeremy Rickard wrote:
>The docs say that a delete transaction log entry needs 2.5 times of the
>raw data with a small 21 bytes (I think) header.
Where? I can't find the 2.5 times figure (just header details) .
>>A delete of all rows (except one) took 29 log files.
>Yep and we end up with the max of 128 files.
>>For Windows 95.
>Seems that DB2 for OS/2 has a problem there. Perhaps it's HPFS???
Or just a much worse problem perhaps. Even 29 files is still over twice
the size your information suggests we would expect.
--
Jeremy Rickard
>Where? I can't find the 2.5 times figure (just header details) .
>Or just a much worse problem perhaps. Even 29 files is still over twice
>the size your information suggests we would expect.
it's in:
API Reference for common servers S20H-4984-01 page 581 and 582.
It's starts with the headers but ends with the data size needed for every data type.
You have to calculate the amount of data you need for every data type (it's on page 582) add the
header and you will have exact the size you need for a DELETE job.
These pages are broken into the several operations (INSERT RECORD, DELETE RECORD, ... you get
the idea).