The dbimport took about 16 h in total, 13 h for data loading and 3 h for
index creation.
During this period of time one CPU is completely busy, so the systems
CPU utilization is about 55 %. No I/O bottleneck or something like that.
Any hints how to improve the time for the data loading?
A parallelization is probably not available during load but are there
options to decrease the CPU use and improve the throughput?
Some setting from my onconfig:
VPCLASS cpu,num=2,max=4,aff=(0,1),noage
NETTYPE ipcstr,1,32,NET
BUFFERPOOL size=2K,buffers=262144,lrus=8,
lru_min_dirty=5.0,lru_max_dirty=10.0
CKPTINTVL 86400
AUTO_CKPTS 1
MAX_PDQPRIORITY 100
RA_PAGES 64
RA_THRESHOLD 16
LOGBUFF 512
The online.log file for the load phase:
Sat Feb 27 02:14:06 2010
02:14:06 Checkpoint Completed: duration was 1 seconds.
02:14:06 Sat Feb 27 - loguniq 167, logpos 0x3184c018, timestamp:
0xc0170b7 Interval: 288
02:14:06 Maximum server connections 21
02:14:06 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns
blocked 1, Plog used 882282, Llog used 257057
03:50:59 Logical Log 167 Complete, timestamp: 0x174c472d.
05:25:52 Performance Advisory: Long transactions are causing blocking
checkpoints.
05:25:52 Results: Transactions are being blocked until the checkpoint
is complete.
05:25:52 Action: Increase the logical log space size.
05:25:53 Logical Log 168 Complete, timestamp: 0x20502d29.
05:25:54 Checkpoint Completed: duration was 2 seconds.
05:25:54 Sat Feb 27 - loguniq 169, logpos 0x18, timestamp: 0x20502d2f
Interval: 289
05:25:54 Maximum server connections 21
05:25:54 Checkpoint Statistics - Avg. Txn Block Time 0.021, # Txns
blocked 0, Plog used 2959, Llog used 845694
06:43:35 Logical Log 169 Complete, timestamp: 0x225577b7.
08:21:29 Logical Log 170 Complete, timestamp: 0x24154389.
08:21:33 Checkpoint Completed: duration was 4 seconds.
08:21:33 Sat Feb 27 - loguniq 171, logpos 0x18, timestamp: 0x2415438f
Interval: 290
08:21:33 Maximum server connections 21
08:21:33 Checkpoint Statistics - Avg. Txn Block Time 0.001, # Txns
blocked 0, Plog used 2899, Llog used 1048522
10:28:16 Logical Log 171 Complete, timestamp: 0x25d52dfe.
13:22:45 Logical Log 172 Complete, timestamp: 0x27c927e0.
13:22:49 Checkpoint Completed: duration was 4 seconds.
13:22:49 Sat Feb 27 - loguniq 173, logpos 0x18, timestamp: 0x27c927e5
Interval: 291
13:22:49 Maximum server connections 21
13:22:49 Checkpoint Statistics - Avg. Txn Block Time 0.001, # Txns
blocked 0, Plog used 1601, Llog used 1048522
13:52:48 Checkpoint Completed: duration was 0 seconds.
13:52:48 Sat Feb 27 - loguniq 173, logpos 0x2b683018, timestamp:
0x28f1730e Interval: 292
13:52:48 Maximum server connections 21
13:52:48 Checkpoint Statistics - Avg. Txn Block Time 0.000, # Txns
blocked 0, Plog used 2314, Llog used 177795
Not really an answer to speeding up a dbimport but it might be worth
looking at using the high performance loader for part of the load,
especially if the bulk of the data is in a couple of tables...
Any reason for using 8 lrus for you buffer pool? By setting the
lru_max_dirty to 10 you are encouraging the database to do a lot of lru
cleaning.
By increasing the #lrus and cleaners you might be better throughput?
Just a thought, not particularly experienced with IDS :)
James
Fonterra NZ
_______________________________________________
Informix-list mailing list
Inform...@iiug.org
http://www.iiug.org/mailman/listinfo/informix-list
dbaccess sysmaster <<!
create procedure generatechkpt()
define dirty decimal (4,3);
while (1=1)
select ( sum(lru_nmod) / sum ( lru_nfree + lru_nmod ))
into dirty
from syslrus ;
if (dirty < 0.75 ) then
system "sleep 1";
else
system "onmode -c";
end if
end while ;
end procedure;
execute procedure generatechkpt();
!
i would set pdpqpriority 100 in the environment so the index builds
can benefit from it
Also set shmvirtsize to a large value and the pdqmemory to a large
value make sure that this does
fit in shmvirtsize eq:
nowadays these two: DS_NONPDQ_QUERY_MEM,DS_TOTAL_MEMORY.
When the import is completed force a checkpoint manually!! then make
sure update stats is done
(using Arts stuff...)
and update stats for routines/procedures is done using pdqpriority set
to 0!!!
The above spl cut load times in half in the past... dono what it does
now though; have fun with it.
I must admit that HPL is faster.
Superboer.
A new dbimport with the same export and the same parameters is running now.
Tomorrow I'll know the run time improvement.
Thanks for spending time on this issue.
6 CPU-VPs for only 2 single core CPUs without hyperthreading?
I read about configuring 2 CPU-VPs per physical CPU/core if the GHz
value is high enough, but 3 VP/core?
And because the standard dbimport during dataload is single threaded I
would not expect a benefit from that.
I changed some parameters (see my reply to myself).
But I'll have a look at your dbimport replacement. I was not aware that
this is capable to do things in parallel.
Frank
I'm sorry, but I did not see a performance improvement for my dbimport
after changing those values.
The time for the dataload of the dbimport was again about 13 hours.
The index creation after wards took 3.5 hours.
As I said, during data loading one CPU is completely busy.
The dbimport process is consuming about 50% of this CPU, the two CPU-VPs
the other half.
In total the dbimport process used about 6.25 h CPU time nearly
completely during load.
So when ftp.iiug.org is vailable again, I'll take a look at Art's
export/import replacement.
Just out of interest, post
onstat -g ioq
output (i.e. are you using KAIO, are you getting the performance out the disks you should be getting)
This was one of the feature requests that I voted on at one of the
surveys that appeared in this group.
I have written a parallel tool simillar to this for cross platforms
migrations using insert/select, the enhancements would be:
- unload/load to file (uncluding gzip via mknod pipe to get around v7
2GB limits).
- use HPL (without or without gzip to a mknod pipe) when
systables,nrows indicates the table contains above a certain number of
rows.
(onpladm in the current releases of each product families makes this
easier).
Lets hope this feature request makes it into later versions.Oh, and
full cross platform dump load for databases (like Sybase already has).