User Experiences with Data Pump Versus Legacy EXP/IMP

EdLong

unread,

Feb 24, 2008, 8:08:31 PM2/24/08

to

Hi everyone.
We are 10.2.0.3 running on Windows 2003. I have to move about a
terabyte of Oracle data from one disk array to another; its split
amongst 5 schemas or so. I'd like to solicit some user experiences
with DataPump. You may infer that we have not had the best experience
so far.

I started with the largest subset, a single schema of about 200gb.
Using traditional exp, sneaker net, import, this took about 24 hours
to move. Of this, nearly 20 hours was the import. Target machine
consists of 4 3ghz engines with beaucoup memory and little if any
competing workload.

I've tried several different cuts at DataPump but found it somewhat
unsatisfactory.

It appears that at least in our shop at this level, The target
directory for the expdp and the source directory for the impdp cannot
be either a mapped drive or a USB drive. I could not find any
documentation for these apparent limitations.

The expdp says I can't open and write to the log file if its a map
drive. The EXPDP appears to work when pointed at a USB drive. The
impdp fails on a USB drive with a 'hard' i/o error after an hour or
two of processing. I tried the latter at a parallel level of 4 and 2;
I didn't try 1.

What leads me to risk the wrath of Achilles and query this group is
that a highly informal survey of genuine Oracle guri's in the
neighborhood (think video thievery in the NFL) revealed a similar
frustration with Data Pump. I also read the most excellent "Oracle
Automatic Storage Management" book (Vengurlekar, Vallath, Long (no
relation)). In the intro, they list a number of supported disk
hardware interfaces. Guess which two don't make the cut, USB and
mapped drives aka NFS. Even if Oracle 10g won't run with tablespaces
on a USB drive, it should be able to reliably read and write to it as
an 'external' file or table.

Before anyone mentions transportable tablespaces, I also tested that
option. Much like the fellow in the recent thread, I found that in our
shop at this maintenance level, we got odd results. I didn't take it
to his level of doing an endian to endian conversion where source and
target were Intel. I'm very willing to try again though since 24 hours
for 200 Gb. is pretty gauche!

Thanks for taking the time to read this note and for considering our
problem.

hpuxrac

unread,

Feb 25, 2008, 7:39:30 AM2/25/08

to

A couple of thoughts. I have had some very nice results using impdp
with a link between the databases and parallel. Of course you need
good network bandwidth betwen the systems and some way to make sure
your schema that you are cloning isn't changing in the process.

Using a usb drive and/or firewire drive to expdp out then impdp in
also has worked for me. If you are having "read errors" against the
drive doing the import well that's a problem and I would first guess
that it is something dodgy with the drive. Maybe windows has problems
reading files of this size is another possibility?

You could do some testing of the usb out and in on systems that are
not running windows to rule out that as a factor.

joel garry

unread,

Feb 25, 2008, 1:26:47 PM2/25/08

to

On Feb 24, 5:08 pm, EdLong <rdhm...@prodigy.net> wrote:
> Hi everyone.
> We are 10.2.0.3 running on Windows 2003. I have to move about a
> terabyte of Oracle data from one disk array to another; its split
> amongst 5 schemas or so. I'd like to solicit some user experiences
> with DataPump. You may infer that we have not had the best experience
> so far.
>
> I started with the largest subset, a single schema of about 200gb.
> Using traditional exp, sneaker net, import, this took about 24 hours
> to move. Of this, nearly 20 hours was the import. Target machine
> consists of 4 3ghz engines with beaucoup memory and little if any
> competing workload.

That seems slow, have you done everything you can to speed it up?
http://www.oracledba.co.uk/tips/import_speed.htm In particular, using
direct path for both exp and imp, and doing the imp with noarchivelog,
then switching to archivelog and taking a backup can make a big
difference. Have you checked to see what is slowing it down?
Sometimes the I/O is just overwhelmed, especially if you are dependent
on some RAID-5 with write buffers that get saturated.

>
> I've tried several different cuts at DataPump but found it somewhat
> unsatisfactory.
>
> It appears that at least in our shop at this level, The target
> directory for the expdp and the source directory for the impdp cannot
> be either a mapped drive or a USB drive. I could not find any
> documentation for these apparent limitations.
>
> The expdp says I can't open and write to the log file if its a map
> drive. The EXPDP appears to work when pointed at a USB drive. The
> impdp fails on a USB drive with a 'hard' i/o error after an hour or
> two of processing. I tried the latter at a parallel level of 4 and 2;
> I didn't try 1.
>
> What leads me to risk the wrath of Achilles and query this group is

Aw, all we want are decent questions! :-)

jg
--
@home.com is bogus.
What's in your database? http://ap.google.com/article/ALeqM5ghPenZUJTE7BfSfgQbj6RX597DEAD8V019TG0

EdLong

unread,

Feb 25, 2008, 4:15:15 PM2/25/08

to

On Feb 25, 1:26 pm, joel garry <joel-ga...@home.com> wrote:
> On Feb 24, 5:08 pm, EdLong <rdhm...@prodigy.net> wrote:
>
> > Hi everyone.
> > We are 10.2.0.3 running on Windows 2003. I have to move about a
> > terabyte of Oracle data from one disk array to another; its split
> > amongst 5 schemas or so. I'd like to solicit some user experiences
> > with DataPump. You may infer that we have not had the best experience
> > so far.
>
> > I started with the largest subset, a single schema of about 200gb.
> > Using traditional exp, sneaker net, import, this took about 24 hours
> > to move. Of this, nearly 20 hours was the import. Target machine
> > consists of 4 3ghz engines with beaucoup memory and little if any
> > competing workload.
>

> That seems slow, have you done everything you can to speed it up?http://www.oracledba.co.uk/tips/import_speed.htmIn particular, using

> direct path for both exp and imp, and doing the imp with noarchivelog,
> then switching to archivelog and taking a backup can make a big
> difference. Have you checked to see what is slowing it down?
> Sometimes the I/O is just overwhelmed, especially if you are dependent
> on some RAID-5 with write buffers that get saturated.
>
>
>
>
>
>
>
> > I've tried several different cuts at DataPump but found it somewhat
> > unsatisfactory.
>
> > It appears that at least in our shop at this level, The target
> > directory for the expdp and the source directory for the impdp cannot
> > be either a mapped drive or a USB drive. I could not find any
> > documentation for these apparent limitations.
>
> > The expdp says I can't open and write to the log file if its a map
> > drive. The EXPDP appears to work when pointed at a USB drive. The
> > impdp fails on a USB drive with a 'hard' i/o error after an hour or
> > two of processing. I tried the latter at a parallel level of 4 and 2;
> > I didn't try 1.
>
> > What leads me to risk the wrath of Achilles and query this group is
>
> Aw, all we want are decent questions! :-)
>
> jg
> --
> @home.com is bogus.

> What's in your database? http://ap.google.com/article/ALeqM5ghPenZUJTE7BfSfgQbj6RX597DEAD8V019TG0- Hide quoted text -
>
> - Show quoted text -

Thank you both for the great suggestions.
1: I'm confident the USB drive is ok; brand new disk, just formatted
and chkdsk /f'd with no problems. Not sure if its a Windows problem; I
have seen similar 'dodgy' results when Windows tried to compress a
file that ended up > 32Gb. No compression here though.
2: I tried the noarchivelog option on the destination; it made a
slight difference.
3: The target disk is an EMC Clarion CX-310 configured as a Raid 5
disk so Raid delay could be a factor.
4: I don't control the source instances so CONSISTENT=Y and DIRECT=Y
were both problematic due to Undo size and other factors. I actually
went back to the original dump sent in by a customer some months ago
making the Export moot for the first schema. I'm retrying Direct and
consistent for the second schema.
5: Thank you Jg for the link to the best practices stuff on export/
import. I've actually been considering a variant on this idea,
breaking up the export by tablesize then doing a series of imports
much like you describe.
6: In watching the Import run the single gating issue appears to be
that its single threaded. The HBA gets to about 35% busy and stays
there. 35% busy, is right at the point in a single server queue where
there is always something using the queue but little queueing(sp?).
The processors on the target server are loafing.

mix...@gmail.com

unread,

Feb 26, 2008, 2:01:11 AM2/26/08

to

hello all.
I think the best way in this cay is use datapump only
I make some tests that show that datapump more quickly than original
exep/imp

exp & imp vs Data pump
test
sunfire V890
disk array netapp (20 disk raid_dp raid4)
Test database ~ 70 gb

--===============================================================
exp
exp xxx/xxx@test file=/yyy/dump_exp.dmp log=/yyy/dump_exp.log full=y
buffer=30000000 direct=y
time 00:40
avg write speed 8mb/s
--===============================================================
expdp
expdp xxx/xxx@test directory=datapump DUMPFILE=dump_dp.dmp
logfile=dump_dp.log full=y
00:10
avg write speed 60 mb/s
--*****************************
expdp xxx/xxx@test PARALLEL=2 directory=datapump DUMPFILE=dump_dp.dmp
logfile=dump_dp.log full=y
00:10
avg write speed 60 мб/c

--===============================================================
imp(max direct buffer record)
imp xxx/xxx@test file=/yyy/dmp.dmp LOG=/yyy.log BUFFER=300000000
IGNORE=y recordlength=65535 STATISTICS=none FULL=Y
01:46
avg write speed 15-16 мб/c
--===============================================================
impdp
00:38
date; impdp xxx/xxx@test directory=datapump DUMPFILE=dump_dp.dmp
logfile=dump_dp_imp.log full=y;date
avg write speed 10-60мб/c

So Datapump more quickly then exp/imp in any way.

p.s. sorry for my english :)

joel garry

unread,

Feb 26, 2008, 1:22:31 PM2/26/08

to

On Feb 25, 1:15 pm, EdLong <rdhm...@prodigy.net> wrote:
> On Feb 25, 1:26 pm, joel garry <joel-ga...@home.com> wrote:
>
>
>
>
>
> > On Feb 24, 5:08 pm, EdLong <rdhm...@prodigy.net> wrote:
>
> > > Hi everyone.
> > > We are 10.2.0.3 running on Windows 2003. I have to move about a
> > > terabyte of Oracle data from one disk array to another; its split
> > > amongst 5 schemas or so. I'd like to solicit some user experiences
> > > with DataPump. You may infer that we have not had the best experience
> > > so far.
>
> > > I started with the largest subset, a single schema of about 200gb.
> > > Using traditional exp, sneaker net, import, this took about 24 hours
> > > to move. Of this, nearly 20 hours was the import. Target machine
> > > consists of 4 3ghz engines with beaucoup memory and little if any
> > > competing workload.
>

> > That seems slow, have you done everything you can to speed it up?http://www.oracledba.co.uk/tips/import_speed.htmInparticular, using

> > direct path for both exp and imp, and doing the imp with noarchivelog,
> > then switching to archivelog and taking a backup can make a big
> > difference. Have you checked to see what is slowing it down?
> > Sometimes the I/O is just overwhelmed, especially if you are dependent
> > on some RAID-5 with write buffers that get saturated.
>
> > > I've tried several different cuts at DataPump but found it somewhat
> > > unsatisfactory.
>
> > > It appears that at least in our shop at this level, The target
> > > directory for the expdp and the source directory for the impdp cannot
> > > be either a mapped drive or a USB drive. I could not find any
> > > documentation for these apparent limitations.
>
> > > The expdp says I can't open and write to the log file if its a map
> > > drive. The EXPDP appears to work when pointed at a USB drive. The
> > > impdp fails on a USB drive with a 'hard' i/o error after an hour or
> > > two of processing. I tried the latter at a parallel level of 4 and 2;
> > > I didn't try 1.
>
> > > What leads me to risk the wrath of Achilles and query this group is
>
> > Aw, all we want are decent questions! :-)
>
> > jg
> > --
> > @home.com is bogus.

> > What's in your database? http://ap.google.com/article/ALeqM5ghPenZUJTE7BfSfgQbj6RX597DEAD8V019...Hide quoted text -

>
> > - Show quoted text -
>
> Thank you both for the great suggestions.
> 1: I'm confident the USB drive is ok; brand new disk, just formatted
> and chkdsk /f'd with no problems. Not sure if its a Windows problem; I
> have seen similar 'dodgy' results when Windows tried to compress a
> file that ended up > 32Gb. No compression here though.
> 2: I tried the noarchivelog option on the destination; it made a
> slight difference.
> 3: The target disk is an EMC Clarion CX-310 configured as a Raid 5
> disk so Raid delay could be a factor.
> 4: I don't control the source instances so CONSISTENT=Y and DIRECT=Y
> were both problematic due to Undo size and other factors. I actually
> went back to the original dump sent in by a customer some months ago
> making the Export moot for the first schema. I'm retrying Direct and
> consistent for the second schema.
> 5: Thank you Jg for the link to the best practices stuff on export/
> import. I've actually been considering a variant on this idea,
> breaking up the export by tablesize then doing a series of imports
> much like you describe.
> 6: In watching the Import run the single gating issue appears to be
> that its single threaded. The HBA gets to about 35% busy and stays
> there. 35% busy, is right at the point in a single server queue where
> there is always something using the queue but little queueing(sp?).
> The processors on the target server are loafing.

Sounds like you may be simply hitting your hardware limits then,
though if noarchivelog doesn't make much difference, I'm not sure what
is going on (unless your arcs go to a different device or
controller?). You might try "roll-your-own-parallelization" by simply
starting up a couple of imps to different tablespaces and see what
your HBA and clock time say about that. I would expect your business
would stay the same and the queues would grow, but I really don't know
anything about Clarion, it may have some smartypants adaptiveness that
you need to convince it to use, or may be much worse. If the latter,
I would suspect the results mixadba posted wouldn't apply to your
situation.

DIRECT should be faster.

And now for something completely different: http://asktom.oracle.com/tkyte/flat/

jg
--
@home.com is bogus.

Download Jonathan Lewis's its_the_data in rtf format. Open it with
the notepad editor. Smoothly scroll left and right until you see
square white blocks on either side, then moderately quickly scroll
left and right between those areas. See the Pong?

Joey.D...@gmail.com

unread,

Feb 28, 2008, 9:19:54 AM2/28/08

to

Can you possibly put the source tablespaces in read-only mode for the
move? This seems like the perfect time to use transportable
tablespaces.

EdLong

unread,

Mar 1, 2008, 12:57:28 PM3/1/08

to

> > > What's in your database? http://ap.google.com/article/ALeqM5ghPenZUJTE7BfSfgQbj6RX597DEAD8V019...quoted text -

Hi Everyone.
Thank you for the great suggestions. My progress report:
1: The suggestion about hardware limits got me thinking. So, I broke
the mirror on the C: drive and used the now spare disk as a third disk
to store tablespaces. This seems to have improved import performance.
2: I changed the tablespace creates to be gigantic thinking that
extent management could be a factor.
3: I'm going to do some comparisions between the PATA drive test box
and the CLarion enabled target system to see if the SAN makes any
difference.
4: The import is still flaky; some tables import ok sometimes and fail
other times with various errors, most commonly, precision failures. I
did run all the upgrade stuff to get from 10.2 to 10.2.0.3.
Progress at least on IMPORT. Thanks for the help.
1: I watched the PERFMON reports for Windows while the IMP ran.