On May 18, 1:24 pm, sc...@slp53.sl.home (Scott Lurndal) wrote:
> Adam Skutt <
ask...@gmail.com> writes:
> >> Using pread/pwrite with 1MB transfers is probably the best you'll be able
> >> to do.
>
> >I'm not sure why you think 1MB [sic] is the right transfer size. I'm
> >not aware of any cp(1) implementation that uses a buffer that large,
> >because it's pointless.
>
> Twenty years of measuring and optimizing disk performance, particularly
> on large Oracle installations.
I'm not sure why you think that is remotely relevant here. The OP is
plainly not writing a part of, nor a replacement for, Oracle.
> >>=A0And you may want to open the files with O_DIRECT to avoid the kernel
> >> buffer/file cache, which may help avoid filling the OS file cache with
> >> "access once" data. =A0On POSIX =A0systems anyway.
>
> >Nope. There's no guarantee that it does any of those things, and it
> >has terrible semantics for correct usage. Again, why are you
> >recommending things that cp(1) doesn't do?
>
> Optimal performance, of course. cp is hardly optimal. dd(1) or xdd(1)
> are much better from a performance standpoint.
It hits the I/O limit on my systems with no problems. He's
manipulating text, so optimal performance in the I/O stack is very
likely to be pointless anyway. His largest source of overhead is
likely to be in whatever processing his application is actually
doing. Your suggestions are both factually incorrect and
irrelevant. dd(1) also doesn't do direct I/O unless asked, FWIW.
>
> O_DIRECT is guaranteed to DMA the data from the device into the
> application buffer, completely bypassing any kernel or glibc buffers.
Nope. All it does is disable any _caching_ done by the kernel, if
possible. On modern Linux, that means that I/O skips the page cache
for the file data only. It doesn't necessarily result in direct DMA
because that's simply not possible on some systems. If the I/O
requires bounce buffers, then there still will be a copy operation in
the kernel. If the file is open anywhere else, then you still get
manipulate of the page cache in order to guarantee cache coherency.
If the filesystem needs to do some magic to ensure consistency (e.g.,
block-level checksums, ordered writes) then those will still happen,
and they may or may not involve a copy operation or some other I/O
"performance killing" behavior.
That's ignoring all the other implementation requirements for using
O_DIRECT:
* You're not guaranteed it's supported. Some file systems simply do
not support it, and the open(2) call will fail. Some operating
systems simply do not support the flag. There's no portable way to
determine whether the flag is supported.
* You have to align everything correctly, and there is no portable way
to find out the correct alignment, nevermind the optimal one.
All of that for a rather dubious performance benefit, and direct I/O
as you describe it is essentially impossible on some modern
filesystems (e.g., ZFS, BTRFS). Moreover, I would suspect the VM
gymnastics required to allow the FS to work correctly would kill a lot
of the performance, especially when compared to just copying the
buffer. TLB flushes and messing with page tables is expensive.
Making copies is not expensive up to a pretty large point.
Adam