Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

I/O-benchmark

22 views
Skip to first unread message

Bonita Montero

unread,
May 18, 2021, 7:50:13 PM5/18/21
to
I've got some code which isn't plain C++ but Windows- / Posix-code.
I just wanted to compare the efficiencies of read() or ReadFile by
reading from the filesystem-cache with increasing blocksizes. Thereby
I noticed a hughe difference in the cose of a kernel-call for reading
small block-sizes.
So here's the code:

#if defined(_MSC_VER)
#include <Windows.h>
#elif defined(__unix__)
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#endif
#include <iostream>
#include <vector>
#include <cstddef>
#include <chrono>
#include <iomanip>

using namespace std;
using namespace chrono;

int main( int argc, char **argv )
{
using sc_tp = time_point<steady_clock>;
if( argc < 2 )
return EXIT_FAILURE;
size_t const MIN_BLOCK_SIZE = 64,
MAX_BLOCK_SIZE = (size_t)1024 * 1024;
vector<char> buf( MAX_BLOCK_SIZE );
#if defined(_MSC_VER)
HANDLE hFile = CreateFileA( argv[1], GENERIC_READ, FILE_SHARE_READ |
FILE_SHARE_WRITE,
nullptr, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL, NULL );
if( hFile == INVALID_HANDLE_VALUE )
return EXIT_FAILURE;
#elif defined(__unix__)
int file = open( argv[1], 0 );
if( file == -1 )
return EXIT_FAILURE;
#endif
for( size_t blockSize = MIN_BLOCK_SIZE; blockSize <= MAX_BLOCK_SIZE;
blockSize *= 2 )
{
#if defined(_MSC_VER)
if( SetFilePointer( hFile, 0, nullptr, FILE_BEGIN ) != 0 )
return EXIT_FAILURE;
#elif defined(__unix__)
if( lseek( file, 0, SEEK_SET ) != 0 )
return EXIT_FAILURE;
#endif
uint64_t total = 0;
sc_tp start = steady_clock::now();
#if defined(_MSC_VER)
DWORD dwRead;
BOOL rfRet;
do
dwRead = 0,
rfRet = ReadFile( hFile, &buf[0], blockSize, &dwRead, nullptr ),
total += dwRead;
while( rfRet && dwRead );
DWORD dwErr = GetLastError();
if( dwErr != NO_ERROR && dwErr != ERROR_HANDLE_EOF )
return EXIT_FAILURE;
#elif defined(__unix__)
for( ; ; )
{
ssize_t readRet = read( file, &buf[0], blockSize );
if( readRet == -1 )
if( errno == EAGAIN )
continue;
else
return EXIT_FAILURE;
if( readRet == 0 )
break;
total += readRet;
}
#endif
double seconds = (int64_t)duration_cast<nanoseconds>(
steady_clock::now() - start ).count() / 1.0e9;
double mbps = (int64_t)total / (1.0e6 * seconds);
cout << (ptrdiff_t)blockSize / 1024.0 << " " << setw( 5 ) << (int)mbps
<< endl;
}
}

This are the resuls of a Ubuntu-PC with a Phenom X4 945 (3GHz quadcore)
reading from the filesystem-cache:

0.0625 175
0.125 344
0.25 634
0.5 1114
1 1850
2 2788
4 3689
8 4195
16 4438
32 4391
64 4305
128 4370
256 4328
512 3946
1024 3969

This are the results of a Windows 10 PC with a Ryzen Threadripper 3990X
(2,9GHz base 64 cores):

0.0625 48
0.125 96
0.25 189
0.5 364
1 679
2 1198
4 1982
8 3257
16 4752
32 6444
64 7789
128 8499
256 8984
512 9159
1024 9271

As you can see, the 10 years oder processor with 1/16-the of number
of cores is faster up to 8kB block-size.

So show me your numbers.

Bonita Montero

unread,
May 18, 2021, 7:55:02 PM5/18/21
to
I was reading from a 1GB-file repeatedly so that the file was
inside the cache ...

Öö Tiib

unread,
May 18, 2021, 9:13:02 PM5/18/21
to
On Wednesday, 19 May 2021 at 02:50:13 UTC+3, Bonita Montero wrote:
> I've got some code which isn't plain C++ but Windows- / Posix-code.
> I just wanted to compare the efficiencies of read() or ReadFile by
> reading from the filesystem-cache with increasing blocksizes. Thereby
> I noticed a hughe difference in the cose of a kernel-call for reading
> small block-sizes.
> So here's the code:

There are two programs that you have interwined with each
other using preprocessor like mad. Looks funny but why you
did it?

Branimir Maksimovic

unread,
May 18, 2021, 10:15:22 PM5/18/21
to
NTFS is much slower then EXT for small files.


--
current job title: senior software engineer
skills: x86 aasembler,c++,c,rust,go,nim,haskell...

press any key to continue or any other to quit...

Bonita Montero

unread,
May 19, 2021, 12:07:37 AM5/19/21
to
> NTFS is much slower then EXT for small files.

It's about small, linear reads from large files.

Bonita Montero

unread,
May 19, 2021, 12:08:10 AM5/19/21
to
> There are two programs that you have interwined with each
> other using preprocessor like mad. Looks funny but why you
> did it?

I think the readability is good.
If there would be more #ifdefs I'd have written an I/O-class.

Scott Lurndal

unread,
May 19, 2021, 10:05:53 AM5/19/21
to
For your comparison to be valid requires a single point of
difference. You have at least five points of difference;
a different operating system, a different compiler, a different
processor, a different I/O controller and a different storage
device.

It's likely that the prime difference is that Linux has a much
better I/O subsystem that Windows.

Bonita Montero

unread,
May 19, 2021, 11:54:06 AM5/19/21
to
> For your comparison to be valid requires a single point of
> difference. You have at least five points of difference;
> a different operating system, a different compiler, a different
> processor, a different I/O controller and a different storage
> device.

The compiler doesn't matter. The code performs according to the
efficiency of the operating-system. The I/O-controller doesn't
matter since I asked for repeated results from the cache.
But it does matter to compare different CPUs with different
operating-systems. I compared a Phenom X4 945 4-core with a
Ryzen Threadripper 3900X 64-core and the first one is faster
up to a block-size of 8kB. And as this is mostly rooted in the
efficiency of the operating-system because the Phenom is par-
tititally faster.

> It's likely that the prime difference is that Linux has a much
> better I/O subsystem that Windows.

Of course - and kernel-calls are supposed to be more efficient.

0 new messages