Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

High throughput disk write: CreateFile/WriteFile?

196 views
Skip to first unread message

Brandon

unread,
Jun 28, 2007, 6:14:45 PM6/28/07
to
Hello all,

I have an application where I need to write text and binary (numeric)
data to a file at ~200 MB/s (8MB/40ms). I am currently using fprintf
and fwrite, respectively, but I'm not achieving the desired
throughput. I've been looking into the Windows CreateFile based
methods instead.

I am using a pcix hardware raid adapter with RAID0 across 4 drives. I
have benchmarked my system using h2benchw v3.6 with good results. For
my desired file sizes, anywhere from 8MB to 64MB (I'm flexible on how
many 8MB data sets are stored in a single file), I am able to achieve
up to 275MB/s.

I'm hoping it's possible to get close to that using alternatives from
the standard C options. Any suggestions?

For now I'm trying CreateFile/WriteFile, without success. My code
looks like this:

<SNIP>
HANDLE hWriteFile = NULL; // File handle
LPDWORD lpNumBytesWritten = NULL; // Number of bytes written
(WriteFile)
char szTextToAppendToLog[1024]; // Temp char buffer
char szFilePath[1024]; // Output file name
char szTimeStamp[32]; // Time character string
char szTemp[128]; // Temp character string

// Open the output file for write.
hWriteFile = CreateFile(
szFilePath, // File path
GENERIC_WRITE, // Open for write
NULL, // Do not share
NULL, // Default security
CREATE_ALWAYS, // Overwrite existing files
FILE_FLAG_WRITE_THROUGH,//FILE_FLAG_OVERLAPPED, //
Normal file
NULL); // No template
if (hWriteFile == INVALID_HANDLE_VALUE)
{

sprintf_s(szTextToAppendToLog,sizeof(szTextToAppendToLog),
"ERROR: Output file %s failed to open.",
szFilePath);
pThis->UI->AppendToStatLog(szTextToAppendToLog);
}

...

// Write ASCII header to file.
sprintf_s(szTemp, sizeof(szTemp), "MyData: Date:%s\r
\n",szTimeStamp);
WriteFile(
hWriteFile,
szTemp,
(DWORD) sizeof(szTemp),
lpNumBytesWritten,
NULL);
</SNIP>

As soon as I hit WriteFile(...) my application hits an unhandled
exception:
"Unhandled exception at 0x7c810e0c in DataCapture.exe: 0xC0000005:
Access violation writing location 0x00000000."

I'm guessing I'm not using CreateFile correctly?

Thanks,
-Brandon

Tom Serface

unread,
Jun 28, 2007, 6:25:13 PM6/28/07
to
I did a project recently where I had to read and write files pretty quickly.
I used the following code which seemed to work the fastest for me.
Hopefully this makes sense just from the code... I'm just copying bytes so
I used the char on purpose. The program itself is compiled for Unicode, but
I specifically did not want a TCHAR here.

Tom

// Copy a file to the destination either creating a new one or adding
// to the one that is already there
#define BUF_SIZE (8192*2)
char buf[BUF_SIZE];
CRITICAL_ERROR CMyAppDlg::CopyFile(CString &csDestination, CString
&csSource, bool bTruncate)
{
// DWORD err;
HANDLE hFileInput;
HANDLE hFileOutput;

DWORD64 nBytesCompleted = 0;

if(m_bCancelCopy)
return CRITICAL_ABORT;

hFileOutput = CreateFile(csDestination, GENERIC_WRITE | GENERIC_READ, 0,
NULL,
bTruncate?CREATE_ALWAYS:OPEN_EXISTING,
FILE_FLAG_RANDOM_ACCESS, NULL);

if(hFileOutput == INVALID_HANDLE_VALUE)
return CRITICAL_ABORT;

hFileInput = CreateFile(csSource, GENERIC_READ, 0, NULL,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);

if(hFileInput == INVALID_HANDLE_VALUE) {
CloseHandle(hFileOutput);
return CRITICAL_ABORT;
}

if(!bTruncate) {
LARGE_INTEGER li;
li.QuadPart = 0;
SetFilePointerEx(hFileOutput, li, &li, FILE_END);
// err = GetLastError();
nBytesCompleted = li.QuadPart;
}

DWORD nBytesRead, nBytesWritten;
int nTimes = 20;
SetLastError(0);
while(ReadFile(hFileInput,buf,BUF_SIZE,&nBytesRead,NULL) && nBytesRead > 0)
{
if(WriteFile(hFileOutput,buf,nBytesRead,&nBytesWritten,NULL)) {
nBytesCompleted += nBytesRead;
UpdateTotalProgress(m_nTotalBytesCompleted + nBytesCompleted,
m_nTotalBytesToCopy);
if(m_bCancelCopy) {
CloseHandle(hFileOutput);
CloseHandle(hFileInput);
return CRITICAL_ABORT;
}
if(--nTimes <= 0) {
GiveTime(); // Allow UI to update every so often
nTimes = 20;
}
}
else { // Write failed
DWORD nError = GetLastError();
if(nError != ERROR_SUCCESS) {
CloseHandle(hFileOutput);
CloseHandle(hFileInput);
return DisplayCriticalError(nError);
}
}
SetLastError(0);
}

DWORD nError = GetLastError();
if(nError != ERROR_SUCCESS) {
CloseHandle(hFileOutput);
CloseHandle(hFileInput);
return DisplayCriticalError(nError);
}
CloseHandle(hFileOutput);
CloseHandle(hFileInput);
return CRITICAL_NONE;
}

"Brandon" <kille...@gmail.com> wrote in message
news:1183068885.5...@q75g2000hsh.googlegroups.com...

William DePalo [MVP VC++]

unread,
Jun 28, 2007, 8:27:36 PM6/28/07
to
"Brandon" <kille...@gmail.com> wrote in message
news:1183068885.5...@q75g2000hsh.googlegroups.com...
> I have an application where I need to write text and binary (numeric)
> data to a file at ~200 MB/s (8MB/40ms). I am currently using fprintf
> and fwrite, respectively, but I'm not achieving the desired
> throughput. I've been looking into the Windows CreateFile based
> methods instead.

I'd try opening the file with FILE_FLAG_NO_BUFFERING. If you do that, you'd
need to insure that you always write blocks which are exact multiples of the
sector size and that the buffers themselves aligned properly - see the
remarks for the flag in the help entry for CreateFile().

And while you are at it, you can benchmark both synchronous and asynchronous
operation. If you are wading in waters unknown to you, you may want to scan
an advanced text on Windows like any of Jeff Richter's books.

Regards,
Will
www.ivrforbeginners.com


Scott McPhillips [MVP]

unread,
Jun 28, 2007, 8:44:14 PM6/28/07
to

The reason for the error is passing a NULL value for lpNumBytesWritten.
You must pass the address of a DWORD variable so WriteFile can fill in
the DWORD with a value.

For top speed you will almost certainly need to use FILE_FLAG_OVERLAPPED
and FILE_FLAG_NO_BUFFERING. Both require quite a bit of additional
code. The overlapped option will let you queue multiple writes, so the
driver will not have to return to you between writes, and it will let
you work on filling more buffers while the disk is being written. The
no buffering option eliminates some unnecessay copying of the data on
its way to the disk.

--
Scott McPhillips [MVP VC++]

Ulrich Eckhardt

unread,
Jun 29, 2007, 9:04:43 AM6/29/07
to
Brandon wrote:
> I have an application where I need to write text and binary (numeric)
> data to a file at ~200 MB/s (8MB/40ms). I am currently using fprintf
> and fwrite, respectively, but I'm not achieving the desired
> throughput.

Just wondering, but what throughput did you manage to achieve?

> I've been looking into the Windows CreateFile based methods instead.

This usually reduces the overhead a bit, because those are the native APIs
while fopen/fprintf/fwrite are just wrappers around them. However, even the
latter don't have to be slow, it still depends on how and what you are
doing.

> char szTextToAppendToLog[1024]; // Temp char buffer
> char szFilePath[1024]; // Output file name
> char szTimeStamp[32]; // Time character string
> char szTemp[128]; // Temp character string

I'm always wondering if hackers think that by using powers of two their
programs will somehow magically work correctly... (SCNR)

> hWriteFile = CreateFile(
> szFilePath, // File path
> GENERIC_WRITE, // Open for write
> NULL, // Do not share
> NULL, // Default security
> CREATE_ALWAYS, // Overwrite existing files
> FILE_FLAG_WRITE_THROUGH,//FILE_FLAG_OVERLAPPED, // Normal file
> NULL); // No template

One thing here: FILE_FLAG_WRITE_THROUGH means that this function will not do
any buffering. If you have short bursts of data to write, this will impact
performance negatively.

> if (hWriteFile == INVALID_HANDLE_VALUE)
> {
> sprintf_s(szTextToAppendToLog,sizeof(szTextToAppendToLog),
> "ERROR: Output file %s failed to open.",
> szFilePath);
> pThis->UI->AppendToStatLog(szTextToAppendToLog);
> }

You should throw an exception here, continuing here is just plain wrong.

> // Write ASCII header to file.
> sprintf_s(szTemp, sizeof(szTemp), "MyData: Date:%s\r\n",szTimeStamp);
> WriteFile(
> hWriteFile,
> szTemp,
> (DWORD) sizeof(szTemp),
> lpNumBytesWritten,
> NULL);

Apart from the NULL-pointer for the number of written bytes, Here might be a
reason for performance problems. The problem here is that you are writing
short pieces of data but without intermediate buffering. Also, I'd suggest
not doing any C-style casts ("(DWORD) sizeof(szTemp)") because those bear
the danger of hiding errors.

Uli


Brandon

unread,
Jun 29, 2007, 9:41:21 AM6/29/07
to
On Jun 29, 9:04 am, Ulrich Eckhardt <eckha...@satorlaser.com> wrote:
> Brandon wrote:
> > I have an application where I need to write text and binary (numeric)
> > data to a file at ~200 MB/s (8MB/40ms). I am currently using fprintf
> > and fwrite, respectively, but I'm not achieving the desired
> > throughput.
>
> Just wondering, but what throughput did you manage to achieve?
>

40 MB/s. I'm opening and closing a file every 8 datasets at 8 MB
dataset. This takes approx. 1.6s, so 64 MB/file / 1.6s = 40 MB/s/file

> > I've been looking into the Windows CreateFile based methods instead.
>
> This usually reduces the overhead a bit, because those are the native APIs
> while fopen/fprintf/fwrite are just wrappers around them. However, even the
> latter don't have to be slow, it still depends on how and what you are
> doing.
>
> > char szTextToAppendToLog[1024]; // Temp char buffer
> > char szFilePath[1024]; // Output file name
> > char szTimeStamp[32]; // Time character string
> > char szTemp[128]; // Temp character string
>
> I'm always wondering if hackers think that by using powers of two their
> programs will somehow magically work correctly... (SCNR)
>

Well, I was always taught that it was a good practice to allocate
memory in powers of 2 along byte boundaries.

> > hWriteFile = CreateFile(
> > szFilePath, // File path
> > GENERIC_WRITE, // Open for write
> > NULL, // Do not share
> > NULL, // Default security
> > CREATE_ALWAYS, // Overwrite existing files
> > FILE_FLAG_WRITE_THROUGH,//FILE_FLAG_OVERLAPPED, // Normal file
> > NULL); // No template
>
> One thing here: FILE_FLAG_WRITE_THROUGH means that this function will not do
> any buffering. If you have short bursts of data to write, this will impact
> performance negatively.

Short bursts? Is 8MB/40 ms a short burst? 200 MB/s is pretty ambitious
imo, and not even attainable on a non RAID disk to my knowledge.

>
> > if (hWriteFile == INVALID_HANDLE_VALUE)
> > {
> > sprintf_s(szTextToAppendToLog,sizeof(szTextToAppendToLog),
> > "ERROR: Output file %s failed to open.",
> > szFilePath);
> > pThis->UI->AppendToStatLog(szTextToAppendToLog);
> > }
>
> You should throw an exception here, continuing here is just plain wrong.

Indeed, but I'm not such a good programmer and for now I'm not worried
about corner cases, I just want the performance.

>
> > // Write ASCII header to file.
> > sprintf_s(szTemp, sizeof(szTemp), "MyData: Date:%s\r\n",szTimeStamp);
> > WriteFile(
> > hWriteFile,
> > szTemp,
> > (DWORD) sizeof(szTemp),
> > lpNumBytesWritten,
> > NULL);
>
> Apart from the NULL-pointer for the number of written bytes, Here might be a
> reason for performance problems. The problem here is that you are writing
> short pieces of data but without intermediate buffering. Also, I'd suggest
> not doing any C-style casts ("(DWORD) sizeof(szTemp)") because those bear
> the danger of hiding errors.
>
> Uli

I'm not having performance problems with CreateFile/WriteFile. It just
doesn't work for me. I'm only having performance limitations with
fopen/fwrite.

Alex Blekhman

unread,
Jun 29, 2007, 11:56:49 AM6/29/07
to
"Ulrich Eckhardt" wrote:
> I'm always wondering if hackers think that by using powers
> of two their
> programs will somehow magically work correctly...

Oh, didn't you hear? Recent research found that when you use
powers of two, then CPU's registers are massaged is certain
way, which is very pleasant to the crystal. And, you know,
when the crystal is pleased, it can do wonders for you.


Brandon

unread,
Jun 29, 2007, 3:03:04 PM6/29/07
to
I'm able to now write my numeric data to the file, but I am having
trouble writing my string header data. What gives? I've verified my
numeric data, so it appears that WriteFile is working for that, but
it's failing on my char strings.

<SNIP>


// Write ASCII header to file.
sprintf_s(szTemp, sizeof(szTemp),
"MyData: Date:%s\r\n",
szTimeStamp);

if (! WriteFile(
hWriteFile,
szTemp,
strlen(szTemp),
&dwNumBytesWritten,
NULL) )
{
pThis->UI->AppendToStatLog("WARNING:
FileWriteThread - Writing header line 1 to file!");
}
</SNIP>

Tim Roberts

unread,
Jul 2, 2007, 1:09:23 AM7/2/07
to
Brandon <kille...@gmail.com> wrote:
>
>I'm able to now write my numeric data to the file, but I am having
>trouble writing my string header data. What gives? I've verified my
>numeric data, so it appears that WriteFile is working for that, but
>it's failing on my char strings.

How can we possibly know that without knowing what error you are getting?
There was nothing inherently wrong with the code you posted.

Unless, that is, you opened the file with FILE_FLAG_OVERLAPPED. If you
open a file overlapped, then EVERY I/O function must specify an overlap
structure.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.

Brandon

unread,
Jul 2, 2007, 9:54:05 AM7/2/07
to
On Jul 2, 1:09 am, Tim Roberts <t...@probo.com> wrote:
> How can we possibly know that without knowing what error you are getting?
> There was nothing inherently wrong with the code you posted.

I apologize. When I use FormatMessage(...) and GetLastError(),
WriteFile(...) is returning:

"The parameter is incorrect." Seems ambiguous...

> Unless, that is, you opened the file with FILE_FLAG_OVERLAPPED. If you
> open a file overlapped, then EVERY I/O function must specify an overlap
> structure.

I'm only opening the file with the FILE_FLAG_NO_BUFFERING flag.

Here's my current code snippet:

<SNIP>
// Write ASCII header to file.
sprintf_s(szTemp, sizeof(szTemp),
"MyData: Date:%s\r\n", szTimeStamp);
if (! WriteFile(
hWriteFile,
szTemp,

sizeof(szTemp),
&dwNumBytesWritten,
NULL) )
{
// Get the error message.
DWORD dwChars = FormatMessage(
FORMAT_MESSAGE_FROM_SYSTEM |
FORMAT_MESSAGE_ALLOCATE_BUFFER,
0,
GetLastError(),
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
(LPSTR) &lpszErr,
0,
0);
// Append it to the log.
pThis->UI->AppendToStatLog(lpszErr);

Alexander Nickolov

unread,
Jul 3, 2007, 2:16:14 PM7/3/07
to
In that case the failure is due to your not writing in blocks of
multiples of the sector size. That at least should be clear from
the doucmentation on FILE_FLAG_NO_BUFFERING!

--
=====================================
Alexander Nickolov
Microsoft MVP [VC], MCSD
email: agnic...@mvps.org
MVP VC FAQ: http://vcfaq.mvps.org
=====================================

"Brandon" <kille...@gmail.com> wrote in message

news:1183384445.6...@c77g2000hse.googlegroups.com...

Tim Roberts

unread,
Jul 3, 2007, 11:06:22 PM7/3/07
to
Brandon <kille...@gmail.com> wrote:
>
>Here's my current code snippet:
>
><SNIP>
> // Write ASCII header to file.
> sprintf_s(szTemp, sizeof(szTemp),
> "MyData: Date:%s\r\n", szTimeStamp);
> if (! WriteFile(
> hWriteFile,
> szTemp,
> sizeof(szTemp),
> &dwNumBytesWritten,
> NULL) )

Alexander pointed out the real reason for your error, but I wanted to point
out one other this. In your previous snippet, you had "strlen(szTemp)",
not "sizeof(szTemp)". You need to be very aware of the difference between
them. Once you eliminate the NO_BUFFERING flag, "strlen" is what you will
want here.

0 new messages