I have an application where I need to write text and binary (numeric)
data to a file at ~200 MB/s (8MB/40ms). I am currently using fprintf
and fwrite, respectively, but I'm not achieving the desired
throughput. I've been looking into the Windows CreateFile based
methods instead.
I am using a pcix hardware raid adapter with RAID0 across 4 drives. I
have benchmarked my system using h2benchw v3.6 with good results. For
my desired file sizes, anywhere from 8MB to 64MB (I'm flexible on how
many 8MB data sets are stored in a single file), I am able to achieve
up to 275MB/s.
I'm hoping it's possible to get close to that using alternatives from
the standard C options. Any suggestions?
For now I'm trying CreateFile/WriteFile, without success. My code
looks like this:
<SNIP>
HANDLE hWriteFile = NULL; // File handle
LPDWORD lpNumBytesWritten = NULL; // Number of bytes written
(WriteFile)
char szTextToAppendToLog[1024]; // Temp char buffer
char szFilePath[1024]; // Output file name
char szTimeStamp[32]; // Time character string
char szTemp[128]; // Temp character string
// Open the output file for write.
hWriteFile = CreateFile(
szFilePath, // File path
GENERIC_WRITE, // Open for write
NULL, // Do not share
NULL, // Default security
CREATE_ALWAYS, // Overwrite existing files
FILE_FLAG_WRITE_THROUGH,//FILE_FLAG_OVERLAPPED, //
Normal file
NULL); // No template
if (hWriteFile == INVALID_HANDLE_VALUE)
{
sprintf_s(szTextToAppendToLog,sizeof(szTextToAppendToLog),
"ERROR: Output file %s failed to open.",
szFilePath);
pThis->UI->AppendToStatLog(szTextToAppendToLog);
}
...
// Write ASCII header to file.
sprintf_s(szTemp, sizeof(szTemp), "MyData: Date:%s\r
\n",szTimeStamp);
WriteFile(
hWriteFile,
szTemp,
(DWORD) sizeof(szTemp),
lpNumBytesWritten,
NULL);
</SNIP>
As soon as I hit WriteFile(...) my application hits an unhandled
exception:
"Unhandled exception at 0x7c810e0c in DataCapture.exe: 0xC0000005:
Access violation writing location 0x00000000."
I'm guessing I'm not using CreateFile correctly?
Thanks,
-Brandon
Tom
// Copy a file to the destination either creating a new one or adding
// to the one that is already there
#define BUF_SIZE (8192*2)
char buf[BUF_SIZE];
CRITICAL_ERROR CMyAppDlg::CopyFile(CString &csDestination, CString
&csSource, bool bTruncate)
{
// DWORD err;
HANDLE hFileInput;
HANDLE hFileOutput;
DWORD64 nBytesCompleted = 0;
if(m_bCancelCopy)
return CRITICAL_ABORT;
hFileOutput = CreateFile(csDestination, GENERIC_WRITE | GENERIC_READ, 0,
NULL,
bTruncate?CREATE_ALWAYS:OPEN_EXISTING,
FILE_FLAG_RANDOM_ACCESS, NULL);
if(hFileOutput == INVALID_HANDLE_VALUE)
return CRITICAL_ABORT;
hFileInput = CreateFile(csSource, GENERIC_READ, 0, NULL,
OPEN_EXISTING, FILE_ATTRIBUTE_NORMAL, NULL);
if(hFileInput == INVALID_HANDLE_VALUE) {
CloseHandle(hFileOutput);
return CRITICAL_ABORT;
}
if(!bTruncate) {
LARGE_INTEGER li;
li.QuadPart = 0;
SetFilePointerEx(hFileOutput, li, &li, FILE_END);
// err = GetLastError();
nBytesCompleted = li.QuadPart;
}
DWORD nBytesRead, nBytesWritten;
int nTimes = 20;
SetLastError(0);
while(ReadFile(hFileInput,buf,BUF_SIZE,&nBytesRead,NULL) && nBytesRead > 0)
{
if(WriteFile(hFileOutput,buf,nBytesRead,&nBytesWritten,NULL)) {
nBytesCompleted += nBytesRead;
UpdateTotalProgress(m_nTotalBytesCompleted + nBytesCompleted,
m_nTotalBytesToCopy);
if(m_bCancelCopy) {
CloseHandle(hFileOutput);
CloseHandle(hFileInput);
return CRITICAL_ABORT;
}
if(--nTimes <= 0) {
GiveTime(); // Allow UI to update every so often
nTimes = 20;
}
}
else { // Write failed
DWORD nError = GetLastError();
if(nError != ERROR_SUCCESS) {
CloseHandle(hFileOutput);
CloseHandle(hFileInput);
return DisplayCriticalError(nError);
}
}
SetLastError(0);
}
DWORD nError = GetLastError();
if(nError != ERROR_SUCCESS) {
CloseHandle(hFileOutput);
CloseHandle(hFileInput);
return DisplayCriticalError(nError);
}
CloseHandle(hFileOutput);
CloseHandle(hFileInput);
return CRITICAL_NONE;
}
"Brandon" <kille...@gmail.com> wrote in message
news:1183068885.5...@q75g2000hsh.googlegroups.com...
I'd try opening the file with FILE_FLAG_NO_BUFFERING. If you do that, you'd
need to insure that you always write blocks which are exact multiples of the
sector size and that the buffers themselves aligned properly - see the
remarks for the flag in the help entry for CreateFile().
And while you are at it, you can benchmark both synchronous and asynchronous
operation. If you are wading in waters unknown to you, you may want to scan
an advanced text on Windows like any of Jeff Richter's books.
Regards,
Will
www.ivrforbeginners.com
The reason for the error is passing a NULL value for lpNumBytesWritten.
You must pass the address of a DWORD variable so WriteFile can fill in
the DWORD with a value.
For top speed you will almost certainly need to use FILE_FLAG_OVERLAPPED
and FILE_FLAG_NO_BUFFERING. Both require quite a bit of additional
code. The overlapped option will let you queue multiple writes, so the
driver will not have to return to you between writes, and it will let
you work on filling more buffers while the disk is being written. The
no buffering option eliminates some unnecessay copying of the data on
its way to the disk.
--
Scott McPhillips [MVP VC++]
Just wondering, but what throughput did you manage to achieve?
> I've been looking into the Windows CreateFile based methods instead.
This usually reduces the overhead a bit, because those are the native APIs
while fopen/fprintf/fwrite are just wrappers around them. However, even the
latter don't have to be slow, it still depends on how and what you are
doing.
> char szTextToAppendToLog[1024]; // Temp char buffer
> char szFilePath[1024]; // Output file name
> char szTimeStamp[32]; // Time character string
> char szTemp[128]; // Temp character string
I'm always wondering if hackers think that by using powers of two their
programs will somehow magically work correctly... (SCNR)
> hWriteFile = CreateFile(
> szFilePath, // File path
> GENERIC_WRITE, // Open for write
> NULL, // Do not share
> NULL, // Default security
> CREATE_ALWAYS, // Overwrite existing files
> FILE_FLAG_WRITE_THROUGH,//FILE_FLAG_OVERLAPPED, // Normal file
> NULL); // No template
One thing here: FILE_FLAG_WRITE_THROUGH means that this function will not do
any buffering. If you have short bursts of data to write, this will impact
performance negatively.
> if (hWriteFile == INVALID_HANDLE_VALUE)
> {
> sprintf_s(szTextToAppendToLog,sizeof(szTextToAppendToLog),
> "ERROR: Output file %s failed to open.",
> szFilePath);
> pThis->UI->AppendToStatLog(szTextToAppendToLog);
> }
You should throw an exception here, continuing here is just plain wrong.
> // Write ASCII header to file.
> sprintf_s(szTemp, sizeof(szTemp), "MyData: Date:%s\r\n",szTimeStamp);
> WriteFile(
> hWriteFile,
> szTemp,
> (DWORD) sizeof(szTemp),
> lpNumBytesWritten,
> NULL);
Apart from the NULL-pointer for the number of written bytes, Here might be a
reason for performance problems. The problem here is that you are writing
short pieces of data but without intermediate buffering. Also, I'd suggest
not doing any C-style casts ("(DWORD) sizeof(szTemp)") because those bear
the danger of hiding errors.
Uli
40 MB/s. I'm opening and closing a file every 8 datasets at 8 MB
dataset. This takes approx. 1.6s, so 64 MB/file / 1.6s = 40 MB/s/file
> > I've been looking into the Windows CreateFile based methods instead.
>
> This usually reduces the overhead a bit, because those are the native APIs
> while fopen/fprintf/fwrite are just wrappers around them. However, even the
> latter don't have to be slow, it still depends on how and what you are
> doing.
>
> > char szTextToAppendToLog[1024]; // Temp char buffer
> > char szFilePath[1024]; // Output file name
> > char szTimeStamp[32]; // Time character string
> > char szTemp[128]; // Temp character string
>
> I'm always wondering if hackers think that by using powers of two their
> programs will somehow magically work correctly... (SCNR)
>
Well, I was always taught that it was a good practice to allocate
memory in powers of 2 along byte boundaries.
> > hWriteFile = CreateFile(
> > szFilePath, // File path
> > GENERIC_WRITE, // Open for write
> > NULL, // Do not share
> > NULL, // Default security
> > CREATE_ALWAYS, // Overwrite existing files
> > FILE_FLAG_WRITE_THROUGH,//FILE_FLAG_OVERLAPPED, // Normal file
> > NULL); // No template
>
> One thing here: FILE_FLAG_WRITE_THROUGH means that this function will not do
> any buffering. If you have short bursts of data to write, this will impact
> performance negatively.
Short bursts? Is 8MB/40 ms a short burst? 200 MB/s is pretty ambitious
imo, and not even attainable on a non RAID disk to my knowledge.
>
> > if (hWriteFile == INVALID_HANDLE_VALUE)
> > {
> > sprintf_s(szTextToAppendToLog,sizeof(szTextToAppendToLog),
> > "ERROR: Output file %s failed to open.",
> > szFilePath);
> > pThis->UI->AppendToStatLog(szTextToAppendToLog);
> > }
>
> You should throw an exception here, continuing here is just plain wrong.
Indeed, but I'm not such a good programmer and for now I'm not worried
about corner cases, I just want the performance.
>
> > // Write ASCII header to file.
> > sprintf_s(szTemp, sizeof(szTemp), "MyData: Date:%s\r\n",szTimeStamp);
> > WriteFile(
> > hWriteFile,
> > szTemp,
> > (DWORD) sizeof(szTemp),
> > lpNumBytesWritten,
> > NULL);
>
> Apart from the NULL-pointer for the number of written bytes, Here might be a
> reason for performance problems. The problem here is that you are writing
> short pieces of data but without intermediate buffering. Also, I'd suggest
> not doing any C-style casts ("(DWORD) sizeof(szTemp)") because those bear
> the danger of hiding errors.
>
> Uli
I'm not having performance problems with CreateFile/WriteFile. It just
doesn't work for me. I'm only having performance limitations with
fopen/fwrite.
Oh, didn't you hear? Recent research found that when you use
powers of two, then CPU's registers are massaged is certain
way, which is very pleasant to the crystal. And, you know,
when the crystal is pleased, it can do wonders for you.
<SNIP>
// Write ASCII header to file.
sprintf_s(szTemp, sizeof(szTemp),
"MyData: Date:%s\r\n",
szTimeStamp);
if (! WriteFile(
hWriteFile,
szTemp,
strlen(szTemp),
&dwNumBytesWritten,
NULL) )
{
pThis->UI->AppendToStatLog("WARNING:
FileWriteThread - Writing header line 1 to file!");
}
</SNIP>
How can we possibly know that without knowing what error you are getting?
There was nothing inherently wrong with the code you posted.
Unless, that is, you opened the file with FILE_FLAG_OVERLAPPED. If you
open a file overlapped, then EVERY I/O function must specify an overlap
structure.
--
Tim Roberts, ti...@probo.com
Providenza & Boekelheide, Inc.
I apologize. When I use FormatMessage(...) and GetLastError(),
WriteFile(...) is returning:
"The parameter is incorrect." Seems ambiguous...
> Unless, that is, you opened the file with FILE_FLAG_OVERLAPPED. If you
> open a file overlapped, then EVERY I/O function must specify an overlap
> structure.
I'm only opening the file with the FILE_FLAG_NO_BUFFERING flag.
Here's my current code snippet:
<SNIP>
// Write ASCII header to file.
sprintf_s(szTemp, sizeof(szTemp),
"MyData: Date:%s\r\n", szTimeStamp);
if (! WriteFile(
hWriteFile,
szTemp,
sizeof(szTemp),
&dwNumBytesWritten,
NULL) )
{
// Get the error message.
DWORD dwChars = FormatMessage(
FORMAT_MESSAGE_FROM_SYSTEM |
FORMAT_MESSAGE_ALLOCATE_BUFFER,
0,
GetLastError(),
MAKELANGID(LANG_NEUTRAL, SUBLANG_DEFAULT),
(LPSTR) &lpszErr,
0,
0);
// Append it to the log.
pThis->UI->AppendToStatLog(lpszErr);
--
=====================================
Alexander Nickolov
Microsoft MVP [VC], MCSD
email: agnic...@mvps.org
MVP VC FAQ: http://vcfaq.mvps.org
=====================================
"Brandon" <kille...@gmail.com> wrote in message
news:1183384445.6...@c77g2000hse.googlegroups.com...
Alexander pointed out the real reason for your error, but I wanted to point
out one other this. In your previous snippet, you had "strlen(szTemp)",
not "sizeof(szTemp)". You need to be very aware of the difference between
them. Once you eliminate the NO_BUFFERING flag, "strlen" is what you will
want here.