On Tue, Sep 29, 2015 at 6:05 AM, Ehsan Akhgari <
ehsan....@gmail.com>
wrote:
> On 2015-09-29 12:52 AM, Gregory Szorc wrote:
>
>> On Mon, Sep 28, 2015 at 6:45 PM, Ehsan Akhgari <
ehsan....@gmail.com
>> <mailto:
ehsan....@gmail.com>> wrote:
>>
>> On 2015-09-28 5:41 PM, Gregory Szorc wrote:
>>
>> When writing thousands of files in rapid succession, this 1+ms
>> pause
>> (assuming synchronous I/O) piles up. Assuming a 1ms pause,
>> writing 100,000
>> files spends 100s in CloseFile()! The process profile also shows
>> the bulk
>> of the time in CloseFile(), so this is a real hot spot.
>>
>>
>> There is no CloseFile() on Windows. Did you mean CloseHandle()?
>>
>>
>> While this is probably something I should know, I confess to blindly
>> copying results from Sysinternals' procmon utility, which reports file
>> closes as the "CloseFile()" "operation." I reckon it is being
>> intelligent and converting CloseHandle() to something more useful for
>> reporting purposes. In my defense, procmon does report "operations" that
>> I know are actual Windows functions. Kinda weird it is inconsistent. Who
>> knows.
>>
>
> Fair! Honestly I haven't used procmon in years, I don't even remember it
> having any profiling tools when I last saw it. :-) But it probably tracks
> which handles are being passed to CloseHandle().
>
It has some very limited profiling tools built in. I had to dump the output
and write a script to perform the analysis I needed :) It does in fact
track various arguments so you can get filename-level activity for all I/O
operations.
>
> The reason I'm asking is that CloseHandle() can close various types
>> of kernel objects, and if that is showing up in profiles, it's worth
>> to verify that the handle passed to it is actually coming from
>> CreateFile(Ex).
>>
>>
>> Procmon is reporting lots of CreateFile() calls. And I'm 100% certain
>> the underlying C code is calling CreateFile().
>>
>
> Good. I'm assuming you mean CreateFile() directly, not wrappers such as
> _open or fopen.
>
We're calling CreateFile() or CreateFileA() directly. However...
>
> Closing handles on a background thread doesn't help with performance
>> if you're invoking sub-processes that need to close a handle and
>> wait for the operation to finish. It would help if you provided
>> more details on the constraints you're dealing with, e.g., where do
>> these handles come from? Are they being created by one long running
>> process or by several short lived ones? etc. Another idea to
>> experiment with is leaking the handles and letting the kernel close
>> them for you when your process is terminated. I _think_ (but I'm
>> not sure) that won't count towards the handle of the process to
>> become signaled so if you're spawning a process that needs to close
>> the file and wait for that to finish, that may be faster.
>>
>>
>> I'm dealing with a single threaded single long-running process that
>> performs synchronous I/O, 1 open file at a time. CreateFile,
>> CloseHandle, CreateFile, CloseHandle, ... I'm pretty sure leaking
>> handles is out of the question, as we need to write to thousands or even
>> tens of thousands of files and this will exhaust open files limits.
>>
>
> You'd be surprised. :-)
>
> Windows doesn't really have a notion of open file limits similar to Unix.
> File handles opened using _open can go up to a maximum of 2048. fopen has a
> cap of 512 which can be raised up to 2048 using _setmaxstdio(). *But*
> these are just CRT limits, and if you use Win32 directly, you can open up
> to 2^24 handles all at once <
>
https://technet.microsoft.com/en-us/library/bb896645.aspx>. Since we
> will never need to open that many file handles, you may very well be able
> to use this approach.
>
I experimented with a background thread for just processing file closes.
This drastically increases performance! However, the queue periodically
accumulates and I was seeing errors for too many open files - despite using
CreateFile()! We do make a call to _open_osfhandle() after CreateFile().
I'm guessing the file limit is on file descriptors (not handles) and
_open_osfhandle() triggers the 512 default ceiling? This call is necessary
because Python file objects speak in terms of file descriptors. Not calling
_open_osfhandle() would mean re-implementing Python's file object, which
I'm going to say is too much work for the interim.
Buried in that last paragraph is that a background threading closing files
resulted in significant performance wins - ~5:00 wall on an operation that
was previously ~16:00 wall! And I'm pretty sure it would go faster if
multiple closing threads were used. Still not as fast as Linux. But much
better than the 3x increase from before.