the upcoming new major release of Cygwin finally supports long path
names up to 32K chars length. Our coreutils maintainer already created
a new, matching release and reported that a certain test in the
coreutils testsuite would apparently show quadratic timing behaviour
depending on the directory path length or the numbers of directory
levels.
Naturally I tried to find the problem in Cygwin's path handling but
after some testing it turned out that the NT call NtQueryFileAttributes
was the major culprit. I ran my tests on XP SP3 and the timing of
NtQueryFileAttributes got really bad, the longer the path was.
However, further tests on other Windows NT versions turned up, that this
quadratic behaviour is restricted to Windows 2000, XP, and 2003, resp.
NT 5.0, 5.1, and 5.2. Windows NT 4 as well as Vista/2008 (6.0) show
linear timing and are for that reason generally much faster than the 5.x
variants.
I created a very simple test application which builds fine with Visual
C++ and gcc, see below. It creates a directory tree with a path length
of up to about 32K, with directory names of 9 chars, thus it creates up
to 3200 directory levels starting at C:. While doing that, it measures
the timing of the CreateDirectoryW call using the HiRes performance
counter and prints the current tree depth, the current path length, and
the time measured for the CreateDirectoryW call in microseconds. I ran
the tests on all systems from NT4 to Windows 2008 on equivalent virtual
hardware running on the same guest system.
On that system under Windows 2008, creating the last directory level
3200 (32000 chars) takes about 2500 us. On Windows XP, including the
last service pack 3, creating that last directory level takes 500,000(!)
us. Similar numbers as 2008 on NT4 and Vista, similar numbers as XP on
2000, 2003 and 2003 R2.
The timing depends obviously on the number of path components. When
using really long directory names of 240 chars, Windows 2008 gets
slightly slower. The last dir level 134 (32166 chars) takes about 3,000
us. Windows XP on the other hand gets much faster. The last dir level
134 takes about 35000 us.
What this test does *not* show is the fact that accessing a file or dir
the second time is generally very fast, probably due to caching.
However, this does not help a lot for, say, recursive deleting of a
directory tree.
To me this looks like a bug in the path handling of the NTFS driver in
all NT 5.x versions. While I assume that this won't be fixed anymore, I
thought that it won't hurt to ask, at least. So, here's the question:
Is there any chance that this can be fixed, at least for XP and Windows
2003?
If you want to test it yourself, just use the below source.
Corinna
=========== SNIP ============
#include "windows.h"
#include "stdio.h"
#include "wchar.h"
int
main (int argc, char **argv)
{
WCHAR path[32768];
WCHAR newpath[32768];
LARGE_INTEGER freq, cnt1, cnt2;
unsigned long path_components = 0;
QueryPerformanceFrequency (&freq);
wcscpy (newpath, L"\\\\?\\C:");
wcscpy (path, newpath);
while (wcslen (newpath) < 32000)
{
wcscat (newpath, L"\\perftestd");
QueryPerformanceCounter (&cnt1);
if (!CreateDirectoryW (newpath, NULL))
{
fprintf (stderr, "CreateDirectoryW failed at path length %u\n",
wcslen (path));
break;
}
QueryPerformanceCounter (&cnt2);
printf ("%4lu,%5lu,%6lu\n", ++path_components, wcslen (newpath),
(unsigned long) (((cnt2.QuadPart - cnt1.QuadPart) * 1000000ULL)
/ freq.QuadPart));
wcscpy (path, newpath);
}
/* Remove all dirs since it's not nice to keep them. */
while (path[wcslen (path) - 1] != L':')
{
RemoveDirectoryW (path);
*wcsrchr (path, L'\\') = L'\0';
}
return 0;
}
=========== SNAP ============
--
Corinna Vinschen
Cygwin Project Co-Leader
Red Hat
Looking at the nature of this issue, it would require intensive
troubleshooting and the Windows source code review which would be done
quickly and effectively with direct assistance from a Microsoft Support
Professional through Microsoft Product Support Services. I would recommend
you to contact Microsoft CSS to help you on this issue. Maybe they can work
with the NTFS team to provide an official answer to you.
Additionally, I have searched the internal database for
NtQueryFileAttributes, but did not find any related performance issue
records.
You can contact Microsoft Product Support directly to discuss additional
support options you may have available, by contacting us at 1-(800)936-5800
or by choosing one of the options listed at:
http://www.microsoft.com/services/microsoftservices/srv_support.mspx
Thanks for your kindly understanding.
Also, if any other experienced community member has comment to share, I am
glad to hear.
Best regards,
Jeffrey Tan
Microsoft Online Community Support
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
msd...@microsoft.com.
==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/subscriptions/managednewsgroups/default.aspx#notif
ications.
Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/subscriptions/support/default.aspx.
==================================================
This posting is provided "AS IS" with no warranties, and confers no rights.
Jeffrey Tan[MSFT] wrote:
> Hi Corinna ,
>
> Looking at the nature of this issue, it would require intensive
> troubleshooting and the Windows source code review which would be done
> quickly and effectively with direct assistance from a Microsoft Support
> Professional through Microsoft Product Support Services. I would recommend
> you to contact Microsoft CSS to help you on this issue. Maybe they can work
> with the NTFS team to provide an official answer to you.
Well, it's a generic timing problem in all NTFS 5.x versions, it's
not exactly a personal problem for me or my company. But I see your
point. Should I have time to invest more time into this issue, can
I handle this using one of my MSDN subscription support incidents?
> Additionally, I have searched the internal database for
> NtQueryFileAttributes, but did not find any related performance issue
> records.
Sorry, I screwed it up again :( The function is called
NtQueryAttributesFile or ZwQueryAttributesFile.
But that's just as it's called in Cygwin. The actual call doesn't
matter. The quadratic timing behaviour on NT 5.x shows up with *any*
file access function used on long path names. The very simple testcase
attached to my original posting shows it by using CreateDirectoryW,
because that's the most easy way to show it.
Thanks,
Corinna
Hmm, quadratic behaviour... Can it be that each path component
(or directory level) is checked in both long and short variants?
Regards,
--PA
"Corinna Vinschen" <cor...@community.nospam> wrote in message
news:fv9gbq$rg9$2...@perth.hirmke.de...
Oh, sorry, I should check NtQueryFileAttributes before searching.
Yes, I see that this is should be a generic problem to many file operations
on NT5.x systems. However, since we have no official channel to report bug
to NTFS team, contacting Microsoft CSS(you can use one of your MSDN
subscription support incidents) should be a better channel to report
it.(Actually, I have sent an email to NTFS team yesterday, however, I did
not get any response yet. I assume they are discussing on another TxF issue
of you yet)
Once this issue is confirmed as a bug by Microsoft CSS, the support
incident will be free. That means no support incident will be reduced from
your MSDN subscription.
Finally, I have searched "NtQueryAttributesFile" in the internal database
with performance issue; I got a lot of records with "NtQueryAttributesFile"
in the stack trace, but none of them are talking about our issue yet. I
suspect this issue is not report yet.
Thanks.
Best regards,
Jeffrey Tan
Microsoft Online Community Support
=========================================
Delighting our customers is our #1 priority. We welcome your comments and
suggestions about how we can improve the support we provide to you. Please
feel free to let my manager know what you think of the level of service
provided. You can send feedback directly to my manager at:
msd...@microsoft.com.
This posting is provided "AS IS" with no warranties, and confers no rights.
"Jeffrey Tan[MSFT]" wrote:
> Hi Corinna,
>
> Oh, sorry, I should check NtQueryFileAttributes before searching.
>
> Yes, I see that this is should be a generic problem to many file operations
> on NT5.x systems. However, since we have no official channel to report bug
> to NTFS team, contacting Microsoft CSS(you can use one of your MSDN
> subscription support incidents) should be a better channel to report
> it.(Actually, I have sent an email to NTFS team yesterday, however, I did
> not get any response yet. I assume they are discussing on another TxF issue
> of you yet)
>
> Once this issue is confirmed as a bug by Microsoft CSS, the support
> incident will be free. That means no support incident will be reduced from
> your MSDN subscription.
>
> Finally, I have searched "NtQueryAttributesFile" in the internal database
> with performance issue; I got a lot of records with "NtQueryAttributesFile"
> in the stack trace, but none of them are talking about our issue yet. I
> suspect this issue is not report yet.
Ok, thanks. I suspect that not many people are trying to use paths up
to 32K length so far, given the artificial restrictions of the Win32
API (like, for instance, the restriction of the working directory to
260 chars).
that might be possible. I have no idea what causes that behaviour,
but my testcase sent in my original posting shows clearly that
this is not a Cygwin or coreutils issue. It's a plain NT 5.x
problem.
Corinna
Alexander Grigoriev wrote:
> Do you by any chance have an antivirus on your XP test machine (which
> exhibits the problem)?
no. I'm running stock versions of all OSes in VMs with no 3rd party
software installed (except Cygwin, of course) for testing purposes.
As outlined in my OP, the problem can be reproduced with all NT 5.x
releases from Windows 2000 up to 2003 R2.
I've noticed something very similar while compiling a quite complex
source tree under WinXP and 2003.
My instinct was to keep the tree flat and directory names short,
but the management (of linuxoidal origin) insisted to push everything to
the Windows' limits.
We noticed significant increase of build time that we couldn't explain then.
So maybe it was exactly this symptom. Good to know it's fixed in NT6.
Regards,
--PA
Corinna Vinschen wrote:
> "Jeffrey Tan[MSFT]" wrote:
>> Hi Corinna,
>>
>> Oh, sorry, I should check NtQueryFileAttributes before searching.
>>
>> Yes, I see that this is should be a generic problem to many file operations
>> on NT5.x systems. However, since we have no official channel to report bug
>> to NTFS team, contacting Microsoft CSS(you can use one of your MSDN
>> subscription support incidents) should be a better channel to report
>> it.(Actually, I have sent an email to NTFS team yesterday, however, I did
>> not get any response yet. I assume they are discussing on another TxF issue
>> of you yet)
Was there any response from the NTFS team yet? Did you send them the
simple testcase source code from my original posting?
It's surely a good thing that NT 6 doesn't have this problem anymore,
but Windows XP and 2003 will be used further on for a long time...
Thanks for your feedback.
Sorry, no response yet. That's why I recommend you contact Microsoft CSS
case support to report this bug because they have more direct
relation/channel with the product team. Thanks.
Will do at one point. I was just hoping there's already some input
which might help to get along quicker when contacting CSS.
Yes, I am hoping so either. However, it seems that I get very few email
responses from product team for your questions; I think this is because
your questions are mostly non-trivial questions(It may cost a lot of time
for them to find out the root cause) which need high technical level
knowledge :-).
If you got the root cause/confirmation from the CSS/product team, please
feel free to share in the newsgroup. Thanks!
"Jeffrey Tan[MSFT]" wrote:
> Hi Corinna,
>
> Yes, I am hoping so either. However, it seems that I get very few email
> responses from product team for your questions; I think this is because
> your questions are mostly non-trivial questions(It may cost a lot of time
> for them to find out the root cause) which need high technical level
> knowledge :-).
>
> If you got the root cause/confirmation from the CSS/product team, please
> feel free to share in the newsgroup. Thanks!
Yes, I'll certainly share the results with the newsgroup.
It took a long time, but here we go.
No debugging, no root cause, no fix. The guys responsible for the file
system are of the opinion that the bad timing behaviour is the result of
a natural evolution of the file system over time, and not actually a
bug. Case closed. :(
"Corinna Vinschen" <cor...@community.nospam> wrote in message
news:g4b1r1$hrn$1...@perth.hirmke.de...
I can't really tell, but I assume he had no source access. He did
reproduce the problem and, from what he told, followed on with the
people working on kernel and file system. The reply is supposedly from
them. I tried multiple times to make my point clear, that quadratic
timing behaviour in the FS should be considered a bug, but to no avail.
Even *if* it would have been considered a bug, there would probably no
fix because "it is a very niche case and has a high "risk" factor as it
could involve changing very fundamental and stable parts of the
operating system." From the whole discussion I'm under the impression
that nobody really even tried to find the root cause. I guess that it's
just not important enough, given that 99% of the users are only
using paths up to 260 chars anyway.
What do you mean by "Time to raise a stink" and how would I do that?
I have written code to create pathnames using paths that repeated recurse
down a string that just gets repeated as a subdirectory entry until I get 20
or 30 levels deep. Then using streams to get a total of about a 20K
characters in the full path name. It is a real pain to manually delete the
tree since most of the OS provided tools don't work at that depth/length.
"Corinna Vinschen" <cor...@community.nospam> wrote in message
news:g4dg1f$k3i$1...@perth.hirmke.de...
Ok, I have just re-opened the case. I have no idea if anything comes out
of it, but I'll try.
Nope. I don't know what to try else. Apparently there's no chance that
this will be fixed. The final reply is that this is not a security
related problem and that a workaround for affected users exists. The
proposed workaround is, upgrading from 2000/XP to Vista and upgrading
from Server 2003 to Server 2008.