PHP and Unix-style path strings

27 views
Skip to first unread message

Lewis G Rosenthal

unread,
Nov 9, 2009, 10:35:56 PM11/9/09
to Apache2 Mailing List
Greetings, all...

I ran into an issue with MediaWiki over the weekend that I wanted to
share with the group, in case it came up to bite someone else. :-)

This has been reported in the MediaWiki forums, where a number of people
have had mysterious file upload failures, typically resulting in a
message similar to "unable to create directory /public/7/7b."

As it turns out, in includes/GlobalFunctions.php, there is a subroutine
to create directory paths which is not handling the slashes well for
non-Unix systems (under MW 1.15.0 & 1.15.1, at least):

/**
* Make directory, and make all parent directories if they don't exist
*
* @param string $dir Full path to directory to create
* @param int $mode Chmod value to use, default is $wgDirectoryMode
* @param string $caller Optional caller param for debugging.
* @return bool
*/
function wfMkdirParents( $dir, $mode = null, $caller = null ) {
global $wgDirectoryMode;

if ( !is_null( $caller ) ) {
wfDebug( "$caller: called wfMkdirParents($dir)" );
}

if( strval( $dir ) === '' || file_exists( $dir ) )
return true;

if ( is_null( $mode ) )
$mode = $wgDirectoryMode;

return mkdir( $dir, $mode, true ); // PHP5 <3
}

The fix is to insert before the return statement something to swap the
slashes:

$dir = str_replace('/', '\\', $dir);

So far, the file upload & delete functions have been the only ones I've
seen so affected. BTW, this was not my hack; I stumbled upon it by sheer
luck (okay, hard hitting, investigative work):
http://www.mediawiki.org/wiki/Project:Support_desk/Archives/Uploading/002#Potential_bug_in_FSRepo.php_causes_image_uploads_to_fail_on_Windows_server:_Internal_error_-_Could_not_create_directory


What surprised me was the fact that mkdir is evidently getting passed to
the shell directly, and there is no check on the delimiter used by the
shell. Older MW code (e.g., 1.13.2) *does* appear to check to see which
path delimiter is in use.

Anyway, just wanted to share the wealth, as it were. now back to your
regularly scheduled programming. ;-)

--
Lewis
-------------------------------------------------------------
Lewis G Rosenthal, CNA, CLP, CLE
Rosenthal & Rosenthal, LLC www.2rosenthals.com
Need a managed Wi-Fi hotspot? www.hautspot.com
Secure, stable, operating system www.ecomstation.com
-------------------------------------------------------------

Steven Levine

unread,
Nov 11, 2009, 6:36:42 PM11/11/09
to apa...@googlegroups.com
In <4AF8DF9C...@2rosenthals.com>, on 11/09/09
at 10:35 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:

Hi,

>This has been reported in the MediaWiki forums, where a number of people
>have had mysterious file upload failures, typically resulting in a
>message similar to "unable to create directory /public/7/7b."

>The fix is to insert before the return statement something to swap the
>slashes:

> $dir = str_replace('/', '\\', $dir);

>http://www.mediawiki.org/wiki/Project:Support_desk/Archives/Uploading/002#Potential_bug_in_FSRepo.php_causes_image_uploads_to_fail_on_Windows_server:_Internal_error_-_Could_not_create_directory

>What surprised me was the fact that mkdir is evidently getting passed to
>the shell directly, and there is no check on the delimiter used by the
>shell.

I don't think this is what happened. mkdir() is a php built-in. The 3rd
arg tells it to make parents as needed. Mkdir is probably assuming that
the directory pathname has native path component separators and did not
split the components correctly.

There is probably a php function to normalize pathnames, but I don't
recall its name.

Older MW code (e.g., 1.13.2) *does* appear to check to see which >path
delimiter is in use.

This does like a regression.

Typically, the problem is the other way around. *nix code often has
problems with backslash path separators.

Steven

--
----------------------------------------------------------------------
"Steven Levine" <ste...@earthlink.net> eCS/Warp/DIY etc.
www.scoug.com www.ecomstation.com
----------------------------------------------------------------------

Lewis G Rosenthal

unread,
Nov 11, 2009, 8:53:01 PM11/11/09
to apa...@googlegroups.com
Hey...

On 11/11/09 06:36 pm, Steven Levine thus wrote :
> In <4AF8DF9C...@2rosenthals.com>, on 11/09/09
> at 10:35 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:
>
> Hi,
>
>
>> This has been reported in the MediaWiki forums, where a number of people
>> have had mysterious file upload failures, typically resulting in a
>> message similar to "unable to create directory /public/7/7b."
>>
>
>
>> The fix is to insert before the return statement something to swap the
>> slashes:
>>
>
>
>> $dir = str_replace('/', '\\', $dir);
>>
>
>
>> http://www.mediawiki.org/wiki/Project:Support_desk/Archives/Uploading/002#Potential_bug_in_FSRepo.php_causes_image_uploads_to_fail_on_Windows_server:_Internal_error_-_Could_not_create_directory
>>
>
>
>> What surprised me was the fact that mkdir is evidently getting passed to
>> the shell directly, and there is no check on the delimiter used by the
>> shell.
>>
>
> I don't think this is what happened. mkdir() is a php built-in. The 3rd
> arg tells it to make parents as needed. Mkdir is probably assuming that
> the directory pathname has native path component separators and did not
> split the components correctly.
>
>
I thought that was the function getting called as well, but in that
case, wouldn't our port tolerate (translate) the slashes?

In fact, there are some interesting comments on mkdir() in the php
manual: http://php.net/manual/en/function.mkdir.php
> There is probably a php function to normalize pathnames, but I don't
> recall its name.
>
>
The constant is "DIRECTORY_SEPARATOR" . The function is realpath():
http://us.php.net/manual/en/function.realpath.php
> Older MW code (e.g., 1.13.2) *does* appear to check to see which >path
> delimiter is in use.
>
> This does like a regression.
>
>
Indeed. the code is quite different between the two approaches.
> Typically, the problem is the other way around. *nix code often has
> problems with backslash path separators.
>
As "\" is a legal character in a Unix-like filename, and neither "/" nor
"\" are legal under DOS, it's easy to see who must do the greatest
amount of adjusting. :-)

Cheers, and thanks for the reminder about the mkdir() function. I had
been under the impression that this passed to the shell, though as not
all shells are capable of creating recursive directories with one
command, I should have thought about this some more...!

Steven Levine

unread,
Nov 12, 2009, 3:05:25 PM11/12/09
to apa...@googlegroups.com
In <4AFB6A7D...@2rosenthals.com>, on 11/11/09
at 08:53 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:

Hey/2,

>I thought that was the function getting called as well, but in that
>case, wouldn't our port tolerate (translate) the slashes?

It would, if the code was fully ported. See main\streams\plain_wrapper.c.
Paul needs to tweak the ifdef near line 1137. I recommend you submit a
ticket at manitis.smedley.info.

>The constant is "DIRECTORY_SEPARATOR" . The function is realpath():
>http://us.php.net/manual/en/function.realpath.php

That would work, if if did not fail when the path does not exist.

>As "\" is a legal character in a Unix-like filename, and neither "/" nor
>"\" are legal under DOS, it's easy to see who must do the greatest
>amount of adjusting. :-)

The slashes can be normalized easily Drive letters are much more evil.
*nix code is going to use the if (*buf == '/') paradigm to check for
aboslute paths and it's easy to miss these when porting.

>been under the impression that this passed to the shell, though as not
>all shells are capable of creating recursive directories with one
>command, I should have thought about this some more...!

The real reason for not using the shell is the nasty performance hit.

Lewis G Rosenthal

unread,
Nov 12, 2009, 3:49:04 PM11/12/09
to apa...@googlegroups.com
Hey/2, to you, too... :-)

On 11/12/09 03:05 pm, Steven Levine thus wrote :
> In <4AFB6A7D...@2rosenthals.com>, on 11/11/09
> at 08:53 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:
>
> Hey/2,
>
>
>> I thought that was the function getting called as well, but in that
>> case, wouldn't our port tolerate (translate) the slashes?
>>
>
> It would, if the code was fully ported. See main\streams\plain_wrapper.c.
> Paul needs to tweak the ifdef near line 1137. I recommend you submit a
> ticket at manitis.smedley.info.
>
>
It would help if I actually perused the code once in a while, now
wouldn't it? :-) Good suggestion; I'll follow up on that. Thanks.
>> The constant is "DIRECTORY_SEPARATOR" . The function is realpath():
>> http://us.php.net/manual/en/function.realpath.php
>>
>
> That would work, if if did not fail when the path does not exist.
>
>
Indeed.
>> As "\" is a legal character in a Unix-like filename, and neither "/" nor
>> "\" are legal under DOS, it's easy to see who must do the greatest
>> amount of adjusting. :-)
>>
>
> The slashes can be normalized easily Drive letters are much more evil.
> *nix code is going to use the if (*buf == '/') paradigm to check for
> aboslute paths and it's easy to miss these when porting.
>
>
True. Drive letters are a pain, but easily sniffed out when looking for
a ":" separator, I guess, although IIRC, ":" is a legal character in a
*nix name.
>> been under the impression that this passed to the shell, though as not
>> all shells are capable of creating recursive directories with one
>> command, I should have thought about this some more...!
>>
>
> The real reason for not using the shell is the nasty performance hit.
>
>
That makes perfect sense, too, not to mention the fact that all shells
have not been created equal(ly). Much as reinventing the wheel is more
work, at least one knows the capabilities and limitations of such a
function instead of leaving it up to whatever lies beneath. ;-)

Steven Levine

unread,
Nov 12, 2009, 5:02:06 PM11/12/09
to apa...@googlegroups.com
In <4AFC74C0...@2rosenthals.com>, on 11/12/09
at 03:49 PM, Lewis G Rosenthal <lgros...@2rosenthals.com> said:

Hi/2,

>It would help if I actually perused the code once in a while, now
>wouldn't it? :-)

Not enough hours in the day. You just managed to get me curious.


>True. Drive letters are a pain, but easily sniffed out when looking for
>a ":" separator, I guess,

Not as easily as you might think. If you take a look at the .diff for the
rsync port (it's attached the a mantis ticket), you notice a large number
of indirect drive letter checks. There's also the issue that relative
paths of the form x:path need special handling.

>although IIRC, ":" is a legal character in a
>*nix name.

It's not. It's used as the PATH_SEPARATOR and as part of URL schemes.

>Much as reinventing the wheel is more
>work, at least one knows the capabilities and limitations of such a
>function instead of leaving it up to whatever lies beneath. ;-)

It's not really reinventing. It's more about creating an abstract
programming environment with identical performance on a variety of
platforms. The goal is that the php code should not need to know what the
underlying OS is.
Reply all
Reply to author
Forward
0 new messages