Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Doubled backslashes in Windows paths

1,211 views
Skip to first unread message

Oz-in-DFW

unread,
Oct 7, 2016, 3:59:04 AM10/7/16
to
I'm using Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC
v.1900 32 bit (Intel)] on Windows 7

I'm trying to write some file processing that looks at file size,
extensions, and several other things and I'm having trouble getting a
reliably usable path to files.

The problem *seems* to be doubled backslashes in the path, but I've read
elsewhere that this is just an artifact of the way the interpreter
displays the strings.

I'm getting an error message on an os.path.getsize call;

Path: -
"C:\Users\Rich\Desktop\2B_Proc\2307e60da6451986dd8d23635b845386.jpg" -
Traceback (most recent call last):
File "C:\Users\Rich\workspace\PyTest\test.py", line 19, in <module>
if os.path.getsize(path)>10000:
File "C:\Python32\lib\genericpath.py", line 49, in getsize
return os.stat(filename).st_size
WindowsError: [Error 123] The filename, directory name, or volume
label syntax is incorrect:
'"C:\\Users\\Rich\\Desktop\\2B_Proc\\2307e60da6451986dd8d23635b845386.jpg"'

>From (snippet)

path = '"'+dirpath+name+'"'
path = os.path.normpath(path)
print("Path: -",path,"-")
if os.path.getsize(path)>10000:
print("Path: ",path," Size:
",os.path.getsize(dirpath+name))

but if I manually use a console window and cut and paste the path I
print, it works;

C:\>dir
"C:\Users\Rich\Desktop\2B_Proc\2307e60da6451986dd8d23635b845386.jpg"
Volume in drive C is Windows7_OS

Directory of C:\Users\Rich\Desktop\2B_Proc

10/03/2016 08:35 AM 59,200
2307e60da6451986dd8d23635b845386.jpg
1 File(s) 59,200 bytes
0 Dir(s) 115,857,260,544 bytes free

So the file is there and the path is correct. I'm adding quotes to the
path to deal with directories and filenames that have spaces in them.
If I drop the quotes, everything works as I expect *until* I encounter
the first file/path with spaces.

I'll happily RTFM, but I need some hints as to which FM to R

--
mailto:o...@ozindfw.net
Oz
POB 93167
Southlake, TX 76092 (Near DFW Airport)



Stephen Tucker

unread,
Oct 7, 2016, 5:27:17 AM10/7/16
to
Hi Oz,

This might only be tangential to your actual issue, but, there again, it
might be the tiny clue that you actually need.

In Python, I use raw strings and single backslashes in folder hierarchy
strings to save the problem of the backslash in ordinary strings. Even with
this policy, however, there is a slight "gotcha": Although it is claimed
that r" ... " suspends the escape interpretation of the backslash in the
string, a raw string cannot end with a backslash:

myraw = "\a\b\"

provokes the error message:

SyntaxError: EOL while scanning string literal

To see how well this approach deals with folder hierarchies with spaces in
their names, I created the following file:

c:\Python27\ArcGIS10.4\Lib\idlelib\sjt\sp in\test.txt

Note the space in the folder name sp in .

In IDLE, I then issued the following statement:

infile= open (r"c:\Python27\ArcGIS10.4\Lib\idlelib\sjt\sp in\test.txt", "r")

Note that I didn't need to get enclosing quotes into my folder hierarchy
string. It begins with c and ends with t .

The statement worked as you might expect, and granted me access to the
lines in the file.

It seems that it is only necessary to enclose a folder hierarchy string in
quotes when defining the string in a situation in which spaces would
otherwise be interpreted as terminators. The classis case of this is in a
command with parameters in a batch file. If the string has been defined
before it is presented to the Windows Command interpreter, the spaces are
accepted as part of it without the need then of enclosing quotes.

Hope this helps.

Yours,

Stephen Tucker.
> --
> https://mail.python.org/mailman/listinfo/python-list
>

Peter Otten

unread,
Oct 7, 2016, 5:28:08 AM10/7/16
to
You have to omit the extra quotes. That the non-working path has spaces in
it is probably not the cause of the problem. If a string *literal* is used,
e. g. "C:\Path\test file.txt" there may be combinations of the backslash and
the character that follows that are interpreted specially -- in the example
\P is just \ followed by a P whereas \t is a TAB (chr(9)):

>>> print("C:\Path\test file.txt")
C:\Path est file.txt

To fix the problem either use forward slashes (which are understood by
Windows, too) or or duplicate all backslashes,

>>> print("C:\\Path\\test file.txt")
C:\Path\test file.txt

or try the r (for "raw string") prefix:

>>> print(r"C:\Path\test file.txt")
C:\Path\test file.txt

This applies only to string literals in Python source code; if you read a
filename from a file or get it as user input there is no need to process the
string:

>>> input("Enter a file name: ")
Enter a file name: C:\Path\test file.txt
'C:\\Path\\test file.txt'

> I'll happily RTFM, but I need some hints as to which FM to R
>

https://docs.python.org/3.5/tutorial/introduction.html#strings
https://docs.python.org/3.5/reference/lexical_analysis.html#string-and-bytes-literals

PS: Unrelated, but a portable way to combine paths is os.path.join() which
will also insert the platform specific directory separator if necessary:

>>> print(os.path.join("foo", "bar\\", "baz"))
foo\bar\baz


Steve D'Aprano

unread,
Oct 7, 2016, 6:46:23 AM10/7/16
to
On Fri, 7 Oct 2016 04:30 pm, Oz-in-DFW wrote:

> I'm using Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC
> v.1900 32 bit (Intel)] on Windows 7
>
> I'm trying to write some file processing that looks at file size,
> extensions, and several other things and I'm having trouble getting a
> reliably usable path to files.
>
> The problem *seems* to be doubled backslashes in the path, but I've read
> elsewhere that this is just an artifact of the way the interpreter
> displays the strings.

Indeed.

Why don't you show us the actual path you use? You show us this snippet:

> path = '"'+dirpath+name+'"'
> path = os.path.normpath(path)
> print("Path: -",path,"-")
> if os.path.getsize(path)>10000:
> print("Path: ",path," Size:
> ",os.path.getsize(dirpath+name))


but apparently dirpath and name are secrets, because you don't show us what
they are *wink*

However, using my awesome powers of observation *grin* I see that your
filename (whatever it is) begins and ends with quotation marks. So instead
of having a file name like:

C:\Users\Rich\Desktop\2B_Proc\2307e60da6451986dd8d23635b845386.jpg


you actually have a file name like:

"C:\Users\Rich\Desktop\2B_Proc\2307e60da6451986dd8d23635b845386.jpg"


In this example, the quotation marks " at the beginning and end are *NOT*
the string delimiters, they are the first and last characters in the
string. Which might explain why Windows complains about the label:

> WindowsError: [Error 123] The filename, directory name, or volume
> label syntax is incorrect:


That's because

"C:

is an illegal volume label (disk name? I'm not really a Windows user, and
I'm not quite sure what the correct terminology here would be).


So I suggest you change your code from this:

> path = '"'+dirpath+name+'"'
> path = os.path.normpath(path)

to this instead:

path = dirpath + name
path = os.path.normpath(path)


> So the file is there and the path is correct. I'm adding quotes to the
> path to deal with directories and filenames that have spaces in them.

The os.path and os functions do not need you to escape the file name with
quotes to handle spaces.

It is very likely that if you call out to another Windows program using the
os.system() call you will need to worry about escaping the spaces, but
otherwise, you don't need them.

> If I drop the quotes, everything works as I expect

Indeed.

> *until* I encounter the first file/path with spaces.

And what happens then? My guess is that you get a different error, probably
in a different place.



--
Steve
“Cheer up,” they said, “things could be worse.” So I cheered up, and sure
enough, things got worse.

BartC

unread,
Oct 7, 2016, 8:39:55 AM10/7/16
to
On 07/10/2016 06:30, Oz-in-DFW wrote:
> I'm using Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 25 2016, 22:01:18) [MSC
> v.1900 32 bit (Intel)] on Windows 7
>
> I'm trying to write some file processing that looks at file size,
> extensions, and several other things and I'm having trouble getting a
> reliably usable path to files.
>
> The problem *seems* to be doubled backslashes in the path, but I've read
> elsewhere that this is just an artifact of the way the interpreter
> displays the strings.
>
> I'm getting an error message on an os.path.getsize call;
>
> Path: -
> "C:\Users\Rich\Desktop\2B_Proc\2307e60da6451986dd8d23635b845386.jpg" -
> Traceback (most recent call last):
> File "C:\Users\Rich\workspace\PyTest\test.py", line 19, in <module>
> if os.path.getsize(path)>10000:
> File "C:\Python32\lib\genericpath.py", line 49, in getsize
> return os.stat(filename).st_size
> WindowsError: [Error 123] The filename, directory name, or volume
> label syntax is incorrect:
> '"C:\\Users\\Rich\\Desktop\\2B_Proc\\2307e60da6451986dd8d23635b845386.jpg"'

I tried to recreate this error and it seems the getsize function doesn't
like quotes in the path.

Whether \ is correctly written as \\ in a string literal, or a raw
string is used with the r prefix and a single \, or a \ has been put
into path by any other means, then these will still be displayed as \\
in the error message, which is strange. The error handler is expanding \
characters to \\.

But the main error appears to be due to the presence of quotes, whether
at each end, or inside the path, enclosing an element with spaces for
example. Try using len(path)>10000 instead; it might be near enough (the
10000 sounds arbitrary anyway).

--
Bartc

eryk sun

unread,
Oct 7, 2016, 9:01:42 AM10/7/16
to
On Fri, Oct 7, 2016 at 9:27 AM, Peter Otten <__pet...@web.de> wrote:
> To fix the problem either use forward slashes (which are understood by
> Windows, too)

Using forward slash in place of backslash is generally fine, but you
need to be aware of common exceptions, such as the following:

(1) Paths with forward slashes aren't generally accepted in command lines.

(2) Forward slash is just another name character in the kernel object
namespace. Thus "Global/Spam" is a local name with a slash in it,
while r"Global\Spam" is a "Spam" object created in the global
r"\BaseNamedObjects" directory because "Global" is a local object
symbolic link to the global r"\BaseNamedObjects".

(3) Extended paths (i.e. paths prefixed by \\?\) bypass the user-mode
path normalization that replaces slash with backslash. Such paths
cannot use forward slashes to delimit filesystem path components.
Given (2), forward slash potentially could be parsed as part of a
filename. Fortunately, slash is a reserved character in all Microsoft
filesystems. Thus "dir1/dir2" is an invalid 9-character name.

You may ask why Windows doesn't hard-code replacing slash with
backslash, even for extended paths. Some filesystems may allow slash
in names, or for some other purpose (no that I know of any). Also,
device names may contain slashes, just like any other Windows object
name. For example, let's define a DOS pseudo-device (really an object
symbolic link) named "Eggs/Spam" to reference Python's installation
directory:

import os, sys, ctypes
kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)

device_name = 'Eggs/Spam'
target_path = os.path.dirname(sys.executable)
kernel32.DefineDosDeviceW(0, device_name, target_path)

>>> os.listdir(r'\\?\Eggs/Spam\Doc')
['python352.chm']

Note that the final component, "Doc", must be delimited by a
backslash. If it instead used slash, Windows would look for a device
named "Eggs/Spam/Doc". For example:

>>> os.listdir(r'\\?\Eggs/Spam/Doc')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
FileNotFoundError: [WinError 3] The system cannot find the
path specified: '\\\\?\\Eggs/Spam/Doc'

For the case of r"\\?\Eggs/Spam\Doc/python352.chm", parsing the path
succeeds up to the last component, "Doc/python352.chm", which is an
invalid name:

>>> os.stat(r'\\?\Eggs/Spam\Doc/python352.chm')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
OSError: [WinError 123] The filename, directory name, or volume
label syntax is incorrect: '\\\\?\\Eggs/Spam\\Doc/python352.chm'

Windows reparses the object symbolic link "Eggs/Spam" to its target
path. In my case that's r"\??\C:\Program Files\Python35". The "C:"
drive in my case is an object symbolic link to
r"\Device\HarddiskVolume2" (typical for Windows 10), so the path gets
reparsed as r"\Device\HarddiskVolume2\Program
Files\Python35\Doc/python352.chm". The volume device "HarddiskVolume2"
contains an NTFS filesystem, so the NTFS driver parses the remaining
path, r"\Program Files\Python35\Doc/python352.chm", and rejects
"Doc/python352.chm" as an invalid name.

Ned Batchelder

unread,
Oct 7, 2016, 9:58:01 AM10/7/16
to
For error messages destined for developers, it is good practice to use
%r or {!r} to get the repr() of a string. This will show quotes around
the string, and use backslash escapes to present the contents
unambiguously. That's what's expanding \ to \\, and adding the single
quotes you see in the message.

--Ned.

eryk sun

unread,
Oct 7, 2016, 10:03:22 AM10/7/16
to
On Fri, Oct 7, 2016 at 10:46 AM, Steve D'Aprano
<steve+...@pearwood.info> wrote:
> That's because
>
> "C:
>
> is an illegal volume label (disk name? I'm not really a Windows user, and
> I'm not quite sure what the correct terminology here would be).

It's not an illegal device name, per se. A DOS device can be defined
with the name '"C:'. For example:

>>> kernel32.DefineDosDeviceW(0, '"C:', 'C:')
1
>>> os.path.getsize(r'\\.\"C:\Windows\py.exe')
889504

However, without the DOS device prefix (either \\.\ or \\?\), Windows
has to normalize the path as a classic DOS path before passing it to
the kernel. Let's see how Windows 10 normalizes this path by setting a
breakpoint on the NtCreateFile system call:

>>> os.path.getsize(r'"C:\Windows\py.exe"')
Breakpoint 0 hit
ntdll!NtCreateFile:
00007ffb`a6c858e0 4c8bd1 mov r10,rcx

A kernel path is stored in an OBJECT_ATTRIBUTES structure, which has
the path, a handle (for opening relative to another object), and flags
such as whether or not the path is case insensitive. The debugger's
!obja extension command shows the contents of this structure:

0:000> !obja @r8
Obja +000000a6c8bef038 at 000000a6c8bef038:
Name is "C:\Windows\py.exe"
OBJ_CASE_INSENSITIVE

We see that the user-mode path normalization code doesn't know what to
make of a path starting with '"', so it just punts the path to the
kernel object manager. In turn the object manager rejects this path
because it's not rooted in the object namespace (i.e. it's not of the
form "\??\..." or "\Device\...", etc):

0:000> pt; r rax
rax=00000000c0000033

The kernel status code 0xC0000033 is STATUS_OBJECT_NAME_INVALID.

Note that a path ending in '"' is still illegal even if we explicitly
use the r'\\.\"C:' DOS device. For example:

>>> os.path.getsize(r'\\.\"C:\Windows\py.exe"')
Breakpoint 0 hit
ntdll!NtCreateFile:
00007ffb`a6c858e0 4c8bd1 mov r10,rcx

0:000> !obja @r8
Obja +000000a6c8bef038 at 000000a6c8bef038:
Name is \??\"C:\Windows\py.exe"
OBJ_CASE_INSENSITIVE

0:000> pt; r rax
rax=00000000c0000033

In this case it fails because the final '"' in the name (after .exe)
is reserved by the I/O manager, along with '<' and '>', respectively
as DOS_DOT, DOS_STAR, and DOS_QM. These characters aren't allowed in
filesystem names. They get used to implement the semantics of DOS
wildcards in NT system calls (the regular '*' and '?' wildcards are
also reserved). See FsRtlIsNameInExpression [1], which, for example, a
filesystem may use in its implementation of the NtQueryDirectoryFile
[2] system call to handle the optional FileName parameter with
wildcard matching.

[1]: https://msdn.microsoft.com/en-us/library/ff546850
[2]: https://msdn.microsoft.com/en-us/library/ff567047

BartC

unread,
Oct 7, 2016, 12:50:21 PM10/7/16
to
On 07/10/2016 13:39, BartC wrote:
> On 07/10/2016 06:30, Oz-in-DFW wrote:

>> I'm getting an error message on an os.path.getsize call;

> But the main error appears to be due to the presence of quotes, whether
> at each end, or inside the path, enclosing an element with spaces for
> example. Try using len(path)>10000 instead; it might be near enough (the
> 10000 sounds arbitrary anyway).

Forget that. Apparently .getsize returns the size of the file not the
length of the path! (That os.path. bit misled me.)

In that case just leave out the quotes. I don't think you need them even
if the file-path contains embedded spaces. That would be an issue on a
command-line (as spaces separate parameters), not inside a string.

But if quotes are present in user-input that represents a file-name,
then they might need to be removed.

--
Bartc

eryk sun

unread,
Oct 28, 2016, 8:42:14 PM10/28/16
to
On Fri, Oct 28, 2016 at 8:04 PM, Gilmeh Serda
<gilmeh...@nothing.here.invalid> wrote:
>
> You can use forward slash to avoid the messy problem.

There are cases in which you need to use backslash, such as extended
paths and command lines. Python 3's pathlib automatically normalizes a
Windows path to use backslash. Otherwise you can use
os.path.normpath().

>>>> target_dir = 'Desktop/2B_proc'
>>>> full_target = os.path.join(os.path.expanduser('~'), target_dir)

Don't assume the default location of a user's known folders. They can
all be relocated, either by domain group policy or individually using
the folder properties. Instead call SHGetKnownFolderPath or
SHGetFolderPath, e.g. to look up the value of FOLDERID_Desktop or
CSIDL_DESKTOP, respectively.

expanduser('~') is also used to locate configuration and data files,
such as "~\.python_history". On Windows, such files belong in a
subfolder of the user's hidden AppData folder. If they should roam
with a roaming profile, use %AppData%; otherwise use %LocalAppData%,
such as "%LocalAppData%\Python\python_history.txt". I prefer calling
SHGetKnownFolderPath instead of using the potentially stale
environment variables, but relocating these folders in a session is
uncommon.
0 new messages