EOF while scanning triple-quoted string literal

636 views
Skip to first unread message

bussiere bussiere

unread,
Oct 11, 2010, 5:37:09 AM10/11/10
to pytho...@python.org

i've looked on the web and here but i didn't find an answer : here is my code

zlib.decompress("""
xワᆳヤ=ラᄇHナs~Ʀᄑç\ムîà
Z@ÑÁÔQÇlxÇÆïPP~ýVãì゙M6ÛÐ|ê֭ᄁᄂヤ=)}éÓUe﬿ö3ᄎᄌú"
}ʿïÿ÷1þ8ñ́U÷ᄏñíLÒVi:`ᄈᄎL!Ê҆p6-%Fë^ヘ÷à,Q.K!ユô`ÄA!ÑêweÌ ÊÚAYøøÂjôóᅠÂcñ䊧fᆴùテúN :nüzAÝ7%ᄌcdUタᄌ3ôPۂタlyHᆲᄑ$/yzᄒíàヌ'ÕÓ&`|S!<'ᄂ÷Zļᄐ2ホモ;ニ(ÅÛfb!úü$ナテᄒ,9ßhàPᄎᄄێフÑbØὛホQᄍ-Ü}(n;ᄄホLヤ\^ï9ᆭᄍラDdВéÞ|åPOGᄂÐÙ%â&AÔë)ÎTÐC ᄐïc枢í%Èï!フᄋëiq*ᄌVKÐNᄡ[ᄁfOq{OᆭÆÊ,0GᄂリmtツᄈOᄌΥ$#îヘqbYᄆメUニᄉÞáP`
ヨ×ᆵÃPwaレǩâ×)ハFcêÚ=!Åöᄊ
)AFñᄈ/cMᄃ!NóNΈór?pàÜòXw
Bvæ0ïçIÉoマ>5pᆭ-ØWÚNᄆùFᄆØPçÃþdᅠ;ル1[Oᄈホ~6ツᄈᆬŕìᄄޠ=øð@ネV﾿ᄅ)÷%ユÜib{HKŅVlDCテîfÑWì÷ìáár.ワîv﾿<dn~ú*ÁÕ7ýá}EsYWᄂÈ:R×ãQңメ?Ø1vヘäツ~èR1ᄉÜ*ᄡónAjmNoツユᄈÌښᆬf[8ᆭÛ>゙OWラ|ÌbDᄁÖ녡M=Ð÷èâミム'ÂÝÐ ;ë mᄎQÂäԤۢ:モᄆdᄎᄑLȂ1ᄈ_÷YZᆲNòÛ â\ロxÐlݵᆵムᆱøm5Ëá=ïoÍlMᆪ[×#Ypᅠトx[ÉÊyæツoモナz)ᆭᄀÝÏìò
""")

so it was a string that i got by zlib.compress an other string. How can i decompress this string ? Regards Bussiere

Google Fan boy

Rhodri James

unread,
Oct 11, 2010, 5:59:32 PM10/11/10
to
On Mon, 11 Oct 2010 10:37:09 +0100, bussiere bussiere <buss...@gmail.com>
wrote:

It helps to say what your problem is more explicitly than just hinting at
it in the title. Assuming that you are running on Windows and the Python
traceback really does single this line out, my guess is that one of those
random binary characters is a Ctrl-Z. Windows regards that as the end of
a text file. How you get out of that one, I'm not sure, but frankly
putting arbitrary binary into a literal string is rather asking for
something like this to come and bite you.


--
Rhodri James *-* Wildebeest Herder to the Masses

Ian Kelly

unread,
Oct 11, 2010, 6:23:52 PM10/11/10
to Python
Option 1: Replace the binary bytes with the proper escape codes (incidentally, I see some backslashes already in there that most likely also need to be escaped).

Option 2: Move that ugly mess out of the source and into an auxiliary data file.

Option 3: Encode it in base64 and add a decoding step before the decompression step.

Cheers,
Ian

Lawrence D'Oliveiro

unread,
Oct 15, 2010, 12:30:20 AM10/15/10
to
In message <op.vkfl1i1na8ncjz@gnudebst>, Rhodri James wrote:

> ... frankly putting arbitrary binary into a literal string is rather


> asking for something like this to come and bite you.

It normally works fine on sensible OSes.

Steven D'Aprano

unread,
Oct 15, 2010, 9:40:21 AM10/15/10
to

What does it have to do with the OS? Surely it's a question of the
editor, interpreter and related tools.

In the Unix world, which includes OS X, text tools tend to have
difficulty with tabs. Or try naming a file with a newline or carriage
return in the file name, or a NULL byte. "Works fine" is not how I would
describe it.


--
Steven

Message has been deleted

Ian

unread,
Oct 15, 2010, 12:55:17 PM10/15/10
to
On Oct 14, 10:30 pm, Lawrence D'Oliveiro <l...@geek-

Which OSes would those be? It doesn't work in Linux:

$ python -c "print 'print \'hello\0world\''" > test.py
$ cat test.py
print 'helloworld'
$ python test.py
File "test.py", line 1
print 'hello
^
SyntaxError: EOL while scanning string literal

Cheers,
Ian

Grant Edwards

unread,
Oct 15, 2010, 1:02:07 PM10/15/10
to
On 2010-10-15, Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> wrote:

> In the Unix world, which includes OS X, text tools tend to have
> difficulty with tabs. Or try naming a file with a newline or carriage
> return in the file name, or a NULL byte.

How do you create a file with a name that contains a NULL byte?

--
Grant Edwards grant.b.edwards Yow! Catsup and Mustard all
at over the place! It's the
gmail.com Human Hamburger!

Martin Gregorie

unread,
Oct 15, 2010, 1:24:11 PM10/15/10
to
On Fri, 15 Oct 2010 17:02:07 +0000, Grant Edwards wrote:

> On 2010-10-15, Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au>
> wrote:
>
>> In the Unix world, which includes OS X, text tools tend to have
>> difficulty with tabs. Or try naming a file with a newline or carriage
>> return in the file name, or a NULL byte.
>
> How do you create a file with a name that contains a NULL byte?

Use a language or program that doesn't use null-terminated strings.

Its quite easy in many BASICs, which often delimit strings by preceeding
it with a with a byte count, and you hit Ctrl-SPACE by accident....


--
martin@ | Martin Gregorie
gregorie. | Essex, UK
org |

Grant Edwards

unread,
Oct 15, 2010, 2:14:13 PM10/15/10
to

I don't see what the in-program string representation has to do with
it. The Unix system calls that create files only accept NULL
terminated strings for the path parameter.

Are you saying that there are BASIC implementations for Unix that
create Unix files by directly accessing the disk rather than using the
Unix system calls?

--
Grant Edwards grant.b.edwards Yow! I'm sitting on my
at SPEED QUEEN ... To me,
gmail.com it's ENJOYABLE ... I'm WARM
... I'm VIBRATORY ...

Seebs

unread,
Oct 15, 2010, 2:49:55 PM10/15/10
to
On 2010-10-15, Grant Edwards <inv...@invalid.invalid> wrote:
> On 2010-10-15, Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> wrote:
>> In the Unix world, which includes OS X, text tools tend to have
>> difficulty with tabs. Or try naming a file with a newline or carriage
>> return in the file name, or a NULL byte.

> How do you create a file with a name that contains a NULL byte?

So far as I know, in canonical Unix, you don't -- the syscalls all work
with something like C strings under the hood, meaning that no matter what
path name you send, the first null byte actually terminates it.

-s
--
Copyright 2010, all wrongs reversed. Peter Seebach / usenet...@seebs.net
http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
http://en.wikipedia.org/wiki/Fair_Game_(Scientology) <-- get educated!
I am not speaking for my employer, although they do rent some of my opinions.

Grant Edwards

unread,
Oct 15, 2010, 2:56:22 PM10/15/10
to
On 2010-10-15, Seebs <usenet...@seebs.net> wrote:
> On 2010-10-15, Grant Edwards <inv...@invalid.invalid> wrote:
>> On 2010-10-15, Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> wrote:
>>> In the Unix world, which includes OS X, text tools tend to have
>>> difficulty with tabs. Or try naming a file with a newline or carriage
>>> return in the file name, or a NULL byte.
>
>> How do you create a file with a name that contains a NULL byte?
>
> So far as I know, in canonical Unix, you don't -- the syscalls all work
> with something like C strings under the hood, meaning that no matter what
> path name you send, the first null byte actually terminates it.

Yes, all of the Unix syscalls use NULL-terminated path parameters (AKA
"C strings"). What I don't know is whether the underlying filesystem
code also uses NULL-terminated strings for filenames or if they have
explicit lengths. If the latter, there might be some way to bypass
the normal Unix syscalls and actually create a file with a NULL in its
name -- a file that then couldn't be accessed via the normal Unix
system calls. My _guess_ is that the underlying filesystem code in
most all Unices also uses NULL-terminated strings, but I haven't
looked yet.

--
Grant Edwards grant.b.edwards Yow! What UNIVERSE is this,
at please??
gmail.com

Martin Gregorie

unread,
Oct 15, 2010, 3:13:16 PM10/15/10
to
On Fri, 15 Oct 2010 18:14:13 +0000, Grant Edwards wrote:

> On 2010-10-15, Martin Gregorie <mar...@address-in-sig.invalid> wrote:
>> On Fri, 15 Oct 2010 17:02:07 +0000, Grant Edwards wrote:
>>
>>> On 2010-10-15, Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au>
>>> wrote:
>>>
>>>> In the Unix world, which includes OS X, text tools tend to have
>>>> difficulty with tabs. Or try naming a file with a newline or carriage
>>>> return in the file name, or a NULL byte.
>>>
>>> How do you create a file with a name that contains a NULL byte?
>>
>> Use a language or program that doesn't use null-terminated strings.
>>
>> Its quite easy in many BASICs, which often delimit strings by
>> preceeding it with a with a byte count, and you hit Ctrl-SPACE by
>> accident....
>
> I don't see what the in-program string representation has to do with it.
> The Unix system calls that create files only accept NULL terminated
> strings for the path parameter.
>

Well, obviously you can't have null in a filename if the program is using
null-terminated strings.

> Are you saying that there are BASIC implementations for Unix that create
> Unix files by directly accessing the disk rather than using the Unix
> system calls?
>

I'm saying that the only BASIC implementations I've looked at the guts of
have used count-delimited strings. None were on *nixen but its a safe bet
that if they were ported to a UNIX they'd retain their count-delimited
nature.

Another language that will certainly do this is COBOL, which only uses
fixed length, and therefore undelimited, strings.

The point I'm making is that in both fixed length and counted string
representations you can put any character value at all into the string
unless whatever mechanism you're using to read in the values recognises
something, i.e. TAB, CR, LF, CRLF as a delimiter, and even then the
program can generate a string containing arbitrary gibberish.

If you then use the string as a file name you can end up with a file that
can't be accessed or deleted if the name flouts the OS's file naming
conventions. I've done it in the past with BASIC programs and finger
trouble under FLEX09 and CP/M. In both cases I had to use a disk editor
to fix the file name before the file could be deleted or accessed.

Seebs

unread,
Oct 15, 2010, 3:41:17 PM10/15/10
to
On 2010-10-15, Grant Edwards <inv...@invalid.invalid> wrote:
> Yes, all of the Unix syscalls use NULL-terminated path parameters (AKA
> "C strings"). What I don't know is whether the underlying filesystem
> code also uses NULL-terminated strings for filenames or if they have
> explicit lengths. If the latter, there might be some way to bypass
> the normal Unix syscalls and actually create a file with a NULL in its
> name -- a file that then couldn't be accessed via the normal Unix
> system calls. My _guess_ is that the underlying filesystem code in
> most all Unices also uses NULL-terminated strings, but I haven't
> looked yet.

There's some dire magic there. The classic V7 or so filesystem had 16-byte
file names which were null terminated unless they were 16 characters, in
which case they weren't but were still only 16 characters. Apart from that,
though, so far as I know everything is always null terminated.

The weird special case is slashes; you can never have a slash in a file name,
but at least one NFS implementation was able to create file names containing
slashes, and if you had a Mac client (where slash was valid in file names),
it could then create files with names that you could never use on the Unix
side, because the path resolution code kept trying to find directories
instead. This was, worse yet, common, because so many people used
"mm/dd/yy" in file names! Later implementations changed to silently
translating between colons and slashes. (I think this still happened under
the hood in at least some OS X, because the HFS filesystem really uses
colons somewhere down in there.)

... But so far as I know, there's never been a Unix-type system where it
was actually possible to get a null byte into a file name. Spaces, newlines,
sure. Slashes, under rare and buggy circumstances. But I've never heard
of a null byte in a file name.

Chris Torek

unread,
Oct 15, 2010, 3:49:33 PM10/15/10
to
>> On 2010-10-15, Grant Edwards <inv...@invalid.invalid> wrote:
>>> How do you create a [Unix] file with a name that contains a NULL byte?

>On 2010-10-15, Seebs <usenet...@seebs.net> wrote:
>> So far as I know, in canonical Unix, you don't -- the syscalls all work
>> with something like C strings under the hood, meaning that no matter what
>> path name you send, the first null byte actually terminates it.

In article <i9a84m$rp9$1...@reader1.panix.com>


Grant Edwards <inv...@invalid.invalid> wrote:
>Yes, all of the Unix syscalls use NULL-terminated path parameters (AKA
>"C strings"). What I don't know is whether the underlying filesystem
>code also uses NULL-terminated strings for filenames or if they have
>explicit lengths. If the latter, there might be some way to bypass
>the normal Unix syscalls and actually create a file with a NULL in its
>name -- a file that then couldn't be accessed via the normal Unix
>system calls. My _guess_ is that the underlying filesystem code in
>most all Unices also uses NULL-terminated strings, but I haven't
>looked yet.

Multiple common on-disk formats (BSD's UFS variants and Linux's
EXTs, for instance) use counted strings, so it is possible -- via
disk corruption or similar -- to get "impossible" file names (those
containing either an embedded NUL or an embedded '/').

More notoriously, earlier versions of NFS could create files with
embedded slashes when serving non-Unix clients. These were easily
removed with the same non-Unix client, but not on the server! :-)

None of this has anything to do with the original problem, in which
a triple-quoted string is left to contain arbitrary binary data
(up to, of course, the closing triple-quote). Should that arbitrary
binary data itself happen to include a triple-quote, this trivial
encoding technique will fail. (And of course, as others have noted,
it fails on some systems that distinguish betwen text and binary
file formats in the first place.) This is why using some
"text-friendly" encoding scheme, such as base64, is a good idea.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40�39.22'N, 111�50.29'W) +1 801 277 2603
email: gmail (figure it out) http://web.torek.net/torek/index.html

Grant Edwards

unread,
Oct 15, 2010, 3:59:13 PM10/15/10
to
On 2010-10-15, Martin Gregorie <mar...@address-in-sig.invalid> wrote:
>> On 2010-10-15, Martin Gregorie <mar...@address-in-sig.invalid> wrote:
>>> On Fri, 15 Oct 2010 17:02:07 +0000, Grant Edwards wrote:
>>>> On 2010-10-15, Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au>:

>>>>
>>>>> In the Unix world, which includes OS X, text tools tend to have
>>>>> difficulty with tabs. Or try naming a file with a newline or carriage
>>>>> return in the file name, or a NULL byte.
>>>>
>>>> How do you create a file with a name that contains a NULL byte?
>>>
>>> Use a language or program that doesn't use null-terminated strings.
>>>
>>> Its quite easy in many BASICs, [...]

>>
>> I don't see what the in-program string representation has to do with
>> it. The Unix system calls that create files only accept NULL
>> terminated strings for the path parameter.
>
> Well, obviously you can't have null in a filename if the program is
> using null-terminated strings.

Obviously.

Just as obviously, you can't have a null in a filename if the OS
filesystem API uses null-terminated strings -- which the Linux
filesystem API does. I just verified that by looking at the kernel
sources -- I can post the relevent code if you like.

I'm pretty sure all the other Unices are the same. I've got BSD
sources laying around somewhere...

>> Are you saying that there are BASIC implementations for Unix that
>> create Unix files by directly accessing the disk rather than using
>> the Unix system calls?
>
> I'm saying that the only BASIC implementations I've looked at the
> guts of have used count-delimited strings. None were on *nixen but
> its a safe bet that if they were ported to a UNIX they'd retain their
> count-delimited nature.

And I'm saying _that_doesn't_matter_. The _OS_ uses NULL-terminated
strings. You can use a language the represents strings as braille
images encoded as in-memory PNG files if you want. That still doesn't
let you create a Unix file whose name contains a NULL byte.

> Another language that will certainly do this is COBOL, which only
> uses fixed length, and therefore undelimited, strings.

Again, what difference does it make?

If the OS uses null-terminated strings for filenames, what difference
does it make how the user-space program represents filenames internally?

> The point I'm making is that in both fixed length and counted string
> representations you can put any character value at all into the
> string unless whatever mechanism you're using to read in the values
> recognises something, i.e. TAB, CR, LF, CRLF as a delimiter, and even
> then the program can generate a string containing arbitrary
> gibberish.

I don't care how the program represents strings.

The OS doesn't care.

The filesystem doesn't care.

Please explain how to pass a filename containing a NULL byte to a Unix
syscall like creat() or open(). You don't even have to use the C
library API -- feel free to use the real syscall API for whatever Unix
on whatever architecture you want.

> If you then use the string as a file name you can end up with a file
> that can't be accessed or deleted if the name flouts the OS's file
> naming conventions. I've done it in the past with BASIC programs and
> finger trouble under FLEX09 and CP/M. In both cases I had to use a
> disk editor to fix the file name before the file could be deleted or
> accessed.

We're talking about Unix.

We're not talking about CP/M, DOS, RSX-11m, Apple-SOS, etc.

--
Grant Edwards grant.b.edwards Yow! I put aside my copy
at of "BOWLING WORLD" and
gmail.com think about GUN CONTROL
legislation...

Grant Edwards

unread,
Oct 15, 2010, 4:07:37 PM10/15/10
to
On 2010-10-15, Seebs <usenet...@seebs.net> wrote:
> On 2010-10-15, Grant Edwards <inv...@invalid.invalid> wrote:

>> Yes, all of the Unix syscalls use NULL-terminated path parameters
>> (AKA "C strings"). What I don't know is whether the underlying
>> filesystem code also uses NULL-terminated strings for filenames or if
>> they have explicit lengths. If the latter, there might be some way
>> to bypass the normal Unix syscalls and actually create a file with a
>> NULL in its name -- a file that then couldn't be accessed via the
>> normal Unix system calls. My _guess_ is that the underlying
>> filesystem code in most all Unices also uses NULL-terminated strings,
>> but I haven't looked yet.
>
> There's some dire magic there. The classic V7 or so filesystem had
> 16-byte file names which were null terminated unless they were 16
> characters, in which case they weren't but were still only 16
> characters. Apart from that, though, so far as I know everything is
> always null terminated.

I've just verfied in the Linux sources that filenames passed to linux
syscall API (as opposed to the C library API) are indeed null
terminated. Even if they're not stored as null-terminated strings in
the actual filesystem data-structures, there's no way to get a null
byte in there using standard syscalls. I have found a few places in
the filesystem code where there's a structure field that seems to be a
"name length", but it's not obvious at first glance if that's a file
name.

> The weird special case is slashes; you can never have a slash in a
> file name, but at least one NFS implementation was able to create
> file names containing slashes, and if you had a Mac client (where
> slash was valid in file names), it could then create files with names
> that you could never use on the Unix side, because the path
> resolution code kept trying to find directories instead.

Fun!

> This was, worse yet, common, because so many people used "mm/dd/yy"
> in file names! Later implementations changed to silently translating
> between colons and slashes. (I think this still happened under the
> hood in at least some OS X, because the HFS filesystem really uses
> colons somewhere down in there.)
>
> ... But so far as I know, there's never been a Unix-type system where
> it was actually possible to get a null byte into a file name. Spaces,
> newlines, sure.

And even more fun, backspaces and ANSI escape sequences. :)

Back in the day when everybody was sitting at a terminal (or at least
an xterm), you could confuse somebody for days with judicious use of
filenames containing escape sequences. Not that I'd ever do such a
thing.

> Slashes, under rare and buggy circumstances. But I've never heard of
> a null byte in a file name.

Nor I, which is why I was confused by the statement that in the "Unix
world" a lot of programs misbehaved when presented with files whose
names contained a null byte.

--
Grant Edwards grant.b.edwards Yow! I want my nose in
at lights!
gmail.com

Grant Edwards

unread,
Oct 15, 2010, 4:10:37 PM10/15/10
to
On 2010-10-15, Chris Torek <nos...@torek.net> wrote:
>>> On 2010-10-15, Grant Edwards <inv...@invalid.invalid> wrote:
>>>> How do you create a [Unix] file with a name that contains a NULL byte?
>
>>On 2010-10-15, Seebs <usenet...@seebs.net> wrote:
>>> So far as I know, in canonical Unix, you don't -- the syscalls all work
>>> with something like C strings under the hood, meaning that no matter what
>>> path name you send, the first null byte actually terminates it.
>
> In article <i9a84m$rp9$1...@reader1.panix.com> Grant Edwards <inv...@invalid.invalid> wrote:
>
>>Yes, all of the Unix syscalls use NULL-terminated path parameters (AKA
>>"C strings"). What I don't know is whether the underlying filesystem
>>code also uses NULL-terminated strings for filenames or if they have
>>explicit lengths. If the latter, there might be some way to bypass
>>the normal Unix syscalls and actually create a file with a NULL in its
>>name -- a file that then couldn't be accessed via the normal Unix
>>system calls. My _guess_ is that the underlying filesystem code in
>>most all Unices also uses NULL-terminated strings, but I haven't
>>looked yet.
>
> Multiple common on-disk formats (BSD's UFS variants and Linux's EXTs,
> for instance) use counted strings, so it is possible -- via disk
> corruption or similar -- to get "impossible" file names (those
> containing either an embedded NUL or an embedded '/').

That appeared it might be the case after a quick browsing of some of
the fs source code, but I wasn't sure.

> More notoriously, earlier versions of NFS could create files with
> embedded slashes when serving non-Unix clients. These were easily
> removed with the same non-Unix client, but not on the server! :-)
>
> None of this has anything to do with the original problem,

No, we left that track miles back...

--
Grant Edwards grant.b.edwards Yow! I'm having an
at emotional outburst!!
gmail.com

Martin Gregorie

unread,
Oct 15, 2010, 7:02:15 PM10/15/10
to
On Fri, 15 Oct 2010 19:59:13 +0000, Grant Edwards wrote:

>
> We're talking about Unix.
> We're not talking about CP/M, DOS, RSX-11m, Apple-SOS, etc.
>

That's just your assumption. Track back up the thread and you'll see that
the OP didn't mention an OS. He merely said that he was using zlib, and
getting unfortunate results when he handled its output, so he could have
been using any OS.

Rhodri James assumed that the OS was Windows, but it wasn't until the
6th post that Steven D'Aprano mentioned Unix and null characters.

I got sucked into the null trap - sorry - because I actually meant to
generalise the discussion into ways of getting a range of unwanted
characters into a file name and why its unwise to use a filename without
checking it for characters the OS doesn't like before creating a file
with it.

Grant Edwards

unread,
Oct 16, 2010, 12:46:18 AM10/16/10
to
On 2010-10-15, Martin Gregorie <mar...@address-in-sig.invalid> wrote:
> On Fri, 15 Oct 2010 19:59:13 +0000, Grant Edwards wrote:
>
>>
>> We're talking about Unix.
>> We're not talking about CP/M, DOS, RSX-11m, Apple-SOS, etc.
>
> That's just your assumption.

If you go back and look at my original posting in this thread, here's
what I was replying to:

In the Unix world, which includes OS X, text tools tend to have
difficulty with tabs. Or try naming a file with a newline or

carriage return in the file name, or a NULL byte. "Works fine" is
not how I would describe it.

I think that was pretty much the only quoted text in my posting, and
my question about how to create such a file was immediately below that
paragraph, so I'm surprised that somebody would infer I was replying
to something else.



> Track back up the thread and you'll see that the OP didn't mention an
> OS.

True, but I wasn't replying to the OP. I was replying to a statement
about how applications "in the Unix world" behave when presented with
a filename containing a null byte. I thought it obvious that my
question was regarding "the unix world".

> He merely said that he was using zlib, and getting unfortunate
> results when he handled its output, so he could have been using any
> OS.
>
> Rhodri James assumed that the OS was Windows, but it wasn't until the
> 6th post that Steven D'Aprano mentioned Unix and null characters.

And it was Steven D'Aprano's post to which I was replied with a
question about how such a file could be created.

> I got sucked into the null trap - sorry - because I actually meant to
> generalise the discussion into ways of getting a range of unwanted
> characters into a file name and why its unwise to use a filename
> without checking it for characters the OS doesn't like before
> creating a file with it.

I'm not disagreeing that in Unix you can create filenames with all
sorts of inadvisable properties. 30 years ago I found backspaces and
cursor control escape sequences to particularly amusing the first time
I realized you could put them in filenames (that was under VMS, but
you could do the same thing under Unix or most of the other OSes I've
used).

--
Grant

Steven D'Aprano

unread,
Oct 16, 2010, 2:56:58 AM10/16/10
to
On Fri, 15 Oct 2010 20:07:37 +0000, Grant Edwards wrote:

> Nor I, which is why I was confused by the statement that in the "Unix
> world" a lot of programs misbehaved when presented with files whose
> names contained a null byte.

That's not what I said. I said, TRY to create a file with a null byte in
the name, the implication being that you *can't do it*.

Okay, I accept that I muddied the water by also mentioning file names
with newlines in them, which can be created but cause havoc. But the
distinction was clear in my head, and if I can't expect people to read my
mind, then the terrorists will have won!


--
Steven

Grant Edwards

unread,
Oct 16, 2010, 10:13:35 AM10/16/10
to
On 2010-10-16, Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> wrote:
> On Fri, 15 Oct 2010 20:07:37 +0000, Grant Edwards wrote:
>
>> Nor I, which is why I was confused by the statement that in the "Unix
>> world" a lot of programs misbehaved when presented with files whose
>> names contained a null byte.
>
> That's not what I said. I said, TRY to create a file with a null byte in
> the name, the implication being that you *can't do it*.

You're right. I missed that.

> Okay, I accept that I muddied the water by also mentioning file names
> with newlines in them, which can be created but cause havoc. But the
> distinction was clear in my head, and if I can't expect people to
> read my mind, then the terrorists will have won!

No, you were clear enough, I just misread it as Unix programs don't
deal well with filenames containing newlines or null bytes.

It was all much ado about nothing -- except I've learned that some of
the more common underlying Unix filesystems _do_ allow null bytes in
filenames, but the intervening API won't.

--
Grant


Reply all
Reply to author
Forward
0 new messages