I installed a brand new linux Mandrake and I noticed a strange thing when
using the shebang in a python script.
$ which python
/usr/bin/python
$ python
...
>>>
$ /usr/bin/python
...
>>>
myscript.py:
----------
#!/usr/bin/python
...
----------
$ chmod +x myscript.py
$ ./myscript.py
error : interpreter not found. (!!!)
The shebang works for #!/bin/sh (and others)
Did I miss something ?
Thanks in advance for any hint.
--Gilles
There's a very subtle bug (feature?) in bash (and maybe other shells?)
that will generate this error if the line is terminated with a CR/LF
pair instead of just a linefeed.
Your script came from a DOS/Windows system and the CR's weren't
stripped. If you copied using FTP make sure to use "text" mode. If you
extracted a zip archive using the "unzip" command use "-a" to convert
text files.
Van
Ensure your very first line _is_ exactly the one you _see_ , i.e.
`#!/usr/bin/python\n', without any additional invisible character;
trailing `\x00\n', for instance, would make you believe your script
runs fine but does nothing.
I bet bash nerves shall be soothed if you delete then quietly rewrite
this line.
This script is a copy from a FAT32 (windows) partition. Emacs carried on
saving silently this file with DOS like end of lines.
A "dos2unix" fixed this.
--Gilles
> This script is a copy from a FAT32 (windows) partition. Emacs carried on
> saving silently this file with DOS like end of lines.
> A "dos2unix" fixed this.
Not silently. You'll be seeing a "(DOS)" in your emacs mode line if you edit
a file in DOS mode.
Ganesan
--
Ganesan R (rganesan at debian dot org) | http://www.debian.org/~rganesan/
1024D/5D8C12EA, fingerprint F361 84F1 8D82 32E7 1832 6798 15E0 02BA 5D8C 12EA
> There's a very subtle bug (feature?) in bash (and maybe other shells?)
> that will generate this error if the line is terminated with a CR/LF
> pair instead of just a linefeed.
Yes, it's common to other shells. It's not a feature, but it's not a
"bug" per se -- using CR LF terminated text files on Unix is operator
error.
--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
__ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
/ \ It's soulful music. It doesn't necessarily sound like ... soul ...
\__/ Sade Adu
Well guess what, it happens. It even happens when the "operator" is
aware of the problem. So lets not call it a bug, and instead call it
poor programming because the error message is not only incorrect, but
will also waste a fair amount of time of someone trying to debug the
problem because it points them in the wrong direction.
> Well guess what, it happens.
The other platforms don't handle different end-of-line sequences when
the operating system counts on it, either. This really has nothing
specific to do with Unix.
--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
__ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
/ \ Success and failure are equally disastrous.
\__/ Tennessee Williams
It's understandable once you realize that the shell
thinks the '\r' is part of the filename. Just like
os.execv ('/usr/bin/python\r', ('myfile.py',))
Regards. Mel.
It may be the kernel rather than the shell...
--
Greg Ewing, Computer Science Dept,
University of Canterbury,
Christchurch, New Zealand
http://www.cosc.canterbury.ac.nz/~greg
IMHO, it's "buggy" :
* most unix/windows/macos modern tools accept any end of line mark, like
Python does.
* the errror message points the user to a bad direction.
--Gilles
So fix it. Something like this should do the trick:
$ su -
Password: 1234
% for d in /bin /usr/bin /usr/local/bin; do cd $d; for i in *; do ln -s $i
`printf '%s\r' $i`; done; done
% exit
Jeff
But how many people use \r at the end of filenames? Or are even aware
that they can?
Even if it isn't a bug, it's a feature that causes more harm than
good.
You seem to be under the impression that this is some "process the \r at
the end of the filename" feature. It isn't. The kernel will treat
everything from the shebang to the linefeed as the command-line to be
used; there's no special "feature" specifically spotting a rogue
character and tripping the foolish user up.
This is simple, known, documented behaviour. If other systems place
foreign characters in the shebang line, it's up to the *user* to know
that; the kernel does what it's told. I certainly don't want the kernel
having special-case, workaround code for line-ending confusions that are
nothing to do with it.
When writing shell scripts, there are many things to learn; line endings
is but one of them. When moving files between operating systems, there
are many things to learn; the differences in line endings is but one of
them.
It's not the job of the kernel to protect the user from herself. That's
the job of userspace programs -- or meatspace processes :-)
--
\ "A man may be a fool and not know it -- but not if he is |
`\ married." -- Henry L. Mencken |
_o__) |
http://bignose.squidly.org/ 9CFE12B0 791A4267 887F520C B7AC2E51 BD41714B
> But how many people use \r at the end of filenames? Or are even aware
> that they can?
>
> Even if it isn't a bug, it's a feature that causes more harm than
> good.
It's simply an end-of-line issue. The "bug" here is that DOS chose to
use CR LF as the end-of-line terminator. (Mac gets even fewer points,
since it chose to do deliberately do smoething even more different.)
This has nothing to do with Unix, it's an inherent difference between
platforms. The platforms are not the same; if you pretend like they are
then you'll continually run into problems.
--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
__ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
/ \ I go out with actresses because I'm not apt to marry one.
\__/ Henry Kissinger
>It's simply an end-of-line issue. The "bug" here is that DOS chose to
>use CR LF as the end-of-line terminator. (Mac gets even fewer points,
>since it chose to do deliberately do smoething even more different.)
>This has nothing to do with Unix, it's an inherent difference between
>platforms. The platforms are not the same; if you pretend like they are
>then you'll continually run into problems.
From another angle, I've been using the program ws-ftp to
move files betweem M$ systems and *nix systems, and it fixes
all the line-end problems for me. I only get bitten now if I
move files another way.
Regards. Mel.
All FTP programs have considerations for encoding translations,
from simple ones such as ASCII and BINARY/IMAGE, to others
more obscure such as TENEX, EBCDIC, etc.
-gustavo
--
Advertencia:La informacion contenida en este mensaje es confidencial y
restringida, por lo tanto esta destinada unicamente para el uso de la
persona arriba indicada, se le notifica que esta prohibida la difusion de
este mensaje. Si ha recibido este mensaje por error, o si hay problemas en
la transmision, favor de comunicarse con el remitente. Gracias.
> From another angle, I've been using the program ws-ftp to
> move files betweem M$ systems and *nix systems, and it fixes
> all the line-end problems for me. I only get bitten now if I
> move files another way.
Any FTP program will work, if you transfer them in ASCII mode.
--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
__ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
/ \ I just want to create my own life. I can just be me.
\__/ Ekene Nwokoye
That's debatable. At least the Mac still only uses *one*
character for each end-of-line...
> Erik Max Francis wrote:
>
> > (Mac gets even fewer points,
> > since it chose to do deliberately do smoething even more different.)
>
> That's debatable. At least the Mac still only uses *one*
> character for each end-of-line...
At the time the Mac was created, line endings of LF (Unix) and CR LF
(CP/M, DOS) were common. The only reason you'd choose CR is to do
something different. Their implementation of data forks vs. resource
forks are another example of Apple doing something different simply for
its own sake, which introduce massive interplatform compatibility
problems.
--
Erik Max Francis && m...@alcyone.com && http://www.alcyone.com/max/
__ San Jose, CA, USA && 37 20 N 121 53 W && &tSftDotIotE
/ \ The doors of Heaven and Hell are adjacent and identical.
\__/ Nikos Kazantzakis
That's all fine, but the Mac does exhibit one problem I can't fathom:
If you have two or more Python installations, the first one in
your path gets invoked no matter what the shebang line says.
If the first line of a script is #!/usr/local/bin/python, I expect the
interpreter located in /usr/local/bin to execute the script, not the
one in /usr/bin, or the one in /sw/bin, but that is what you get if
you run the script as an executable.
The process list shows why - python is called without a path, e.g. as
"python". The same behavior occurs if the shell is bash or tcsh.
As far as I know, OS X is the only "modern" Unix to behave this way.
This behavior caused me a large amount of angst until I figured out
what the problem was. I had to resort to writing a script that
renames the pythons in /usr/bin/ and /sw/bin when I don't want them
and renames them back if and when I do.
I mostly use Python 2.3 from CVS, but I haven't been able to get
mod_python working with a Framework build yet, so I need to switch
to Fink's python once in a while.
>"Greg Ewing (using news.cis.dfn.de)" wrote:
>
>> Erik Max Francis wrote:
>>
>> > (Mac gets even fewer points,
>> > since it chose to do deliberately do smoething even more different.)
>>
>> That's debatable. At least the Mac still only uses *one*
>> character for each end-of-line...
>
>At the time the Mac was created, line endings of LF (Unix) and CR LF
>(CP/M, DOS) were common. The only reason you'd choose CR is to do
>something different. Their implementation of data forks vs. resource
>forks are another example of Apple doing something different simply for
>its own sake, which introduce massive interplatform compatibility
>problems.
I'm not sure it's entirely fair to jump to the conclusion that Apple chose CR
as line end only to be different.
My interpretation is that going from CRLF to a single character signalled
a switch from hardware-control semantics to symbolic semantics. I.e., CR and LF
originally literally referred to printing hardware with a physical "carriage" like
a typewriter or -- keeping paper handling stationary -- a moving print head,
and line feeding actually fed paper a line at a time. They're still used when
dealing with devices and emulated/simulated devices, of course.
The CR is what you get when you hit the Enter key, so Apple did the most direct thing
in using that key code as an EOL symbol. Perhaps they thought that was "cleaner" and
that they would lead the way to a cleaner standard way of doing things when they
achieved market dominance ;-)
Of course, LF has (IMO) a better semantic relationship to the EOL meaning, so translating
the Enter key to LF seems better than CR. Either way, ISTM yet another symptom of what happens
when the hardware-oriented evolves towards the abstract. You wind up with vestigial hardware
semantics in abstract contexts where they don't really belong, e.g., as in one of my pet peeves:
drive letters in file paths.
Regards,
Bengt Richter
Tru64 (5.1) also shows this behavior (which recently bit me too), but
it's arguably a bug in Python rather than in the OS. If you look
carefully, I think you'll find that the correct binary (e.g.,
/usr/local/bin/python) is in fact being invoked, but that that binary
then uses the libraries associated with the first python in your PATH.
The reason this is happening is that python determines where all of
its libraries live by examining argv[0], if a more suitable method is
not available. If this gives the full path, everything is fine, but
if only the basename is given ("python"), then the startup code walks
to the PATH to guess. As you've noticed, in some cases, this guess is
wrong.
Mike
--
Mike Coleman, Scientific Programmer, +1 816 926 4419
Stowers Institute for Biomedical Research
1000 E. 50th St., Kansas City, MO 64110
I never use True64, but my company does, so I'm glad you identified
the same problem on that platform. argv[0] should contain the full
path to the interpreter and it does not, which makes me believe
this is an OS error, not a Python error, except you could argue that
relying on argv is not a platform independent way to find the
correct path.
If I could get the Panther install CD to boot on my PowerBook, I
could see if this is still going to be a problem in the future.
I think the terminology is not taken from typewriters, but from some
old printers where you needed both characters to start a new line.
CR moved the print head to the beginning of the line and LF moved the
paper one line. It can't be compared with a typewriter, where the
[Enter] key did both operations. The Microsoft (other operating
systems also had similar EOF) way is actually the "correct" way, since
the "cursor" needs to move down one line and start at the beginning.
The Unix way is of cource more elegant, because you have a digital
computer and not some mechanical device. It doesn't matter if it's CR
or LF, because both characters only does half of the operation. Apple
should have chosen LF to preserve compatibillity.
CR stands for "carriage return". If you're talking about a print head
moving across the paper, you're no longer talking about a carriage
"returning", so the terminology obviously didn't come from electric
printers.
Carriage Return is a direct reference to the paper carriage on a manual
typewriter. These predate electric printing machines, and thus the
terminology was borrowed when teletypes needed control codes to control
their print head.
On such typewriters, the "line feed" function was also separate; once
the carriage was returned to the start of the line, one could cause
the paper to feed up a line at a time to introduce more vertical space;
this didn't affect the position of the paper carriage, so was
conceptually a separate operation.
So, it was teletypes that needlessly preserved the CR and LF as separate
control operations, due to the typewriter-based thinking of their
designers. If they'd been combined into the one operation, we would
have all the same functionality but none of the confusion over line
ending controls.
--
\ "I installed a skylight in my apartment. The people who live |
`\ above me are furious!" -- Steven Wright |
> CR moved the print head to the beginning of the line and LF moved the
> paper one line. It can't be compared with a typewriter, where the
> [Enter] key did both operations.
What [Enter] key? In a *proper* typewriter the act of ending one line and
starting another was effected by using the "carriage return lever", which
physically moved the platen back to the left margin, and incidentally also
fed the paper through the platen. Most typewriters could be set to feed one,
one-and-a-half or two lines. Remember, in these devices the printing
position was fixed (no print head) and the paper had to be moved along each
time a character was printed.
> The Microsoft (other operating systems also had similar EOF)
EOF means end of file, usually. You seem to be a bit confused. Don't worry,
me too.
> way is actually the "correct" way, since
> the "cursor" needs to move down one line and start at the beginning.
Ah, so Microsoft are "correct" because they choose a system that corresponds
to a typewriting device you don't understand and are probably too young to
remember. I see.
> The Unix way is of cource more elegant, because you have a digital
> computer and not some mechanical device. It doesn't matter if it's CR
> or LF, because both characters only does half of the operation. Apple
> should have chosen LF to preserve compatibillity.
For that matter it might just as well have been ESC or any other arbitrary
character value - clearly a single character will suffice to delimit a line.
I fail to see why Apple should have chosen LF to preserve compatibility with
Unix if Microsoft are "correct". But then I'm just a crotchety old fartbot,
and you're just a mad surfer.
mess-with-old-farts-at-your-peril-ly y'rs - steve
--
Steve Holden http://www.holdenweb.com/
Python Web Programming http://pydish.holdenweb.com/pwp/
> CR stands for "carriage return". If you're talking about a print head
> moving across the paper, you're no longer talking about a carriage
> "returning", so the terminology obviously didn't come from electric
> printers.
>
> Carriage Return is a direct reference to the paper carriage on a manual
> typewriter. These predate electric printing machines, and thus the
> terminology was borrowed when teletypes needed control codes to control
> their print head.
let's see: the first typewriters arrived in 1873 or so, and the baudot tele-
printer was patented in 1874. sounds like parallel development to me.
the first electrical teletypes were, as far as I can tell, modified typewriters.
> So, it was teletypes that needlessly preserved the CR and LF as separate
> control operations, due to the typewriter-based thinking of their designers.
I think you're seriously underestimating the effort it took to build an
electromechanical telegraph machine at the very beginning of the 20th
century.
</F>
Although in actual fact the KSR33 teletype did need a fifth of a second to
guarantee that the print head would have returned to the left margin from
column 72 haracters was a "feature". Sometimes you would (all right, *I*
would) depress the two keys in the wrong order, and the result was that you
would see a single character printed in the middle of the new line during
the "flyback" period.
mobile-mine-of-useless-information-ly y'rs - steve
Further highlighting the foolishness of keeping them as separate
operations. If they interacted in this non-intuitive and damaging way,
the "go to the start of the next line" should have been a single
transaction for the user, with the implementation deciding how to carry
it out.
--
\ "I spent all my money on a FAX machine. Now I can only FAX |
`\ collect." -- Steven Wright |
Having CR/LF terminate a line makes reasonable sense for a teletype,
or for a simple printer. Effects such as underline, strikethrough, and
double-print (aka boldface) are accomplished by outputting new
characters on the same line, which can be accomplished by outputting a
CR without a LF.
Similarly, this places double-spacing (two LFs per CR) in the hands of
the controlling machine, rather than the (presumably dumb) slave
device.
One might argue that special control characters would be more
appropriate for these uncommon cases. Then the more common case would
take only a single character, and the special cases two. Most of the
manual typewriters I've used employ a variation on this scheme: a
single "global variable" mechanical switch controls double or single
spacing, and you can overtype only by executing a CR/LF and then
unrolling the LF.
The CR/LF model does have a simpler implementation, though: the
problem of managing linefeeds is passed up to the application. And, as
we know, worse is better (http://www.jwz.org/doc/worse-is-better.html).
--G.
--
Geoff Gerrietts "A little sincerity is a dangerous thing,
geoff @ gerrietts.net and a great deal of it is absolutely fatal."
http://www.gerrietts.net/ --Oscar Wilde
What I am trying to remember is how the Friden flexowriter that was connected
to the LGP-30 I once coded for worked re CR/LF. It definitely had a moving carriage
and moved like a typewriter, but ISTR that one key would do the CRLF. But there was
more than one model, and I suspect there was one that had separate CR/LF codes/functions.
Maybe one we used with a PDP-8 later ;-)
Regards,
Bengt Richter
>So, it was teletypes that needlessly preserved the CR and LF as separate
>control operations, due to the typewriter-based thinking of their
>designers. If they'd been combined into the one operation, we would
>have all the same functionality but none of the confusion over line
>ending controls.
It wasn't needless. A CR with no LF was often used for overstriking,
as a way of extending the rather limited character set. 'O'
overstruck with '-' would make a passable \Theta, for instance..
Overstriking is amply catered for with the BS (BackSpace) control code.
For the "theta" example: emit a 'O', emit a BS, emit a '-'. In fact,
this method continues today in some *roff outputs, for bold or digraph
characters. The CR is completely redundant for this purpose.
I maintain that the CR and LF were needlessly preserved as separate
operations, with no benefit.
--
\ "Yesterday I parked my car in a tow-away zone. When I came back |
`\ the entire area was missing." -- Steven Wright |
Yes, I was thinking about teletypewriter, but couldn't remember the
name.
> EOF means end of file, usually. You seem to be a bit confused. Don't worry,
> me too.
>
Not confused, just a typo :-(
> > way is actually the "correct" way, since
> > the "cursor" needs to move down one line and start at the beginning.
>
> Ah, so Microsoft are "correct" because they choose a system that corresponds
> to a typewriting device you don't understand and are probably too young to
> remember. I see.
>
I was told this by a person I assumed had knowledge of this (he had
used these devices). I should learn to do some research before making
such statements in public.
> > The Unix way is of cource more elegant, because you have a digital
> > computer and not some mechanical device. It doesn't matter if it's CR
> > or LF, because both characters only does half of the operation. Apple
> > should have chosen LF to preserve compatibillity.
>
> For that matter it might just as well have been ESC or any other arbitrary
> character value - clearly a single character will suffice to delimit a line.
> I fail to see why Apple should have chosen LF to preserve compatibility with
> Unix if Microsoft are "correct". But then I'm just a crotchety old fartbot,
> and you're just a mad surfer.
>
I said that using just one character is more elegant, and I think
Apple was right when they chose a single character. I have no real
knowledge of the different line-endings that were used at the time,
but I assume that LF (alone) was more frequently used than CR (alone)
or other characters. If I'm right, the Apple's choice was less
compatible.
Doesn't really matter - I already told you I'm a crotchety old fartbot :-)
So, FWIW, I agree with you that a single character makes hugely more sense
than CRLF. Apple are well-known for making self-limiting decisions :-)
regards
Although the first is correct (using BS to get the effect) it implies
that the printer/typewriter that you want to make this effect on, is
able to backspace _precicely_ whereas the CR method can be made simple
and easy with a mechanical(?) stopper, that ensures that you always
start at the same point.
>
> I maintain that the CR and LF were needlessly preserved as separate
> operations, with no benefit.
>
--
Med Venlig Hilsen / Regards
Kim Petersen - Kyborg A/S (Udvikling)
IT - Innovationshuset
Havneparken 2
7100 Vejle
Tlf. +4576408183 || Fax. +4576408188
I disagree. IIRC there were plenty of printers where the carriage
return operation took a disproportionately long time, so therefore
backspacing was the prefered method for overstriking, but at the same
time that meant line feeds were a big performance win for smart
applications.
I still have some saved "graphics" output from my NEC Spinwriter
generated by printing a dot "." combined with micro-linefeeds. Ah, the
joys of doing "screenshots" of UCSD Pascal turtlegraphics on an Apple
II... hmmm, think I'm dating myself?
Van
Fair enough then; the operations were kept separate for a reason
justifiable at the time.
Sadly, now we have to live with the legacy of the resulting confusion,
long after those benefits are obsolete.
--
\ "Here is a test to see if your mission on earth is finished. If |
`\ you are alive, it isn't." -- Francis Bacon |