This morning I am working though Building Skills in Python and was
having problems with string.strip.
Then I found the input file I was using was in DOS format and I
thought it be best to convert it to UNIX and so I started to type perl
-i -pe 's/ and then I though, wait, I'm learning Python, I have to
think in Python, as I'm a Python newbie I fired up Google and typed:
+python convert dos to unix +one +liner
Found perl, sed, awk but no python on the first page
So I tried
+python dos2unix +one +liner -perl
Same thing..
But then I found http://wiki.python.org/moin/Powerful%20Python%20One-Liners
and tried this:
cat file.dos | python -c "import sys,re;
[sys.stdout.write(re.compile('\r\n').sub('\n', line)) for line in
sys.stdin]" >file.unix
And it works..
[10:31:11 incc-imac-intel ~/python] cat -vet file.dos
one^M$
two^M$
three^M$
[10:32:10 incc-imac-intel ~/python] cat -vet file.unix
one$
two$
three$
But it is long and just like sed does not do it in place.
Is there a better way in Python or is this kind of thing best done in
Perl ?
Thanks,
Jerry
- Oneline through away script with re as a built in syntax, yup that
sounds like perl to me.
- What is wrong with making an executable script (not being one line)
and call that, this is even shorter.
- ... wait a minute, you are building something in python (problem with
string.strip - why don't you use the built-in string strip method
instead?) which barfs on the input (win/unix line ending), should the
actual solution not be in there, i.e. parsing the line first to check
for line-endings? .. But wait another minute, why are you getting \r\n
in the first place, python by default uses universal new lines?
Hope that helps a bit, maybe you could post the part of the code what
you are doing for some better suggestions.
--
mph
> But then I found
> http://wiki.python.org/moin/Powerful%20Python%20One-Liners
> and tried this:
>
> cat file.dos | python -c "import sys,re;
> [sys.stdout.write(re.compile('\r\n').sub('\n', line)) for line in
> sys.stdin]" >file.unix
>
> And it works..
- Don't build list comprehensions just to throw them away, use a for-loop
instead.
- You can often use string methods instead of regular expressions. In this
case line.replace("\r\n", "\n").
> But it is long and just like sed does not do it in place.
>
> Is there a better way in Python or is this kind of thing best done in
> Perl ?
open(..., "U") ("universal" mode) converts arbitrary line endings to just
"\n"
$ cat -e file.dos
alpha^M$
beta^M$
gamma^M$
$ python -c'open("file.unix", "wb").writelines(open("file.dos", "U"))'
$ cat -e file.unix
alpha$
beta$
gamma$
But still, if you want very short (and often cryptic) code Perl is hard to
beat. I'd say that Python doesn't even try.
Peter
> cat file.dos | python -c "import sys,re;
> [sys.stdout.write(re.compile('\r\n').sub('\n', line)) for line in
> sys.stdin]" >file.unix
Holy cow!!!!!!! Calling a regex just for a straight literal-to-literal
string replacement! You've been infected by too much Perl coding!
*wink*
Regexes are expensive, even in Perl, but more so in Python. When you
don't need the 30 pound sledgehammer of regexes, use lightweight string
methods.
import sys; sys.stdout.write(sys.stdin.read().replace('\r\n', '\n'))
ought to do it. It's not particularly short, but Python doesn't value
extreme brevity -- code golf isn't terribly exciting in Python.
[steve@sylar ~]$ cat -vet file.dos
one^M$
two^M$
three^M$
[steve@sylar ~]$ cat file.dos | python -c "import sys; sys.stdout.write
(sys.stdin.read().replace('\r\n', '\n'))" > file.unix
[steve@sylar ~]$ cat -vet file.unix
one$
two$
three$
[steve@sylar ~]$
Works fine. Unfortunately it still doesn't work in-place, although I
think that's probably a side-effect of the shell, not Python. To do it in
place, I would pass the file name:
# Tested and working in the interactive interpreter.
import sys
filename = sys.argv[1]
text = open(filename, 'rb').read().replace('\r\n', '\n')
open(filename, 'wb').write(text)
Turning that into a one-liner isn't terribly useful or interesting, but
here we go:
python -c "import sys;open(sys.argv[1], 'wb').write(open(sys.argv[1],
'rb').read().replace('\r\n', '\n'))" file
Unfortunately, this does NOT work: I suspect it is because the file gets
opened for writing (and hence emptied) before it gets opened for reading.
Here's another attempt:
python -c "import sys;t=open(sys.argv[1], 'rb').read().replace('\r\n',
'\n');open(sys.argv[1], 'wb').write(t)" file
[steve@sylar ~]$ cp file.dos file.txt
[steve@sylar ~]$ python -c "import sys;t=open(sys.argv[1], 'rb').read
().replace('\r\n', '\n');open(sys.argv[1], 'wb').write(t)" file.txt
[steve@sylar ~]$ cat -vet file.txt
one$
two$
three$
[steve@sylar ~]$
Success!
Of course, none of these one-liners are good practice. The best thing to
use is a dedicated utility, or write a proper script that has proper
error testing.
> Is there a better way in Python or is this kind of thing best done in
> Perl ?
If by "this kind of thing" you mean text processing, then no, Python is
perfectly capable of doing text processing. Regexes aren't as highly
optimized as in Perl, but they're more than good enough for when you
actually need a regex.
If you mean "code golf" and one-liners, then, yes, this is best done in
Perl :)
--
Steven
> On Sat, 27 Feb 2010 10:36:41 +0100, @ Rocteur CC wrote:
>
>> cat file.dos | python -c "import sys,re;
>> [sys.stdout.write(re.compile('\r\n').sub('\n', line)) for line in
>> sys.stdin]" >file.unix
>
> Holy cow!!!!!!! Calling a regex just for a straight literal-to-literal
> string replacement! You've been infected by too much Perl coding!
Thanks for the replies I'm looking at them now, however, for those who
misunderstood, the above cat file.dos pipe pythong does not come from
Perl but comes from:
http://wiki.python.org/moin/Powerful%20Python%20One-Liners
> Apply regular expression to lines from stdin
> [another command] | python -c "import sys,re;
> [sys.stdout.write(re.compile('PATTERN').sub('SUBSTITUTION', line))
> for line in sys.stdin]"
Nothing to do with Perl, Perl only takes a handful of characters to do
this and certainly does not require the creation an intermediate file,
I simply found the above example on wiki.python.org whilst searching
Google for a quick conversion solution.
Thanks again for the replies I've learned a few things and I
appreciate your help.
Jerry
Perl may be better for you for throw-away code. Use Python for the code you want to keep (and read and understand later).
S
See:
http://partmaps.org/era/unix/award.html
Stefan
> Nothing to do with Perl, Perl only takes a handful of characters to do
> this and certainly does not require the creation an intermediate file,
Are you sure about that?
Or does it just hide the intermediate file from you the way
that sed -i does?
--
Grant
Amusing how long those Python toes can be. In several replies I have
noticed (often clueless) opinions on Perl. When do people learn that a
language is just a tool to do a job?
--
John Bokma j3b
Hacking & Hiking in Mexico - http://johnbokma.com/
http://castleamber.com/ - Perl & Python Development
Steven is right with the "Holy Cow" and multiple exclamation marks.
For those unfamiliar with that, just google "multiple exclamation marks", I
think that should work... ;-)
Not only is a regular expression overkill & inefficient, but the snippet also
needlessly constructs an array with size the number of lines.
Consider instead e.g.
<hack>
import sys; sum(int(bool(sys.stdout.write(line.replace('\r\n','\n')))) for line
in sys.stdin)
</hack>
But better, consider that it's less work to save the code in a file than copying
and pasting it in a command interpreter, and then it doesn't need to be 1 line.
>> Apply regular expression to lines from stdin
>> [another command] | python -c "import
>> sys,re;[sys.stdout.write(re.compile('PATTERN').sub('SUBSTITUTION',
>> line)) for line in sys.stdin]"
>
>
> Nothing to do with Perl, Perl only takes a handful of characters to do
> this and certainly does not require the creation an intermediate file, I
> simply found the above example on wiki.python.org whilst searching
> Google for a quick conversion solution.
>
> Thanks again for the replies I've learned a few things and I appreciate
> your help.
Cheers,
- Alf
> On 27 Feb 2010, at 12:44, Steven D'Aprano wrote:
>
>> On Sat, 27 Feb 2010 10:36:41 +0100, @ Rocteur CC wrote:
>>
>>> cat file.dos | python -c "import sys,re;
>>> [sys.stdout.write(re.compile('\r\n').sub('\n', line)) for line in
>>> sys.stdin]" >file.unix
>>
>> Holy cow!!!!!!! Calling a regex just for a straight literal-to-literal
>> string replacement! You've been infected by too much Perl coding!
>
> Thanks for the replies I'm looking at them now, however, for those who
> misunderstood, the above cat file.dos pipe pythong does not come from
> Perl but comes from:
>
> http://wiki.python.org/moin/Powerful%20Python%20One-Liners
Whether it comes from Larry Wall himself, or a Python wiki, using regexes
for a simple string replacement is like using an 80 lb sledgehammer to
crack a peanut.
>> Apply regular expression to lines from stdin [another command] | python
>> -c "import sys,re;
>> [sys.stdout.write(re.compile('PATTERN').sub('SUBSTITUTION', line)) for
>> line in sys.stdin]"
And if PATTERN is an actual regex, rather than just a simple substring,
that would be worthwhile. But if PATTERN is a literal string, then string
methods are much faster and use much less memory.
> Nothing to do with Perl, Perl only takes a handful of characters to do
> this
I'm sure it does. If I were interested in code-golf, I'd be impressed.
> and certainly does not require the creation an intermediate file,
The solution I gave you doesn't use an intermediate file either.
*slaps head and is enlightened*
Oh, I'm an idiot!
Since you're reading text files, there's no need to call
replace('\r\n','\n'). Since there shouldn't be any bare \r characters in
a DOS-style text file, just use replace('\r', '').
Of course, that's an unsafe assumption in the real world. But for a quick
and dirty one-liner (and all one-liners are quick and dirty), it should
be good enough.
--
Steven
> "sste...@gmail.com" <sste...@gmail.com> writes:
>
>> On Feb 27, 2010, at 10:01 AM, @ Rocteur CC wrote:
>>> Nothing to do with Perl, Perl only takes a handful of characters to
>>> do this and certainly does not require the creation an intermediate
>>> file
>>
>> Perl may be better for you for throw-away code. Use Python for the
>> code you want to keep (and read and understand later).
>
> Amusing how long those Python toes can be. In several replies I have
> noticed (often clueless) opinions on Perl. When do people learn that a
> language is just a tool to do a job?
I'm not sure how "use it for what it's good for" has anything to do with toes.
I've written lots of both Python and Perl and sometimes, for one-off's, Perl is quicker; if you know it.
I sure don't want to maintain Perl applications though; even ones I've written.
When all you have is a nail file, everything looks like a toe; that doesn't mean you want to have to maintain it. Or something.
S
In _theory_ you can do a simple string-replace in situ as long
as the replacement string is shorter than the original string.
But I have a hard time believing that Perl actually does it
that. Since I don't speak line-noise, will you please post the
Perl script that you claim does the conversion without creating
an intermediate file?
The only way I can think of to do a general in-situ file
modification is to buffer the entire file's worth of output in
memory and then overwrite the file after all of the processing
has finished. Python can do that too, but it's not generally a
very good approach.
--
Grant
> I'm not sure how "use it for what it's good for" has anything to do
> with toes.
I've the feeling that some people who use Python are easily offended by
everthing Perl related. Which is silly; zealotism in general is, for
that matter.
> I've written lots of both Python and Perl and sometimes, for
> one-off's, Perl is quicker; if you know it.
>
> I sure don't want to maintain Perl applications though; even ones I've
> written.
Ouch, I am afraid that that tells a lot about your Perl programming
skills.
>> I sure don't want to maintain Perl applications though; even ones I've
>> written.
>
> Ouch, I am afraid that that tells a lot about your Perl programming
> skills.
Nah, it tells you about my preferences.
I can, and have, written maintainable things in many languages, including Perl.
However, I *choose* Python.
S
When do people learn that language makes a difference? I used to be a
Perl programmer; these days, you'd have to triple my not-small salary to
get me to even think about programming in Perl.
--
Aahz (aa...@pythoncraft.com) <*> http://www.pythoncraft.com/
"Many customs in this life persist because they ease friction and promote
productivity as a result of universal agreement, and whether they are
precisely the optimal choices is much less important." --Henry Spencer
> When do people learn that a
> language is just a tool to do a job?
When do people learn that there are different sorts of tools? A
professional wouldn't use a screwdriver when they need a hammer.
Perl has strengths: it can be *extremely* concise, regexes are optimized
much more than in Python, and you can do some things as a one-liner short
enough to use from the command line easily. Those are values, as seen by
the millions of people who swear by Perl, but they are not Python's
values.
If you want something which can make fine cuts in metal, you would use a
hacksaw, not a keyhole saw or a crosscut saw. If you want to cut through
an three foot tree truck, you would use a ripsaw or a chainsaw, and not a
hacksaw. If you want concise one-liners, you would use Perl, not Python,
and if you want readable, self-documenting code, you're more likely to
get it from Python than from Perl.
If every tool is the same, why aren't we all using VB? Or C, or
Javascript, or SmallTalk, or Forth, or ... ? In the real world, all these
languages have distinguishing characteristics and different strengths and
weaknesses, which is why there are still people using PL/I and Cobol as
well as people using Haskell and Lisp and Boo and PHP and D and ...
Languages are not just nebulous interchangeable "tools", they're tools
for a particular job with particular strengths and weaknesses, and
depending on what strengths you value and what weaknesses you dislike,
some tools simply are better than other tools for certain tasks.
--
Steven
dude, you nailed it. many times, if not _always_, the correct output
is important. the method used to produce the output is irrelevant.
Oh really?
Then by that logic, you would consider that these two functions are both
equally good. Forget readability, forget maintainability, forget
efficiency, we have no reason for preferring one over the other since the
method is irrelevant.
def greet1(name):
"""Print 'Hello <name>' for any name."""
print "Hello", name
def greet2(name):
"""Print 'Hello <name>' for any name."""
count = 0
for i in range(0, ("Hello", name).__len__(), 1):
word = ("Hello", name).__getitem__(i)
for i in range(0, word[:].__len__(), 1):
c = word.__getitem__(i)
import sys
import string
empty = ''
maketrans = getattr.__call__(string, 'maketrans')
chars = maketrans.__call__(empty, empty)
stdout = getattr.__call__(sys, 'stdout')
write = getattr.__call__(stdout, 'write')
write.__call__(c)
count = count.__add__(1)
import operator
eq = getattr.__call__(operator, 'eq')
ne = getattr.__call__(operator, 'ne')
if eq.__call__(count, 2):
pass
elif not ne.__call__(count, 2):
continue
write.__call__(chr.__call__(32))
write.__call__(chr.__call__(10))
return None
There ought to be some kind of competition for the least efficient
solution to programming problems-ly y'rs,
--
Steven
That wouldn't be very interesting. You could just write a code generator
that spits out tons of garbage code including a line that solves the
problem, and then let it execute the code afterwards. That beast would
always win.
Stefan
Though the idea of a code generator is solid, but instead of generating
garbage, produces a virtual machine that implements a generator that
produces a virtual machine, etc. etc.
--
mph
> On Sat, 27 Feb 2010 11:27:04 -0600, John Bokma wrote:
>
>> When do people learn that a
>> language is just a tool to do a job?
>
> When do people learn that there are different sorts of tools? A
> professional wouldn't use a screwdriver when they need a hammer.
[...]
> Languages are not just nebulous interchangeable "tools", they're tools
A hammer is just a tool to do a job. Doesn't mean one must or should use
a hammer to paint a wall.
> for a particular job with particular strengths and weaknesses, and
> depending on what strengths you value and what weaknesses you dislike,
> some tools simply are better than other tools for certain tasks.
In short, we agree.
Obfuscated code competitions could do the same: insert your simple,
straight-forward, completely unobfuscated algorithm somewhere in the
middle of 15 GB of garbage code. Who would even find it?
But they don't, because human judges decide the winner, not some silly
rule of "the most lines of code wins".
In any case, I wasn't serious. It would be a bit of fun, if you like that
sort of thing, and you might even learn a few things (I never knew that
ints don't have an __eq__ method), but I can't see it taking off. I
prefer to use my powers for inefficiency to be sarcastic to strangers on
Usenet.
--
Steven
Thinking of the international obfuscated c contest (iocc).
It is easy to make a mess of a program using the preprocessor.
It is also easy to preprocess then prettyprint the program.
If the result is not obfuscated, it impresses nobody.
Likewise the judges would think nothing of a program with garbage,
and would rate it low, so such rule is unnecessary.
>
>Though the idea of a code generator is solid, but instead of generating
>garbage, produces a virtual machine that implements a generator that
>produces a virtual machine, etc. etc.
That was actually done by Lennart Benschop. He made a Forth program
run by an interpreter written in C.
Although Forthers thought it was straightforward comprehensible
code, it was a winner in the iocc.
>
>--
>mph
Groetjes Albert
--
--
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst