I was wondering if anyone here has a moment of time to help me with 2
things that have been bugging me.
1. Multi dimensional arrays - how do you load them in python
For example, if I had:
-------
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
-------
with "i" being the row number, "j" the column number, and "k" the ..
uhmm, well, the "group" number, how would you load this ?
If fortran90 you would just do:
do 10 k=1,2
do 20 i=1,3
read(*,*)(a(i,j,k),j=1,3)
20 continue
10 continue
How would the python equivalent go ?
2. I've read the help on the next one but I just find it difficult
understanding it.
I have;
a=2.000001
b=123456.789
c=1234.0001
How do you print them with the same number of decimals ?
(eg. 2.000, 123456.789, 1234.000)
and how do you print them with the same number of significant
decimals?
(eg. 2.000001, 123456.7, 1234.000 - always 8 decimals) ?
Is something like this possible (built-in) in python ?
Really grateful for all the help and time you can spare.
--
Ivan
>>> print '%0.3f' % 2.000001
2.000
>>> print '%0.3f' % 123456.789
123456.789
>>> print '%0.3f' % 1234.0001
1234.000
> and how do you print them with the same number of significant
> decimals?
> (eg. 2.000001, 123456.7, 1234.000 - always 8 decimals) ?
Your examples are 7 decimals (and you're not rounding).
Here's what 8 looks like (note that it's %0.7e because there
is always one digit to the left of the decimal point.)
>>> print '%0.7e' % 2.000001
2.0000010e+00
>>> print '%0.7e' % 123456.789
1.2345679e+05
>>> print '%0.7e' % 1234.0001
1.2340001e+03
If you actually meant 7, then use %0.6e:
>>> print '%0.6e' % 2.000001
2.000001e+00
>>> print '%0.6e' % 123456.789
1.234568e+05
>>> print '%0.6e' % 1234.0001
1.234000e+03
>
> Is something like this possible (built-in) in python ?
You can do more with gmpy.
>> 2. I've read the help on the next one but I just find it difficult
>> understanding it.
>> I have;
>> a=2.000001
>> b=123456.789
>> c=1234.0001
>>
Hello Mensanator, thank you for answering in such a short time.
< snip >
>If you actually meant 7, then use %0.6e:
Sorry about that; I have the habit of counting the point as a decimal
place too.
>
>>>> print '%0.6e' % 2.000001
>2.000001e+00
>>>> print '%0.6e' % 123456.789
>1.234568e+05
>>>> print '%0.6e' % 1234.0001
>1.234000e+03
>
I understood the above from help, but it's not what's been bugging me.
Mea culpa, I've defined the question in a confusing way, I see that
now. What I've meant to ask was, when I have 3 numbers, how would you
print them with the same format which would apply to them 3 numbers.
for example, I have
print a,b,c
now if I print them with
print '%12.3f' %a,b,c
the format will apply only to a, and not to b and c. I could of course
write
print '%12.3f %12.3f ... 3 times
but that is just unpractical.
Is there a way to just do something like this (not normal syntax, just
my wishful thinking):
print 3*'%12.3f' %a,b,c
(meaning - use this format for the next 3 real numbers that come
along)
--
Ivan
The Python genie grants you that wish. You were almost right:
>>> a = 2.000001
>>> b = 123456.789
>>> c = 1234.0001
>>> print (3 * '%12.3f') % (a, b, c)
2.000 123456.789 1234.000
>>> print 3 * '%12.3f' % (a, b, c)
2.000 123456.789 1234.000
>>> print 3 * '%12.3f' % a, b, c
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: not enough arguments for format string
(Note the spaces and parentheses. Python programmers thank you if put
them improving readability a little).
Bye,
bearophile
Hello bearophile, thank you for replying.
>The Python genie grants you that wish. You were almost right:
>>>> print (3 * '%12.3f') % (a, b, c)
> 2.000 123456.789 1234.000
>>>> print 3 * '%12.3f' % (a, b, c)
> 2.000 123456.789 1234.000
Works beautifully :-) Thank you!
>>>> print 3 * '%12.3f' % a, b, c
>Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
>TypeError: not enough arguments for format string
Just one more question - it's actually an extension to this one
(forgive my curiosity, but I really need this info, and searching
google always gives me the same stuff again and again) ...
a = 2.000001
b = 123456.789
c = 1234.0001
d = 98765.4321
# same as above except for d
print (3 * '%12.3f') % (a, b, c)
#this works beautifully
How to add d at the end but with a different format now, since I've
"used" the "format part" ?
Again, my weird wishful-thinking code:
print (3*'%12.3f', '%5.3f') %(a,b,c),d
>(Note the spaces and parentheses. Python programmers thank you if put
>them improving readability a little).
Yes, ok. I can agree with that - separating the format from the
variable list part sounds reasonable.
>
>Bye,
>bearophile
--
Ivan
Again, very close to the correct code:
print (3*'%12.3f' + '%5.3f') %(a,b,c,d)
Regards,
Chris
>
>
>>(Note the spaces and parentheses. Python programmers thank you if put
>>them improving readability a little).
>
> Yes, ok. I can agree with that - separating the format from the
> variable list part sounds reasonable.
>
>>
>>Bye,
>>bearophile
>
> --
> Ivan
> --
> http://mail.python.org/mailman/listinfo/python-list
>
--
Follow the path of the Iguana...
http://rebertia.com
> a = 2.000001
> b = 123456.789
> c = 1234.0001
> d = 98765.4321
> # same as above except for d
>
> print (3 * '%12.3f') % (a, b, c)
> #this works beautifully
>
> How to add d at the end but with a different format now, since I've
> "used" the "format part" ?
>
> Again, my weird wishful-thinking code: print (3*'%12.3f', '%5.3f')
> %(a,b,c),d
Maybe you should stop that wishful thinking and programming by accident
and start actually thinking about what the code does, then it's easy to
construct something working yourself.
The ``%`` operator on strings expects a string on the left with format
strings in it and a tuple with objects to replace the format strings
with. So you want
'%12.3f%12.3f%12.3f%5.3f' % (a, b, c, d)
But without repeating the '%12.3f' literally. So you must construct that
string dynamically by repeating the '%12.3f' and adding the '%5.3f':
In [27]: 3 * '%12.3f'
Out[27]: '%12.3f%12.3f%12.3f'
In [28]: 3 * '%12.3f' + '%5.3f'
Out[28]: '%12.3f%12.3f%12.3f%5.3f'
Now you can use the ``%`` operator on that string:
In [29]: (3 * '%12.3f' + '%5.3f') % (a, b, c, d)
Out[29]: ' 2.000 123456.789 1234.00098765.432'
(I guess there should be at least a space before the last format string.)
This time you *have* to put parenthesis around the construction of the
format string BTW because ``%`` has a higher priority than ``+``. So
implicit parentheses look like this:
3 * '%12.3f' + '%5.3f' % (a, b, c, d)
<=> 3 * '%12.3f' + ('%5.3f' % (a, b, c, d))
And there are of course not enough formatting place holders for four
objects in '%5.3f'.
It's also important to learn why your wrong codes fail. In your wishful
thinking example you will get a `TypeError` saying "unsupported operand
type(s) for %: 'tuple' and 'tuple'". That's because on the left side of
the ``%`` operator you wrote a tuple:
In [34]: (3 * '%12.3f', '%5.3f')
Out[34]: ('%12.3f%12.3f%12.3f', '%5.3f')
Ciao,
Marc 'BlackJack' Rintsch
> 1. Multi dimensional arrays - how do you load them in python For
> example, if I had:
> -------
> 1 2 3
> 4 5 6
> 7 8 9
>
> 10 11 12
> 13 14 15
> 16 17 18
> -------
> with "i" being the row number, "j" the column number, and "k" the ..
> uhmm, well, the "group" number, how would you load this ?
>
> If fortran90 you would just do:
>
> do 10 k=1,2
> do 20 i=1,3
>
> read(*,*)(a(i,j,k),j=1,3)
>
> 20 continue
> 10 continue
>
> How would the python equivalent go ?
Well, I don't know if this qualifies as equivalent:
=====
from __future__ import with_statement
from functools import partial
from itertools import islice
from pprint import pprint
def read_group(lines, count):
return [map(int, s.split()) for s in islice(lines, count)]
def main():
result = list()
with open('test.txt') as lines:
#
# Filter empty lines.
#
lines = (line for line in lines if line.strip())
#
# Read groups until end of file.
#
result = list(iter(partial(read_group, lines, 3), list()))
pprint(result, width=30)
if __name__ == '__main__':
main()
=====
The output is:
[[[1, 2, 3],
[4, 5, 6],
[7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]]]
`k` is the first index here, not the last and the code doesn't use fixed
values for the ranges of `i`, `j`, and `k`, in fact it doesn't use index
variables at all but simply reads what's in the file. Only the group
length is hard coded in the source code.
Ciao,
Marc 'BlackJack' Rintsch
I'm not sure if this is applicable to your multi-dimensional list
problem... but it sounded a bit sudoku like (with row, columns and
groups) so I thought I'd share a bit of code of developed in regards to
solving sudoku puzzles...
Given a list of 9 list elements, each with nine elements (lets call it
sudoku_grid), the following list comprehensions produce lists of indexes
into sudoku grid
vgroups = [[(x,y) for y in xrange(9)] for x in xrange(9)]
hgroups = [[(x,y) for x in xrange(9)] for y in xrange(9)]
lgroups = [[(x,y) for x in xrange(a,a+3) for y in xrange(b,b+3)]
for a in xrange(0,9,3) for b in xrange(0,9,3)]
where sudoku_grid[y][x] yields the value at position (x,y), assuming the
top left corner is indexed as (0,0)
HTH
Hello Marc, thanks for answering (on both subjects). I understand now
the logic which lays behind what you were explaining in the other one.
It cleared things quite a bit.
>Well, I don't know if this qualifies as equivalent:
>
>=====
>from __future__ import with_statement
>from functools import partial
>from itertools import islice
>from pprint import pprint
>
>
>def read_group(lines, count):
> return [map(int, s.split()) for s in islice(lines, count)]
>
>def main():
> result = list()
> with open('test.txt') as lines:
> lines = (line for line in lines if line.strip())
> result = list(iter(partial(read_group, lines, 3), list()))
> pprint(result, width=30)
>if __name__ == '__main__':
> main()
>=====
I'm afraid I must admit I find the code above totally uncomprehesible
(I can't even see where the array here is mentioned - "result"?) and
inpractical for any kind of engineering practice I had in mind.
Does python, perchance, have some wrapper functions or something,
which would allow one to load an array in a more natural "technical"
way ? Like something mentioned above in my post (now deleted) ?
Also, is there a way to change counter for arrays to go from 0 to 1 ?
(first element being with the index 1) ?
(probably not since that seems like a language implementation thing,
but it doesn't hurt to ask)
--
Ivan
> On 30 Sep 2008 07:07:52 GMT, Marc 'BlackJack' Rintsch <bj_...@gmx.net>
> wrote:
>>=====
>>from __future__ import with_statement from functools import partial
>>from itertools import islice
>>from pprint import pprint
>>
>>
>>def read_group(lines, count):
>> return [map(int, s.split()) for s in islice(lines, count)]
>>
>>def main():
>> with open('test.txt') as lines:
>> lines = (line for line in lines if line.strip())
>> result = list(iter(partial(read_group, lines, 3), list()))
>> pprint(result, width=30)
>>
>>if __name__ == '__main__':
>> main()
>>=====
>
> I'm afraid I must admit I find the code above totally uncomprehesible (I
> can't even see where the array here is mentioned - "result"?) and
> inpractical for any kind of engineering practice I had in mind.
Learn Python then to understand that code. ;-)
There is no array. The data type is called "list" in Python, so `result`
is a nested list. And in Python it quite unusual to build lists by
creating them with the final size filled with place holder objects and
then fill the real values in. Instead lists are typically created by
appending values to existing lists, using list comprehension or the
`list()` function with some iterable object.
Typical Python code tries to minimize the use of index variables. Python
is not Fortran (or C, or Pascal, …).
> Does python, perchance, have some wrapper functions or something, which
> would allow one to load an array in a more natural "technical" way ?
> Like something mentioned above in my post (now deleted) ?
>
> Also, is there a way to change counter for arrays to go from 0 to 1 ?
You can write your own sequence type but that would be odd because the
rest of the language expects zero as the first index, so you will be
constantly fighting the language by adding or subtracting 1 all the time
at the "border" between your custom sequence type and the the rest of
Python.
Ciao,
Marc 'BlackJack' Rintsch
Since you're coming from the FORTRAN world (thank you for
that stroll down Memory Lane), you might be doing scientific
computations, and so might be interested in the SciPy
package (Google scipy), which gives you arrays and matrices.
Don't expect to be able to use it without learning some Python,
though.
--
To email me, substitute nowhere->spamcop, invalid->net.
You would drag yourself out of the 1960s, install numpy, and
then do something like this:
a = read_array(open("filename.dat","r"))
> Since you're coming from the FORTRAN world (thank you for that
> stroll down Memory Lane), you might be doing scientific
> computations, and so might be interested in the SciPy package
> (Google scipy), which gives you arrays and matrices. Don't
> expect to be able to use it without learning some Python,
> though.
If not full-up scipy (which provides all sorts of scientific
and numerical-analysis stuff), then at least numpy (which
provides the basic array/matrix operations:
Though the software is free, the documentation isn't. You've
got to buy the book if you want something to read. IMO, it's
definitely worth it, and a good way to support the project even
if you don't really need something to keep your bookends apart.
Scientific Python is something else the OP might be interested
in. Yes, Scientific Python is different than SciPy:
http://dirac.cnrs-orleans.fr/plone/software/scientificpython/overview/
If you're a Windows user, I can recommend the Enthough Python
distribution. It has all sorts of numerical and scientific
"batteries included".
http://www.enthought.com/products/epd.php
It includes both scipy and scientific python as well as several
options for data visualization (e.g. matplotlib, VTK).
There's also an Enthought Python distro for Linux, but I've
never tried it. I run Gentoo Linux, and there are standard
ebuilds for pretty much all of the stuff in EPD.
--
Grant Edwards grante Yow! I've read SEVEN
at MILLION books!!
visi.com
In [64]: a = numpy.fromfile('test.txt', dtype=int, sep=' ')
In [65]: a
Out[65]:
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18])
In [66]: a.reshape(2, 3, 3)
Out[66]:
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]]])
Ciao,
Marc 'BlackJack' Rintsch
>
>Since you're coming from the FORTRAN world (thank you for
>that stroll down Memory Lane), you might be doing scientific
>computations, and so might be interested in the SciPy
>package (Google scipy), which gives you arrays and matrices.
>Don't expect to be able to use it without learning some Python,
>though.
Actually, no (regarding memory lane :-). I'm helping a friend to
translate some of my old routines to python so he can use them in his
programs.
I'm still using fortran84, and mean to continue doing so as long as
something better doesn't come along.
But as I said, got a job that't got to be done, so I'm trying to
figure out how to do array operations as easily as possible in python,
which are necessary for all my calculations.
Best regards
Ivan
> But as I said, got a job that't got to be done, so I'm trying
> to figure out how to do array operations as easily as possible
> in python, which are necessary for all my calculations.
numpy
--
Grant Edwards grante Yow! TONY RANDALL! Is YOUR
at life a PATIO of FUN??
visi.com
>>> How would the python equivalent go ?
>
> You would drag yourself out of the 1960s, install numpy, and then do
> something like this:
I think that was thoughtlessly rude to somebody who is asking a perfectly
reasonable question.
--
Steven
I would weaken that claim a tad... I'd say it is "usual" to write
something like this:
alist = []
for x in some_values:
alist.append(something_from_x)
but it is not uncommon (at least not in my code) to write something like
this equivalent code instead:
alist = [None]*len(some_values)
for i, x in enumerate(some_values):
alist[i] = something_from_x
Most often the first way is most natural, but the second way is sometimes
more natural. It will also be more familiar to somebody coming from
Fortran, and it is a micro-optimization for large lists because it
doesn't need to resize the list as it grows.
I stress the *micro*-optimization, because Python lists are highly
optimized to resize as rarely as possible and as quickly as possible, so
you're not saving much time.
And Marc, I think you're being a little unfair to the OP, who is clearly
unfamiliar with Python. I've been using Python for perhaps ten years, and
I still find your code above dense and hard to comprehend. It uses a
number of "advanced Python concepts" that a newbie is going to have
trouble with:
- the with statement acts by magic; if you don't know what it does, it's
an opaque black box.
- you re-use the same name for different uses, which can cause confusion.
- generator expressions.
- functional programming using partial.
- you call a function that uses a list comprehension with both map and
iterator slicing inside it.
No wonder the OP had trouble with it. *I* have trouble with it, and would
need to sit down at the interactive interpreter and play around with it
for a while to be sure what it actually does. If it was your intention to
make Python look as obtuse and mysterious as possible, you almost
succeeded. The one things you missed was to replace the read_group
function with a lambda in the partial.
--
Steven
> 1. Multi dimensional arrays - how do you load them in python
> For example, if I had:
> -------
> 1 2 3
> 4 5 6
> 7 8 9
>
> 10 11 12
> 13 14 15
> 16 17 18
> -------
> with "i" being the row number, "j" the column number, and "k" the ..
> uhmm, well, the "group" number, how would you load this ?
I agree that using NumPy is the way to go if you're going to do lots of
array calculations. But a plain Python code would look like this (more
comprehensible than other posted versions, I hope):
--- begin code ---
def read_group(fin, rows_per_group):
rows = []
for i in range(rows_per_group):
line = fin.readline()
row = [float(x) for x in line.split()]
rows.append(row)
fin.readline() # skip blank line
return rows
# simulate a file using a string instead
# actual code would use: fin = open(filename)
from StringIO import StringIO
fin = StringIO("""1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
""")
# read 2 groups of 3 lines each
matrix = [read_group(fin, 3) for k in range(2)]
print matrix
--- end code ---
A more compact version of read_group (collapsing all rows.append onto the
outer list comprehension):
--- begin code ---
def read_group(fin, rows_per_group):
rows = [[float(x) for x in fin.readline().split()]
for i in range(rows_per_group)]
fin.readline() # skip blank line
return rows
--- end code ---
--
Gabriel Genellina
Sheesh. I guess I should have added a smiley face.
So much for trying to be helpful.
--
Grant
> On Tue, 30 Sep 2008 14:34:31 +0000, Marc 'BlackJack' Rintsch wrote:
>
>> There is no array. The data type is called "list" in Python, so
>> `result` is a nested list. And in Python it quite unusual to build
>> lists by creating them with the final size filled with place holder
>> objects and then fill the real values in. Instead lists are typically
>> created by appending values to existing lists, using list comprehension
>> or the `list()` function with some iterable object.
>
> I would weaken that claim a tad... I'd say it is "usual" to write
> something like this:
>
> alist = []
> for x in some_values:
> alist.append(something_from_x)
>
>
> but it is not uncommon (at least not in my code) to write something like
> this equivalent code instead:
>
> alist = [None]*len(some_values)
> for i, x in enumerate(some_values):
> alist[i] = something_from_x
I have never done this, except in the beginning I used Python, and --
maybe more importantly -- I've never seen this in others code. I really
looks like a construct from someone who is still programming in some
other language(s).
> Most often the first way is most natural, but the second way is
> sometimes more natural.
When will it be more natural to introduce an unnecessary index?
> And Marc, I think you're being a little unfair to the OP, who is clearly
> unfamiliar with Python. I've been using Python for perhaps ten years,
> and I still find your code above dense and hard to comprehend. It uses a
> number of "advanced Python concepts" that a newbie is going to have
> trouble with:
>
> - the with statement acts by magic; if you don't know what it does, it's
> an opaque black box.
Everything acts by magic unless you know what it does. The Fortran
read(*,*)(a(i,j,k),j=1,3)
in the OP's first post looks like magic too. I admit that my code shows
off advanced Python features but I don't think ``with`` is one of them.
It makes it easier to write robust code and maybe even understandable
without documentation by just reading it as "English text".
> - you re-use the same name for different uses, which can cause
> confusion.
Do you mean `lines`? Then I disagree because the (duck) type is always
"iterable over lines". I just changed the content by filtering.
> - generator expressions.
>
> - functional programming using partial.
>
> - you call a function that uses a list comprehension with both map and
> iterator slicing inside it.
>
>
> No wonder the OP had trouble with it. *I* have trouble with it, and
> would need to sit down at the interactive interpreter and play around
> with it for a while to be sure what it actually does. If it was your
> intention to make Python look as obtuse and mysterious as possible, you
> almost succeeded. The one things you missed was to replace the
> read_group function with a lambda in the partial.
Well that would make the code harder to understand. ;-)
Serious, I think it should be easy to understand the code for someone who
knows Python. Yes a newbie will have trouble to understand this, but
Python is not Fortran and IMHO I haven't used something really exotic or
strange nor did I wrote convoluted and hard to understand things like
deeply nested list comprehensions.
Ciao,
Marc 'BlackJack' Rintsch
Oh the rest of your post was helpful. I think you were trying to be
funny, but I think you failed.
--
Steven
>> I would weaken that claim a tad... I'd say it is "usual" to write
>> something like this:
>>
>> alist = []
>> for x in some_values:
>> alist.append(something_from_x)
>>
>>
>> but it is not uncommon (at least not in my code) to write something
>> like this equivalent code instead:
>>
>> alist = [None]*len(some_values)
>> for i, x in enumerate(some_values):
>> alist[i] = something_from_x
>
> I have never done this, except in the beginning I used Python, and --
> maybe more importantly -- I've never seen this in others code. I really
> looks like a construct from someone who is still programming in some
> other language(s).
It occurs at least twice in the 2.5 standard library, once in
sre_parse.py:
groups = []
groupsappend = groups.append
literals = [None] * len(p)
for c, s in p:
if c is MARK:
groupsappend((i, s))
# literal[i] is already None
else:
literals[i] = s
and another time in xdrlib.py:
succeedlist = [1] * len(packtest)
count = 0
for method, args in packtest:
print 'pack test', count,
try:
method(*args)
print 'succeeded'
except ConversionError, var:
print 'ConversionError:', var.msg
succeedlist[count] = 0
count = count + 1
>> Most often the first way is most natural, but the second way is
>> sometimes more natural.
>
> When will it be more natural to introduce an unnecessary index?
We can agree that the two idioms are functionally equivalent. Appending
is marginally less efficient, because the Python runtime engine has to
periodically resize the list as it grows, and that can in principle take
an arbitrary amount of time if it causes virtual memory paging. But
that's unlikely to be a significant factor for any but the biggest lists.
So in the same way that any while-loop can be rewritten as a recursive
function, and vice versa, so these two idioms can be trivially re-written
from one form to the other. When should you use one or the other?
When the algorithm you have is conceptually about growing a list by
appending to the end, then you should grow the list by appending to the
end. And when the algorithm is conceptually about dropping values into
pre-existing pigeon holes, then you should initialize the list and then
walk it, modifying the values in place.
And if the algorithm in indifferent to which idiom you use, then you
should use whichever idiom you are most comfortable with, and not claim
there's Only One True Way to build a list.
>> And Marc, I think you're being a little unfair to the OP, who is
>> clearly unfamiliar with Python. I've been using Python for perhaps ten
>> years, and I still find your code above dense and hard to comprehend.
>> It uses a number of "advanced Python concepts" that a newbie is going
>> to have trouble with:
>>
>> - the with statement acts by magic; if you don't know what it does,
>> it's an opaque black box.
>
> Everything acts by magic unless you know what it does. The Fortran
>
> read(*,*)(a(i,j,k),j=1,3)
>
> in the OP's first post looks like magic too.
It sure does. My memories of Fortran aren't good enough to remember what
that does.
But I think you do Python a disservice. One of my Perl coders was writing
some Python code the other day, and he was amazed at how guessable Python
was. You can often guess the right way to do something. He wanted a set
with all the elements of another set removed, so he guess that s1-s2
would do the job -- and it did. A lot of Python is amazingly readable to
people with no Python experience at all. But not everything.
> I admit that my code shows
> off advanced Python features but I don't think ``with`` is one of them.
> It makes it easier to write robust code and maybe even understandable
> without documentation by just reading it as "English text".
The first problem with "with" is that it looks like the Pascal "with"
statement, but acts nothing like it. That may confuse anyone with Pascal
experience, and there are a lot of us out there.
The second difficulty is that:
with open('test.txt') as lines:
binds the result of open() to the name "lines". How is that different
from "lines = open('test.txt')"? I know the answer, but we shouldn't
expect newbies coming across it to be anything but perplexed.
Now that the newbie has determined that lines is a file object, the very
next thing you do is assign something completely different to 'lines':
lines = (line for line in lines if line.strip())
So the reader needs to know that brackets aren't just for grouping like
in most other languages, but also that (x) can be equivalent to a for-
loop. They need to know, or guess, that iterating over a file object
returns lines of the file, and they have to keep the two different
bindings of "lines" straight in their head in a piece of code that uses
"lines" twice and "line" three times.
And then they hit the next line, which includes a function called
"partial", which has a technical meaning out of functional languages and
I am sure it will mean nothing whatsoever to anyone unfamiliar to it.
It's not something that is guessable, unlike open() or len() or append().
>> - you re-use the same name for different uses, which can cause
>> confusion.
>
> Do you mean `lines`? Then I disagree because the (duck) type is always
> "iterable over lines". I just changed the content by filtering.
Nevertheless, for people coming from less dynamic languages than Python
(such as Fortran), it is a common idiom to never use the same variable
for two different things. It's not a bad choice really: imagine reading a
function where the name "lines" started off as an integer number of
lines, then became a template string, then was used for a list of
character positions...
Of course I'm not suggesting that your code was that bad. But rebinding a
name does make code harder to understand.
--
Steven
I guess the first falls into the "micro optimization" category because it
binds `groups.append` to a name to spare the attribute look up within the
loop.
Both have in common that not every iteration changes the list, i.e. the
preset values are not just place holders but values that are actually
used sometimes. That is different from creating a list of place holders
that are all overwritten in any case.
>>> - the with statement acts by magic; if you don't know what it does,
>>> it's an opaque black box.
>>
>> Everything acts by magic unless you know what it does. The Fortran
>>
>> read(*,*)(a(i,j,k),j=1,3)
>>
>> in the OP's first post looks like magic too.
>
> It sure does. My memories of Fortran aren't good enough to remember what
> that does.
>
> But I think you do Python a disservice. One of my Perl coders was
> writing some Python code the other day, and he was amazed at how
> guessable Python was. You can often guess the right way to do something.
I think my code would be as guessable to a Lisp, Scheme, or Haskell
coder. Okay, Lispers and Schemers might object the ugly syntax. ;-)
>> I admit that my code shows off advanced Python features but I don't
>> think ``with`` is one of them. It makes it easier to write robust code
>> and maybe even understandable without documentation by just reading it
>> as "English text".
>
> The first problem with "with" is that it looks like the Pascal "with"
> statement, but acts nothing like it. That may confuse anyone with Pascal
> experience, and there are a lot of us out there.
But Python is not Pascal either. Nonetheless a Pascal coder might guess
what the ``with`` does. Not all the gory details but that it opens a
file and introduces `lines` should be more or less obvious to someone who
has programmed before.
> The second difficulty is that:
>
> with open('test.txt') as lines:
>
> binds the result of open() to the name "lines". How is that different
> from "lines = open('test.txt')"? I know the answer, but we shouldn't
> expect newbies coming across it to be anything but perplexed.
Even if newbies don't understand all the details they should be
introduced to ``with`` right away IMHO. Because if you explain all the
details, even if they understand them, they likely will ignore the
knowledge because doing it right is a lot of boiler plate code. So
usually people write less robust code and ``with`` is a simple way to
solve that problem.
> Now that the newbie has determined that lines is a file object, the very
> next thing you do is assign something completely different to 'lines':
>
> lines = (line for line in lines if line.strip())
>
> So the reader needs to know that brackets aren't just for grouping like
> in most other languages, but also that (x) can be equivalent to a for-
> loop. They need to know, or guess, that iterating over a file object
> returns lines of the file, and they have to keep the two different
> bindings of "lines" straight in their head in a piece of code that uses
> "lines" twice and "line" three times.
Yes the reader needs to know a basic Python syntax construct to
understand this. And some knowledge from the tutorial about files. So
what?
> And then they hit the next line, which includes a function called
> "partial", which has a technical meaning out of functional languages and
> I am sure it will mean nothing whatsoever to anyone unfamiliar to it.
> It's not something that is guessable, unlike open() or len() or
> append().
Why on earth has everything to be guessable for someone who doesn't know
Python or even programming at all?
>>> - you re-use the same name for different uses, which can cause
>>> confusion.
>>
>> Do you mean `lines`? Then I disagree because the (duck) type is always
>> "iterable over lines". I just changed the content by filtering.
>
> Nevertheless, for people coming from less dynamic languages than Python
> (such as Fortran), it is a common idiom to never use the same variable
> for two different things. It's not a bad choice really: imagine reading
> a function where the name "lines" started off as an integer number of
> lines, then became a template string, then was used for a list of
> character positions...
Which I'm not doing at all. It has the same duck type all the time:
"iterable of lines".
> Of course I'm not suggesting that your code was that bad. But rebinding
> a name does make code harder to understand.
Introducing a new name here would be worse IMHO because then the file
object would be still reachable by a name, which it shouldn't to document
that it won't be used anymore in the following code.
Again, I don't think I have written something deliberately obfuscated,
but readable, concise, and straight forward code -- for people who know
the language of course.
If someone ask how would you write this code from language X in Python, I
actually write Python, and not something that is a 1:1 almost literal
translation of the code in language X.
*I* think I would do Python a disservice if I encourage people to
continue writing Python code as if it where language X or pretending
Python is all about "readable, executable Pseudocode for anyone". Python
has dynamic typing, first class functions, "functional" syntax
constructs, and it seems the developers like iterators and generators.
That's the basic building blocks of the language, so I use them, even in
public. :-)
Ciao,
Marc 'BlackJack' Rintsch
>> The first problem with "with" is that it looks like the Pascal "with"
>> statement, but acts nothing like it. That may confuse anyone with
>> Pascal experience, and there are a lot of us out there.
>
> But Python is not Pascal either. Nonetheless a Pascal coder might guess
> what the ``with`` does. Not all the gory details but that it opens a
> file and introduces `lines` should be more or less obvious to someone
> who has programmed before.
But that's not what the with statement does. It doesn't open a file and
it doesn't introduce lines. That's what open() does. So what you say is
"obvious" is actually wrong. To a newbie who knows nothing about context
managers, the statement
with open(filename) as lines
will look like "syntactic fat" (like syntactic sugar but harder to digest
and more fattening) for the simpler code:
lines = open(filename)
[snip]
> Even if newbies don't understand all the details they should be
> introduced to ``with`` right away IMHO. Because if you explain all the
> details, even if they understand them, they likely will ignore the
> knowledge because doing it right is a lot of boiler plate code. So
> usually people write less robust code and ``with`` is a simple way to
> solve that problem.
So what you're saying is that we should encourage cargo-cult coding.
"Write this boilerplate, because I tell you that if you do, good things
will happen."
Newbies aren't going to be writing robust code anyway. The ability to
write robust code is one of the things which distinguishes experienced
coders from newbies. If they don't understand what the code is actually
doing, they're going to make mistakes like these:
import urllib2
try:
result = urllib2.open('http://www.python.org')
except IOError, URLError:
print "Can't reach website"
except HTTPError:
print "Page not found"
[much more snippage]
> Why on earth has everything to be guessable for someone who doesn't
> know Python or even programming at all?
Oh please. Don't take my words out of context. I'm not talking about
"everything", and I'm not suggesting that advanced programming features
should be prohibited and we should write to the level my grandmother
would understand.
The context was that a Fortran programmer asked for some help in writing
a piece of code in Python. Your answer was entirely opaque and
undecipherable to the OP. If your intention in answering was to teach the
OP how to write Python code, you failed, because the OP couldn't
understand your code! You can argue with me until Doomsday and it won't
change that basic fact.
Your answer may have solved the OP's *technical* problem, but it didn't
do anything to solve the OP's *actual* problem, which was that he didn't
know enough basic Python techniques to solve a simple problem. And that's
the issue I was commenting on.
[more snippage]
> > Nevertheless, for people coming from less dynamic languages than
> > Python (such as Fortran), it is a common idiom to never use the same
> > variable for two different things. It's not a bad choice really:
> > imagine reading a function where the name "lines" started off as an
> > integer number of lines, then became a template string, then was used
> > for a list of character positions...
>
> Which I'm not doing at all. It has the same duck type all the time:
> "iterable of lines".
It has nothing to do with duck typing and everything to do with re-use of
variables (or in Python, names) for different "things". Just because
"lines" has the same duck-type doesn't mean they are conceptually the
same things. If they were, the assignments would be null-ops.
There is a programming principle that says never re-use variables. It
makes it harder for the programmer to figure out what the variable
represents and for some languages, it can defeat compiler optimizations.
Now, I personally wouldn't treat this principle as a law. I'd treat it as
a guideline with just as many exceptions as examples. But there's no
doubt in my mind that reuse of names can lead to hard to understand code,
particularly if the reader is not used to the language and is already
struggling to understand it.
[snippity snip]
> *I* think I would do Python a disservice if I encourage people to
> continue writing Python code as if it where language X or pretending
> Python is all about "readable, executable Pseudocode for anyone".
There's no "pretending". Python is excellent for writing readable,
executable pseudo-code for anyone. With Python 3.0, GvR had the
opportunity to strip Python of all the features that makes Python easy to
learn, and he didn't. Python still has features that are easy for
newbies, and features that are powerful for experienced coders, and that
friendliness for newbies isn't going away. That's a good thing.
--
Steven
> On Wed, 01 Oct 2008 10:38:12 +0000, Marc 'BlackJack' Rintsch wrote:
>
>> Even if newbies don't understand all the details they should be
>> introduced to ``with`` right away IMHO. Because if you explain all the
>> details, even if they understand them, they likely will ignore the
>> knowledge because doing it right is a lot of boiler plate code. So
>> usually people write less robust code and ``with`` is a simple way to
>> solve that problem.
>
> So what you're saying is that we should encourage cargo-cult coding.
> "Write this boilerplate, because I tell you that if you do, good things
> will happen."
It's not cargo cult programming if you tell people to use the ``with``
statement to make sure the file will be closed after the block is left,
for whatever reason the block was left.
>> Why on earth has everything to be guessable for someone who doesn't
>> know Python or even programming at all?
>
> Oh please. Don't take my words out of context. I'm not talking about
> "everything", and I'm not suggesting that advanced programming features
> should be prohibited and we should write to the level my grandmother
> would understand.
>
> The context was that a Fortran programmer asked for some help in writing
> a piece of code in Python. Your answer was entirely opaque and
> undecipherable to the OP. If your intention in answering was to teach
> the OP how to write Python code, you failed, because the OP couldn't
> understand your code! You can argue with me until Doomsday and it won't
> change that basic fact.
My intention wasn't to teach the OP how to write Python but to give a
concise, easy and straight forward solution in Python. Yes, I really
believe I have written such thing. I'm well aware that a Fortran
programmer will not understand this without learning Python.
> Your answer may have solved the OP's *technical* problem, but it didn't
> do anything to solve the OP's *actual* problem, which was that he didn't
> know enough basic Python techniques to solve a simple problem. And
> that's the issue I was commenting on.
If he doesn't know enough basic Python techniques to solve *a simple
problem* I think this is the wrong forum and he should work through the
tutorial from the documentation to learn the basics first. The tutorial
includes `map()`, list comprehensions, methods in strings, the fact that
files are iterable, and generator expressions.
> [more snippage]
>> > Nevertheless, for people coming from less dynamic languages than
>> > Python (such as Fortran), it is a common idiom to never use the same
>> > variable for two different things. It's not a bad choice really:
>> > imagine reading a function where the name "lines" started off as an
>> > integer number of lines, then became a template string, then was used
>> > for a list of character positions...
>>
>> Which I'm not doing at all. It has the same duck type all the time:
>> "iterable of lines".
>
> It has nothing to do with duck typing and everything to do with re-use
> of variables (or in Python, names) for different "things". Just because
> "lines" has the same duck-type doesn't mean they are conceptually the
> same things.
Of course it means they are the same "things", that is what duck typing
is about. In a statically typed language `lines` would be declared as
`Iterable<str>` or similar. Files opened for reading have that interface
and the generator expression has the very same type. A hypothetically
statically typed Python variant with a ``declare`` statement should
compile the following without problems because `generator` objects would
implement `Iterable<A>` and `line` is of type `str`:
declare lines as Iterable<str>
lines = open('test.txt')
lines = (line for line in lines if line.strip())
#...
>> *I* think I would do Python a disservice if I encourage people to
>> continue writing Python code as if it where language X or pretending
>> Python is all about "readable, executable Pseudocode for anyone".
>
> There's no "pretending". Python is excellent for writing readable,
> executable pseudo-code for anyone.
Yes, but that's not what Python is all about. I use it for programming
and not for writing code with the primary goal to be newbie friendly or
pseudo code like.
Ciao,
Marc 'BlackJack' Rintsch
I think that making range, dict.keys, dict.values, filter, map, etc,
return lazy iterables GvR has made the language a little less easy to
understand for newbies.
What's a range(n)? A function that returns a list of n items, from 0
to n. This is easy to understand, while xrange(n) is a bit less easy
to understand (a generator or generators).
Python is growing toward being more fit for medium-large programs, and
less fit for being small, simple and easy. Lua for example is now
maybe better than Python if you need something light to script a large
C++ program, so the niche partially left free by Python that has gone
"up" is being partially replaced by Lua.
Bye,
bearophile
> What's a range(n)? A function that returns a list of n items, from 0 to
> n. This is easy to understand, while xrange(n) is a bit less easy to
> understand (a generator or generators).
<nitpick>
`xrange()` doesn't return a generator or iterator but an object that
implements the sequence protocol:
In [159]: a = xrange(0, 10, 2)
In [160]: len(a)
Out[160]: 5
In [161]: a[0]
Out[161]: 0
In [162]: a[2]
Out[162]: 4
</nitpick>
Ciao,
Marc 'BlackJack' Rintsch
On Thu, 02 Oct 2008 23:49:16 +0000, Marc 'BlackJack' Rintsch wrote:
> On Thu, 02 Oct 2008 14:51:29 +0000, Steven D'Aprano wrote:
>
>> On Wed, 01 Oct 2008 10:38:12 +0000, Marc 'BlackJack' Rintsch wrote:
>>
>>> Even if newbies don't understand all the details they should be
>>> introduced to ``with`` right away IMHO. Because if you explain all
>>> the details, even if they understand them, they likely will ignore the
>>> knowledge because doing it right is a lot of boiler plate code. So
>>> usually people write less robust code and ``with`` is a simple way to
>>> solve that problem.
>>
>> So what you're saying is that we should encourage cargo-cult coding.
>> "Write this boilerplate, because I tell you that if you do, good things
>> will happen."
>
> It's not cargo cult programming if you tell people to use the ``with``
> statement to make sure the file will be closed after the block is left,
> for whatever reason the block was left.
You are right. If you explain what "with" blocks do, it isn't cargo cult
programming.
>>> Why on earth has everything to be guessable for someone who doesn't
>>> know Python or even programming at all?
>>
>> Oh please. Don't take my words out of context. I'm not talking about
>> "everything", and I'm not suggesting that advanced programming features
>> should be prohibited and we should write to the level my grandmother
>> would understand.
>>
>> The context was that a Fortran programmer asked for some help in
>> writing a piece of code in Python. Your answer was entirely opaque and
>> undecipherable to the OP. If your intention in answering was to teach
>> the OP how to write Python code, you failed, because the OP couldn't
>> understand your code! You can argue with me until Doomsday and it won't
>> change that basic fact.
>
> My intention wasn't to teach the OP how to write Python but to give a
> concise, easy and straight forward solution in Python. Yes, I really
> believe I have written such thing. I'm well aware that a Fortran
> programmer will not understand this without learning Python.
I'm curious what the point of answering the OP's question was if you knew
he wouldn't understand the answer. You might have saved us both a lot of
time if you started your post with "You aren't expected to understand
this".
>> Your answer may have solved the OP's *technical* problem, but it didn't
>> do anything to solve the OP's *actual* problem, which was that he
>> didn't know enough basic Python techniques to solve a simple problem.
>> And that's the issue I was commenting on.
>
> If he doesn't know enough basic Python techniques to solve *a simple
> problem* I think this is the wrong forum and he should work through the
> tutorial from the documentation to learn the basics first. The tutorial
> includes `map()`, list comprehensions, methods in strings, the fact that
> files are iterable, and generator expressions.
Then you should have said so.
>> [more snippage]
>>> > Nevertheless, for people coming from less dynamic languages than
>>> > Python (such as Fortran), it is a common idiom to never use the same
>>> > variable for two different things. It's not a bad choice really:
>>> > imagine reading a function where the name "lines" started off as an
>>> > integer number of lines, then became a template string, then was
>>> > used for a list of character positions...
>>>
>>> Which I'm not doing at all. It has the same duck type all the time:
>>> "iterable of lines".
>>
>> It has nothing to do with duck typing and everything to do with re-use
>> of variables (or in Python, names) for different "things". Just because
>> "lines" has the same duck-type doesn't mean they are conceptually the
>> same things.
>
> Of course it means they are the same "things", that is what duck typing
> is about.
No, you still don't understand me. Let me give you a more extreme example
to help clarify:
average_age = 64.7
width_of_page = 20.2
speed_of_car = 35.2
concentration_of_acid = 1.03
children_per_family = 2.3
All of the above have not just the same duck-type, but the same actual
type (floats), and yet they are COMPLETELY different things. Imagine a
piece of code like this:
def foo():
x = 64.7 # x is the average age of a person
... more lines of code here
x = 2.3 # x is now the average number of children per family
...
return something
Would you defend the above code on the basis that x had the same duck-
type in both places? I hope not.
A decade or so ago, one of the Mars spaceships crashed because a coder
used a variable that was a float in inches when they were supposed to use
a variable that was a float in millimetres (or vice versa, I forget).
Because of this mistake, the retro-rockets fired too late, and the
spaceship flew into the surface of Mars at some thousands of miles an
hour. And yet both variables were not just the same duck-type, but the
same actual type. You cannot conclude that two things are the same kind
of thing just because they have the same type.
The difference between the above and re-using the same variable for
lines_of_text_before_filtering and lines_of_text_after_filtering is one
of degree.
Now, in practice, I personally think that what you did was perfectly
acceptable. I do it myself. I think the coders who refuse to EVER re-use
variables in a code block are being over-strict. But I am aware that the
cost of re-using variables is to increase the risk of confusing the two
different meanings of the variable name.
When I'm reading code within the boundaries of my understanding, that
risk is tiny. But when I'm reading code that is complicated or in a
language I don't understand, then the risk is magnified greatly. That's
all I was trying to get across.
I don't think I'm making an unreasonable claim.
[snip]
> Yes, but that's not what Python is all about. I use it for programming
> and not for writing code with the primary goal to be newbie friendly or
> pseudo code like.
I never suggested that being newbie friendly was the only acceptable use
of Python. But I think that when you are replying to a newbie's direct
question, there are only two ways to answer: newbie-friendly, or newbie-
hostile.
--
Steven
I meant "a generator of generators", a typo, sorry. -.-
Bye,
bearophile