Switching from Perl here, and having a hard time letting go...
Suppose I have an "array" foo, and that I'm interested in the 4th, 8th,
second, and last element in that array. In Perl I could write:
my @wanted = @foo[3, 7, 1, -1];
I was a bit surprised when I got this in Python:
>>> wanted = foo[3, 7, 1, -1]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: list indices must be integers
Granted, Perl's syntax is often obscure and hard-to-read, but in
this particular case I find it quite transparent and unproblematic,
and the fictional "pythonized" form above even more so.
The best I've been able to come up with in Python are the somewhat
Perl-like-in-its-obscurity:
>>> wanted = map(foo.__getitem__, (3, 7, 1, -1))
or the clearer but unaccountably sesquipedalian
>>> wanted = [foo[i] for i in 3, 7, 1, -1]
>>> wanted = [foo[3], foo[7], foo[7], foo[-1]]
Are these the most idiomatically pythonic forms? Or am I missing
something better?
TIA!
kynn
There is only so much room in the syntax for common cases before you
end up with ... perl (no offense intended, I'm a perl monk[1]). The
Python grammar isn't as context sensitive or irregular as the perl
grammar so mylist[1,2,3] so the "1,2,3" tuple is always interpreted
as a tuple and the square brackets always expect an int or a slice.
Not including special cases everywhere means there isn't a short way
to handle special cases but it also means human readers have to
remember fewer special cases. Perl and Python make different
tradeoffs in that respect.
-Jack
You've just tried to index a list with a tuple...
If foo was a dictionary then this might make sense.
> Granted, Perl's syntax is often obscure and hard-to-read, but in
> this particular case I find it quite transparent and unproblematic,
> and the fictional "pythonized" form above even more so.
>
> The best I've been able to come up with in Python are the somewhat
> Perl-like-in-its-obscurity:
>
> >>> wanted = map(foo.__getitem__, (3, 7, 1, -1))
>
> or the clearer but unaccountably sesquipedalian
>
> >>> wanted = [foo[i] for i in 3, 7, 1, -1]
> >>> wanted = [foo[3], foo[7], foo[7], foo[-1]]
>
> Are these the most idiomatically pythonic forms? Or am I missing
> something better?
Firstly run "import this" at the python interactive interpreter to
remind youself of the philosophical differences between perl and
python. I think the philosophy of python is the major reason why it
is such a good language.
As I transitioned from perl to python it took me a while to let go of
perlisms, and get used to writing a little bit more code (usually of
the order of a few characters only) but which was much much clearer.
Perl is full of cleverness which give you great pleasure to write as a
programmer. However when you or someone else looks at that code later
they don't get that same pleasure - horror is more likely the
reaction! Python just isn't like that.
I'd probably write
wanted = foo[3], foo[7], foo[1], foo[-1]
(assuming you didn't mind having a tuple rather than a list)
or maybe this
wanted = [ foo[i] for i in 3, 7, 1, -1 ]
However I can't think of the last time I wanted to do this - array
elements having individual purposes are usually a sign that you should
be using a different data structure.
--
Nick Craig-Wood <ni...@craig-wood.com> -- http://www.craig-wood.com/nick
Could you explain your use case? It could be that a list isn't the
appropriate data structure.
Cheers,
Brian
You're missing operator.itemgetter:
>>> from operator import itemgetter
>>> foo = "spam & eggs"
>>> itemgetter(3, 7, 1, -1)(foo)
('m', 'e', 'p', 's')
>>>
--
Arnaud
>There is only so much room in the syntax for common cases before you
>end up with ... perl (no offense intended, I'm a perl monk[1]). The
>Python grammar isn't as context sensitive or irregular as the perl
>grammar so mylist[1,2,3] so the "1,2,3" tuple is always interpreted
>as a tuple and the square brackets always expect an int or a slice.
>Not including special cases everywhere means there isn't a short way
>to handle special cases but it also means human readers have to
>remember fewer special cases. Perl and Python make different
>tradeoffs in that respect.
OK, I see: if Python allowed foo[3,7,1,-1], then foo[3] would be
ambiguous: does it mean the fourth element of foo, or the tuple
consisting of this element alone? I suppose that's good enough
reason to veto this idea...
Thanks for all the responses.
kynn
>However I can't think of the last time I wanted to do this - array
>elements having individual purposes are usually a sign that you should
>be using a different data structure.
In the case I was working with, was a stand-in for the value returned
by some_match.groups(). The match comes from a standard regexp
defined elsewhere and that captures more groups than I need. (This
regexp is applied to every line of a log file.)
kj
That looks to me like the best solution to the OP's specific
question. It's amazing how many cool things are tucked into the
standard library. While it's easy to miss these things, I appreciate
the effort to keep Python's core relatively small. (Not knowing about
itemgetter, I would have gone with the list comprehension.)
John
The common idiom for this sort of thing is:
_, _, _, val1, _, _, _, val2, ..., val3 = some_match.groups()
Cheers,
Brian
val1, val2, val3 = some_match.group(4, 8, something)
Actually, now that I think about it, naming the groups seems like it
would make this code a lot less brittle.
Cheers,
Brian
>OK, I see: if Python allowed foo[3,7,1,-1], then foo[3] would be
>ambiguous: does it mean the fourth element of foo, or the tuple
>consisting of this element alone? I suppose that's good enough
>reason to veto this idea...
Hmmm, come to think of it, this argument is weaker than I thought
at first. Python already has cases where it has to deal with this
sort of ambiguity, and does so with a trailing comma. So the two
cases could be disambiguated: foo[3] for the single element, and
foo[3,] for the one-element tuple.
Also, the association of this idiom with Perl is a bit unfair:
tuple-index is very common in other languages, and in pure math as
well.
As I said in my original post, Perl code often has very obscure
expressions, but I would never describe tuple indexing as one of
them.
By the same token, the desing of Python does not entirely disregard
considerations of ease of writing. Who could argue that
foo[:]
is intrinsically clearer, or easier to read than
foo[0:len(foo)]
?
Please don't misunderstand me here. I don't want to critize, let
alone change, Python. I'm sure there is a good reason for why
Python doesn't support foo[3,7,1,-1], but I have not figured it
out yet. I still find it unconvincing that it would be for the
sake of keeping the code easy to read, because I don't see how
foo[3,7,1,-1] is any more confusing than foo[:].
kj
>k> Switching from Perl here, and having a hard time letting go...
>k> Suppose I have an "array" foo, and that I'm interested in the 4th, 8th,
>k> second, and last element in that array. In Perl I could write:
>k> my @wanted = @foo[3, 7, 1, -1];
>k> I was a bit surprised when I got this in Python:
>>>>> wanted = foo[3, 7, 1, -1]
>k> Traceback (most recent call last):
>k> File "<stdin>", line 1, in <module>
>k> TypeError: list indices must be integers
>k> Granted, Perl's syntax is often obscure and hard-to-read, but in
>k> this particular case I find it quite transparent and unproblematic,
>k> and the fictional "pythonized" form above even more so.
>k> The best I've been able to come up with in Python are the somewhat
>k> Perl-like-in-its-obscurity:
>>>>> wanted = map(foo.__getitem__, (3, 7, 1, -1))
>k> or the clearer but unaccountably sesquipedalian
>>>>> wanted = [foo[i] for i in 3, 7, 1, -1]
>>>>> wanted = [foo[3], foo[7], foo[7], foo[-1]]
>k> Are these the most idiomatically pythonic forms? Or am I missing
>k> something better?
Do it yourself:
class MyList(list):
def __getitem__(self, indx):
if isinstance (indx, tuple):
return [self[i] for i in indx]
else:
return list.__getitem__(self, indx)
l = MyList((range(10)))
print l[3, 7, 1, -1]
print l[3]
print l[3:7]
# and now for something completely different
print l[3, (7, 1), -1]
duck :=)
--
Piet van Oostrum <pi...@cs.uu.nl>
URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
Private email: pi...@vanoostrum.org
> the square brackets always expect an int or a slice.
true only for lists :)
In [1]: mydict = {}
In [2]: mydict[1,2,3] = 'hi'
In [3]: print mydict[1,2,3]
------> print(mydict[1,2,3])
hi
--
By ZeD
even better:
print l[3, 4:10:2, 7]
> OK, I see: if Python allowed foo[3,7,1,-1], then foo[3] would be
> ambiguous: does it mean the fourth element of foo, or the tuple
> consisting of this element alone? I suppose that's good enough
> reason to veto this idea...
There's nothing ambiguous about it. obj.__getitem__(x) already accepts two
different sorts of objects for x: ints and slice-objects:
>>> range(8)[3]
3
>>> range(8)[slice(3)]
[0, 1, 2]
>>> range(8)[slice(3, None)]
[3, 4, 5, 6, 7]
>>> range(8)[slice(3, 4)]
[3]
Allowing tuple arguments need not be ambiguous:
range(8)[3] => 3
range(8)[(3,)] => [3]
range(8)[(3,5,6)] => [3, 5, 6]
I've rarely needed to grab arbitrary items from a list in this fashion. I
think a more common case would be extracting fields from a tuple. In any
case, there are a few alternatives:
Grab them all, and ignore some of them (ugly for more than one or two
ignorable items):
_, x, _, _, y, _, _, _, z, _ = range(10) # x=1, y=4, z=9
Subclass list to allow tuple slices:
>>> class MyList(list):
... def __getitem__(self, obj):
... if type(obj) is tuple:
... return [self[i] for i in obj]
... else:
... return list.__getitem__(self, obj)
...
>>> L = MyList(range(10))
>>> L[(1, 4, 8)]
[1, 4, 8]
Write a helper function:
def getitems(L, *indexes):
if len(indexes) == 1:
indexes = indexes[0]
return [L[i] for i in indexes]
But I think this is an obvious enough extension to the __getitem__ protocol
that I for one would vote +1 on it being added to Python sequence objects
(lists, tuples, strings).
--
Steven
Whoops! Your example is broken:
>>> cars = ['Ford', 'Toyota', 'Edsel']
>>> getitems(cars, 1)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in getitems
TypeError: 'int' object is not iterable
>>>
I think you meant to apply that [0] to the created list instead.
Something like:
def getitems(L, *indexes):
new_list = [L[i] for i in indexes]
if len(indexes) == 1:
new_list = new_list[0]
return new_list
But I'm not sure that would be the best idea anyway. Just let getitems
always return a list. That way the caller doesn't have to test the
length to figure out what to do with it. If you know you want a single
item, you can use regular old .__getitem__ (or .get) methods, or direct
indexing.
Then getitems can just be:
def getitems(L, *indexes):
return [L[i] for i in indexes]
>
> But I think this is an obvious enough extension to the __getitem__ protocol
> that I for one would vote +1 on it being added to Python sequence objects
> (lists, tuples, strings).
>
I'd be +0. It won't change my life, but it seems like a decent idea.
>
> --
> Steven
>
Cheers,
Cliff
>>> a = np.arange(10)
>>> a
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
>>> a[[3,7,1,-1]]
array([3, 7, 1, 9])
hth,
Alan Isaac
What's np.arange?
--
Steven
import numpy as np
--
Pierre "delroth" Bourdon <del...@gmail.com>
Étudiant à l'EPITA / Student at EPITA
Perfect example of why renaming namespaces should be done only when
absolutely required, that is, almost never.
Jean-Michel
Actually, "np." is quite commonly used in the numpy community,
so it is a bit of a "term of art". Since you can often use
several numpy elements in an expression, brevity is appreciated,
and at least they've stopped assuming "from numpy import *" in
their documents. Unfortunately, if you work in a numpy world long
enough, you'll forget that not everyone uses numpy.
--Scott David Daniels
Scott....@Acm.Org
I disagree. Renaming namespaces should always be done if it will help
stop people from doing a 'from package import *'. However, example code
should always include relevant imports.
Cheers,
Cliff
Agreed. It was a cut and paste failure.
Apologies.
Alan Isaac
I was about to suggest that too, but it sounds like the OP has little
or no control, in this case, over the RE itself. Another thing I
would suggest is using the (?:) syntax--it allows creating a syntactic
group that isn't returned in the list of match groups.
br
Jean-Michel
Technically, no. But we're dealing with people, who are notoriously
*un*technical in their behavior. A person is much more likely to
develop bad habits if the alternative means more work for them. The
reason people do `from foo import *` is that they don't want to type
more than they have to. If they can write a one or two letter
namespace, they're likely to be happy with that trade-off. If the
alternative is to write out long module names every time you use a
variable, they'll tend to develop bad habits. To paraphrase Peter
Maurin, coding guidelines should have the aim of helping to "bring about
a world in which it is easy to be good."
I don't really see much problem with renaming namespaces: For people
reading the code, everything is explicit, as you can just look at the
top of the module to find out what module a namespace variable
represent; the local namespace doesn't get polluted with God knows what
from God knows where; and code remains succinct.
I've found in my own code that using, for example, the name `sqlalchemy`
in my code means that I have to go through painful contortions to get
your code down to the PEP-8 recommended 80 characters per line. The
resulting mess of multi-line statements is significantly less readable
than the same code using the abbreviation `sa`.
Do you have an argument for avoiding renaming namespaces? So far the
only example you provided is a code fragment that doesn't run. I don't
disagree with you on that example; referring to numpy as np without
telling anyone what np refers to is a bad idea, but no functioning piece
of code could reasonably do that.
Cheers,
Cliff
The perfect rule satisfies both of them, but when I have to choose, I
prefer number 2. Renaming packages, especially those who are world wide
used, may confuse the reader and force him to browse into more code.
From the OP example, I was just pointing the fact that **he alone**
gains 3 characters when **all** the readers need to ask what means "np".
Renaming namespaces with a well chosen name (meaningful) is harmless.
br
Jean-Michel
<snip>
> Maybe I've been a little bit too dictatorial when I was saying that
> renaming namespaces should be avoided.
> Sure your way of doing make sense. In fact they're 2 main purposes of
> having strong coding rules:
> 1/ ease the coder's life
> 2/ ease the reader's life
>
> The perfect rule satisfies both of them, but when I have to choose, I
> prefer number 2. Renaming packages, especially those who are world wide
> used, may confuse the reader and force him to browse into more code.
>
> From the OP example, I was just pointing the fact that **he alone**
> gains 3 characters when **all** the readers need to ask what means "np".
> Renaming namespaces with a well chosen name (meaningful) is harmless.
As long as you keep all import statements at the head of the file, there
is no readability problems with renaming namespace.
Glance at the header file, see:
import numpy as np
and it's not hard to mentally switch np as numpy...
well, as long as your header doesn't look like this:
import numpy as np
import itertools as it
import Tkinter as Tk
from time import time as t
yep, your example is good, no namespace renaming ... :o)
I would gladly accept the following renaming:
import theMostEfficientPythonPackageInTheWorld as meppw
Hopefully, package names are often usable as provided.
Moreover, writing numpy instead of np is not harder for the coder than
switching mentally from np to numpy for the reader. It's just about who
you want to make the life easier, the coder or the reader ?
br
Jean-Michel
> Moreover, writing numpy instead of np is not harder for the coder than
> switching mentally from np to numpy for the reader. It's just about who
> you want to make the life easier, the coder or the reader ?
<shrug> It depends on the audience. For those familiar with numpy and the np
convention, it's easier to read code that uses np because there are many lines
with several numpy functions called in each.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
My point was, use namespace renaming whenever that improves readability;
however like all tools, don't overuse it
Another usecase might be when you have two similarly named package which
might bring confusion on which is which if left as is.