[Python-ideas] Optional kwarg making attrgetter & itemgetter always return a tuple

111 views
Skip to first unread message

Masklinn

unread,
Sep 13, 2012, 9:15:03 AM9/13/12
to python-ideas
attrgetter and itemgetter are both very useful functions, but both have
a significant pitfall if the arguments passed in are validated but not
controlled: if receiving the arguments (list of attributes, keys or
indexes) from an external source and *-applying it, if the external
source passes a sequence of one element both functions will in turn
return an element rather than a singleton (1-element tuple).

This means such code, for instance code "slicing" a matrix of some sort
to get only some columns and getting the slicing information from its
caller (in situation where extracting a single column may be perfectly
sensible) will have to implement a manual dispatch between a "manual"
getitem (or getattr) and an itemgetter (resp. attrgetter) call, e.g.

slicer = (operator.itemgetter(*indices) if len(indices) > 1
else lambda ar: [ar[indices[0]])

This makes for more verbose and less straightforward code, I think it
would be useful to such situations if attrgetter and itemgetter could be
forced into always returning a tuple by way of an optional argument:

# works the same no matter what len(indices) is
slicer = operator.itemgetter(*indices, force_tuple=True)

which in the example equivalences[0] would be an override (to False) of
the `len` check (`len(items) == 1` would become `len(items) == 1 and not
force_tuple`)

The argument is backward-compatible as neither function currently
accepts any keyword argument.

Uncertainty note: whether force_tuple (or whatever its name is)
silences the error generated when len(indices) == 0, and returns
a null tuple rather than raising a TypeError.

[0] http://docs.python.org/dev/library/operator.html#operator.attrgetter
_______________________________________________
Python-ideas mailing list
Python...@python.org
http://mail.python.org/mailman/listinfo/python-ideas

Terry Reedy

unread,
Sep 13, 2012, 3:11:22 PM9/13/12
to python...@python.org
This seems like a plausible idea. The actual C version requires one
argument. The Python equivalent in the doc does not (hence the different
signature), as it would return an empty tuple for empty *items.

--
Terry Jan Reedy

Steven D'Aprano

unread,
Sep 13, 2012, 9:20:38 PM9/13/12
to python...@python.org
On 13/09/12 23:15, Masklinn wrote:
> attrgetter and itemgetter are both very useful functions, but both have
> a significant pitfall if the arguments passed in are validated but not
> controlled: if receiving the arguments (list of attributes, keys or
> indexes) from an external source and *-applying it, if the external
> source passes a sequence of one element both functions will in turn
> return an element rather than a singleton (1-element tuple).

For those who, like me, had to read this three or four times to work out
what Masklinn is talking about, I think he is referring to the fact that
attrgetter and itemgetter both return a single element if passed a single
index, otherwise they return a tuple of results.

If a call itemgetter(*args)(some_list) returns a tuple, was that tuple
a single element (and args contained a single index) or was the tuple
a collection of individual elements (and args contained multiple
indexes)?

py> itemgetter(*[1])(['a', ('b', 'c'), 'd'])
('b', 'c')
py> itemgetter(*[1, 2])(['a', 'b', 'c', 'd'])
('b', 'c')


> This means such code, for instance code "slicing" a matrix of some sort
> to get only some columns and getting the slicing information from its
> caller (in situation where extracting a single column may be perfectly
> sensible) will have to implement a manual dispatch between a "manual"
> getitem (or getattr) and an itemgetter (resp. attrgetter) call, e.g.
>
> slicer = (operator.itemgetter(*indices) if len(indices)> 1
> else lambda ar: [ar[indices[0]])


Why is this a problem? If you don't like writing this out in place, write
it once in a helper function. Not every short code snippet needs to be in
the standard library.


> This makes for more verbose and less straightforward code, I think it
> would be useful to such situations if attrgetter and itemgetter could be
> forced into always returning a tuple by way of an optional argument:

-1

There is no need to add extra complexity to itemgetter and attrgetter for
something best solved in your code. Write a helper:

def slicer(*indexes):
getter = itemgetter(*indexes)
if len(indexes) == 1:
return lambda seq: (getter(seq), ) # Wrap in a tuple.
return getter



--
Steven

Masklinn

unread,
Sep 14, 2012, 3:43:38 AM9/14/12
to python-ideas
On 2012-09-14, at 03:20 , Steven D'Aprano wrote:
>> This means such code, for instance code "slicing" a matrix of some sort
>> to get only some columns and getting the slicing information from its
>> caller (in situation where extracting a single column may be perfectly
>> sensible) will have to implement a manual dispatch between a "manual"
>> getitem (or getattr) and an itemgetter (resp. attrgetter) call, e.g.
>>
>> slicer = (operator.itemgetter(*indices) if len(indices)> 1
>> else lambda ar: [ar[indices[0]])
>
>
> Why is this a problem?

Because it adds significant complexity to the code, and that's for the
trivial version of itemgetter, attrgetter also does keypath resolution
so the code is nowhere near this simple.

It's also anything but obvious what this snippet does on its own.

> If you don't like writing this out in place, write
> it once in a helper function. Not every short code snippet needs to be in
> the standard library.

It's not really "every short code snippet" in this case, it's a way to
avoid a sometimes deleterious special case and irregularity of the stdlib.

>> This makes for more verbose and less straightforward code, I think it
>> would be useful to such situations if attrgetter and itemgetter could be
>> forced into always returning a tuple by way of an optional argument:
>
> -1
>
> There is no need to add extra complexity to itemgetter and attrgetter for
> something best solved in your code.

I don't agree with this statement, the stdlib flag adds very little
extra complexity, way less than the original irregularity/special case
and way less than necessary to do it outside the stdlib. Furthermore, it
makes the solution (to having a regular output behavior for
(attr|item)getter) far more obvious and makes the code itself much simpler
to read.

Steven D'Aprano

unread,
Sep 14, 2012, 5:02:54 AM9/14/12
to python...@python.org
On 14/09/12 17:43, Masklinn wrote:
> On 2012-09-14, at 03:20 , Steven D'Aprano wrote:
>>> This means such code, for instance code "slicing" a matrix of some sort
>>> to get only some columns and getting the slicing information from its
>>> caller (in situation where extracting a single column may be perfectly
>>> sensible) will have to implement a manual dispatch between a "manual"
>>> getitem (or getattr) and an itemgetter (resp. attrgetter) call, e.g.
>>>
>>> slicer = (operator.itemgetter(*indices) if len(indices)> 1
>>> else lambda ar: [ar[indices[0]])
>>
>>
>> Why is this a problem?
>
> Because it adds significant complexity to the code,

I don't consider that to be *significant* complexity.


> and that's for the
> trivial version of itemgetter, attrgetter also does keypath resolution
> so the code is nowhere near this simple.

I don't understand what you mean by "keypath resolution". attrgetter
simply looks up the attribute(s) by name, just like obj.name would do. It
has the same API as itemgetter, except with attribute names instead of
item indexes.


> It's also anything but obvious what this snippet does on its own.

Once you get past the ternary if operator, the complexity is pretty much
entirely in the call to itemgetter. You don't even use itemgetter in the
else clause! Beyond the call to itemgetter, it's trivially simple Python
code.

slicer = operator.itemgetter(*indices, force_tuple=flag)

is equally mysterious to anyone who doesn't know what itemgetter does.


>> If you don't like writing this out in place, write
>> it once in a helper function. Not every short code snippet needs to be in
>> the standard library.
>
> It's not really "every short code snippet" in this case, it's a way to
> avoid a sometimes deleterious special case and irregularity of the stdlib.


I disagree that this is a "sometimes deleterious special case". itemgetter
and attrgetter have two APIs:

itemgetter(index)(L) => element
itemgetter(index, index, ...)(L) => tuple of elements

and likewise for attrgetter:

attrgetter(name)(L) => attribute
attrgetter(name, name, ...)(L) => tuple of attributes

Perhaps it would have been better if there were four functions rather than
two. Or if the second API were:

itemgetter(sequence_of_indexes)(L) => tuple of elements
attrgetter(sequence_of_names)(L) => tuple of attributes

so that the two getters always took a single argument, and dispatched on
whether that argument is an atomic value or a sequence. But either way,
it is not what I consider a "special case" so much as two related non-
special cases.

But let's not argue about definitions. Special case or not, can you
demonstrate that the situation is not only deleterious, but cannot be
reasonably fixed with a helper function?

Whenever you call itemgetter, there is no ambiguity because you always know
whether you are calling it with a single index or multiple indexes.



>>> This makes for more verbose and less straightforward code, I think it
>>> would be useful to such situations if attrgetter and itemgetter could be
>>> forced into always returning a tuple by way of an optional argument:
>>
>> -1
>>
>> There is no need to add extra complexity to itemgetter and attrgetter for
>> something best solved in your code.
>
> I don't agree with this statement, the stdlib flag adds very little
> extra complexity, way less than the original irregularity/special case

Whether or not it is empirically less than the complexity already there in
itemgetter, it would still be adding extra complexity. It simply isn't
possible to end up with *less* complexity by *adding* features.

(Complexity is not always a bad thing. If we wanted to program in something
simple, we would program using a Turing machine.)

The reader now has to consider "what does the force_tuple argument do?"
which is not necessarily trivial nor obvious. I expect a certain number of
beginners who don't read documentation will assume that you have to do this:

slicer = itemgetter(1, 2, 3, force_tuple=False)

if they want to pass something other than a tuple to slicer. Don't imagine
that adding an additional argument will make itemgetter and attrgetter
*simpler* to understand.


To me, a major red-flag for your suggested API can be seen here:

itemgetter(1, 2, 3, 4, force_tuple=False)

What should this do? I consider all the alternatives to be less than
ideal:

- ignore the explicit keyword argument and return a tuple anyway
- raise an exception

To say nothing of more... imaginative... semantics:

- return a list, or a set, anything but a tuple
- return a single element instead of four (but which one?)

The suggested API is not as straight-forward as you seem to think it is.


> and way less than necessary to do it outside the stdlib. Furthermore, it
> makes the solution (to having a regular output behavior for
> (attr|item)getter) far more obvious and makes the code itself much simpler
> to read.

The only thing I will grant is that it aids in discoverability of a
solution: you don't have to think of the (trivial) solution yourself, you
just need to read the documentation. But I don't see either the problem
or the solution to be great enough to justify adding an argument, writing
new documentation, and doubling the number of tests for both itemgetter and
attrgetter.



--
Steven

Masklinn

unread,
Sep 14, 2012, 5:29:47 AM9/14/12
to python-ideas
On 2012-09-14, at 11:02 , Steven D'Aprano wrote
>> and that's for the
>> trivial version of itemgetter, attrgetter also does keypath resolution
>> so the code is nowhere near this simple.
>
> I don't understand what you mean by "keypath resolution". attrgetter
> simply looks up the attribute(s) by name, just like obj.name would do. It
> has the same API as itemgetter, except with attribute names instead of
> item indexes.

It takes dotted paths, not just attribute names

>> It's also anything but obvious what this snippet does on its own.
>
> Once you get past the ternary if operator, the complexity is pretty much
> entirely in the call to itemgetter. You don't even use itemgetter in the
> else clause! Beyond the call to itemgetter, it's trivially simple Python
> code.
>
> slicer = operator.itemgetter(*indices, force_tuple=flag)
>
> is equally mysterious to anyone who doesn't know what itemgetter does.

I would expect either foreknowledge or reading up on it to be obvious
in the context of its usage.

>>> If you don't like writing this out in place, write
>>> it once in a helper function. Not every short code snippet needs to be in
>>> the standard library.
>>
>> It's not really "every short code snippet" in this case, it's a way to
>> avoid a sometimes deleterious special case and irregularity of the stdlib.
>
>
> I disagree that this is a "sometimes deleterious special case". itemgetter
> and attrgetter have two APIs:
>
> itemgetter(index)(L) => element
> itemgetter(index, index, ...)(L) => tuple of elements
>
> and likewise for attrgetter:
>
> attrgetter(name)(L) => attribute
> attrgetter(name, name, ...)(L) => tuple of attributes
>
> Perhaps it would have been better if there were four functions rather than
> two. Or if the second API were:
>
> itemgetter(sequence_of_indexes)(L) => tuple of elements
> attrgetter(sequence_of_names)(L) => tuple of attributes
>
> so that the two getters always took a single argument, and dispatched on
> whether that argument is an atomic value or a sequence. But either way,
> it is not what I consider a "special case" so much as two related non-
> special cases.

Which conflict for a sequence of length 1, which is the very reason
why I started this thread.

> But let's not argue about definitions. Special case or not, can you
> demonstrate that the situation is not only deleterious, but cannot be
> reasonably fixed with a helper function?

Which as usual hinges on the definition of "reasonably", of course the
situation can be "fixed" (with "reasonably" being a wholly personal
value judgement) with a helper function or a reimplementation of an
(attr|item)getter-like function from scratch. As it can pretty much
always be. I don't see that as a very useful benchmark.

> Whenever you call itemgetter, there is no ambiguity because you always know
> whether you are calling it with a single index or multiple indexes.

That is not quite correct, even ignoring that you have to call `len` to
do so when the indices are provided by a third party, the correct code
gets yet more complex as the third party could provide an iterator which
would have to be reified before being passed to len(), increasing the
complexity of the "helper" yet again.

>>>> This makes for more verbose and less straightforward code, I think it
>>>> would be useful to such situations if attrgetter and itemgetter could be
>>>> forced into always returning a tuple by way of an optional argument:
>>>
>>> -1
>>>
>>> There is no need to add extra complexity to itemgetter and attrgetter for
>>> something best solved in your code.
>>
>> I don't agree with this statement, the stdlib flag adds very little
>> extra complexity, way less than the original irregularity/special case
>
> Whether or not it is empirically less than the complexity already there in
> itemgetter, it would still be adding extra complexity. It simply isn't
> possible to end up with *less* complexity by *adding* features.

At no point did I deny that, as far as I know or can see.

> (Complexity is not always a bad thing. If we wanted to program in something
> simple, we would program using a Turing machine.)
>
> The reader now has to consider "what does the force_tuple argument do?"
> which is not necessarily trivial nor obvious. I expect a certain number of
> beginners who don't read documentation will assume that you have to do this:
>
> slicer = itemgetter(1, 2, 3, force_tuple=False)
>
> if they want to pass something other than a tuple to slicer. Don't imagine
> that adding an additional argument will make itemgetter and attrgetter
> *simpler* to understand.
>
>
> To me, a major red-flag for your suggested API can be seen here:
>
> itemgetter(1, 2, 3, 4, force_tuple=False)
>
> What should this do?

The exact same as `itemgetter(1, 2, 3, 4)`, since `force_tuple` defaults
to False.

> I consider all the alternatives to be less than
> ideal:
>
> - ignore the explicit keyword argument and return a tuple anyway
> - raise an exception
>
> To say nothing of more... imaginative... semantics:
>
> - return a list, or a set, anything but a tuple
> - return a single element instead of four (but which one?)

I have trouble seeing how such interpretations can be drawn up from
explicitly providing the default value for the argument. Does anyone
really expect dict.get(key, None) to always return None?

> The suggested API is not as straight-forward as you seem to think it is.

It's simply a proposal to fix what I see as an issue (as befits to
python-ideas), you're getting way too hung up on something which can
quite trivially be discussed and changed.

>> and way less than necessary to do it outside the stdlib. Furthermore, it
>> makes the solution (to having a regular output behavior for
>> (attr|item)getter) far more obvious and makes the code itself much simpler
>> to read.
>
> The only thing I will grant is that it aids in discoverability of a
> solution

It also aids in the discoverability of the problem in the first place, and
in limiting the surprise when unexpectedly encountering it for the first
time.

alex23

unread,
Sep 14, 2012, 5:41:43 AM9/14/12
to python...@python.org
On Sep 13, 11:15 pm, Masklinn <maskl...@masklinn.net> wrote:
>     # works the same no matter what len(indices) is
>     slicer = operator.itemgetter(*indices, force_tuple=True)

I'd be inclined to write that as:

slicer = force_tuple(operator.itemgetter(*indices))

With force_tuple then just being another decorator.

Nick Coghlan

unread,
Sep 14, 2012, 7:01:04 AM9/14/12
to Masklinn, python-ideas
On Thu, Sep 13, 2012 at 11:15 PM, Masklinn <mask...@masklinn.net> wrote:
> attrgetter and itemgetter are both very useful functions, but both have
> a significant pitfall if the arguments passed in are validated but not
> controlled: if receiving the arguments (list of attributes, keys or
> indexes) from an external source and *-applying it, if the external
> source passes a sequence of one element both functions will in turn
> return an element rather than a singleton (1-element tuple).

Both attrgetter and itemgetter are really designed to be called with
*literal* arguments, not via *args. In particular, they are designed
to be useful as arguments bound to a "key" parameter, where the object
vs singleton tuple distinction doesn't matter.

If that behaviour is not desirable, *write a different function* that
does what you want, and don't use itemgetter or attrgetter at all.
These tools are designed as convenience functions for a particular use
case (specifically sorting, and similar ordering operations). Outside
those use cases, you will need to drop back down to the underlying
building blocks and produce your *own* tool from the same raw
materials.

For example:

def my_itemgetter(*subscripts):
def f(obj):
return tuple(obj[x] for x in subscripts)
return f

I agree attrgetter is slightly more complex due to the fact that it
*also* handles chained lookups, where getattr does not, but that's a
matter of making the case for providing chained lookup (or even
str.format style field value lookup) as a more readily accessible
building block, not for making the attrgetter API more complicated.

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Masklinn

unread,
Sep 14, 2012, 7:36:39 AM9/14/12
to python-ideas
On 2012-09-14, at 13:01 , Nick Coghlan wrote:
> On Thu, Sep 13, 2012 at 11:15 PM, Masklinn <mask...@masklinn.net> wrote:
>> attrgetter and itemgetter are both very useful functions, but both have
>> a significant pitfall if the arguments passed in are validated but not
>> controlled: if receiving the arguments (list of attributes, keys or
>> indexes) from an external source and *-applying it, if the external
>> source passes a sequence of one element both functions will in turn
>> return an element rather than a singleton (1-element tuple).
>
> Both attrgetter and itemgetter are really designed to be called with
> *literal* arguments, not via *args. In particular, they are designed
> to be useful as arguments bound to a "key" parameter, where the object
> vs singleton tuple distinction doesn't matter.

It was my understanding that they are also designed to be useful for
mapping (such a usage is shown in itemgetter's examples), which is
a superset of the use case outlined here.

> If that behaviour is not desirable, *write a different function* that
> does what you want, and don't use itemgetter or attrgetter at all.
> These tools are designed as convenience functions

And save for one stumbling block, they are utilities I love for their
convenience and their plain clarity of purpose.

Oscar Benjamin

unread,
Sep 14, 2012, 9:23:53 AM9/14/12
to python-ideas
On 14 September 2012 12:36, Masklinn <mask...@masklinn.net> wrote:
On 2012-09-14, at 13:01 , Nick Coghlan wrote:
> On Thu, Sep 13, 2012 at 11:15 PM, Masklinn <mask...@masklinn.net> wrote:
>> attrgetter and itemgetter are both very useful functions, but both have
>> a significant pitfall if the arguments passed in are validated but not
>> controlled: if receiving the arguments (list of attributes, keys or
>> indexes) from an external source and *-applying it, if the external
>> source passes a sequence of one element both functions will in turn
>> return an element rather than a singleton (1-element tuple).
>
> Both attrgetter and itemgetter are really designed to be called with
> *literal* arguments, not via *args. In particular, they are designed
> to be useful as arguments bound to a "key" parameter, where the object
> vs singleton tuple distinction doesn't matter.

It was my understanding that they are also designed to be useful for
mapping (such a usage is shown in itemgetter's examples), which is
a superset of the use case outlined here.

> If that behaviour is not desirable, *write a different function* that
> does what you want, and don't use itemgetter or attrgetter at all.
> These tools are designed as convenience functions

I can see why you would expect different behaviour here, though. I tend not to think of the functions in the operator module as convenience functions but as *efficient* nameable functions referring to operations that are normally invoked with a non-function syntax. Which is more convenient out of the following:

1) using operator
import operator
result = sorted(values, key=operator.attrgetter('name'))

2) using lambda
result = sorted(values, key=lambda v: v.name)

I don't think that the operator module is convenient and I think that it damages readability in many cases. My primary reason for choosing it in some cases is that it is more efficient than the lambda expression.

There is no special syntax for 'get several items as a tuple'. I didn't know about this extended use for attrgetter, itemgetter. I can't see any other functions in the operator module (abs, add, and_, ...) that extend the semantics of the operation they are supposed to represent in this way.

In general it is bad to conflate scalar/sequence semantics so that a caller should get a different type of object depending on the length of a sequence. I can see how practicality beats purity in adding this feature for people who want to use these functions for sorting by a couple of elements/attributes. I think it would have been better though to add these as separate functions itemsgetter and attrsgetter that always return tuples.

Oscar

Jim Jewett

unread,
Sep 14, 2012, 5:02:31 PM9/14/12
to Oscar Benjamin, python-ideas
On 9/14/12, Oscar Benjamin <oscar.j....@gmail.com> wrote:

> I can see why you would expect different behaviour here, though. I tend not
> to think of the functions in the operator module as convenience functions
> but as *efficient* nameable functions referring to operations that are
> normally invoked with a non-function syntax. Which is more convenient out
> of the following:

> 1) using operator
> import operator
> result = sorted(values, key=operator.attrgetter('name'))

I would normally write that as

from operator import attrgetter as attr
... # may use it several times

result=sorted(values, key=attr('name'))

which is about the best I could hope for, without being able to use
the dot itself.

> 2) using lambda
> result = sorted(values, key=lambda v: v.name)

And I honestly think that would be worse, even if lambda didn't have a
code smell. It focuses attention on the fact that you're creating a
callable, instead of on the fact that you're grabbing the name
attribute.

> In general it is bad to conflate scalar/sequence semantics so that a caller
> should get a different type of object depending on the length of a
> sequence.

Yeah, but that can't really be solved well in python, except maybe by
never extending an API to handle sequences. I would personally not
consider that an improvement.

Part of the problem is that the cleanest way to take a variable number
of arguments is to turn them into a sequence under the covers (*args),
even if they weren't passed that way.

-jJ

Oscar Benjamin

unread,
Sep 15, 2012, 7:09:12 AM9/15/12
to python-ideas

On Sep 14, 2012 10:02 PM, "Jim Jewett" <jimjj...@gmail.com> wrote:
>
> On 9/14/12, Oscar Benjamin <oscar.j....@gmail.com> wrote:
>
> > I can see why you would expect different behaviour here, though. I tend not
> > to think of the functions in the operator module as convenience functions
> > but as *efficient* nameable functions referring to operations that are
> > normally invoked with a non-function syntax. Which is more convenient out
> > of the following:
>
> > 1) using operator
> > import operator
> > result = sorted(values, key=operator.attrgetter('name'))
>
> I would normally write that as
>
>     from operator import attrgetter as attr
>     ... # may use it several times
>
>     result=sorted(values, key=attr('name'))
>
> which is about the best I could hope for, without being able to use
> the dot itself.

To be clear, I wasn't complaining about the inconvenience of importing and referring to attrgetter. I was saying that if the obvious alternative (lambda functions) is at least as convenient then it's odd to describe itemgetter/attrgetter as convenience functions.

> > 2) using lambda

> > result = sorted(values, key=lambda v: v.name)
>
> And I honestly think that would be worse, even if lambda didn't have a
> code smell.  It focuses attention on the fact that you're creating a
> callable, instead of on the fact that you're grabbing the name
> attribute.

I disagree here. I find the fact that a lambda function shows me the expression I would normally use to get the quantity I'm interested in makes it easier for me to read. When I look at it I don't see it as a callable function but as an expression that I'm passing for use somewhere else.

>
> > In general it is bad to conflate scalar/sequence semantics so that a caller
> > should get a different type of object depending on the length of a
> > sequence.
>
> Yeah, but that can't really be solved well in python, except maybe by
> never extending an API to handle sequences.  I would personally not
> consider that an improvement.
>
> Part of the problem is that the cleanest way to take a variable number
> of arguments is to turn them into a sequence under the covers (*args),
> even if they weren't passed that way.
>
> -jJ

You can extend an API to support sequences by adding a new entry point. This is a common idiom in python: think list.append vs list.extend.

Oscar

Nick Coghlan

unread,
Sep 15, 2012, 8:43:59 AM9/15/12
to Masklinn, python-ideas
On Fri, Sep 14, 2012 at 9:36 PM, Masklinn <mask...@masklinn.net> wrote:
> On 2012-09-14, at 13:01 , Nick Coghlan wrote:
>> Both attrgetter and itemgetter are really designed to be called with
>> *literal* arguments, not via *args. In particular, they are designed
>> to be useful as arguments bound to a "key" parameter, where the object
>> vs singleton tuple distinction doesn't matter.
>
> It was my understanding that they are also designed to be useful for
> mapping (such a usage is shown in itemgetter's examples), which is
> a superset of the use case outlined here.

The "key" style usage was definitely the primary motivator, which is
why the ambiguity in the *args case wasn't noticed. If it *had* been
noticed, the multiple argument support likely never would have been
added.

As it is, the *only* case where the ambiguity causes problems is when
you want to use *args with these functions. Since they weren't built
with that style of usage in mind, they don't handle it well. Making
them even *more* complicated to work around an earlier design mistake
doesn't seem like a good idea.

Cheers,
Nick.

--
Nick Coghlan | ncog...@gmail.com | Brisbane, Australia

Sam Denton

unread,
Apr 2, 2014, 1:47:34 PM4/2/14
to python...@googlegroups.com, python-ideas, mask...@masklinn.net
I stumbled upon this and while it's a bit old, I want to revisit it.  I agree that attrgetter and itemgetter are messed up, and I would like to see a version that fixes them.  I have a different fix, however.  The itertools.chain and itertools.chain.from_iterable functions are identical except for how the arguments are specified.  Using that precedent, I would like to suggest that itemgetter.as_tuple be defined as exactly like itemgetter except that the return value is always a tuple, and similarly for attrgetter.  This way there is no confusion over the meaning of the keywork force_tuple, as Steven D'Aprano feared.  I would urge that the implementation have itemgetter call itemgetter.as_tuple as that would allow both functions to be simpler.

Sam Denton

unread,
Apr 2, 2014, 2:12:01 PM4/2/14
to python...@googlegroups.com, python-ideas, mask...@masklinn.net
Here's a quick pure Python implementation, based up the code here:
http://hg.python.org/cpython/file/0aeaea247d7d/Lib/operator.py

class itemgetter:
    """
    Return a callable object that fetches the given item(s) from its operand.
    After f = itemgetter(2), the call f(r) returns r[2].
    After g = itemgetter(2, 5, 3), the call g(r) returns (r[2], r[5], r[3])
    """
    def __init__(self, *items):
        def func(obj):
            """
            A callable object that fetches the given item(s) from its operand as a tuple.
            After f = itemgetter(2), the call f(r) returns (r[2],).
            """
            return tuple(obj[i] for i in items)
        self.as_tuple = func
        if len(items) == 1:
            item = items[0]
            def func(obj):
                return obj[item]
        self.__call__ = func
Reply all
Reply to author
Forward
0 new messages