Best regards, João Matos
While I don't object to the idea of concatenating dictionaries, I feel
obliged to point out that this last is currently spelled
dict_a.update(dict_b)
--
Rhodri James *-* Kynesim Ltd
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
On 27/02/2019 16:25, João Matos wrote:
> I would like to propose that instead of using this (applies to Py3.5 and upwards)
> dict_a = {**dict_a, **dict_b}
>
> we could use
> dict_a = dict_a + dict_b
>
> or even better
> dict_a += dict_b
While I don't object to the idea of concatenating dictionaries, I feel
obliged to point out that this last is currently spelled
dict_a.update(dict_b)
The key conundrum that needs to be solved is what to do for `d1 + d2` when there are overlapping keys. I propose to make d2 win in this case, which is what happens in `d1.update(d2)` anyways. If you want it the other way, simply write `d2 + d1`.
The key conundrum that needs to be solved is what to do for `d1 + d2` when there are overlapping keys. I propose to make d2 win in this case, which is what happens in `d1.update(d2)` anyways. If you want it the other way, simply write `d2 + d1`.This would mean that addition, at least in this particular instance, is not a commutative operation. Are there other places in Python where this is the case?
Sure:
>>> a = "A"
>>> b = "B"
>>> a + b == b + a
False
> {1} | {2}
{1, 2}
To me it makes sense that if + works for dict then it should for set too.
/ Anders
I dislike the asymmetry with sets:
> {1} | {2}
{1, 2}
To me it makes sense that if + works for dict then it should for set too.
/ Anders
> On 27 Feb 2019, at 17:25, João Matos <jcrm...@gmail.com> wrote:
>
> Hello,
>
> I would like to propose that instead of using this (applies to Py3.5 and upwards)
> dict_a = {**dict_a, **dict_b}
>
> we could use
> dict_a = dict_a + dict_b
João Matos
Great.
Because I don't program in any other language except Python, I can't make the PR (with the C code).
Maybe someone who program in C can help?
Counter uses + for a *different* behavior!
>>> Counter(a=2) + Counter(a=3)
Counter({'a': 5})
I do not understand why we discuss a new syntax for dict merging if we
already have a syntax for dict merging: {**d1, **d2} (which works with
*all* mappings). Is not this contradicts the Zen?
I'd help out.
Eric
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
27.02.19 20:48, Guido van Rossum пише:
>
> On Wed, Feb 27, 2019 at 10:42 AM Michael Selik
> <mi...@selik.org
> <mailto:mi...@selik.org>> wrote > The dict subclass collections.Counter overrides the update method
> for adding values instead of overwriting values.
>
> https://docs.python.org/3/library/collections.html#collections.Counter.update
>
> Counter also uses +/__add__ for a similar behavior.
>
> >>> c = Counter(a=3, b=1)
> >>> d = Counter(a=1, b=2)
> >>> c + d # add two counters together: c[x] + d[x]
> Counter({'a': 4, 'b': 3})
>
> At first I worried that changing base dict would cause confusion for
> the subclass, but Counter seems to share the idea that update and +
> are synonyms.
>
>
> Great, this sounds like a good argument for + over |. The other argument
> is that | for sets *is* symmetrical, while + is used for other
> collections where it's not symmetrical. So it sounds like + is a winner
> here.
Counter uses + for a *different* behavior!
>>> Counter(a=2) + Counter(a=3)
Counter({'a': 5})
I do not understand why we discuss a new syntax for dict merging if we
already have a syntax for dict merging: {**d1, **d2} (which works with
*all* mappings). Is not this contradicts the Zen?
The PEP should probably also propose d1-d2.
And this opens a non-easy problem: how to create a mapping of the same
type? Not all mappings, and even not all dict subclasses have a copying
constructor.
Considering potential ambiguity, I suggest `d1.append(d2)` so we can have an additional argument saying `d1.append(d2, mode="some mode that tells how this function behaviours")`.If we are really to have the new syntax `d1 + d2`, I suggest leaving it for `d1.append(d2, mode="strict")` which raises an error when there're duplicate keys. The semantics is nature and clear when two dicts have no overlapping keys.
_______________________________________________
Currently Counter += dict works and Counter + dict is an error. With
this change Counter + dict will return a value, but it will be different
from the result of the += operator.
Also, if the custom dict subclass implemented the plus operator with
different semantic which supports the addition with a dict, this change
will break it, because dict + CustomDict will call dict.__add__ instead
of CustomDict.__radd__. Adding support of new operators to builting
types is dangerous.
> I do not understand why we discuss a new syntax for dict merging if we
> already have a syntax for dict merging: {**d1, **d2} (which works with
> *all* mappings). Is not this contradicts the Zen?
>
>
> But (as someone else pointed out) {**d1, **d2} always returns a dict,
> not the type of d1 and d2.
And this saves us from the hard problem of creating a mapping of the
same type. Note that reference implementations discussed above make d1 +
d2 always returning a dict. dict.copy() returns a dict.
> Also, I'm sorry for PEP 448, but even if you know about **d in simpler
> contexts, if you were to ask a typical Python user how to combine two
> dicts into a new one, I doubt many people would think of {**d1, **d2}. I
> know I myself had forgotten about it when this thread started! If you
> were to ask a newbie who has learned a few things (e.g. sequence
> concatenation) they would much more likely guess d1+d2.
Perhaps the better solution is to update the documentation.
[...]
I do not understand why we discuss a new syntax for dict merging if we
already have a syntax for dict merging: {**d1, **d2} (which works with
*all* mappings). Is not this contradicts the Zen?
Hello,
I would like to propose that instead of using this (applies to Py3.5 and upwards)
dict_a = {**dict_a, **dict_b}
we could use
dict_a = dict_a + dict_b
or even better
dict_a += dict_b
Best regards, João Matos
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
I think mappings are more complicated than sequences it some things
seems not obvious to me.
What would be OrderedDict1 + OrderedDict2, in which positions would be
the resulting keys, which value would be used if the same key is
present in both?
What would be defaultdict1 + defaultdict2?
It seems to me that subclasses of dict are complex mappings for which
« merging » may be less obvious than for sequences.
>
>
> Is that an invariant you expect to apply to other uses of the +
> operator?
>
> py> x = -1
> py> x <= (x + x)
> False
>
> py> [999] <= ([1, 2, 3] + [999])
> False
>
Please calm down. I meant each type implements "sum"
in semantics of the type, in lossless way.
What "lossless" means is changed by the semantics of the type.
-1 + -1 = -2 is sum in numerical semantics. There are no loss.
That's understandable, clouds of confusion have been raised. As far as
I can tell it's pretty straightforward: d = d1 + d2 is equivalent to:
>>> d = d1.copy()
>>> d.update(d2)
All of your subsequent questions then become "What does
DictSubclassInQuestion.update() do?" which should be well defined.
--
Rhodri James *-* Kynesim Ltd
The only reasonable answer I can come up with is:
1) unique keys from OrderedDict1 are in the same order as before
2) duplicate keys and new keys from OrderedDict2 come after the keys from
d1, in their original order in d2 since they replace keys in d1.
Basically, the expression says: "take a copy of d1 and add the items from
d2 to it". That's exactly what you should get, whether the mappings are
ordered or not (and dict are ordered by insertion in Py3.6+).
> What would be defaultdict1 + defaultdict2?
No surprises here, the result is a copy of defaultdict1 (using the same
missing-key function) with all items from defaultdict2 added.
Remember that the order of the two operands matters. The first always
defines the type of the result, the second is only added to it.
> It seems to me that subclasses of dict are complex mappings for which
> « merging » may be less obvious than for sequences.
It's the same for subclasses of sequences.
Stefan
I understand Inada to be saying that each value on the LHS (as shown
above) affects the result on the RHS. That's the case with addition of
ints and other types, but not so with the proposed dict addition. As he
says, the {a:1} doesn't affect the result. The result would be the same
if this key wasn't present in the first dict, or if the key had a
different value.
This doesn't bother me, personally. I'm just trying to clarify.
Eric
>
> Regards,
>
>
> 2019年3月1日(金) 23:19 Ivan Levkivskyi <levki...@gmail.com
> <mailto:levki...@gmail.com>>:
>
> On Fri, 1 Mar 2019 at 13:48, INADA Naoki <songof...@gmail.com
> <mailto:songof...@gmail.com>> wrote:
>
> >
> >
> > Is that an invariant you expect to apply to other uses of the +
> > operator?
> >
> > py> x = -1
> > py> x <= (x + x)
> > False
> >
> > py> [999] <= ([1, 2, 3] + [999])
> > False
> >
>
> Please calm down. I meant each type implements "sum"
> in semantics of the type, in lossless way.
> What "lossless" means is changed by the semantics of the type.
>
> -1 + -1 = -2 is sum in numerical semantics. There are no loss.
>
>
> TBH I don't understand what is lossless about numeric addition. What
> is the definition of lossless?
> Clearly some information is lost, since you can't uniquely restore
> two numbers you add from the result.
>
> Unless you define what lossless means, there will be just more
> misunderstandings.
>
> --
> Ivan
>
>
>
No, I was, apparently. In Py3.7:
>>> d1 = {"a": 1, "b": 2, "c": 3}
>>> d1
{'a': 1, 'b': 2, 'c': 3}
>>> d2 = {"d": 4, "b": 5}
>>> d = d1.copy()
>>> d.update(d2)
>>> d
{'a': 1, 'b': 5, 'c': 3, 'd': 4}
I think the behaviour makes sense when you know how it's implemented (keys
are stored separately from values). I would have been less surprised if the
keys had also been reordered, but well, this is how it is now in Py3.6+, so
this is how it's going to work also for the operator.
No *additional* surprises here. ;)
On 3/1/2019 9:38 AM, INADA Naoki wrote:
> Sorry, I'm not good at English enough to explain my mental model.
>
> I meant no skip, no ignorance, no throw away.
>
> In case of 1+2=3, both of 1 and 2 are not skipped, ignored or thrown away.
>
> On the other hand, in case of {a:1, b:2}+{a:2}={a:2, b:2}, I feel {a:1}
> is skipped, ignored, or thrown away. I used "lost" to explain it.
>
> And I used "lossless" for "there is no lost". Not for reversible.
>
> If it isn't understandable to you, please ignore me.
>
> I think Rémi’s comment is very similar to my thought. Merging mapping
> is more complex than concatenate sequence and it seems hard to call it
> "sum".
I understand Inada to be saying that each value on the LHS (as shown
above) affects the result on the RHS. That's the case with addition of
ints and other types, but not so with the proposed dict addition. As he
says, the {a:1} doesn't affect the result. The result would be the same
if this key wasn't present in the first dict, or if the key had a
different value.
This doesn't bother me, personally. I'm just trying to clarify.
No, it just helps _me_ in explaining the behaviour to myself. Feel free to
look it up in the documentation if you prefer.
>> I would have been less surprised if the
>> keys had also been reordered, but well, this is how it is now in Py3.6+, so
>> this is how it's going to work also for the operator.
>>
>> No *additional* surprises here. ;)
>
> There is never any surprises left once all details have been carefully worked
> out but having `+` for mappings make it looks like an easy operation whose
> meaning is non ambiguous and obvious.
>
> I’m still not convinced that it the meaning is obvious, and gave an example
> in my other message where I think it could be ambiguous.
What I meant was that it's obvious in the sense that it is no new behaviour
at all. It just provides an operator for behaviour that is already there.
We are not discussing the current behaviour here. That ship has long sailed
with the release of Python 3.6 beta 1 back in September 2016. The proposal
that is being discussed here is the new operator.
28.02.19 23:19, Greg Ewing пише:
> Serhiy Storchaka wrote:
>> I do not understand why we discuss a new syntax for dict merging if we
>> already have a syntax for dict merging: {**d1, **d2} (which works with
>> *all* mappings).
>
> But that always returns a dict. A '+' operator could be implemented
> by other mapping types to return a mapping of the same type.
And this opens a non-easy problem: how to create a mapping of the same
type? Not all mappings, and even not all dict subclasses have a copying
constructor.
> On Mar 1, 2019, at 11:31 AM, Guido van Rossum <gu...@python.org> wrote:
>
> There's a compromise solution for this possible. We already do this for Sequence and MutableSequence: Sequence does *not* define __add__, but MutableSequence *does* define __iadd__, and the default implementation just calls self.update(other). I propose the same for Mapping (do nothing) and MutableMapping: make the default __iadd__ implementation call self.update(other).
Usually, it's easy to add methods to classes without creating disruption, but ABCs are more problematic. If MutableMapping grows an __iadd__() method, what would that mean for existing classes that register as MutableMapping but don't already implement __iadd__? When "isinstance(m, MutableMapping)" returns True, is it a promise that the API is fully implemented? Is this something that mypy could would or should complain about?
This LGTM for mappings. But the problem with dict subclasses still
exists. If use the copy() method for creating a copy, d1 + d2 will
always return a dict (unless the plus operator or copy() are redefined
in a subclass). If use the constructor of the left argument type, there
will be problems with subclasses with non-compatible constructors (e.g.
defaultdict).
> Anyways, the main reason to prefer d1+d2 over {**d1, **d2} is that the
> latter is highly non-obvious except if you've already encountered that
> pattern before, while d1+d2 is what anybody familiar with other Python
> collection types would guess or propose. And the default semantics for
> subclasses of dict that don't override these are settled with the "d =
> d1.copy(); d.update(d2)" equivalence.
Dicts are not like lists or deques, or even sets. Iterating dicts
produces keys, but not values. The "in" operator tests a key, but not a
value.
It is not that I like to add an operator for dict merging, but dicts are
more like sets than sequences: they can not contain duplicated keys and
the size of the result of merging two dicts can be less than the sum of
their sizes. Using "|" looks more natural to me than using "+". We
should look at discussions for using the "|" operator for sets, if the
alternative of using "+" was considered, I think the same arguments for
preferring "|" for sets are applicable now for dicts.
But is merging two dicts a common enough problem that needs introducing
an operator to solve it? I need to merge dicts maybe not more than one
or two times by year, and I am fine with using the update() method.
Perhaps {**d1, **d2} can be more appropriate in some cases, but I did
not encounter such cases yet.
Because the plus operator for lists predated any list subclasses.
>> Also, if the custom dict subclass implemented the plus operator with
>> different semantic which supports the addition with a dict, this change
>> will break it, because dict + CustomDict will call dict.__add__ instead
>> of CustomDict.__radd__.
>
> That's not how operators work in Python or at least that's not how they
> worked the last time I looked: if the behaviour has changed without
> discussion, that's a breaking change that should be reverted.
You are right.
> What's wrong with doing this?
>
> new = type(self)()
>
> Or the equivalent from C code. If that doesn't work, surely that's the
> fault of the subclass, the subclass is broken, and it will raise an
> exception.
Try to do this with defaultdict.
Note that none of builtin sequences or sets do this. For good reasons
they always return an instance of the base type.
But is merging two dicts a common enough problem that needs introducing
an operator to solve it? I need to merge dicts maybe not more than one
or two times by year, and I am fine with using the update() method.
Perhaps {**d1, **d2} can be more appropriate in some cases, but I did not encounter such cases yet.
That doesn't answer my question. Just because it is older is no
explaination for why this behaviour is not a problem for lists, or a
problem for dicts.
[...]
> >What's wrong with doing this?
> >
> > new = type(self)()
> >
> >Or the equivalent from C code. If that doesn't work, surely that's the
> >fault of the subclass, the subclass is broken, and it will raise an
> >exception.
>
> Try to do this with defaultdict.
I did. It seems to work fine with my testing:
py> defaultdict()
defaultdict(None, {})
is precisely the behaviour I would expect.
If it isn't the right thing to do, then defaultdict can override __add__
and __radd__.
> Note that none of builtin sequences or sets do this. For good reasons
> they always return an instance of the base type.
What are those good reasons?
--
Steven
> Like other folks in the thread, I also want to merge dicts three times per
> year.
I'm impressed that you have counted it with that level of accuracy. Is it on the same three days each year, or do they move about? *wink*
> And every one of those times, itertools.ChainMap is the right way to do that non-destructively, and without copying.
Can you elaborate on why ChainMap is the right way to merge multiple dicts into a single, new dict?
ChainMap also seems to implement the opposite behaviour to that usually desired: first value seen wins, instead of last:
If you know ahead of time which order you want, you can simply reverse it:
On Mon, Mar 4, 2019 at 2:26 PM Guido van Rossum <gu...@python.org> wrote:
>
> * Dicts are not like sets because the ordering operators (<, <=, >, >=) are not defined on dicts, but they implement subset comparisons for sets. I think this is another argument pleading against | as the operator to combine two dicts.
>
I feel like dict should be treated like sets with the |, &, and -
operators since in mathematics a mapping is sometimes represented as a
set of pairs with unique first elements. Therefore, I think the set
metaphor is stronger.
> * Regarding how to construct the new set in __add__, I now think this should be done like this:
>
> class dict:
> <other methods>
> def __add__(self, other):
> <checks that other makes sense, else return NotImplemented>
> new = self.copy() # A subclass may or may not choose to override
> new.update(other)
> return new
I like that, but it would be inefficient to do that for __sub__ since
it would create elements that it might later delete.
def __sub__(self, other):
new = self.copy()
for k in other:
del new[k]
return new
is less efficient than
def __sub__(self, other):
return type(self)({k: v for k, v in self.items() if k not in other})
when copying v is expensive. Also, users would probably not expect
values that don't end up being returned to be copied.
Honestly I would rather withdraw the subtraction operators than reopen the discussion about making dict more like set.
On Mon, Mar 4, 2019 at 3:58 PM Christopher Barker <pyth...@gmail.com> wrote:
>
>
>
> On Mon, Mar 4, 2019 at 12:41 PM Guido van Rossum <gu...@python.org> wrote:
>>
>> Honestly I would rather withdraw the subtraction operators than reopen the discussion about making dict more like set.
I think that's unfortunate.
>
>
> +1
>
> I think the "dicts are like more-featured" sets is a math-geek perspective, and unlikely to make things more clear for the bulk of users. And may make it less clear.
I'd say reddit has some pretty "common users", and they're having a
discussion of this right now
(https://www.reddit.com/r/Python/comments/ax4zzb/pep_584_add_and_operators_to_the_builtin_dict/).
The most popular comment is how it should be |.
Anyway, I think that following the mathematical metaphors tends to
make things more intuitive in the long run.
Python is an adventure.
You learn it for years and then it all makes sense. If dict uses +,
yes, new users might find that sooner than |. However, when they
learn set union, I think they will wonder why it's not consistent with
dict union.
The PEP's main justification for + is that it matches Counter, but
counter is adding the values whereas | doesn't touch the values. I
think it would be good to at least make a list of pros and cons of
each proposed syntax.
> We need to be careful -- there are a lot more math geeks on this list than in the general Python coding population.
>
> Simply adding "+" is a non-critical nice to have, but it seems unlikely to really confuse anyone.
>
> -CHB
>
>
> --
> Christopher Barker, PhD
>
> Python Language Consulting
> - Teaching
> - Scientific Software Development
> - Desktop GUI and Web Development
> - wxPython, numpy, scipy, Cython
Well, I suppose that the next proposition will be to implement the
ordering operators for dicts. Because why not? Lists and numbers support
them. /sarcasm/
Jokes aside, dicts have more common with sets than with sequences. Both
can not contain duplicated keys/elements. Both have the constant
computational complexity of the containment test. For both the size of
the merging/unioning can be less than the sum of sizes of original
containers. Both have the same restrictions for keys/elements (hashability).
> * Regarding how to construct the new set in __add__, I now think this
> should be done like this:
>
> class dict:
> <other methods>
> def __add__(self, other):
> <checks that other makes sense, else return NotImplemented>
> new = self.copy() # A subclass may or may not choose to override
> new.update(other)
> return new
>
> AFAICT this will give the expected result for defaultdict -- it keeps
> the default factory from the left operand (i.e., self).
No one builtin type that implements __add__ uses the copy() method. Dict
would be the only exception from the general rule.
And it would be much less efficient than {**d1, **d2}.
> * Regarding how often this is needed, we know that this is proposed and
> discussed at length every few years, so I think this will fill a real need.
And every time this proposition was rejected. What has been changed
since it was rejected the last time? We now have the expression form of
dict merging ({**d1, **d2}), this should be decrease the need of the
plus operator for dicts.
Interesting point.
In Japan, we learn set in high school, not in university. And I think
it's good idea that people using `set` type learn about `set` in math.
So I don't think "union" is not only for math geeks.
But we use "A ∪ B" in math. `|` is borrowed from "bitwise OR" in C.
And "bitwise" operators are for "geeks".
Although I'm not in favor of adding `+` to set, it will be worth enough to
add `+` to set too if it is added to dict for consistency.
FWIW, Scala uses `++` for join all containers.
Kotlin uses `+` for join all containers.
(ref https://discuss.python.org/t/pep-584-survey-of-other-languages-operator-overload/977)
Regards,
--
Inada Naoki <songof...@gmail.com>
See the Python-Dev thread with the subject "Re: Re: PEP 218 (sets);
moving set.py to Lib" starting from
https://mail.python.org/pipermail/python-dev/2002-August/028104.html
I just wanted to mention this since it hasn't been brought up, but neither of these work
a.keys() + b.keys()
a.values() + b.values()
a.items() + b.items()
However, the following do work:
a.keys() | b.keys()
a.items() | b.items()
Perhaps they work by coincidence (being set types), but I think it's worth bringing up, since a naive/natural Python implementation of dict addition/union would possibly involve the |-operator.
Pål
Every. Single. Time.
I don't use sets a lot (purely by happenstance rather than choice), and
every time I do I have to go and look in the documentation because I
expect the union operator to be '+'.
> Except for math geek the `|` is always something obscure.
Two thirds of my degree is in maths, and '|' is still something I don't
associate with sets. It would be unreasonable to expect '∩' and '∪' as
the operators, but reasoning from '-' for set difference I always expect
'+' and '*' as the union and intersection operators. Alas my hopes are
always cruelly crushed :-)
--
Rhodri James *-* Kynesim Ltd
The language SETL (the language of sets) also uses + and * for set operations.¹
For us though, the decision to use | and & are set in stone. The time for debating the decision was 19 years ago.²
Raymond
¹ https://www.linuxjournal.com/article/6805
² https://www.python.org/dev/peps/pep-0218/
> On Mar 5, 2019, at 2:13 PM, Greg Ewing <greg....@canterbury.ac.nz> wrote:
>
> Rhodri James wrote:
>> I have to go and look in the documentation because I expect the union operator to be '+'.
>
> Anyone raised on Pascal is likely to find + and * more
> natural. Pascal doesn't have bitwise operators, so it
> re-uses + and * for set operations. I like the economy
> of this arrangement -- it's not as if there's any
> other obvious meaning that + and * could have for sets.
The language SETL (the language of sets) also uses + and * for set operations.¹
¹ https://www.linuxjournal.com/article/6805
² https://www.python.org/dev/peps/pep-0218/