unordered_multimap and equivalent element order

Marcel Mueller

unread,

Nov 5, 2014, 5:11:06 PM11/5/14

to

Does unordered_multimap preserve the order of elements /with the same key/?
The documentation does not directly answer this question, at least I did
not find the answer. But when elements with the same key are stored in
the same bucket it should be straight forward to keep the sequence.

Marcel

Luca Risolia

unread,

Nov 5, 2014, 6:26:21 PM11/5/14

to

Il 05/11/2014 23:10, Marcel Mueller ha scritto:
> Does unordered_multimap preserve the order of elements /with the same key/?

Rehashing and other operations that internally change the order of
elements preserve the relative order of elements with equivalent keys.

Marcel Mueller

unread,

Nov 6, 2014, 12:58:59 PM11/6/14

to

Is this experience or guaranteed?

Marcel

Marcel Mueller

unread,

Nov 6, 2014, 1:09:28 PM11/6/14

to

I should add some more details.

I want to build up a list of entries with two keys. The first key is an
identifier name, i.e. string. Lookups for this key are always exact
matches. So a hash table is the first choice.

The second key is a version number (int). This is used if the same
identifier is redefined in a nested scope shadowing the outer
definition. (It is part of a compiler.)
If the unordered_multimap keeps the sequence then the entries are always
ordered by the second key, because they are always added strongly
monotonically. So I can easily use one of the borders of equal_range to
get the most recent version.

Using a nested structure with map inside unordered_map is not my first
choice because it is very likely that most of the identifiers never get
more than one version. And so I get very many maps with only one element
which is not that efficient.

Marcel

Luca Risolia

unread,

Nov 6, 2014, 3:12:44 PM11/6/14

to

Il 06/11/2014 18:58, Marcel Mueller ha scritto:

>> Rehashing and other operations that internally change the order of
>> elements preserve the relative order of elements with equivalent keys.
>
> Is this experience or guaranteed?

Both. Here is the guarantee from the standard:

23.2.5 Unordered associative containers

"In containers that support equiv-
alent keys, elements with equivalent keys are adjacent to each other in
the iteration order of the container.
Thus, although the absolute order of elements in an unordered container
is not specified, its elements are
grouped into
equivalent-key group
s such that all elements of each group have equivalent keys. Mutating
operations on unordered containers shall preserve the relative order of
elements within each equivalent-key
group unless otherwise specified."

Öö Tiib

unread,

Nov 6, 2014, 7:38:16 PM11/6/14

to

On Thursday, 6 November 2014 20:09:28 UTC+2, Marcel Mueller wrote:
> I should add some more details.
>
> I want to build up a list of entries with two keys. The first key is an
> identifier name, i.e. string. Lookups for this key are always exact
> matches. So a hash table is the first choice.
>
> The second key is a version number (int). This is used if the same
> identifier is redefined in a nested scope shadowing the outer
> definition. (It is part of a compiler.)
> If the unordered_multimap keeps the sequence then the entries are always
> ordered by the second key, because they are always added strongly
> monotonically. So I can easily use one of the borders of equal_range to
> get the most recent version.

Compiler? You mean you keep all the names from different scopes in
same container and indicate scope with int? It feels upside
down and it can not be efficient. If it is compiler of something like
C++ then that single int can not be sufficient for name lookup.

Marcel Mueller

unread,

Nov 10, 2014, 1:34:15 PM11/10/14

to

On 07.11.14 01.37, Öö Tiib wrote:
> Compiler? You mean you keep all the names from different scopes in
> same container and indicate scope with int?

Yes. The Scope is only a depth.

> It feels upside
> down and it can not be efficient. If it is compiler of something like
> C++ then that single int can not be sufficient for name lookup.

No, it is by far not that complex than C++. It is a macro assembler for
a Broadcom GPU. So basically it is about labels, constants, user defined
functions and macros. Nothing like namespaces or ADL.

But maybe I use a different approach. If I have a separate dictionary
for each scope I can easily discard deeper dictionaries at the end of a
scope block. The price is N lookups per global symbol in a scope with
depth N. Not optimal, but on nowadays hardware no big deal.

From my point of a data structure that uses the symbol name as first
key better fits my needs. Therefore the question. At least I can learn more.

Marcel

Öö Tiib

unread,

Nov 10, 2014, 4:02:26 PM11/10/14

to

On Monday, 10 November 2014 20:34:15 UTC+2, Marcel Mueller wrote:
> On 07.11.14 01.37, Öö Tiib wrote:
> > Compiler? You mean you keep all the names from different
> > scopes in same container and indicate scope with int?
>
> Yes. The Scope is only a depth.

Ok, but on case the outer scope names are hidden ... why
to keep them also in lookup table within scope?

Lets say scope 7 declares name "a", and there are no names
that it hides. You can add name "a" to lookup table and
memorize to remove it at end of scope 7.

Then scope 7 declares name "b", and there is already name
"b" from scope 5 that it hides. You can modify name "b"
in lookup table and memorize to restore the scope 5 "b"
to power at end of scope 7.

> > It feels upside down and it can not be efficient. If it
> > is compiler of something like C++ then that single
> > int can not be sufficient for name lookup.
>
> No, it is by far not that complex than C++. It is a macro
> assembler for a Broadcom GPU. So basically it is about
> labels, constants, user defined functions and macros.
> Nothing like namespaces or ADL.

If the nesting of scopes is straight (unlike with
classes, namespaces and ADL) then it must be is
even more simple. I see no reason why you should
have so complex, multi-layer lookup table on that case.

For assembler it likely does not matter. Assembler
presumably does not preprocess into million line monsters
like C++. However ... simplicity is also quality on its
own.

> But maybe I use a different approach. If I have a separate > dictionary for each scope I can easily discard deeper
> dictionaries at the end of a scope block. The price is
> N lookups per global symbol in a scope with depth N.
> Not optimal, but on nowadays hardware no big deal.

Why you should do several lookups? Is there some
diagnostic about defects that you want to improve
somehow with it?

> From my point of a data structure that uses the symbol
> name as first key better fits my needs. Therefore the
> question. At least I can learn more.

It may indeed be that I do not know of some details of
your assembler. With what cases such large container is
most convenient and/or most efficient?

Marcel Mueller

unread,

Nov 10, 2014, 5:44:33 PM11/10/14

to

On 10.11.14 22.02, Öö Tiib wrote:
> Ok, but on case the outer scope names are hidden ... why
> to keep them also in lookup table within scope?
>
> Lets say scope 7 declares name "a", and there are no names
> that it hides. You can add name "a" to lookup table and
> memorize to remove it at end of scope 7.
>
> Then scope 7 declares name "b", and there is already name
> "b" from scope 5 that it hides. You can modify name "b"
> in lookup table and memorize to restore the scope 5 "b"
> to power at end of scope 7.

This was my first idea, too. But I dropped it because it makes the code
for modification on the lookup table somewhat complicated.

Symbols could be legally redefined in the same scope. It is only one of
my extensions to protect symbols against redefinition inside the same
(maybe global) scope. In this case I must not put the current symbol
value into the recovery container, of course.

> If the nesting of scopes is straight (unlike with
> classes, namespaces and ADL) then it must be is
> even more simple. I see no reason why you should
> have so complex, multi-layer lookup table on that case.

Hmm, maybe I should rethink about the above approach.

> For assembler it likely does not matter. Assembler
> presumably does not preprocess into million line monsters
> like C++.

No, obviously not. The expansion factor is only in the order of 2 to 4.
I.e. 1 line of source emits 2 to 4 instructions depending on whether you
count comment lines as well or not.

> However ... simplicity is also quality on its
> own.

Indeed.

> Why you should do several lookups?

It is simply easier to implement. Each level start with its own empty
dictionary. Each modifying directive applies to the deepmost dictionary
for local operations or on the outermost dictionary for global changes.
At the end of the scope I simply do a pop_back and the previous state is
restored with exception of the global changes.

From the code's point of view there are several locations where the
dictionaries are modified with different directives. But there is only
one read access: the instruction parser for macros and the expression
parser for anything else.

> Is there some
> diagnostic about defects that you want to improve
> somehow with it?

I think printing meaningful diagnostic messages is easily possible with
any of the discussed data models.

Marcel