Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Number of unique values in map

78 views
Skip to first unread message

Paul

unread,
Aug 14, 2015, 5:57:52 AM8/14/15
to
Please could someone tell me the simplest way to count the number of unique values in a container when the values are not necessarily consecutive?

For example,

std::unordered_map<int, int> example;
example[5] = 3;
example[1] = 7;
example[0] = 3;

If we apply the desired function to example, we would get 2 because there are two distinct values -- 3 and 7.

The way I know how to do this simply is to insert or copy into a std::set and apply size() to the set but that seems inefficient and inelegant.

Thank you,

Paul

c.mila...@gmail.com

unread,
Aug 14, 2015, 9:18:50 AM8/14/15
to
I think it is very elegant, and rather efficient.

If you use an unordered_set, it is probably more efficient.

Which solution is more elegant than this one?

std::unordered_set<int> s;
for (auto& pair: example)
s.insert(pair.second);

--
Carlo Milanesi

Barry Schwarz

unread,
Aug 14, 2015, 2:06:30 PM8/14/15
to
On Fri, 14 Aug 2015 02:57:31 -0700 (PDT), Paul <peps...@gmail.com>
wrote:

>Please could someone tell me the simplest way to count the number of unique values in a container when the values are not necessarily consecutive?
>
>For example,
>
>std::unordered_map<int, int> example;

Would the issue in question be any different if example had type
std::map<int,int>
instead of std::unordered_map?

>example[5] = 3;
>example[1] = 7;
>example[0] = 3;
>
>If we apply the desired function to example, we would get 2 because there are two distinct values -- 3 and 7.
>
>The way I know how to do this simply is to insert or copy into a std::set and apply size() to the set but that seems inefficient and inelegant.
>
>Thank you,
>
>Paul

--
Remove del for email

Nobody

unread,
Aug 14, 2015, 10:34:51 PM8/14/15
to
On Fri, 14 Aug 2015 02:57:31 -0700, Paul wrote:

> Please could someone tell me the simplest way to count the number of
> unique values in a container when the values are not necessarily
> consecutive?

First of all, the objects must support some form of equality comparison,
otherwise any talk of uniqueness is meaningless.

If they additionally support total ordering (i.e. for any elements a and b
then either a==b, a<b or b<a, and for any elements a, b and c then if a<b
and b<c then a<c), you can first sort them; then you only need to check
whether adjacent elements are equal.

Personally, I'd suggest that copying to an unordered_set and measuring its
size is likely to be about as efficient as it gets (anything more
efficient would basically amount to optimising unordered_set) and
(although this is subjective) perfectly elegant.

Vir Campestris

unread,
Aug 16, 2015, 4:44:27 PM8/16/15
to
OK, I'm confused. unordered_map contains values whose keys are unique.

So surelt the answer to the number of unique values is ... size()?

If it was a multimap of some sort the question would be different, and I
suspect making a set of the keys would be the fastest answer.

Andy

Nobody

unread,
Aug 16, 2015, 5:13:28 PM8/16/15
to
On Sun, 16 Aug 2015 21:44:10 +0100, Vir Campestris wrote:

> OK, I'm confused. unordered_map contains values whose keys are unique.
>
> So surelt the answer to the number of unique values is ... size()?

No, that's the number of unique *keys*. The OP is asking for the number of
unique *values*.

Nothing about the question is specific to unordered_map; it could be any
container.

Jorgen Grahn

unread,
Aug 16, 2015, 6:16:06 PM8/16/15
to
To be fair, the std::unordered_map is, the value_type is the (key,
value) pair. What we tend to think of[0] as the value is what the
library calls the data_type.

/Jorgen

[0] Because that's the terminology outside C++.

--
// Jorgen Grahn <grahn@ Oo o. . .
\X/ snipabacken.se> O o .

1971 powerChina

unread,
Aug 17, 2015, 9:45:21 PM8/17/15
to
Title: The core of the big data solutions -- Map.
Author: pengwenwei
address: Xianggan Batang District 17-18 xiangtan City, Hunan China
Language: c++
Platform: Windows, linux
Technology: Perfect hash algorithm
Level: Advanced
Description: A high performance map algorithm
Section MFC c++ map stl
SubSection c++ algorithm
License: (GPLv3)


Map is widely used in c++ programs. Its performance is critical to programs' performance. Especially in big data and the scenarios which can't realize data distribution and parallel processing.

I have been working on big data analysis for many years in telecommunition and information security industry. The data analysis is so complicated that they can't work without map. Especially in information security industry, the data is much more complicated than others. For example, ip table, mac table, telephone numbers table, dns table etc.


Currently, the STL map and Google's hash map are the most popular maps. But they have some disadvantages. The STL map is based on binary chop, which causes a bad performance. Google Hash map has the best performance at present, but it has probability of collision. For big data analysis, the collision probability is unacceptable.

Now I would like to publish pwwMap. It includes three different maps for different scenarios:
1. Memory Map(memMap): It has a good access speed. But its size is limited by memory size.
2. Harddisk Map(diskMap): It utilizes hard disk to store data. So it could accept much more data than memory map.
3. Hashmap(hashMap): It has the best performance and a great lookup speed, but it doesn't have 'insert' and 'delete' functionality.

MemMap and diskMap could be converted to hashMap by function memMap2HashMap and diskMap2HashMap. According to the test result, my algorithms' collision probability is zero. About performance, memMap has a comparable performance with google, and hashMap's performance is 100 times better than Google's hashmap.

In summary, pwwhash are perfect hash algorithms with zero collision probability. You can refer to following artical to find the key index and compress algorithm theory:
http://blog.csdn.net/chixinmuzi/article/details/1727195

Source code and documents:
https://sourceforge.net/projects/pwwhashmap/files/?source=navbar
0 new messages