Multi dimensional dense_hash_map implementation

lord...@gmail.com

unread,

Nov 8, 2013, 10:35:47 AM11/8/13

to google-s...@googlegroups.com

Hi,

I apologize for disturbing you out of the blue. But i am running a time bound experiment and i have a 4 dimensional matrix which i have to search for values. I am unable to implement dense_hash_map since it involves setting the empty key before. I do the following but it does not seem to be working at all,

dense_hash_map<int, dense_hash_map<int, dense_hash_map<int, dense_hash_map<int, int> > > > wIndex;

wIndex.set_empty_key(NULL);

wIndex[0].set_empty_key(NULL);

wIndex[0][0].set_empty_key(NULL);

wIndex[0][0][0].set_empty_key(NULL);

It compiles fine (with warnings) but it throws an error when i try to push values in it.

I would really appreciate your help in this matter.

Thanks,

Piyush

The error is as below,

/usr/local/include/sparsehash/internal/densehashtable.h:985: google::dense_hashtable<Value, Key, HashFcn, ExtractKey, SetKey, EqualKey, Alloc>::value_type& google::dense_hashtable<Value, Key, HashFcn, ExtractKey, SetKey, EqualKey, Alloc>::find_or_insert(const key_type&) [with DefaultValue = google::dense_hash_map<int, google::dense_hash_map<int, google::dense_hash_map<int, google::dense_hash_map<int, int> > > >::DefaultValue; Value = std::pair<const int, google::dense_hash_map<int, google::dense_hash_map<int, google::dense_hash_map<int, int> > > >; Key = int; HashFcn = std::tr1::hash<int>; ExtractKey = google::dense_hash_map<int, google::dense_hash_map<int, google::dense_hash_map<int, google::dense_hash_map<int, int> > > >::SelectKey; SetKey = google::dense_hash_map<int, google::dense_hash_map<int, google::dense_hash_map<int, google::dense_hash_map<int, int> > > >::SetKey; EqualKey = std::equal_to<int>; Alloc = google::libc_allocator_with_realloc<std::pair<const int, google::dense_hash_map<int, google::dense_hash_map<int, google::dense_hash_map<int, int> > > > >; google::dense_hashtable<Value, Key, HashFcn, ExtractKey, SetKey, EqualKey, Alloc>::value_type = std::pair<const int, google::dense_hash_map<int, google::dense_hash_map<int, google::dense_hash_map<int, int> > > >; google::dense_hashtable<Value, Key, HashFcn, ExtractKey, SetKey, EqualKey, Alloc>::key_type = int]: Assertion `(!settings.use_empty() || !equals(key, get_key(val_info.emptyval))) && "Inserting the empty key"' failed.

Geoff Pike

unread,

Nov 8, 2013, 1:59:28 PM11/8/13

to google-s...@googlegroups.com

You told it to use 0 (also known as NULL) as the empty key, and then
tried to use 0 as key. You need to choose an empty key that isn't
going to be used as a "regular" key.

Geoff

lord...@gmail.com

unread,

Nov 8, 2013, 2:09:11 PM11/8/13

to google-s...@googlegroups.com

Hey Geoff, Thanks for the quick reply,

I followed what you said and tried the following,

wIndex.set_empty_key(NULL);

wIndex[10000].set_empty_key(NULL);

wIndex[9999][10000].set_empty_key(NULL);

wIndex[9998][9999][10000].set_empty_key(NULL);

So basically, all of these have keys which i am never going to use. It compiles with a warning.

warning: passing NULL to non-pointer argument 1 of ‘void google::dense_hash_map<Key, T, HashFcn, EqualKey, Alloc>::set_empty_key(const key_type&) [with Key = int; T = google::dense_hash_map<int, int>; HashFcn = std::tr1::hash<int>; EqualKey = std::equal_to<int>; Alloc = google::libc_allocator_with_realloc<std::pair<const int, google::dense_hash_map<int, int> > >; google::dense_hash_map<Key, T, HashFcn, EqualKey, Alloc>::key_type = int]’ [-Wconversion-null]

lp_prob.c:1055:48: warning: passing NULL to non-pointer argument 1 of ‘void google::dense_hash_map<Key, T, HashFcn, EqualKey, Alloc>::set_empty_key(const key_type&) [with Key = int; T = int; HashFcn = std::tr1::hash<int>; EqualKey = std::equal_to<int>; Alloc = google::libc_allocator_with_realloc<std::pair<const int, int> >; google::dense_hash_map<Key, T, HashFcn, EqualKey, Alloc>::key_type = int]’ [-Wconversion-null]

It still errors out with the same error as before.

Geoff Pike

unread,

Nov 8, 2013, 2:14:39 PM11/8/13

to google-s...@googlegroups.com

On general principles you should eliminate the warning by changing
set_empty_key(NULL) to set_empty_key(0) if you want 0 to be the empty
key. I agree with you that fixing that shouldn't affect the runtime
error. You said it said you are using 0 as a key. Are you sure you
aren't?

Geoff

Piyush Kumar

unread,

Nov 8, 2013, 2:29:20 PM11/8/13

to google-s...@googlegroups.com

I am sure i was not. Because all the keys have to do with ranks which never go to 0. But i fixed it anyway. I converted the 4 dimensions to a single dimension by just multiplying them by multiples of 10. Its a pretty hacky fix but i do not have the time yet because my phd defense is 2 weeks away. Thank you so much for your time and attention. Hopefully i will come back and rectify this problem.

Geoff

--
You received this message because you are subscribed to a topic in the Google Groups "google-sparsehash" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/google-sparsehash/AYBaVVOqaP0/unsubscribe.
To unsubscribe from this group and all its topics, send an email to google-sparseh...@googlegroups.com.
To post to this group, send email to google-s...@googlegroups.com.
Visit this group at http://groups.google.com/group/google-sparsehash.
For more options, visit https://groups.google.com/groups/opt_out.

Wenwei Peng

unread,

Apr 22, 2015, 4:16:51 AM4/22/15

to google-s...@googlegroups.com, lord...@gmail.com

Title:       The core of the core of the big data solutions -- Map
Author:      pengwenwei
Email:
Language:    c++
Platform:    Windows, linux
Technology: Perfect hash algorithm
Level:       Advanced
Description: Map algorithm with high performance
Section      MFC c++ map stl
SubSection   c++ algorithm
License:     (GPLv3)

    Download demo project - 1070 Kb
    Download source - 1070 Kb

Introduction:
For the c++ program, map is used everywhere.And bottleneck of program performance is often the performance of map.Especially in the case of large data,and the business association closely and unable to realize the data distribution and parallel processing condition.So the performance of map becomes the key technology.

In the work experience with telecommunications industry and the information security industry, I was dealing with the big bottom data,especially the most complex information security industry data,all can’t do without map.

For example, IP table, MAC table, telephone number list, domain name resolution table, ID number table query, the Trojan horse virus characteristic code of cloud killing etc..

The map of STL library using binary chop, its has the worst performance.Google Hash map has the optimal performance and memory at present, but it has repeated collision probability.Now the big data rarely use a collision probability map,especially relating to fees, can’t be wrong.

Now I put my algorithms out here,there are three kinds of map,after the build is Hash map.We can test the comparison,my algorithm has the zero probability of collision,but its performance is also better than the hash algorithm, even its ordinary performance has no much difference with Google.

My algorithm is perfect hash algorithm,its key index and the principle of compression algorithm is out of the ordinary,the most important is a completely different structure,so the key index compression is fundamentally different.The most direct benefit for program is that for the original map need ten servers for solutions but now I only need one server.
Declare: the code can not be used for commercial purposes, if for commercial applications,you can contact me with QQ 75293192.
Download:
https://sourceforge.net/projects/pwwhashmap/files

Applications:
First,modern warfare can’t be without the mass of information query, if the query of enemy target information slows down a second, it could lead to the delaying fighter, leading to failure of the entire war. Information retrieval is inseparable from the map, if military products use pwwhashMap instead of the traditional map,you must be the winner.

Scond,the performance of the router determines the surfing speed, just replace open source router code map for pwwHashMap, its speed can increase ten times.
There are many tables to query and set in the router DHCP ptotocol,such as IP,Mac ,and all these are completed by map.But until now,all map are using STL liabrary,its performance is very low,and using the Hash map has error probability,so it can only use multi router packet dispersion treatment.If using pwwHashMap, you can save at least ten sets of equipment.

Third,Hadoop is recognized as the big data solutions at present,and its most fundamental thing is super heavy use of the map,instead of SQL and table.Hadoop assumes the huge amounts of data so that the data is completely unable to move, people must carry on the data analysis in the local.But as long as the open source Hadoop code of the map changes into pwwHashMap, the performance will increase hundredfold without any problems.

Background to this article that may be useful such as an introduction to the basic ideas presented:
http://blog.csdn.net/chixinmuzi/article/details/1727195

在 2013年11月8日星期五 UTC+8下午11:35:47，lord...@gmail.com写道：

Reply all

Reply to author

Forward