[LLVMdev] Proposed implementation of N3333 hashing interfaces for LLVM (and possible libc++)

83 views
Skip to first unread message

Chandler Carruth

unread,
Feb 28, 2012, 6:34:26 AM2/28/12
to LLVM Developers Mailing List, Howard Hinnant
Hello folks,

TL;DR: This is my proposed hashing interface based on a proposed standard hashing interface. It also is implemented with a much faster and higher quality algorithm than the current one. This is an *early draft* of the code, looking for initial feedback.


There has been recent interest in improving the quality and consistency of LLVM's approach to hashing. In particular, getting a single API and a high quality implementation in one place that other parts of the codebase can use to hash custom data structures. As it happens, Jeffrey Yasskin and I have been working on a proposal with similar goals to the the C++ standard library[1].

----
API concerns

This interface is a bit heavyweight for LLVM's needs alone. It was designed with the input of Matt Austern, and two hashing experts, Geoff Pike and Austin Appleby, to be as easy to use and simple for user-defined types as possible, while composing cleanly with the STL and other C++ standard library concepts.

That said, we are working actively on getting this (quite possibly modified during the process) into the standard, and so it seems reasonable to just implement what we expect people to have available with their standard library. I'm planning to continue working on this, and hope to contribute it to libc++ to implement the proposed library extension when appropriate (both the code is sufficiently mature, and the committee/Howard is happy with the state of the interface).

The attached implementation is, I must emphasize, an early draft. I'm mostly looking for feedback, thoughts, concerns, etc. In particular, I'm not satisfied with the organization of the header file. There are a *lot* of implementation details. Ideas on the best organization welcome, otherwise I'll try to forward declare and shuffle the user-facing functions to at least be declared toward the top.

Howard, high-level feedback from you would be particularly appreciated as I would love to contribute this to libc++ when the time is right.

I'm hoping to package up tests and convert clients to the new interface tomorrow.

----
Implementation concerns

The current hashing implementation simply isn't high-enough quality. I'm working on extensive quality testing reports using the SMHasher test suite which uses a large number of keysets known to cause problems for real-world hashing algorithms in real-world applications. The current implementation is used to hash potentially long vectors of data, and so it is very likely to be subject to the collision patterns being tested for.

This isn't terribly surprising -- murmur2, the algorithm it is based upon has some known weaknesses here. Also, my testing seems to indicate some aspects of the adaptation made these significantly worse, but I'm not entirely certain. I'm still digging there.

There are now a few hashing algorithms that do significantly better in both performance and hash quality: Murmur3, CityHash, and SpookyHash. Of these, CityHash and SpookyHash are significantly faster than Murmur3, both for large keys and for very small keys.

My implementation is a variation of CityHash (the original isn't suitable for the interface) which is as high quality or higher quality, and happens to be still faster for large keys and no slower for small keys. SpookyHash is still faster for large keys, but is slower for small keys, and those dominate LLVM's (and typical application's) hash tables.

In particular the implementation I propose *scales* very well through the small (8-byte) to medium (64- to 128-byte) key space that seems not unheard of for folding sets and other LLVM data structures. For example, it should be faster than FoldingSet's current solution for 2-pointers worth of key by just a bit, but by the time there are 8-pointers worth of key, it is over 2x faster.


That said, I think there remain two significant implementation problems I want to solve in the near-term:
1) Performance of hashing very small keys should be better than it is. 8-bytes and smaller have room for improvement.
2) Performance on 32-bit hosts, ARM, and Atom hosts needs to be measured. I've not done this yet, and it may necessitate either changes to the implementation, or alternate implementations on those hosts.

Again, as I'm hoping this will be the standard hashing implementation for libc++ and others going forward, I'm planning on doing this legwork anyways.

I'll try to post the rest of my benchmarks and a nice chart, 32-bit benchmarks, some compile-time benchmarks of various pieces of the system, and the quality evaluation of these algorithms tomorrow.

----
bernstein_performance.txt
Hashing.h
superfast_performance.txt
old_llvm_performance.txt
murmur2a_performance.txt
murmur3a_performance.txt
city64_performance.txt
new_llvm_performance.txt
spooky64_performance.txt

Tobias Grosser

unread,
Feb 28, 2012, 8:35:31 AM2/28/12
to Chandler Carruth, LLVM Developers Mailing List
On 02/28/2012 12:34 PM, Chandler Carruth wrote:
> Hello folks,
>
> TL;DR: This is my proposed hashing interface based on a proposed
> standard hashing interface. It also is implemented with a much faster
> and higher quality algorithm than the current one. This is an *early
> draft* of the code, looking for initial feedback.
>
>
> There has been recent interest in improving the quality and consistency
> of LLVM's approach to hashing. In particular, getting a single API and a
> high quality implementation in one place that other parts of the
> codebase can use to hash custom data structures. As it happens, Jeffrey
> Yasskin and I have been working on a proposal with similar goals to the
> the C++ standard library[1].

Hi Chandler,

I just skimmed over the proposal and it seems very interesting to me.
Unfortunately I did not have time to look further into it.

One point that I found surprising is that your hashing is in most cases
notably faster than the old LLVM hashing, except for key sizes which are
a modulo of 4. Here the existing hash function is a lot faster than your
new solution. If keys that are multiple of 4 bytes are common in LLVM,
it might be worth to check if this change does not cause performance
regressions.


> old_llvm_performance.txt
>
>
> -------------------------------------------------------------------------------
> --- Testing llvm (LLVM Hashing)
>
> [[[ Speed Tests ]]]
>
> Small key speed test - 4-byte keys - 16.77 cycles/hash 7.39 nanos/hash
> Small key speed test - 8-byte keys - 25.93 cycles/hash 11.43 nanos/hash
> Small key speed test - 12-byte keys - 30.39 cycles/hash 13.39 nanos/hash
> Small key speed test - 16-byte keys - 30.18 cycles/hash 13.30 nanos/hash
> Small key speed test - 20-byte keys - 39.33 cycles/hash 17.34 nanos/hash
> Small key speed test - 24-byte keys - 44.70 cycles/hash 19.70 nanos/hash
> Small key speed test - 28-byte keys - 49.17 cycles/hash 21.68 nanos/hash

> new_llvm_performance.txt
>
>
> -------------------------------------------------------------------------------
> --- Testing std64 (LLVM Standard-baed Hashing)
>
> [[[ Speed Tests ]]]
>
> Small key speed test - 4-byte keys - 40.22 cycles/hash 17.73 nanos/hash
> Small key speed test - 8-byte keys - 40.24 cycles/hash 17.73 nanos/hash
> Small key speed test - 12-byte keys - 44.69 cycles/hash 19.70 nanos/hash
> Small key speed test - 16-byte keys - 44.69 cycles/hash 19.70 nanos/hash
> Small key speed test - 24-byte keys - 50.95 cycles/hash 22.46 nanos/hash
> Small key speed test - 28-byte keys - 50.95 cycles/hash 22.46 nanos/hash

Cheers
Tobi
_______________________________________________
LLVM Developers mailing list
LLV...@cs.uiuc.edu http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Howard Hinnant

unread,
Feb 28, 2012, 10:20:06 AM2/28/12
to Chandler Carruth, LLVM Developers Mailing List
On Feb 28, 2012, at 6:34 AM, Chandler Carruth wrote:

> Howard, high-level feedback from you would be particularly appreciated as I would love to contribute this to libc++ when the time is right.

Does the enclosed implementation implement this part of N3333:

http://www.open-std.org/Jtc1/sc22/wg21/docs/papers/2012/n3333.html#per.process.seed

?

That to me seems like potentially the most controversial part. I scanned Hashing.h but did not immediately see that support (I could've easily missed it).

Howard

Chandler Carruth

unread,
Feb 28, 2012, 1:21:04 PM2/28/12
to Howard Hinnant, LLVM Developers Mailing List
Will post an updated (and cleaned up, thanks Nadav!) patch shortly, but wanted to answer this question:

On Tue, Feb 28, 2012 at 7:20 AM, Howard Hinnant <hhin...@apple.com> wrote:
On Feb 28, 2012, at 6:34 AM, Chandler Carruth wrote:

> Howard, high-level feedback from you would be particularly appreciated as I would love to contribute this to libc++ when the time is right.

Does the enclosed implementation implement this part of N3333:

http://www.open-std.org/Jtc1/sc22/wg21/docs/papers/2012/n3333.html#per.process.seed

?

That to me seems like potentially the most controversial part.  I scanned Hashing.h but did not immediately see that support (I could've easily missed it).

This is definitely the most controversial part, and as such I've separated it out for implementation purposes. The mechanism to implement this is in place, but currently it uses a fixed large prime and has a comment about the plan to potentially switch to a per-execution seed.

(see ::llvm::hashing::detail::get_execution_seed() if you're curious where)

My plan was first to get the interface and implementation worked out, and then to try flipping on the per-execution seed, seeing what breaks, tracking down the places we depend on the hash-table ordering, etc. Hopefully there aren't many places, but that's something we'll learn with this.

Chandler Carruth

unread,
Feb 29, 2012, 4:35:25 AM2/29/12
to LLVM Developers Mailing List, Howard Hinnant
Thanks for the feedback thus far!

I've updated the header file, and enclosed a complete patch against LLVM. This passes all the regtests, and I'll be doing more thorough testing of LLVM itself with the patch. I've included some basic unit tests, but could probably do more here... Not sure it's worth delaying the initial submission though, as the best testing is to use a hash testing suite...

I've addressed the comments from Nadav, but there may be more minor stylistic issues or typos. Please keep pointing them out! I appreciate the help there.

I've also corrected my stub for the execution-seed to be more correct and to include the ability to override it during program startup. It still doesn't go the final step to make it actually change on each execution, but is now otherwise identical. (In particular, it is no longer a compile-time constant that can be folded.) This regressed the performance a tiny bit, however...

There are several improvements to the implementation of the hashing algorithm itself. The way the hashing included the seed hurt performance quite a bit. I've re-worked it to be much lighter weight, and even after the execution seed fix above slowed things down, this speeds everything up more. We shave 4ns off the 4 to 8 byte case, bringing performance above that of essentially every high quality hashing algorithm...

I still think we can do more, but it's already much faster than the existing LLVM one except for the issue Tobias pointed out w/ modulo-4 key sizes. I'm going to investigate this, but it may be a consequence of a weakness in that algorithm.

I've re-run the SMHasher quality testing suite and the results are very good. There are a few problems identified, but these are long-known problems with CityHash in general, and have proven to thus far not be a cause of real-world issues... I'd like to fix them, but it doesn't seem a high priority yet.

Finally, I've run some rough initial numbers for x86 32-bit, and it's surprisingly good. The relative speeds of this algorithm and others don't seem to change much. A bit suspicious of that, so going to do more thorough analysis.
Hashing.h
hashing.diff.gz

Jay Foad

unread,
Feb 29, 2012, 5:52:21 AM2/29/12
to Chandler Carruth, LLVM Developers Mailing List
On 29 February 2012 09:35, Chandler Carruth <chan...@google.com> wrote:
> I still think we can do more, but it's already much faster than the existing LLVM one except for the issue Tobias pointed out w/ modulo-4 key sizes. I'm going to investigate this

OK, but this is a VERY big exception! Almost any non-string data
anyone wants to hash will be a multiple of 4 bytes in length.


> //===-- llvm/ADT/Hashing.h - Utilities for hashing --------------*- C++ -*-===//

Doesn't this belong in ../Support/ ? No-one's ever explained the
difference to me; I just assumed that generic data types go in ADT/
and everything else goes in Support/.

> //
> // The LLVM Compiler Infrastructure
> //
> // This file is distributed under the University of Illinois Open Source
> // License. See LICENSE.TXT for details.
> //
> //===----------------------------------------------------------------------===//
> //
> // This file implements the newly proposed standard C++ interfaces for hashing
> // arbitrary data and building hash functions for user-defined types. This
> // interface was originally proposed in N3333[1] and is currently under review
> // for inclusion in a future TR and/or standard.
> //
> // The primary interfaces provide are comprised of one type and three functions:

"provided"

> //
> // -- 'hash_code' class is an opaque type representing the hash code for some
> // data. It is the intended product of hashing, and can be used to implement
> // hash tables, checksumming, and other common uses of hashes. It is not an
> // integer type (although it can be converted to one) because it is risky
> // to assume much about the internals of a hash_code. In particular, each
> // execution of the program has a high probability of producing a different
> // hash_code for a given input. Thus their values are not stable to save or
> // persist, and should only be used during the execution for the
> // construction of hashing datastructures.
> //
> // -- 'hash_value' is a function designed to be overloaded for each
> // user-defined type which wishes to be used within a hashing context. It
> // should be overloaded within the user-defined type's namespace and found via
> // ADL. Overloads for primitive types are provided by this library.
> //
> // -- 'hash_combine' and 'hash_combine_range' are functions designed to aid
> // programmers in easily and intuitively combining a set of data into a single
> // hash_code for their object. They should only logically be used within the
> // implementation of a 'hash_value' routine or similar context.
> //
> // Note that 'hash_combine_range' contains very special logic for hashing
> // a contiguous array of integers or pointers. This logic is *extremely* fast,
> // on a modern Intel "Gainestown" Xeon (Nehalem uarch) @2.2 GHz, these were
> // benchmarked at over 8.5 GiB/s for large keys, and <20 cycles/

20 cycles per what? Don't keep us in suspense!

> //
> //===----------------------------------------------------------------------===//
>
> #ifndef LLVM_ADT_HASHING_H
> #define LLVM_ADT_HASHING_H
>
> #include "llvm/ADT/STLExtras.h"
> #include "llvm/Support/DataTypes.h"
> #include "llvm/Support/type_traits.h"
> #include <cassert>
> #include <cstring>
> #include <algorithm>
> #include <utility>
> #include <iterator>
>
> namespace llvm {
>
> /// \brief An opaque object representing a hash code.
> ///
> /// This object represents the result of hashing some entity. It is intended to
> /// be used to implement hashtables or other hashing-based data structures.
> /// While it wraps and exposes a numeric value, this value should not be
> /// trusted to be stable or predictable across processes or executions.
> ///
> /// In order to obtain the hash_code for an object 'x':
> /// \code
> /// using llvm::hash_value;
> /// llvm::hash_code code = hash_value(x);
> /// \endcode
> ///
> /// Also note that there are two numerical values which are reserved, and the
> /// implementation ensures will never be produced for real hash_codes. These
> /// can be used as sentinels within hashing data structures.
> class hash_code {
> size_t value;
>
> public:
> /// \brief Default construct a hash_code. Constructs a null code.
> hash_code() : value() {}
>
> /// \brief Form a hash code directly from a numerical value.
> hash_code(size_t value) : value(value) {}
>
> /// \brief Convert the hash code to its numerical value for use.
> /*explicit*/ operator size_t() const { return value; }
>
> /// \brief Get a hash_code object which corresponds to a null code.
> ///
> /// The null code must never be the result of any 'hash_value' calls and can
> /// be used to detect an unset hash_code.
> static hash_code get_null_code() { return 0; }
>
> /// \brief Get a hash_code object which corresponds to an invalid code.
> ///
> /// The invalid code must never be the result of any 'hash_value' calls. This
> /// can be used to flag invalid hash_codes or mark entries in a hash table.
> static hash_code get_invalid_code() { return -1; }

I'm sure there's a compiler out there just waiting to warn you that
you're implicitly converting -1 to an unsigned type. :-)

> friend bool operator==(const hash_code &lhs, const hash_code &rhs) {
> return lhs.value == rhs.value;
> }
> friend bool operator!=(const hash_code &lhs, const hash_code &rhs) {
> return lhs.value != rhs.value;
> }
>
> /// \brief Allow a hash_code to be directly run through hash_value.
> friend size_t hash_value(const hash_code &code) { return code.value; }
> };
>
>
> // All of the implementation details of actually computing the various hash
> // code values are held within this namespace. These routines are included in
> // the header file mainly to allow inlining and constant propagation.
> namespace hashing { namespace detail {

My only comment on the implementation is that, on CPUs that have
different instruction sequences for aligned and unaligned loads, you
want to be careful that the 8-byte memcpys turn into aligned loads
when hashing naturally aligned data. (But if we only care about
running on x86-64, this won't be a problem.)

Jay.

James Molloy

unread,
Feb 29, 2012, 6:47:39 AM2/29/12
to Jay Foad, Chandler Carruth, LLVM Developers Mailing List
> (But if we only care about
> running on x86-64, this won't be a problem.)

Please, no. We have a cortex-a9 native buildbot already in lab.llvm.org and
as manufacturers emit faster ARM chips we (ARM) will want to have LLVM run
native on them.

You've also got the OpenCL use case etc.

Please bear ARM in mind.

Cheers,

James

Jay Foad

unread,
Feb 29, 2012, 7:24:41 AM2/29/12
to James Molloy, LLVM Developers Mailing List
On 29 February 2012 11:47, James Molloy <james....@arm.com> wrote:
>> (But if we only care about
>> running on x86-64, this won't be a problem.)
>
> Please, no. We have a cortex-a9 native buildbot already in lab.llvm.org and
> as manufacturers emit faster ARM chips we (ARM) will want to have LLVM run
> native on them.
>
> You've also got the OpenCL use case etc.
>
> Please bear ARM in mind.

Right. (But ARMv7 uses the same ldr instruction for aligned and
unaligned loads (doesn't it?) so it's not an issue there.)

James Molloy

unread,
Feb 29, 2012, 7:30:00 AM2/29/12
to Jay Foad, LLVM Developers Mailing List
That is true!

-----Original Message-----
From: Jay Foad [mailto:jay....@gmail.com]
Sent: 29 February 2012 12:25
To: James Molloy
Cc: Chandler Carruth; LLVM Developers Mailing List
Subject: Re: [LLVMdev] Proposed implementation of N3333 hashing interfaces for LLVM (and possible libc++)

Jim Grosbach

unread,
Feb 29, 2012, 11:46:32 AM2/29/12
to Jay Foad, LLVM Developers Mailing List

On Feb 29, 2012, at 4:24 AM, Jay Foad wrote:

> On 29 February 2012 11:47, James Molloy <james....@arm.com> wrote:
>>> (But if we only care about
>>> running on x86-64, this won't be a problem.)
>>
>> Please, no. We have a cortex-a9 native buildbot already in lab.llvm.org and
>> as manufacturers emit faster ARM chips we (ARM) will want to have LLVM run
>> native on them.
>>
>> You've also got the OpenCL use case etc.
>>
>> Please bear ARM in mind.
>
> Right. (But ARMv7 uses the same ldr instruction for aligned and
> unaligned loads (doesn't it?) so it's not an issue there.)
>

Sorta. It's the same instruction, but on some implementations, unaligned loads will cause an exception, so the compiler may need to generate LDRB sequences instead. There's also the VLD* instructions to consider. Depending on what's being loaded, there's all sorts of potential tricks that can be played. We should try to be careful and give the compiler as many hints about alignment as we can.

-Jim

Nick Lewycky

unread,
Mar 1, 2012, 12:22:17 AM3/1/12
to Chandler Carruth, LLVM Developers Mailing List
//  -- 'hash_code' class is an opaque type representing the hash code for some
//     data. It is the intended product of hashing, and can be used to implement

vs.

//  -- 'hash_value' is a function designed to be overloaded for each
//  user-defined type which wishes to be used within a hashing context. It

The first paragraph has a hanging indent but the second and third don't.

//  benchmarked at over 8.5 GiB/s for large keys, and <20 cycles/

Trailing punctuation.

#include <cassert>
#include <cstring>
#include <algorithm>
#include <utility>
#include <iterator>

Sort.

  /// \brief Form a hash code directly from a numerical value.
  hash_code(size_t value) : value(value) {}

why not assert that the value is not the null or invalid value? I realize you'd need to have a private constructor for the null and invalid ones then.

// All of the implementation details of actually computing the various hash
// code values are held within this namespace. These routines are included in
// the header file mainly to allow inlining and constant propagation.
namespace hashing { namespace detail {

One namespace per line please, in LLVM.

[I only skimmed the implementation details of the hashing. It all looks plausible enough.]

} } // End hashing detail namespaces.

Again, one per line. You do this multiple times in Hashing.h, please fix them all.

/// reproduce *exactly* a specific behavior. To that end, we provide a function
/// which will forcible set the seed to a fixed value. This must be done at the

forcible -> forcibly

  template <typename T> struct is_hashable_data
    : integral_constant<bool, ((is_integral<T>::value ||
                                is_pointer<T>::value)
                               && 64 % sizeof(T) == 0)> {};

Optional: I object to the leading && and propose this reflowed text:

  template <typename T> struct is_hashable_data
    : integral_constant<bool,
                        ((is_integral<T>::value || is_pointer<T>::value) &&
                         64 % sizeof(T) == 0)> {};
.

#if __has_feature(__cxx_variadic_templates__)

I'm pretty sure non-clang compilers will bark at that unless you #define'd __has_feature(X) to 0?

--- a/unittests/ADT/HashingTest.cpp
+++ b/unittests/ADT/HashingTest.cpp
@@ -13,45 +13,295 @@
 
 #include "gtest/gtest.h"
 #include "llvm/ADT/Hashing.h"
+#include "llvm/Support/DataTypes.h"
+#include <vector>
+#include <list>
+#include <deque>
+#include <map>

Sort.

+  // Helper for test code to print hash codes.
+  void PrintTo(const hash_code &code, ::std::ostream *os) {

What's with the extra leading :: before std::?

+  namespace hashing { namespace detail {
+  template <> struct is_hashable_data<LargeTestInteger> : true_type {};
+  } } // End hashing detail namespace.
+} // End LLVM namespace.

A reminder about 1 namespace-per-line, but also the comments on ending namespaces are odd. They usually look like "} // end namespace llvm" or "}  // namespace llvm", and I prefer the latter.

Nick

Ahmed Charles

unread,
Mar 1, 2012, 1:03:03 AM3/1/12
to Nick Lewycky, LLVM Developers Mailing List
> +  // Helper for test code to print hash codes.
> +  void PrintTo(const hash_code &code, ::std::ostream *os) {
>
> What's with the extra leading :: before std::?

Have you ever tried:

namespace foo {
class std {};
}
using namespace foo;

#include <vector>

Well, I'm not sure that Chandler is guarding against this possibility,
but most library implementations of the standard use ::std::
everywhere to avoid this potential for ambiguity.

Chandler Carruth

unread,
Mar 1, 2012, 8:01:39 AM3/1/12
to Ahmed Charles, LLVM Developers Mailing List
Thanks for all the comments!

I think I've addressed all of them, as wel as Duncan's comments from IRC. Based on your OK Nick, I'm planning to commit this tomorrow. If anyone has objections or serious concerns, please let me know to hold off. Updated patch is attached, as well as the latest version of the header.


For the record, the only concern that has come up thus far is one of performance. The explanation there is that while some algorithms are slightly faster (5-10 cycles max), they are significantly lower quality, and don't currently show up on profiles. I'd like to get the quality up to remove collisions, and then continue working to improve the actual performance.

Clearly, very sensitive and hot routines like the Clang lexer's StringMap aren't about to change without careful benchmarking and numbers on those specific components. =] I think StringMap will be the very last bit of hashing to change considering its requirements.
Hashing.h
hashing2.diff.gz
Reply all
Reply to author
Forward
0 new messages