missing hash_code result in P0029

66 views
Skip to first unread message

Sean Middleditch

unread,
Oct 3, 2015, 11:25:13 PM10/3/15
to ISO C++ Standard - Future Proposals, Geoff Romer, Chandler Carruth
Unless I'm missing something (wouldn't surprise me), there's no way specific in the paper for user code to extract the resulting hash from hash_code or the HashCode concept.

The sample implementation code has a member result_type alias and an `operator result_type()` but those interfaces are not present in the paper.

Seems like a small but rather critical missing piece. :)

Geoffrey Romer

unread,
Oct 5, 2015, 12:17:49 PM10/5/15
to Sean Middleditch, ISO C++ Standard - Future Proposals, Chandler Carruth
The API for extracting a hash value, and symmetrically the API for initializing the HashCode, is deliberately left unspecified. In the case of std::hash_code it doesn't matter, because only std::hash needs to be aware of those details. Other HashCode types can provide similar functor wrappers, or explicitly specify those APIs however they choose.

The initialization API is unspecified because it depends on the implementation details of the HashCode (e.g. hashing::farmhash needs a pointer to its state as an input). The value-extraction API is unspecified because it's unsafe: in many cases it will leave the HashCode in an unusable state. I solved that in the prototype by making it rvalue-ref qualified, but then I realized there's no need to specify it at all (particularly when the initialization API is also unspecified), and an unspecified API is simpler, safer, and harder to bikeshed.

I had thought I spelled this out in the paper somewhere, but I can't immediately find it, so that may be an oversight on my part.

Thiago Macieira

unread,
Oct 5, 2015, 1:07:59 PM10/5/15
to std-pr...@isocpp.org
On Monday 05 October 2015 09:17:45 'Geoffrey Romer' via ISO C++ Standard -
Future Proposals wrote:
> The API for extracting a hash value, and symmetrically the API for
> initializing the HashCode, is deliberately left unspecified.

So if I want to write a new portable container that relies on hashing, I
can't? What's the rationale for that?

> In the case of
> std::hash_code it doesn't matter, because only std::hash needs to be aware
> of those details.

Most definitely not. I started with an example that disagrees with this
perspective.

> Other HashCode types can provide similar functor
> wrappers, or explicitly specify those APIs however they choose.

So if I write my own element type T, I need to specify its hashing function
for std::unordered_map and SomeOtherHash?

> The initialization API is unspecified because it depends on the
> implementation details of the HashCode (e.g. hashing::farmhash needs a
> pointer to its state as an input). The value-extraction API is unspecified
> because it's unsafe: in many cases it will leave the HashCode in an
> unusable state. I solved that in the prototype by making it rvalue-ref
> qualified, but then I realized there's no need to specify it at all
> (particularly when the initialization API is also unspecified), and an
> unspecified API is simpler, safer, and harder to bikeshed.
>
> I had thought I spelled this out in the paper somewhere, but I can't
> immediately find it, so that may be an oversight on my part.

I find that those reasons invalidate the proposal.

If I can't use them in my own hashing container, they're unusable.

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
Software Architect - Intel Open Source Technology Center
PGP/GPG: 0x6EF45358; fingerprint:
E067 918B B660 DBD1 105C 966C 33F5 F005 6EF4 5358

Shahms King

unread,
Oct 5, 2015, 1:34:17 PM10/5/15
to std-pr...@isocpp.org
On Mon, Oct 5, 2015 at 10:07 AM Thiago Macieira <thi...@macieira.org> wrote:
On Monday 05 October 2015 09:17:45 'Geoffrey Romer' via ISO C++ Standard -
Future Proposals wrote:
> The API for extracting a hash value, and symmetrically the API for
> initializing the HashCode, is deliberately left unspecified.

So if I want to write a new portable container that relies on hashing, I
can't? What's the rationale for that?

The proposal doesn't change the requirements on the hash-function callable itself.  If you want to implement a portable container that relies on hashing, you have a template parameter for the hash function which adheres to the same interface as std::hash<T>.  It is callable with the element type and returns a size_t.
~/.ssh/id_rsa_github
 

> In the case of
> std::hash_code it doesn't matter, because only std::hash needs to be aware
> of those details.

Most definitely not. I started with an example that disagrees with this
perspective.

> Other HashCode types can provide similar functor
> wrappers, or explicitly specify those APIs however they choose.

So if I write my own element type T, I need to specify its hashing function
for std::unordered_map and SomeOtherHash?

If you write your own element type T, you need to implement one of:

std::hash_code hash_value(std::hash_code, T);

Or (preferred, to support a variety of algorithms):

template <typename H>
H hash_value(H, T);


> The initialization API is unspecified because it depends on the
> implementation details of the HashCode (e.g. hashing::farmhash needs a
> pointer to its state as an input). The value-extraction API is unspecified
> because it's unsafe: in many cases it will leave the HashCode in an
> unusable state. I solved that in the prototype by making it rvalue-ref
> qualified, but then I realized there's no need to specify it at all
> (particularly when the initialization API is also unspecified), and an
> unspecified API is simpler, safer, and harder to bikeshed.
>
> I had thought I spelled this out in the paper somewhere, but I can't
> immediately find it, so that may be an oversight on my part.

I find that those reasons invalidate the proposal.

If I can't use them in my own hashing container, they're unusable.

You can trivially use them in your own hashing container, as you would now.  The primary thing that is being changed is the mechanism by which user-defined types can supply their data to the hash function, not the interface for retrieving final values from that hash function.  The high-level API for retrieving a hash value from an object is unchanged, e.g.

template <typename T>
struct hash {
  size_t operator() (const T& value) const {
    std::hash_code state = __implementation_defined_init();
    state = hash_value(state, value);
    return __implementation_defined_extract_value(state);
  }
};

Or:

struct MyHashCode {
  ...
 private:
    ...
    friend class MyHash;
    operator size_t() const;
};

struct MyHash {
  template <typename T>
  size_t operator()(const T& value) const {
    MyHashCode state;
    return hash_value(state, t);
  }
};

Essentially, the only part which remains undefined is the internal interface between the top-level hash algorithm and its internal state.

--Shahms
 

--
Thiago Macieira - thiago (AT) macieira.info - thiago (AT) kde.org
   Software Architect - Intel Open Source Technology Center
      PGP/GPG: 0x6EF45358; fingerprint:
      E067 918B B660 DBD1 105C  966C 33F5 F005 6EF4 5358

--

---
You received this message because you are subscribed to the Google Groups "ISO C++ Standard - Future Proposals" group.
To unsubscribe from this group and stop receiving emails from it, send an email to std-proposal...@isocpp.org.
To post to this group, send email to std-pr...@isocpp.org.
Visit this group at http://groups.google.com/a/isocpp.org/group/std-proposals/.

Shahms King

unread,
Oct 5, 2015, 1:36:16 PM10/5/15
to std-pr...@isocpp.org
On Mon, Oct 5, 2015 at 10:34 AM Shahms King <shahm...@gmail.com> wrote:
On Mon, Oct 5, 2015 at 10:07 AM Thiago Macieira <thi...@macieira.org> wrote:
On Monday 05 October 2015 09:17:45 'Geoffrey Romer' via ISO C++ Standard -
Future Proposals wrote:
> The API for extracting a hash value, and symmetrically the API for
> initializing the HashCode, is deliberately left unspecified.

So if I want to write a new portable container that relies on hashing, I
can't? What's the rationale for that?

The proposal doesn't change the requirements on the hash-function callable itself.  If you want to implement a portable container that relies on hashing, you have a template parameter for the hash function which adheres to the same interface as std::hash<T>.  It is callable with the element type and returns a size_t.
~/.ssh/id_rsa_github

Ignore that line, there was an accidental ^V in there :-\

--Shahms

Sean Middleditch

unread,
Oct 5, 2015, 4:18:19 PM10/5/15
to std-pr...@isocpp.org
On Mon, Oct 5, 2015 at 10:34 AM, Shahms King <shahm...@gmail.com> wrote:
> You can trivially use them in your own hashing container, as you would now.
> The primary thing that is being changed is the mechanism by which
> user-defined types can supply their data to the hash function, not the
> interface for retrieving final values from that hash function. The
> high-level API for retrieving a hash value from an object is unchanged, e.g.

Gotcha. In that case, this proposal is less complete than N3980 was.
The HashAlgorithm uhash<> interfaces were more widely usable. With
P0029, users have to write extraneous high-level wrappers around their
algorithms (a std::hash replacement), and generic code that needs to
support generic progressive hashes is left without a solution.

It seems to me a near trivial addition to require HashCode to have a
result_type and a conversion operator/method to result_type. It will
still satisfy all the goals of P0029 while also satisfy requirements
covered by N3980, no?

Is there a significant cost/risk of adding such an interface to the
HashCode concept?

--
Sean Middleditch
http://seanmiddleditch.com

Thiago Macieira

unread,
Oct 5, 2015, 4:58:12 PM10/5/15
to std-pr...@isocpp.org
On Monday 05 October 2015 17:34:04 Shahms King wrote:
> You can trivially use them in your own hashing container, as you would
> now. The primary thing that is being changed is the mechanism by which
> user-defined types can supply their data to the hash function, not the
> interface for retrieving final values from that hash function. The
> high-level API for retrieving a hash value from an object is unchanged, e.g.
>
> template <typename T>
> struct hash {
> size_t operator() (const T& value) const {
> std::hash_code state = __implementation_defined_init();
> state = hash_value(state, value);
> return __implementation_defined_extract_value(state);
> }
> };

If by __implementation_defined_init(), you mean the way I will initialise the
hash from my hash seed, I understand. But what constructors and assignment
operators will std::hash_code have so I can create it from my hash seed?

As for the extract value, I don't get it.

Can you give as example the two most trivial implementations? That is, the one
where there's no extra state before or after the hashing of the value type (no
seed) and the one where the seed is just a global integer value initialised
from a random source.

Shahms King

unread,
Oct 5, 2015, 6:34:03 PM10/5/15
to std-pr...@isocpp.org
As far as I understand it, you don't.  std::hash_code is the internal state of whatever hash algorithm is being used by std::hash.  Either you're using std::hash, in which case it initializes the std::hash_code through mechanisms unknown and similarly extracts the value or you're using a different algorithm which uses a different type to represent its internal state which conforms to the HashCode concept and you don't traffic in std::hash_code at all.  At least, that's my reading of the proposal.

Either your hash_value() function accepts std::hash_code and only works with std::hash, or it's a template and supports any conforming hashing algorithm.

--Shahms

Thiago Macieira

unread,
Oct 6, 2015, 3:07:12 AM10/6/15
to std-pr...@isocpp.org
On Monday 05 October 2015 22:33:50 Shahms King wrote:
> As far as I understand it, you don't. std::hash_code is the internal state
> of whatever hash algorithm is being used by std::hash. Either you're using
> std::hash, in which case it initializes the std::hash_code through
> mechanisms unknown and similarly extracts the value or you're using a
> different algorithm which uses a different type to represent its internal
> state which conforms to the HashCode concept and you don't traffic in
> std::hash_code at all. At least, that's my reading of the proposal.
>
> Either your hash_value() function accepts std::hash_code and only works
> with std::hash, or it's a template and supports any conforming hashing
> algorithm.

Ok, so the class is not useful or interesting at all for me. I will pay it no
further attention.
Reply all
Reply to author
Forward
0 new messages