[Python-Dev] New hash algorithms: SHA3, SHAKE, BLAKE2, truncated SHA512

367 views
Skip to first unread message

Christian Heimes

unread,
May 25, 2016, 6:30:51 AM5/25/16
to pytho...@python.org
Hi everybody,

I have three hashing-related patches for Python 3.6 that are waiting for
review. Altogether the three patches add ten new hash algorithms to the
hashlib module: SHA3 (224, 256, 384, 512), SHAKE (SHA3 XOF 128, 256),
BLAKE2 (blake2b, blake2s) and truncated SHA512 (224, 256).


SHA-3 / SHAKE: https://bugs.python.org/issue16113
BLAKE2: https://bugs.python.org/issue26798
SHA512/224 / SHA512/256: https://bugs.python.org/issue26834


I like to push the patches during the sprints at PyCon. Please assist
with reviews.

Regards,
Christian
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Raymond Hettinger

unread,
May 27, 2016, 12:56:23 AM5/27/16
to Christian Heimes, Python-Dev@Python. Org

> On May 25, 2016, at 3:29 AM, Christian Heimes <chri...@python.org> wrote:
>
> I have three hashing-related patches for Python 3.6 that are waiting for
> review. Altogether the three patches add ten new hash algorithms to the
> hashlib module: SHA3 (224, 256, 384, 512), SHAKE (SHA3 XOF 128, 256),
> BLAKE2 (blake2b, blake2s) and truncated SHA512 (224, 256).

Do we really need ten? I don't think the standard library is the place to offer all variants of hashing. And we should avoid getting in a cycle of "this was just released by NIST" and "nobody uses that one anymore". Is any one of them an emergent best practice (i.e. starting to be commonly used in network protocols because it is better, faster, stronger, etc)?

Your last message on https://bugs.python.org/issue16113 suggests that these aren't essential and that there is room for debate about whether some of them are standard-library worthy (i.e. we will have them around forever).


Raymond

Donald Stufft

unread,
May 27, 2016, 6:05:33 AM5/27/16
to Raymond Hettinger, Christian Heimes, Python-Dev@Python. Org

> On May 27, 2016, at 12:54 AM, Raymond Hettinger <raymond....@gmail.com> wrote:
>
>
>> On May 25, 2016, at 3:29 AM, Christian Heimes <chri...@python.org> wrote:
>>
>> I have three hashing-related patches for Python 3.6 that are waiting for
>> review. Altogether the three patches add ten new hash algorithms to the
>> hashlib module: SHA3 (224, 256, 384, 512), SHAKE (SHA3 XOF 128, 256),
>> BLAKE2 (blake2b, blake2s) and truncated SHA512 (224, 256).
>
> Do we really need ten? I don't think the standard library is the place to offer all variants of hashing. And we should avoid getting in a cycle of "this was just released by NIST" and "nobody uses that one anymore". Is any one of them an emergent best practice (i.e. starting to be commonly used in network protocols because it is better, faster, stronger, etc)?
>
> Your last message on https://bugs.python.org/issue16113 suggests that these aren't essential and that there is room for debate about whether some of them are standard-library worthy (i.e. we will have them around forever).
>


I think that adding sha3 here is a net positive. While there isn’t a huge amount of things using it today, that’s largely because it’s fairly new— It’s a NIST standard so it won’t be long until things are using it. It would be surprising to me to be able to use sha1 and sha2 from the standard library, but not sha3.

SHAKE is really just SHA3 with some additional tweaks to the parameters. I think if you’re adding SHA3 it’s pretty easy to also add these, though I don’t think that it’s as important as adding SHA3 itself.

BLAKE2 is an interesting one, because while SHA3 is a NIST standard (so it’s going to gain adoption because of that), BLAKE2 is at least as strong as SHA3 but is better in many ways, particularly in speed— it’s actually faster than MD5 while being as secure as SHA3. This one I think is a good one to have in the standard library as well because it is all around a really great hash and a lot of things are starting to be built on top of it. In particularly I’d like to use this in PyPI and pip- but I can’t unless it’s in the standard library.


Donald Stufft

Victor Stinner

unread,
May 27, 2016, 6:45:49 AM5/27/16
to Donald Stufft, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org

Le 27 mai 2016 12:05 PM, "Donald Stufft" <don...@stufft.io> a écrit :
> BLAKE2 is an interesting one, because while SHA3 is a NIST standard (so it’s going to gain adoption because of that), BLAKE2 is at least as strong as SHA3 but is better in many ways, particularly in speed— it’s actually faster than MD5 while being as secure as SHA3.

BLAKE2 was part of the SHA3 competition and it was in finalists. The SHA3 competition is interesting because each algorithm is deeply tested and analyzed by many teams all around the world. Obvious vulnerabilities are quickly found.

The advantage of putting SHA3 and BLAKE2 in the stdlib is that they have a different design. I don't expect that two designs have the same vulnerabilities, but I'm not ax expert :-)

SHA3 (Keccak) is based on a new sponge construction:
https://en.m.wikipedia.org/wiki/SHA-3

BLAKE is based on ChaCha:
https://en.m.wikipedia.org/wiki/BLAKE_(hash_function)
https://en.m.wikipedia.org/wiki/Salsa20#ChaCha_variant

Victor

M.-A. Lemburg

unread,
May 27, 2016, 6:56:39 AM5/27/16
to Raymond Hettinger, Christian Heimes, Python-Dev@Python. Org
On 27.05.2016 06:54, Raymond Hettinger wrote:
>
>> On May 25, 2016, at 3:29 AM, Christian Heimes <chri...@python.org> wrote:
>>
>> I have three hashing-related patches for Python 3.6 that are waiting for
>> review. Altogether the three patches add ten new hash algorithms to the
>> hashlib module: SHA3 (224, 256, 384, 512), SHAKE (SHA3 XOF 128, 256),
>> BLAKE2 (blake2b, blake2s) and truncated SHA512 (224, 256).
>
> Do we really need ten? I don't think the standard library is the place to offer all variants of hashing. And we should avoid getting in a cycle of "this was just released by NIST" and "nobody uses that one anymore". Is any one of them an emergent best practice (i.e. starting to be commonly used in network protocols because it is better, faster, stronger, etc)?
>
> Your last message on https://bugs.python.org/issue16113 suggests that these aren't essential and that there is room for debate about whether some of them are standard-library worthy (i.e. we will have them around forever).

I can understand your eagerness to get this landed, since it's
been 4 years since work started, but I think we should wait with
the addition until OpenSSL has them:

https://github.com/openssl/openssl/issues/439

The current patch is 1.2MB for SHA-3 - that's pretty heavy for just
a few hash functions, which aren't in any wide spread use yet and
probably won't be for quite a few years ahead.

IMO, relying on OpenSSL is a better strategy than providing
(and maintaining) our own compatibility versions. Until OpenSSL
has them, people can use Björn's package:

https://github.com/bjornedstrom/python-sha3

Perhaps you could join forces with Björn to create a standard
SHA-3 standalone package on PyPI based on your two variants
which we could recommend to people in the docs ?!

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, May 27 2016)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
http://www.malemburg.com/

Donald Stufft

unread,
May 27, 2016, 7:05:31 AM5/27/16
to M.-A. Lemburg, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org

> On May 27, 2016, at 6:54 AM, M.-A. Lemburg <m...@egenix.com> wrote:
>
> IMO, relying on OpenSSL is a better strategy than providing
> (and maintaining) our own compatibility versions. Until OpenSSL
> has them, people can use Björn's package:

Even now, hashlib doesn’t rely on OpenSSL if I recall, I mean it will
use it if OpenSSL is available but otherwise it has internal implementations
too.


Donald Stufft

M.-A. Lemburg

unread,
May 27, 2016, 7:14:54 AM5/27/16
to Donald Stufft, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org
On 27.05.2016 13:03, Donald Stufft wrote:
>
>> On May 27, 2016, at 6:54 AM, M.-A. Lemburg <m...@egenix.com> wrote:
>>
>> IMO, relying on OpenSSL is a better strategy than providing
>> (and maintaining) our own compatibility versions. Until OpenSSL
>> has them, people can use Björn's package:
>
> Even now, hashlib doesn’t rely on OpenSSL if I recall, I mean it will
> use it if OpenSSL is available but otherwise it has internal implementations
> too.

I know, but still don't think that's a good idea. It makes
sense in case you don't want to carry around OpenSSL all the
time, but how often does that happen nowadays ?

BTW: If I recall correctly, those hash implementations predate
the deeper support for OpenSSL we now have in Python.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, May 27 2016)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
http://www.malemburg.com/

Daniel Holth

unread,
May 27, 2016, 10:33:39 AM5/27/16
to M.-A. Lemburg, Donald Stufft, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org
OpenSSL sucks. Python would only have to bundle a reference implementation of the new hash algorithm(s), and unlike TLS suites they tend to just work.

BLAKE2 is important, since it removes the last objection to replacing MD5 - speed - that has made it hard for cryptography fans to convince MD5 users to upgrade.

Bernardo Sulzbach

unread,
May 27, 2016, 11:40:13 AM5/27/16
to pytho...@python.org
On 05/27/2016 11:31 AM, Daniel Holth wrote:
>
> BLAKE2 is important, since it removes the last objection to replacing MD5 -
> speed - that has made it hard for cryptography fans to convince MD5 users
> to upgrade.
>

I have had to stick to MD5 for performance reasons (2 seconds in MD5 or
9.6 seconds in SHA256, IIRC) in scenarios that did not require an SHA*.
Having BLAKE2 around wouldn't be a necessity, but if it shipped with
newer versions of Python eventually there would be a commit switching
the underlying hash function.
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Chris Barker - NOAA Federal

unread,
May 27, 2016, 11:46:32 AM5/27/16
to M.-A. Lemburg, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org
>> , which aren't in any wide spread use yet and
> probably won't be for quite a few years ahead.

Anything added to the stdlib now will be in py3.6+, yes?

Which won't be in widespread use for quite a few years yet, either.

So if ( and that's a big if) it's possible to anticipate what will be
in widespread use in a couple years, getting it in now would be a good
thing.

-CHB

> Unsubscribe: https://mail.python.org/mailman/options/python-dev/chris.barker%40noaa.gov

M.-A. Lemburg

unread,
May 27, 2016, 12:40:50 PM5/27/16
to Chris Barker - NOAA Federal, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org


On 27.05.2016 17:44, Chris Barker - NOAA Federal wrote:
>>> , which aren't in any wide spread use yet and
>> probably won't be for quite a few years ahead.
>
> Anything added to the stdlib now will be in py3.6+, yes?
>
> Which won't be in widespread use for quite a few years yet, either.
>
> So if ( and that's a big if) it's possible to anticipate what will be
> in widespread use in a couple years, getting it in now would be a good
> thing.

You cut away the important part of what I said:
"The current patch is 1.2MB for SHA-3 - that's pretty heavy for
just a few hash functions, ..."

If people want to use the hashes earlier, this is already possible
via a separate package, so we're not delaying their use.

It is clear that SHA-3 will get more traction in coming years (*),
but I'm pretty sure that OpenSSL will have good implementations by
the time people will actively start using the new hash algorithm
and then hashlib will automatically make that available (hashlib
uses the OpenSSL EVP abstraction, so will be able to use any
new algorithms added to OpenSSL).

However, if we add the reference implementation now, we'd then be
left with 1.2MB unnecessary code in the stdlib.

The question is not so much: is SHA-3 useful or not, it's
whether we want to maintain this forever going forward or
not.

(*) People are just now starting to move from SHA-1 to SHA-2
and SHA-2 was standardized in 2001. Python received SHA-2 support
in 2006. So there's plenty of time to decide :-)
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Chris Barker

unread,
May 27, 2016, 12:43:55 PM5/27/16
to M.-A. Lemburg, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org
On Fri, May 27, 2016 at 9:35 AM, M.-A. Lemburg <m...@egenix.com> wrote:
> So if ( and that's a big if) it's possible to anticipate what will be
> in widespread use in a couple years, getting it in now would be a good
> thing.

You cut away the important part of what I said:
"The current patch is 1.2MB for SHA-3 - that's pretty heavy for
just a few hash functions, ..."

If people want to use the hashes earlier, this is already possible
via a separate package, so we're not delaying their use.

That's true for ANY addition to the stdlib -- it could always be made available in a third party lib. (unless you want to use it in another part of the stdlib...)
 
It is clear that SHA-3 will get more traction in coming years (*),
but I'm pretty sure that OpenSSL will have good implementations by
the time people will actively start using the new hash algorithm
and then hashlib will automatically make that available (hashlib
uses the OpenSSL EVP abstraction, so will be able to use any
new algorithms added to OpenSSL).

However, if we add the reference implementation now, we'd then be
left with 1.2MB unnecessary code in the stdlib.

I'm probably showing my ignorance here, but couldn't we swap in the OpenSSL implementation when that becomes available?

-CHB


(*) People are just now starting to move from SHA-1 to SHA-2
and SHA-2 was standardized in 2001. Python received SHA-2 support
in 2006. So there's plenty of time to decide :-)

can't deny the history, nor the inertia -- but that doesn't make it a good thing...


--

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris....@noaa.gov

M.-A. Lemburg

unread,
May 27, 2016, 12:58:08 PM5/27/16
to Chris Barker, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org
On 27.05.2016 18:41, Chris Barker wrote:
> On Fri, May 27, 2016 at 9:35 AM, M.-A. Lemburg <m...@egenix.com> wrote:
>
>>> So if ( and that's a big if) it's possible to anticipate what will be
>>> in widespread use in a couple years, getting it in now would be a good
>>> thing.
>>
>> You cut away the important part of what I said:
>> "The current patch is 1.2MB for SHA-3 - that's pretty heavy for
>> just a few hash functions, ..."
>>
>> If people want to use the hashes earlier, this is already possible
>> via a separate package, so we're not delaying their use.
>>
>
> That's true for ANY addition to the stdlib -- it could always be made
> available in a third party lib. (unless you want to use it in another part
> of the stdlib...)

Well, any addition for which someone already wrote a package,
but yes...

>> It is clear that SHA-3 will get more traction in coming years (*),
>> but I'm pretty sure that OpenSSL will have good implementations by
>> the time people will actively start using the new hash algorithm
>> and then hashlib will automatically make that available (hashlib
>> uses the OpenSSL EVP abstraction, so will be able to use any
>> new algorithms added to OpenSSL).
>>
>> However, if we add the reference implementation now, we'd then be
>> left with 1.2MB unnecessary code in the stdlib.
>>
>
> I'm probably showing my ignorance here, but couldn't we swap in the OpenSSL
> implementation when that becomes available?

We could, but only if we don't expose separate interfaces
for the hashes and not add them to hashlib.

hashlib.algorithms
hashlib.algorithms_guaranteed

> -CHB
>
>
> (*) People are just now starting to move from SHA-1 to SHA-2
>> and SHA-2 was standardized in 2001. Python received SHA-2 support
>> in 2006. So there's plenty of time to decide :-)
>
>
> can't deny the history, nor the inertia -- but that doesn't make it a good
> thing...
>
>
>
>
> _______________________________________________
> Python-Dev mailing list
> Pytho...@python.org
> https://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/mal%40egenix.com

Victor Stinner

unread,
May 27, 2016, 4:05:24 PM5/27/16
to M.-A. Lemburg, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org
Le vendredi 27 mai 2016, M.-A. Lemburg <m...@egenix.com> a écrit :
The curent patch is 1.2MB for SHA-3 - that's pretty heavy for just

a few hash functions, which aren't in any wide spread use yet and
probably won't be for quite a few years ahead.

Oh wow, it's so fat? Why is it so big? Can't we use a lighter version?

Victor

Ryan Gonzalez

unread,
May 27, 2016, 5:00:07 PM5/27/16
to Victor Stinner, Christian Heimes, Python-Dev@Python. Org, Raymond Hettinger, M. -A. Lemburg

The stark majority of the patch is Lib/test/vectors/sha3_224.txt, which seems to be (as the file path implies) just test data. A whopping >1k LOC of really long hashes.

> Victor


>
> _______________________________________________
> Python-Dev mailing list
> Pytho...@python.org
> https://mail.python.org/mailman/listinfo/python-dev

> Unsubscribe: https://mail.python.org/mailman/options/python-dev/rymg19%40gmail.com
>

--
Ryan
[ERROR]: Your autotools build scripts are 200 lines longer than your program. Something’s wrong.
http://kirbyfan64.github.io/

M.-A. Lemburg

unread,
May 27, 2016, 5:43:16 PM5/27/16
to Ryan Gonzalez, Victor Stinner, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org
On 27.05.2016 22:58, Ryan Gonzalez wrote:
> On May 27, 2016 3:04 PM, "Victor Stinner" <victor....@gmail.com> wrote:
>>
>> Le vendredi 27 mai 2016, M.-A. Lemburg <m...@egenix.com> a écrit :
>>>
>>> The current patch is 1.2MB for SHA-3 - that's pretty heavy for just
>>> a few hash functions, which aren't in any wide spread use yet and
>>> probably won't be for quite a few years ahead.
>>
>>
>> Oh wow, it's so fat? Why is it so big? Can't we use a lighter version?
>>
>
> The stark majority of the patch is Lib/test/vectors/sha3_224.txt, which
> seems to be (as the file path implies) just test data. A whopping >1k LOC
> of really long hashes.

Right. There's about 1MB test data in the patch, but even
without that data, the patch adds more than 6400 lines of code.

If we add this now, there should at least be an exit strategy
to remove the code again, when OpenSSL ships with the same
code, IMO.

Aside: BLAKE2 has already landed in OpenSSL 1.1.0:

https://github.com/openssl/openssl/tree/master/crypto/blake2

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, May 27 2016)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
http://www.malemburg.com/

_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Donald Stufft

unread,
May 27, 2016, 5:48:29 PM5/27/16
to M.-A. Lemburg, Christian Heimes, Python-Dev@Python. Org, Raymond Hettinger

> On May 27, 2016, at 5:41 PM, M.-A. Lemburg <m...@egenix.com> wrote:
>
> If we add this now, there should at least be an exit strategy
> to remove the code again, when OpenSSL ships with the same
> code, IMO.


I think it is a clear win to have the fallback implementations in cases where people either don’t have OpenSSL or don’t have a new enough OpenSSL for those implementations. Not having the fallback just makes it more difficult for people to rely on those hash functions.


Donald Stufft

M.-A. Lemburg

unread,
May 27, 2016, 6:10:28 PM5/27/16
to Donald Stufft, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org
On 27.05.2016 23:46, Donald Stufft wrote:
>
>> On May 27, 2016, at 5:41 PM, M.-A. Lemburg <m...@egenix.com> wrote:
>>
>> If we add this now, there should at least be an exit strategy
>> to remove the code again, when OpenSSL ships with the same
>> code, IMO.
>
> I think it is a clear win to have the fallback implementations in cases where people either don’t have OpenSSL or don’t have a new enough OpenSSL for those implementations. Not having the fallback just makes it more difficult for people to rely on those hash functions.

This will only be needed once the stdlib itself starts requiring
support for some of these hashes and for that we could add
a pure Python implementation, eg.

https://github.com/coruus/py-keccak

In all other cases, you can simply add the support via a
package such as Björn's or Christian's.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, May 27 2016)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
http://www.malemburg.com/

Nathaniel Smith

unread,
May 27, 2016, 6:53:50 PM5/27/16
to M.-A. Lemburg, Christian Heimes, Raymond Hettinger, Python-Dev@Python. Org
On Fri, May 27, 2016 at 3:08 PM, M.-A. Lemburg <m...@egenix.com> wrote:
> On 27.05.2016 23:46, Donald Stufft wrote:
>>
>>> On May 27, 2016, at 5:41 PM, M.-A. Lemburg <m...@egenix.com> wrote:
>>>
>>> If we add this now, there should at least be an exit strategy
>>> to remove the code again, when OpenSSL ships with the same
>>> code, IMO.
>>
>> I think it is a clear win to have the fallback implementations in cases where people either don’t have OpenSSL or don’t have a new enough OpenSSL for those implementations. Not having the fallback just makes it more difficult for people to rely on those hash functions.
>
> This will only be needed once the stdlib itself starts requiring
> support for some of these hashes and for that we could add
> a pure Python implementation, eg.
>
> https://github.com/coruus/py-keccak
>
> In all other cases, you can simply add the support via a
> package such as Björn's or Christian's.

SHA-3 and BLAKE are extremely widely accepted standards, our users
will expect them, and they're significant improvements over all the
current hashes in the algorithms_guaranteed list. If we demote them to
second-class support (by making them only available in some builds, or
using a slow pure Python implementation), then we'll be encouraging
users to use inferior hashes. We shouldn't do this without a very good
reason, and I don't see anything very convincing here... by all means
drop the megabyte of test data, but why does it matter how many lines
of code the algorithm is? No python developer will ever have to look
at it -- hash code by its nature is *very* low maintenance (it either
computes the right function or it doesn't, and the right answer never
changes). And in unlikely case where some terrible unexpected bug is
discovered then the only maintenance needed will be to delete the
current impl and drop-in whatever the new fixed one is.

So +1 to adding SHA-3 and BLAKE to algorithms_guaranteed.

-n

--
Nathaniel J. Smith -- https://vorpus.org

Bernardo Sulzbach

unread,
May 27, 2016, 7:21:05 PM5/27/16
to pytho...@python.org
On 05/27/2016 07:52 PM, Nathaniel Smith wrote:
> If we demote them to second-class support (by making them only
> available in some builds, or using a slow pure Python implementation),
> then we'll be encouraging users to use inferior hashes. We shouldn't
> do this without a very good reason.

I agree. And I really think we shouldn't even ship pure Python
implementations of these hashing algorithms. I am fairly confident that
these algorithms would be prohibitively slow if written in pure Python.

Christian Heimes

unread,
May 28, 2016, 4:45:49 PM5/28/16
to Victor Stinner, Donald Stufft, Raymond Hettinger, Python-Dev@Python. Org
On 2016-05-27 03:44, Victor Stinner wrote:
> Le 27 mai 2016 12:05 PM, "Donald Stufft" <don...@stufft.io
> <mailto:don...@stufft.io>> a écrit :

>> BLAKE2 is an interesting one, because while SHA3 is a NIST standard
> (so it’s going to gain adoption because of that), BLAKE2 is at least as
> strong as SHA3 but is better in many ways, particularly in speed— it’s
> actually faster than MD5 while being as secure as SHA3.
>
> BLAKE2 was part of the SHA3 competition and it was in finalists. The
> SHA3 competition is interesting because each algorithm is deeply tested
> and analyzed by many teams all around the world. Obvious vulnerabilities
> are quickly found.

Thanks Victor,

minor correction, BLAKE was a finalist in the SHA3 competition, not
BLAKE2. BLAKE2 is an improved version of BLAKE2 with additional features.

Christian

Christian Heimes

unread,
May 28, 2016, 4:59:00 PM5/28/16
to M.-A. Lemburg, Raymond Hettinger, Python-Dev@Python. Org
On 2016-05-27 03:54, M.-A. Lemburg wrote:
> On 27.05.2016 06:54, Raymond Hettinger wrote:
>>
>>> On May 25, 2016, at 3:29 AM, Christian Heimes <chri...@python.org> wrote:
>>>
>>> I have three hashing-related patches for Python 3.6 that are waiting for
>>> review. Altogether the three patches add ten new hash algorithms to the
>>> hashlib module: SHA3 (224, 256, 384, 512), SHAKE (SHA3 XOF 128, 256),
>>> BLAKE2 (blake2b, blake2s) and truncated SHA512 (224, 256).
>>
>> Do we really need ten? I don't think the standard library is the place to offer all variants of hashing. And we should avoid getting in a cycle of "this was just released by NIST" and "nobody uses that one anymore". Is any one of them an emergent best practice (i.e. starting to be commonly used in network protocols because it is better, faster, stronger, etc)?
>>
>> Your last message on https://bugs.python.org/issue16113 suggests that these aren't essential and that there is room for debate about whether some of them are standard-library worthy (i.e. we will have them around forever).
>
> I can understand your eagerness to get this landed, since it's
> been 4 years since work started, but I think we should wait with
> the addition until OpenSSL has them:
>
> https://github.com/openssl/openssl/issues/439
>
> The current patch is 1.2MB for SHA-3 - that's pretty heavy for just
> a few hash functions, which aren't in any wide spread use yet and
> probably won't be for quite a few years ahead.

About 1 MB of the 1.2 MB are test vectors for SHA3. Strictly speaking
the test vectors are not required.

> IMO, relying on OpenSSL is a better strategy than providing
> (and maintaining) our own compatibility versions. Until OpenSSL
> has them, people can use Björn's package:
>
> https://github.com/bjornedstrom/python-sha3
>
> Perhaps you could join forces with Björn to create a standard
> SHA-3 standalone package on PyPI based on your two variants
> which we could recommend to people in the docs ?!

I have been maintaining my own SHA3 module for couple of years. A month
ago I moved my code to github and ported it to the new Keccak Code
Package. The standalone package uses the same code as my patch but also
provides the old Keccak hashes and works on Python 2.7.

https://github.com/tiran/pysha3
https://pypi.python.org/pypi/pysha3

Christian Heimes

unread,
May 28, 2016, 5:03:17 PM5/28/16
to Chris Barker, M.-A. Lemburg, Raymond Hettinger, Python-Dev@Python. Org
On 2016-05-27 09:41, Chris Barker wrote:
> I'm probably showing my ignorance here, but couldn't we swap in the
> OpenSSL implementation when that becomes available?

No, not any time soon. As soon as we guarantee SHA3 support we have to
keep our own implementation for a couple of additional releases. We can
drop our own SHA3 code as soon as all supported OpenSSL versions have SHA3.

For example when OpenSSL 1.2.0 is going to have SHA3 support, we must
wait until OpenSSL 1.1 and 1.0.2 are no longer supported by OpenSSL.

Christian

Donald Stufft

unread,
May 28, 2016, 5:06:38 PM5/28/16
to Christian Heimes, Python-Dev@Python. Org, Raymond Hettinger, Chris Barker, M.-A. Lemburg

> On May 28, 2016, at 5:01 PM, Christian Heimes <chri...@python.org> wrote:
>
> No, not any time soon. As soon as we guarantee SHA3 support we have to
> keep our own implementation for a couple of additional releases. We can
> drop our own SHA3 code as soon as all supported OpenSSL versions have SHA3.


It still will be needed for as long as it’s possible to build Python without
OpenSSL.


Donald Stufft

Brett Cannon

unread,
May 28, 2016, 5:08:44 PM5/28/16
to Christian Heimes, M.-A. Lemburg, Raymond Hettinger, Python-Dev@Python. Org


On Sat, May 28, 2016, 13:58 Christian Heimes <chri...@python.org> wrote:
On 2016-05-27 03:54, M.-A. Lemburg wrote:
> On 27.05.2016 06:54, Raymond Hettinger wrote:
>>
>>> On May 25, 2016, at 3:29 AM, Christian Heimes <chri...@python.org> wrote:
>>>
>>> I have three hashing-related patches for Python 3.6 that are waiting for
>>> review. Altogether the three patches add ten new hash algorithms to the
>>> hashlib module: SHA3 (224, 256, 384, 512), SHAKE (SHA3 XOF 128, 256),
>>> BLAKE2 (blake2b, blake2s) and truncated SHA512 (224, 256).
>>
>> Do we really need ten?  I don't think the standard library is the place to offer all variants of hashing.  And we should avoid getting in a cycle of "this was just released by NIST" and "nobody uses that one anymore".  Is any one of them an emergent best practice (i.e. starting to be commonly used in network protocols because it is better, faster, stronger, etc)?
>>
>> Your last message on https://bugs.python.org/issue16113 suggests that these aren't essential and that there is room for debate about whether some of them are standard-library worthy (i.e. we will have them around forever).
>
> I can understand your eagerness to get this landed, since it's
> been 4 years since work started, but I think we should wait with
> the addition until OpenSSL has them:
>
> https://github.com/openssl/openssl/issues/439
>
> The current patch is 1.2MB for SHA-3 - that's pretty heavy for just
> a few hash functions, which aren't in any wide spread use yet and
> probably won't be for quite a few years ahead.

About 1 MB of the 1.2 MB are test vectors for SHA3. Strictly speaking
the test vectors are not required.

We can always make the test vector file an external download like we do for some of the codec tests.

-brett



> IMO, relying on OpenSSL is a better strategy than providing
> (and maintaining) our own compatibility versions. Until OpenSSL
> has them, people can use Björn's package:
>
> https://github.com/bjornedstrom/python-sha3
>
> Perhaps you could join forces with Björn to create a standard
> SHA-3 standalone package on PyPI based on your two variants
> which we could recommend to people in the docs ?!

I have been maintaining my own SHA3 module for couple of years. A month
ago I moved my code to github and ported it to the new Keccak Code
Package. The standalone package uses the same code as my patch but also
provides the old Keccak hashes and works on Python 2.7.

https://github.com/tiran/pysha3
https://pypi.python.org/pypi/pysha3
_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev

Guido van Rossum

unread,
May 28, 2016, 5:10:01 PM5/28/16
to Christian Heimes, Python-Dev@Python. Org, Raymond Hettinger, Chris Barker, M.-A. Lemburg
But you could choose which implementation to use at compile time based
on the autoconf output, right?
> Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org



--
--Guido van Rossum (python.org/~guido)

Christian Heimes

unread,
May 28, 2016, 5:15:11 PM5/28/16
to M.-A. Lemburg, Ryan Gonzalez, Victor Stinner, Raymond Hettinger, Python-Dev@Python. Org
On 2016-05-27 14:41, M.-A. Lemburg wrote:
> On 27.05.2016 22:58, Ryan Gonzalez wrote:
>> On May 27, 2016 3:04 PM, "Victor Stinner" <victor....@gmail.com> wrote:
>>>
>>> Le vendredi 27 mai 2016, M.-A. Lemburg <m...@egenix.com> a écrit :
>>>>
>>>> The current patch is 1.2MB for SHA-3 - that's pretty heavy for just
>>>> a few hash functions, which aren't in any wide spread use yet and
>>>> probably won't be for quite a few years ahead.
>>>
>>>
>>> Oh wow, it's so fat? Why is it so big? Can't we use a lighter version?
>>>
>>
>> The stark majority of the patch is Lib/test/vectors/sha3_224.txt, which
>> seems to be (as the file path implies) just test data. A whopping >1k LOC
>> of really long hashes.
>
> Right. There's about 1MB test data in the patch, but even
> without that data, the patch adds more than 6400 lines of code.

The KeccakCodePackage is rather large. I already removed all unnecessary
files and modified some files so more code is shared between 32 and
64bit optimized variants. Please keep in mind that the KCP contains
multiple implementations with different optimizations for CPU
architectures. I already removed the ARM NEON optimization.
I also don't get your obsession with lines of code. The gzip and expat
are far bigger than the KeccakCodePackage.


> If we add this now, there should at least be an exit strategy
> to remove the code again, when OpenSSL ships with the same
> code, IMO.
>
> Aside: BLAKE2 has already landed in OpenSSL 1.1.0:
>
> https://github.com/openssl/openssl/tree/master/crypto/blake2

Except BLAKE2 in OpenSSL is severely castrated and tailored towards a
very limited use case. The implementation does not support any of the
useful advanced features like keyed hashing (MAC), salt,
personalization, tree hashing and variable hash length.

Donald Stufft

unread,
May 28, 2016, 5:20:35 PM5/28/16
to gu...@python.org, Christian Heimes, M.-A. Lemburg, Chris Barker, Raymond Hettinger, Python-Dev@Python. Org

> On May 28, 2016, at 5:06 PM, Guido van Rossum <gu...@python.org> wrote:
>
> But you could choose which implementation to use at compile time based
> on the autoconf output, right?

I think we should follow what hashlib already does. If we want to change the way it works that's fine but these hashes shouldn't be special. They should work the way that all the other standard hashes in hashlib work.

Christian Heimes

unread,
May 28, 2016, 5:32:20 PM5/28/16
to gu...@python.org, Python-Dev@Python. Org, Raymond Hettinger, Chris Barker, M.-A. Lemburg
On 2016-05-28 14:06, Guido van Rossum wrote:
> But you could choose which implementation to use at compile time based
> on the autoconf output, right?

We compile all modules and then let hashlib decide which implementation
is used. hashlib prefers OpenSSL but falls back to our builtin modules.
For MD5, SHA1 and SHA2 OpenSSL's implementation has better performance
(up to twice the speed).

Christian Heimes

unread,
May 28, 2016, 5:34:22 PM5/28/16
to Brett Cannon, M.-A. Lemburg, Raymond Hettinger, Python-Dev@Python. Org
On 2016-05-28 14:06, Brett Cannon wrote:
> We can always make the test vector file an external download like we do
> for some of the codec tests.

That is actually a great idea! :)

Thanks Brett

_______________________________________________
Python-Dev mailing list
Pytho...@python.org
https://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: https://mail.python.org/mailman/options/python-dev/dev-python%2Bgarchive-30976%40googlegroups.com

Christian Heimes

unread,
May 28, 2016, 5:57:44 PM5/28/16
to Nathaniel Smith, M.-A. Lemburg, Raymond Hettinger, Python-Dev@Python. Org
Thanks Nathaniel,

my patches don't add SHA3 and BLAKE2 to algorithms_guaranteed because
Python still supports C89 platforms without a 64 bit integer type.
Theoretically 64bit ints are not required except for BLAKE2b. Since
Trent's snakebite.org is dead I don't have access to these old platforms
any more.

Christian

Victor Stinner

unread,
May 29, 2016, 2:53:05 AM5/29/16
to Christian Heimes, Python-Dev@Python. Org, Raymond Hettinger, M.-A. Lemburg
Python 3.5 requires a 64 bit signed integer to build. Search for _PyTime type in pytime.h ;-)

M.-A. Lemburg

unread,
May 29, 2016, 10:10:10 AM5/29/16
to Christian Heimes, Ryan Gonzalez, Victor Stinner, Raymond Hettinger, Python-Dev@Python. Org
On 28.05.2016 23:13, Christian Heimes wrote:
> On 2016-05-27 14:41, M.-A. Lemburg wrote:
>> On 27.05.2016 22:58, Ryan Gonzalez wrote:
>>> On May 27, 2016 3:04 PM, "Victor Stinner" <victor....@gmail.com> wrote:
>>>>
>>>> Le vendredi 27 mai 2016, M.-A. Lemburg <m...@egenix.com> a écrit :
>>>>>
>>>>> The current patch is 1.2MB for SHA-3 - that's pretty heavy for just
>>>>> a few hash functions, which aren't in any wide spread use yet and
>>>>> probably won't be for quite a few years ahead.
>>>>
>>>>
>>>> Oh wow, it's so fat? Why is it so big? Can't we use a lighter version?
>>>>
>>>
>>> The stark majority of the patch is Lib/test/vectors/sha3_224.txt, which
>>> seems to be (as the file path implies) just test data. A whopping >1k LOC
>>> of really long hashes.
>>
>> Right. There's about 1MB test data in the patch, but even
>> without that data, the patch adds more than 6400 lines of code.
>
> The KeccakCodePackage is rather large. I already removed all unnecessary
> files and modified some files so more code is shared between 32 and
> 64bit optimized variants. Please keep in mind that the KCP contains
> multiple implementations with different optimizations for CPU
> architectures. I already removed the ARM NEON optimization.
> I also don't get your obsession with lines of code. The gzip and expat
> are far bigger than the KeccakCodePackage.

For a small piece of code, it's fine to have a copy in the
stdlib, but for larger chunks such as this one, I think we
ought to consider alternative options, since I don't think
it's good to have to carry around this baggage forever.

OpenSSL will eventually have good enough support for what
most Python users will need from these new hash functions.
That's why I think it's better to have a discussion of whether
we need to full package in the stdlib or better only provide
limited support built into the stdlib and refer people to
PyPI packages for things that you don't need every day.

Regarding the stories for zlib and expat, I only remember
that expat was essentially unmaintained when we added it
and the existing version at the time had known bugs (but
could be wrong).

For zlib, I have no clue as to why we have a copy in the stdlib.
That lib is available on all systems nowadays. Perhaps it wasn't
when we added it; don't remember. If so, it's a good example of
why adding copies to the stdlib is not such a good idea :-)

>> If we add this now, there should at least be an exit strategy
>> to remove the code again, when OpenSSL ships with the same
>> code, IMO.
>>
>> Aside: BLAKE2 has already landed in OpenSSL 1.1.0:
>>
>> https://github.com/openssl/openssl/tree/master/crypto/blake2
>
> Except BLAKE2 in OpenSSL is severely castrated and tailored towards a
> very limited use case. The implementation does not support any of the
> useful advanced features like keyed hashing (MAC), salt,
> personalization, tree hashing and variable hash length.

I bet that the use cases they put into OpenSSL is what
most people will eventually use, so essentially the same
reasoning we use for putting stuff into the stdlib.

Besides, the code just landed in OpenSSL. It's likely they'll
continue to optimize it and possibly also add the variants
they left out initially.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Experts (#1, May 29 2016)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> Python Database Interfaces ... http://products.egenix.com/
>>> Plone/Zope Database Interfaces ... http://zope.egenix.com/
________________________________________________________________________

::: We implement business ideas - efficiently in both time and costs :::

eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
http://www.malemburg.com/

Christian Heimes

unread,
May 29, 2016, 4:22:44 PM5/29/16
to Victor Stinner, Python-Dev@Python. Org, Raymond Hettinger, M.-A. Lemburg
On 2016-05-28 23:51, Victor Stinner wrote:
> Python 3.5 requires a 64 bit signed integer to build. Search for _PyTime
> type in pytime.h ;-)

Awesome! Thanks :)

Christian Heimes

unread,
Jun 12, 2016, 10:38:50 AM6/12/16
to pytho...@python.org
On 2016-05-25 12:29, Christian Heimes wrote:
> Hi everybody,
>
> I have three hashing-related patches for Python 3.6 that are waiting for
> review. Altogether the three patches add ten new hash algorithms to the
> hashlib module: SHA3 (224, 256, 384, 512), SHAKE (SHA3 XOF 128, 256),
> BLAKE2 (blake2b, blake2s) and truncated SHA512 (224, 256).
>
>
> SHA-3 / SHAKE: https://bugs.python.org/issue16113
> BLAKE2: https://bugs.python.org/issue26798
> SHA512/224 / SHA512/256: https://bugs.python.org/issue26834
>
>
> I like to push the patches during the sprints at PyCon. Please assist
> with reviews.

Hi,

I have unassigned myself from the tickets and will no longer pursue the
addition of new crypto hash algorithms. I might try again when blake2
and sha3 are more widely adopted and the opposition from other core
contributors has diminished. Acceptance is simply not high enough to be
worth the trouble.

Kind regards,
Christian
Reply all
Reply to author
Forward
0 new messages