Strategy for Accomodating Unstable Platforms and Tools

16 views
Skip to first unread message

Jeffrey Walton

unread,
Oct 31, 2015, 11:42:03 PM10/31/15
to Crypto++ Users
Hi Everyone,

Some distributions test the library using an unstable platform/distribution that provides unstable tools. This is sometimes referred to "bleeding edge". Though its called "unstable" or "bleeding edge", its should not be received in a negative way. Its simply cutting edge, and its paving the way for the next release cycle.

We have taken a few bug reports under the unstable testing. The reports are usually an unexplained failure, like a hang or crash. From our Release Testing process (http://cryptopp.com/wiki/Release_Testing), I'm fairly confident of the code. I don't claim its bug free, but the assurance levels are fairly high.

We can't get access to the testing environment where the finding is produced. Some times, the test rig (bleeding edge platform and tools) is so delicate that we can't set it up to duplicate the issue. For example, it may be a particular version/check-in of the tools, and when we perform an "apt-get update/upgrade cycle, we skip over the problem package. As an example, there was a hang in SHA in bleeding edge when ASM was in effect. We was not able to reproduce it or access the test environment, and eventually a VM was TAR'd and made available for us. By disabling ASM or moving ResizeBuffers() out-of-line, we cleared a hang and a crash (see http://github.com/weidai11/cryptopp/pull/46/files). And as soon as we updated the toolchain in the VM, the issue resolved itself.

Wei suggested we disable ASM, but I [mildly] disagree. My opinion differs because I still see benefit to the code. First, the code is genuinely faster than what the compiler produces. I doubt the compiler will ever be able to produce faster AES or Whirpool code than a human when CPU acceleration is available. Second, we have to use ASM for down level toolchains, like Visual Studio 2005, when using instructions like RDRAND and RDSEED. Though the instructions are fairly recent (circa 2012/2013), we can still produce usable code in Visual Studio 2005 that utilizes the instructions.

I'd like to solicit feedback regarding a reasonable strategy when working with unstable platforms or tools. That is, what should our policy be?

Below are some examples to help get you pointed in the right direction. I'm just tossing them out there as examples. I'm not showing my hand yet because I don't want to influence the process.

Jeff

Potential Strategies

* Don't support unstable; only support stable
* Don't support unstable+ASM; require CRYPTOPP_DISABLE_ASM
* Fully support unstable


Jean-Pierre Münch

unread,
Nov 1, 2015, 9:28:45 AM11/1/15
to cryptop...@googlegroups.com
Am 01.11.2015 um 04:42 schrieb Jeffrey Walton:
Hi Everyone,

Some distributions test the library using an unstable platform/distribution that provides unstable tools. This is sometimes referred to "bleeding edge". Though its called "unstable" or "bleeding edge", its should not be received in a negative way. Its simply cutting edge, and its paving the way for the next release cycle.

We have taken a few bug reports under the unstable testing. The reports are usually an unexplained failure, like a hang or crash. From our Release Testing process (http://cryptopp.com/wiki/Release_Testing), I'm fairly confident of the code. I don't claim its bug free, but the assurance levels are fairly high.

We can't get access to the testing environment where the finding is produced. Some times, the test rig (bleeding edge platform and tools) is so delicate that we can't set it up to duplicate the issue. For example, it may be a particular version/check-in of the tools, and when we perform an "apt-get update/upgrade cycle, we skip over the problem package. As an example, there was a hang in SHA in bleeding edge when ASM was in effect. We was not able to reproduce it or access the test environment, and eventually a VM was TAR'd and made available for us. By disabling ASM or moving ResizeBuffers() out-of-line, we cleared a hang and a crash (see http://github.com/weidai11/cryptopp/pull/46/files). And as soon as we updated the toolchain in the VM, the issue resolved itself.

Wei suggested we disable ASM, but I [mildly] disagree. My opinion differs because I still see benefit to the code. First, the code is genuinely faster than what the compiler produces. I doubt the compiler will ever be able to produce faster AES or Whirpool code than a human when CPU acceleration is available. Second, we have to use ASM for down level toolchains, like Visual Studio 2005, when using instructions like RDRAND and RDSEED. Though the instructions are fairly recent (circa 2012/2013), we can still produce usable code in Visual Studio 2005 that utilizes the instructions.
I'm also against disabling ASM for the same reasons.


I'd like to solicit feedback regarding a reasonable strategy when working with unstable platforms or tools. That is, what should our policy be?
My suggestion:

Full stable support in each case.
Unstable support as far as possible.
Definition of "as far as possible":
a) If the change that presumably caused the error will make it into "stable" -> we need to address this beforehand (assuming we're talking about something like Ubuntu pre-RCs or similar)
b) If the bug is "simple" to find and fix -> fix it.
c) If the configuration is "easy" to acquire and likely for (some) people -> fix it.

I think that should capture it.
Of course the optimal case where the above would result would be: Full unstable support. But this may not be desirable if we try to hunt down one bug for days / weeks for one odd configuration that is super-unlikely to occur in the wild.

BR

JPM

Below are some examples to help get you pointed in the right direction. I'm just tossing them out there as examples. I'm not showing my hand yet because I don't want to influence the process.

Jeff

Potential Strategies

* Don't support unstable; only support stable
* Don't support unstable+ASM; require CRYPTOPP_DISABLE_ASM
* Fully support unstable


--
--
You received this message because you are subscribed to the "Crypto++ Users" Google Group.
To unsubscribe, send an email to cryptopp-user...@googlegroups.com.
More information about Crypto++ and this group is available at http://www.cryptopp.com.
---
You received this message because you are subscribed to the Google Groups "Crypto++ Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cryptopp-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Mobile Mouse

unread,
Nov 1, 2015, 10:57:51 AM11/1/15
to Jean-Pierre Münch, cryptop...@googlegroups.com
M $0.02.

Every problem report should be looked at - we have no choice. 

If the issue is with our code (some deviation from the standard that managed to hide until now), we try to fix it.

If the cause is a change outside of our code, but it (a) is easy to address, (b) seems to future-proof our code, and (c) does not break the existing platforms (maybe except those beings sunset-ed) - we try to fix it.

Otherwise, in general we apologize for our lack of resources to address every possible platform, and leave it as us. Maybe encouraging the user to file a bug report with his platform.

Sent from my iPad

Zooko Wilcox-O'Hearn

unread,
Nov 1, 2015, 12:39:36 PM11/1/15
to Jeffrey Walton, Crypto++ Users
> * Don't support unstable; only support stable
> * Don't support unstable+ASM; require CRYPTOPP_DISABLE_ASM
> * Fully support unstable

Three inter-related issues here:

Issue 1: What does "support" mean? Empirically, it seems to mean that
Jeffrey will spend his time reproducing and debugging crashes on
unstable platforms. I would tend to vote against that, but I guess it
is up to Jeffrey?

Issue 2: On the other hand, if we set up a Buildbot then we could
provide information to people about whether Crypto++ builds and passes
tests on a variety of platforms, including unstable platforms, without
a major ongoing expenditure of anyone's (Jeffrey's) time. That is: it
is an expenditure of someone's time to set up a Buildbot for a
specific platform, but it is a pretty low ongoing cost to keep it
running after that.

This would also let us know whenever a patch that we committed caused
tests to go from passing to failing on any platform, which would be
potentially valuable information.

Issue 3: I am in favor of disabling ASM by default! I maintain a
library built on top of Crypto++ — pycryptopp — which is used by
“Tahoe-LAFS”, the cryptographic distributed storage system. We decided
after collecting several years of data about bugs that the added
bugginess of asm implementations was too much to justify the added
performance: https://tahoe-lafs.org/trac/pycryptopp/ticket/85

So in any case, I am *certainly* in favor of the smaller step of
disabling asm on unstable platforms. You could imagine that enabling
asm is a sort of opt-in step that only happens for specific platforms
where it has been proven (perhaps with the aid of a Buildbot) to be
safe.

Regards,

Zooko
Reply all
Reply to author
Forward
0 new messages