Publishing text of 2017-era proposed Bitcoin opcode OP_FFS (Fold Function Stream)

98 views
Skip to first unread message

Ethan Heilman

unread,
Sep 9, 2024, 8:59:35 AMSep 9
to Bitcoin Development Mailing List
OP_FFS was an idea written up by Jeremy Rubin in 2017, during an email
conversation with me about a streaming hash function bitcoin opcode. I
am sharing it as it is sometimes referenced in public discussions but
was not previously public and it felt like it should be public. For
instance there was some discussion referring to OP_FFS on [the
bitcoin-dev mailinglist in 2019]
(https://lists.linuxfoundation.org/pipermail/bitcoin-dev/2019-October/017355.html)
and more recently on [twitter in
2024](https://x.com/bitgould/status/1804535410904764666).

This should not be read as a BIP proposal, BIP draft or endorsement,
there are just notes written up during private correspondence. This is
being published with Jeremy's express permission.

Can also be read as a gist:
https://gist.github.com/EthanHeilman/b12b9c834c61109aaccb6cb9ffc675e8

From Jeremy's email to me:

--------BEGIN BIP----------------
<pre>
BIP: XXX
Layer: Consensus (soft fork)
Title: FOLDFUNCTIONSTREAM
Author: Jeremy Rubin <j...@rubin.io>
Ethan Heilman <eth...@gmail.com>
Comments-Summary: No comments yet.
Status: DRAFT
Type: Standards Track
Created: 2017-26-05
License: PD
</pre>

==Abstract==

This BIP describes a new opcode (FOLDFUNCTIONSTREAM) for the Bitcoin
scripting system that adds a number of functional folds across data that
are efficient to compute.


==Summary==

FOLDFUNCTIONSTREAM redefines the existing NOP4 opcode.
When executed, if any of the following conditions are true, the script
interpreter will terminate with an error:

* the stack is not at least 2 elements
* The top item is not a number (N) from -128 to 127. Let `M = 0 if N
== -1 else abs(N) - (1 if N < 0 else 0)`
* the 2nd to the top item on the stack is not a valid {fold type ||
fold midstate}
* the stack is not at least M+2 elements
* any items in between 2nd to top and `M +2` to the top could not be
applied under the fold function number given the midstate

Otherwise, script execution will proceed by popping the top number, stream
processing and popping the top M elements in reverse order, and updating the
midstate on the stack. If the top item on the stack is negative, then the
stream is terminated and midstate is finalized.

==Motivation==

There are many computationally expensive constructs that enable useful types of
scripts, but are difficult to implement in Bitcoin safely.

For instance, CAT would enable concatenation of input arguments
followed by hashing.
However CAT and DUP can be used to cause exponential memory growth.

FOLDFUNCTIONSTREAM with a SHA256 argument could be used to hash an arbitrary
length concatenated string efficiently without worry of unbounded memory
growth.


<a> <b> <c> <FOLDFUNCTION_SHA256> 3 FOLDFUNCTIONSTREAM -1
FOLDFUNCTIONSTREAM

Which can be shortened to (because of the negative number optimization
I made...)

<a> <b> <c> <FOLDFUNCTION_SHA256> -4 FOLDFUNCTIONSTREAM

is equivalent to

<a> <b> CAT <c> CAT SHA256



==Fold Functions==

<pre>
{| class="wikitable sortable" style="width: auto; text-align: center;
font-size: smaller; table-layout: fixed;"
!Name
!Tag
!Purpose
|Midstate
|Argument
|- SHA256
| 1
| Computes SHA256(s_0 || s_1 || ... || s_n)
| Serialized SHA256 midstate
| Arbitrary Byte Vector
|- FACTORS
| 2
| Computes i_1 | i_0 /\ i_2 | i_0 /\ ... /\ i_n | i_0
| Arbitrary Byte Vector interpreted as unsigned integer
|- MAST
| 3
| Computes SHA256(SORT_CAT(SHA256(SORT_CAT(SHA256(s_0), s_1)), ...), s_n)
| s_0 is an arbitrary length vector, s_1...s_n are 32 byte vectors
|- SORT
| 4
| Computes SORT([i_0, i_1, ..., i_n])
| Arbitrary precision integers
|- MODMULT
| 5
| Computes prod(i_1...i_n) mod i_0
| i_* is an arbitrary integer
|- MODADD
| 6
| Computes sum(i_1...i_n) mod i_0
| i_* is an arbitrary integer
|- SETCONTAINS
| 7
| Computes i_2...i_n in i_0 given i_1
| i_0 is an accumulator, i_1 is a witness, i_2...i_n are candidates
|- UNDEFINED_FAIL
| 0
| Immediately returns FALSE and ends script
| n/a
|- UNDEFINED_SUCCEED
| 8-255
| Immediately returns FALSE and ends script
| n/a
</pre>
==Examples==

==Specification==

Refer to the reference implementation, reproduced below, for the precise
semantics and detailed rationale for those semantics.

<pre>
int main() {
// whatever c++ code
}
</pre>

==Reference Implementation==

A reference implementation will be provided by the following pull request:

https://github.com/bitcoin/bitcoin/pull/XXX


==Deployment==

This BIP is to be deployed by "versionbits" BIP9 using bit 6 and incrementing
the SegWit script version. Signalling for bit 6 can happen safely before SegWit
activates because FOLDFUNCTIONSTREAM is only usable from SegWit scripts.

For Bitcoin '''mainnet''', the BIP9 '''starttime''' will be midnight
xxxxxxxxxxxx UTC (Epoch timestamp XXXXXXXXXX) and BIP9 '''timeout'''
will be midnight xxxxxxxxxxxx UTC (Epoch timestamp XXXXXXXXXX).

For Bitcoin '''testnet''', the BIP9 '''starttime''' will be midnight
xxxxxxxxxxxxxxx UTC (Epoch timestamp XXXXXXXXXX) and BIP9
'''timeout''' will be midnight xxxxxxxxxxxx UTC (Epoch timestamp
XXXXXXXXXX).


==Credits==

The orginal idea of stream hash opcodes to solve concatenation exponential
memory came from Ethan Heilman.

The generalization of this idea to a folding function with unbounded stream
operations came from Jeremy Rubin.


==References==

[https://github.com/bitcoin/bips/blob/master/bip-0009.mediawiki BIP 9]
Versionbits


==Copyright==

This document is placed in the public domain.

------------END BIP------------------

One thing I'm debating is if another order of the arguments makes more
sense. I initially had the midstate/type go first, but I realized that
made problems with evaluating with data passed in from a
scriptsig/witness, so I switched it. Unclear if this way has some
other problem I missed.

Also the UNDEFINED behavior is to make it easy to extend, but it also
makes it less safe to pass in untrusted stream types... tradeoffs?
(OP_WITHIN helps with bound checking the arg efficiently... arg *
OP_WITHIN(min_expect, max_expect, arg) outputs either arg or 0).
Reply all
Reply to author
Forward
0 new messages