On Wed, Oct 01, 2025 at 07:24:50AM -0700, waxwing/ AdamISZ wrote:
> I'm curious about the case of P, R, s published in utxos to prevent usage
> of utxos as data. I think this answers in the half-affirmative: you can
> only embed data by leaking the privkey so that it (can) immediately fall
> out of the utxo set.
I think you can attack the setup here.
If you allow scriptPubKeys in the utxo set whose spending conditions
are HTLC/atomic-swap-like:
(pubkey A and preimage reveal of X)
OR (pubkey B and block height > H)
then you either set H to be arbitrarily far in the future and reveal
B's privkey, or choose an NUMS X with no known preimage, and reveal
A's privkey.
If you don't allow those things (eg, by requiring such constructions
also have a (pubkey musig(A,B)) path) then I think you rule out NUMS-IPK
constructions, and end up making things like vaults ("hotkey with delay,
coldkey anytime") difficult to send to ("I have to sign with my cold
key to request funds?"), or, depending on what the utxo R,s is signing,
encourage key reuse.
> (To emphasize, this is different to the earlier observations (including by
> me!) that just say it is *possible* to leak data by leaking the private
> key; here I'm trying to prove that there is *no other way*).
That seems right to me.
I think if the signature scheme supported pubkey recovery (ie, s*G = R +
H(R,m)*P, and our "m" didn't commit to P as well), you could get around
this by just having P be the data, with no one, including the "signer"
able to recover the private key.
> However I still am probably in the large majority that thinks it's
> appalling to imagine a sig attached to every pubkey onchain.
I think the only thing achieved by embedding data in the utxo set (vs
an OP_RETURN output or witness data) is to bloat the utxo set; and if
that's the goal, it can equally easily be done with spendable outputs
that the attacker simply chooses not to ever spend. So that doesn't seem
like a terribly interesting solution to anything.
As far as embedding data in signatures goes, I think the following
scheme would allow you to publish data in a cryptographically-secure way,
with minimal lost funds:
0) Setup secret keys p and q, and a 32-byte secret k. H(a,b,..) is sha256
of a,b,.. concatenated.
1) Split your data into N 31 byte blocks, a1, a2, .., aN.
2) Calculate r0 as H(k*G). Calculate r1, .., rN as:
r(i+1) = H(p, r(i)) + a(i)
3) Sign N+1 transactions in a chain spending pubkey p*G, using rN, r(N-1),
.., r1, r0 as nonces. All but the final tx should pay to a p*G output to
continue the chain; the final output should pay to q*G instead.
4) Once all transactions are sufficiently confirmed, spend the final
output with k as the secret nonce (and hence R=k*G as the public
nonce).
Recover the data using the following process:
1) From the final transaction, recover R=k*G, and calculate r0 as H(R).
Recover p from the previous transaction, p = (s0-r0)/H(r0*G, P,mi).
2) Recover ri from each signature; ri = si - H(Ri, P, mi)*p. Recover
the data ai as ai = ri - H(p,r(i-1)).
Dealing with the points being 32-bytes might require carrying over a
sign-bit; but that should be possible in the spare ~7 bits since each
block was only 31 bytes not 32 bytes. Left as an exercise for the
reader, etc.
I believe that the privkey p is secure prior to k*G being revealed,
since all the nonces are distinct hashes seeded by that privkey; and q
remains secure because k is never revealed.
If you wanted to not reuse the pubkey p*G repeatedly, you could tweak it
to be p0 = p, p(i+1) = p + H(k*G, p(i)), or similar. That would allow you
to use an n-of-n multisig to get multiple blocks in a single transaction
without seeming weird, eg.
I believe the only way to distinguish this from a normal transaction
pattern where a wallet has a change output, is via the final transaction
that reveals k*G, and detecting the relationship between k*G and the
spending conditions of the transaction that created the coin being spent.
That's already somewhat expensive to check for every spend, but could
be made more so by publishing k*G on some other medium (ie the data is
in the blockchain, but you obtain the txid and key to find the data
from elsewhere), or by revealing (k+x)*G where x is a random 20-bit
(?) number, and a significant but tractable amount of grinding is needed
to recover the desired k*G and decode the data -- the idea being that
that is tractable for someone who knows there is data at that txid,
but not tractable when performed on every signature in the blockchain
in order to filter data publication.
I think if you did 20 such transactions per block, each spending a single
20-of-20 tapscript multisig, you'd get 12400 bytes of data per block
(without violating standardness constraints), at a cost of ~11800vb, so
much less efficient than inscriptions, but slightly more efficient than
OP_RETURN, and significantly less detectable than either. I think Knots
default policy currently allows up to 50-of-50 multisig in tapscript,
which would give you 31kB of data in ~26.6kvB of tx weight in a block.
If you're regularly making payments from a particular wallet, I think
that procedure would allow you to encode data in your change outputs at
the rate of 32B/tx for no additional cost. Though the data would only be
recoverable once complete, and it's probably worth noting that I haven't
provided any security proofs...
Cheers,
aj