Some comments/questions on the general structure of the scheme:
When I started thinking about ways to change the algorithm, I started to appreciate it more :) Although this algo is not specific to Bitcoin I'm viewing it 100% through that lens here. Some thoughts:
We want this CISA algorithm to have the property that it doesn't require the blockchain (and its verifiers) to incur linear cost in the number of signers/signatures. For a 100 input transaction, we get big gains from the owner or owners of the inputs choosing to use this algo, but that would mostly be lost if either the verifying was linear in the number, or if the size of the signature was linear in the number. So to avoid that we want a (R, s) structure to be actually published, not an (R1..Rn, s) or a (R, s1..sn). That pretty much forces us to make a sum R for all the individual component's R-values, and the same for s.
However it doesn't quite force us to add literally everything. The pubkeys *can* be kept separate, because they are retrieved implicitly from the existing blockchain record, they are not published with the signature (taproot). (Technically the same comment applies to the message being signed). This allows us to use the more "pedestrian", "safe" idea; we are not aggregating keys (as in MuSig) so we can actually add each with its own challenge hash: sum( challenge_hash_i * X_i). This may worry you that there is a performance issue because the verifier has to iterate through that whole list ( the verification equation being: sG =?= R + c_1 X_1 + c_2X_2 + .. ), but the paper specifically claims that comparing this with just batch verifying the individual signatures (i.e. without CISA), this is twice as fast.
So one could simplistically say "OK that's the pubkey side, they're treated individually so we don't have to worry about that, but what about adding the R values?" ("worry" here means: trivial key subtraction attacks or sophisticated Wagner/ROS grinding). And here what is done is basically the same as in MuSig2, which is to say, by breaking the nonce into two components and including an additional challenge hash, you prevent the counterparty/adversary from grinding R values successfully. Note that the "b" coefficient used here is more explicit about hashing the full context, than it was in MuSig2: it's hashing each individual pubkey and message as well as the R2 subcomponents for each party. This is vaguely similar to "client side validation" ideas: it's not really "validation" as in state updates, but it's having the more complex/expensive part of the calculation being done in the coordination before anything goes on-chain, and allowing us to just use a single "R" value onchain that we know is safe.
(Side note: it's worth remembering that a lot (maybe a huge majority?) of the usage of CISA will be a single signer of multiple inputs; for these cases there is not the same security arguments required, only that the final signature is not leaking the private key!).
That side note reminds me of my first question: would it not be appropriate to include a proof of the zero knowledgeness property of the scheme, and not only the soundness? I can kind of accept the answer "it's trivial" based on the structure of the partial sig components (s_k = r_k1 + br_k2 + c_k x_k) being "identical" to baseline Schnorr?
The side note also raises this point: would it be a good idea to explicitly write down ways in which the usage of the scheme/structure can, and cannot, be optimised for the single-party case? Intuitively it's "obvious" that you may be able to streamline it for the case where all operations happen on the same device, with a single owner of all the private keys. I realize that this is a thorny point, because we explicitly want to account for the corruption of parties that are "supposed" to be the same as the honest signer, but aren't.
And my last question is about this multi-component-nonce technique:
Did you consider the idea of e.g. sending proofs of knowledge of R along with R in the coordination step? This would keep the same number of rounds, and I'm assuming (though not sure exactly) that it makes the security proof significantly simpler, but my guess is you mostly dismiss such approaches as being too expensive for, say, constrained devices? (I imagine something like: 2 parties say, X1 sends (R1, pi_R1) and same for X2, to coordinator, then sum directly for overall R; here pi_R1 is ofc just a schnorr sig on r). If we're talking about bandwidth the current "ctx" object is already pretty large, right, because it contains all the pubkeys and all the messages (though in bitcoin they could be implicit perhaps).
(I won't mention the other idea, which is going back to MuSig1 style and just committing to R, because that's what both MuSig2 and FROST went away from, preferring fewer rounds.)
By the way after writing this overly long post I realised I didn't even get in to the really tricky part of the algorithm, the "check our key and message appears once" part because of the multisig-to-aggregated-sig transformation and the hole previously identified in it, which to be fair is the most interesting bit. Oh well, another time!
Cheers,
AdamISZ/waxwing
On Thursday, April 17, 2025 at 10:38:46 AM UTC-6 Jonas Nick wrote: