[BIP Proposal] Add sp() output descriptor format for BIP352

323 views
Skip to first unread message

Craig Raw

unread,
Dec 4, 2025, 3:40:23 AMDec 4
to Bitcoin Development Mailing List
Hi all,

There is a practical need for a silent payments output descriptor format in order to enable wallet interoperability and backup/recovery. There has been some prior discussion on this topic [1][2] which this BIP proposal builds on:


In summary a new top level script expression sp() is defined, which takes as it's first argument one of two new key expressions:
  • spscan1q... which encodes the scan private key and the spend public key
  • spspend1q... which encodes the scan private key and the spend private key
The outputs may then be generated by combining this key material with the sender input public keys. 

In order to reduce the scanning burden, a block height may be optionally specified in the sp() expression as a second argument for a wallet birthday. Finally, zero or more positive integers may be specified as further arguments to scan for additional BIP352 labels. The change label (m = 0) is implicitly included.

Examples:
sp(spscan1q...)
sp([deadbeef/352'/0'/0']spscan1q...,900000)
sp(spspend1q...,842579,1,2,3)
sp([deadbeef/352'/0'/0']spscan1q...,900000,1,5,10)

--Craig


Oghenovo Usiwoma

unread,
Dec 4, 2025, 5:11:55 AMDec 4
to Craig Raw, Bitcoin Development Mailing List
Hi Craig, thank you for taking this up. I have the following comments, based on a light inspection of your original email.

> In order to reduce the scanning burden, a block height may be optionally specified in the sp() expression as a second argument for a wallet birthday.

I'm not sure adding a block height does much to reduce scanning burden. We can already scan from the taproot activation height and it won't matter much anyway, because the chain will get longer and this only helps temporarily.

Users can also specify a "wallet birthday" in their wallets which can be used for scanning. Is there any reason to add the birthday to the descriptor? Other descriptors do not do this.

> Finally, zero or more positive integers may be specified as further arguments to scan for additional BIP352 labels. The change label (m = 0) is implicitly included.

In https://github.com/bitcoin/bips/blob/master/bip-0352.mediawiki#backup-and-recovery , a strategy to recover funds from labels is specified. We can attempt to make this stronger and avoid the need to also include an integer for labels. For example, we can set the maximum number of labels in the bip; wallets will only have to scan for this max number of labels during recovery and if a wallet goes beyond this maximum number, they have gone beyond the bip and are now responsible for ensuring full recovery of funds. 

> In summary a new top level script expression sp() is defined, which takes as it's first argument one of two new key expressions:
- spscan1q... which encodes the scan private key and the spend public key
- spspend1q... which encodes the scan private key and the spend private key

Given the above points, I argue that we don't need to introduce new scan and spend key formats, and we can use "sp(scankey,spendkey)".

I'm happy to hear any counter arguments you have.

Novo

--
You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAPR5oBNCd65XaipOF%3DeXW7PT%2BJRVC4m6ey%2BX42aQsKa1YzA-Xw%40mail.gmail.com.

Craig Raw

unread,
Dec 4, 2025, 6:07:49 AMDec 4
to Oghenovo Usiwoma, Bitcoin Development Mailing List
Hi Novo,

Responses inline:

> I'm not sure adding a block height does much to reduce scanning burden. We can already scan from the taproot activation height and it won't matter much anyway, because the chain will get longer and this only helps temporarily.

I'm not sure I follow here. Since we need to retrieve and compute possible matching outputs for all eligible public keys in the chain, having a block height later than the Taproot activation date can make a significant difference, and will make a greater difference in future as the chain grows.

> Is there any reason to add the birthday to the descriptor? Other descriptors do not do this.

The difference between this and other descriptors is that it cannot describe outputs without reference to the blockchain. This, combined with the significant computational burden which other descriptors do not have to bear, is reason enough I think to include it here as an optional argument.

> For example, we can set the maximum number of labels in the bip; wallets will only have to scan for this max number of labels during recovery and if a wallet goes beyond this maximum number, they have gone beyond the bip and are now responsible for ensuring full recovery of funds. 

The problem with this approach is that scanning for each additional label adds incrementally and non-trivially to the computational burden. For each label, there is an EC point addition and comparison against all taproot outputs for an eligible transaction. Some benchmark numbers indicating the relative cost of each additional label are in [1], demonstrating that scanning for 100k labels is cost-prohibitive. As an aside, I will add that labels have a limited use case, and in most cases a new BIP44 account is a better choice for additional silent payment addresses based on the same seed.

Given the above points, I argue that we don't need to introduce new scan and spend key formats, and we can use "sp(scankey,spendkey)".

While not strictly necessary, using spscan and spspend key expressions make for a much better user experience and reduce the chance for user error. With this encoding we get:
  1. A self-describing format which makes the use and sensitivity of the key material immediately obvious
  2. The advantages of Bech32m encoding, including strong error detection and unambiguous characters
  3. Safety from accidentally mixing different unrelated scan and spend keys
  4. Versioning to indicate silent payments version 0
  5. A similar format to an xpub, the display of which is a common user interface element in many wallets which makes things simpler for wallet developers and users alike
--Craig

Oghenovo Usiwoma

unread,
Dec 12, 2025, 2:28:55 AM (13 days ago) Dec 12
to Craig Raw, Bitcoin Development Mailing List
Hi Craig,

I see how adding the birthday to the descriptor string could be beneficial for anyone trying to use a third-party scanning server; they will only have to submit the descriptor string, and the server will automatically determine what height to start scanning from. However, without the birthday, the descriptor will still be able to describe its outputs. The birthday can be collected through some other means, as we do with other descriptors today.

I'm opposed to using new key formats because I do not think we have enough justification to do so. With existing key formats, users and wallets will be able use their existing master key to generate silent payment outputs using a descriptor like: sp(<master_key>/352h/1h/0h/1h/0,<master_key>/352h/1h/0h/0h/0).


> A self-describing format which makes the use and sensitivity of the key material immediately obvious
> The advantages of Bech32m encoding, including strong error detection and unambiguous characters
I'm not sure these are strong enough to warrant new key formats.


> Safety from accidentally mixing different unrelated scan and spend keys
I'm not sure what the chance of this happening is. We already have descriptors with multiple keys having complex relationships. Compared to those, the "sp()" is simple. Users are supposed to back up the entire descriptor string. If this is done properly, then the keys should not be mixed up.


> Versioning to indicate silent payments version 0
We should use the descriptor prefix to indicate the silent payments version instead of the keys.

> A similar format to an xpub, the display of which is a common user interface element in many wallets, which makes things simpler for wallet developers and users alike
Things can be even simpler without the new key format.

From your original email:

> Finally, zero or more positive integers may be specified as further arguments to scan for additional BIP352 labels.
IIUC, this creates a descriptor with a variable length. What if we encoded multiple labels in one number? For example, labels 1, 5, 10 are encoded into a 64-bit number by setting the corresponding bit positions to '1' so that the final number is '1058'. Using one number to encode the labels is very appealing to me.

Kind regards,
Novo

pyth

unread,
Dec 12, 2025, 5:16:09 AM (13 days ago) Dec 12
to Oghenovo Usiwoma, Craig Raw, Bitcoin Development Mailing List
Hi Novo,


> The birthday can be collected through some other means, as we do with other descriptors today.

Being on the wallet side, I'd like other descriptor type to have an **optional** field for the birthday, being forced to backup it separately is not the best design imho.

Best,
Pyth
signature.asc

Craig Raw

unread,
Dec 12, 2025, 5:16:32 AM (13 days ago) Dec 12
to Oghenovo Usiwoma, Bitcoin Development Mailing List
Hi Novo,

Responses inline:

> However, without the birthday, the descriptor will still be able to describe its outputs. The birthday can be collected through some other means, as we do with other descriptors today.

Indeed, that is why it is an optional argument. Again however, other descriptors do not have to bear the very significant computational overhead that sp() descriptors do. For many deployment contexts, this will effectively make the birthday a requirement to retrieve all silent payment outputs in a wallet within a usable time frame. Other descriptors do not share such a stark usability challenge.

> With existing key formats, users and wallets will be able use their existing master key to generate silent payment outputs using a descriptor like: sp(<master_key>/352h/1h/0h/1h/0,<master_key>/352h/1h/0h/0h/0).

You are requiring the user to specify their master xprv in an output descriptor, even for a watch-only wallet. This is a non-starter. Today, output descriptors are often stored in clear text alongside a hardware wallet or similar as privacy-sensitive but not directly security-sensitive information.

> > The advantages of Bech32m encoding, including strong error detection and unambiguous characters
> I'm not sure these are strong enough to warrant new key formats.

To the contrary, the use of output descriptors today means they are often written down (sometimes into durable media). In this context, the advantages of strong error detection and unambiguous characters are significant.

> > Safety from accidentally mixing different unrelated scan and spend keys
> I'm not sure what the chance of this happening is.... If this is done properly, then the keys should not be mixed up.

It might also be done intentionally, with unexpected results. Regardless of the cause, having one key expression discourages this.

> We should use the descriptor prefix to indicate the silent payments version instead of the keys.

Indeed, silent payments version 1 may use a different script expression. The versioning here is still useful though.

> Things can be even simpler without the new key format.

This is an unsubstantiated statement, suggesting that two key expressions are somehow simpler than one. Again, it is simpler for wallet developers and users to adapt to formats that are similar to those they are already familiar with - and they are very familiar with xpubs.

> IIUC, this creates a descriptor with a variable length. What if we encoded multiple labels in one number? For example, labels 1, 5, 10 are encoded into a 64-bit number by setting the corresponding bit positions to '1' so that the final number is '1058'. Using one number to encode the labels is very appealing to me.

Descriptors are generally of variable length, so I'm not sure why this is so appealing. Not only does this limit the range from 1 to 63, it has the added disadvantage of making this part of the descriptor unreadable to most humans.

Best regards,
Craig

Oghenovo Usiwoma

unread,
Dec 14, 2025, 3:33:03 PM (11 days ago) Dec 14
to Craig Raw, Bitcoin Development Mailing List
Hi Craig,

> > IIUC, this creates a descriptor with a variable length. What if we encoded multiple labels in one number? For example, labels 1, 5, 10 are encoded into a 64-bit number by setting the corresponding bit positions to '1' so that the final number is '1058'. Using one number to encode the labels is very appealing to me.

> Descriptors are generally of variable length, so I'm not sure why this is so appealing. Not only does this limit the range from 1 to 63, it has the added disadvantage of making this part of the descriptor unreadable to most humans.

I was thinking about a descriptor with thousands of labels and I thought I could encode a large list of numbers into 64 bit number, but that’s impossible.

Consider this idea, if we strictly increment the label by one, then just writing the max label, '10', in this example should be sufficient to tell the wallet to check for labels 1.. 10. The previous example used "1, 5, 10" as labels, but I'm not sure why we would we use label '1' and skip to '5', so writing just the max label should work?


I'm still not in support of using new key formats for the descriptor, but I'll see if they receive popular support.

Novo.

Sebastian Falbesoner

unread,
Dec 22, 2025, 4:14:15 PM (3 days ago) Dec 22
to Bitcoin Development Mailing List
Hi Craig,

a few comments about the cost of label scanning, as this topic comes up
repeatedly:


> The problem with this approach is that scanning for each additional label
> adds incrementally and non-trivially to the computational burden.

I think this statement is a misconception or at best only half of the truth,
likely based on the assumption that all existing and future SP wallets would
implement scanning using the same method.


> For each label, there is an EC point addition and comparison against all
> taproot outputs for an eligible transaction.

Iterating through labels and calculating taproot outputs to match for is only
one way to implement scanning. Another one, as laid out in BIP-352 [1], is to
do it backwards: for each taproot output, subtract the output candidate and
check if the result is in the label cache. One advantage of this approach is
that its scanning cost is independent of the number of labels to scan for, as
the lookup time in the cache is constant and efficient if a proper data
structure is used.

To prove that point, I've extended the scanning benchmark of the libsecp SP
module PR #1765 [2] with an _actual_ label cache using a third-party hash
table implementation [3]:
https://github.com/theStack/secp256k1/commit/f9f41adcedaca98aa4f3f65e2782e25b2124bf85

The scanning time is measured with three different label cache sizes: no
entries (i.e. empty lookup function stub, no hash-map involved), one entry
("tiny"), and one million entries ("huge"). Running the benchmark for scanning
a transaction with 10 taproot outputs leads to the following output:

$ ./build/bin/bench sp_scan_with_labels
Benchmark                               ,    Min(us)    ,    Avg(us)    ,    Max(us)

sp_scan_with_label_cache_empty_L=0      ,    61.3       ,    61.4       ,    61.4
sp_scan_with_label_cache_tiny_L=1       ,    61.6       ,    61.6       ,    61.7
sp_scan_with_label_cache_huge_L=1000000 ,    61.6       ,    61.7       ,    61.7

This shows that the cache lookup cost is negligible even for hundreds of
thousands of labels, compared to other much more costly EC operations in the
scanning function.

With only a handful of labels to scan for, the "iterate over labels" approach
is faster (certainly if only scanning for the change label), but at some
cross-over point the BIP-style scanning with label-cache lookup performs
better [4], and at that point the scanning performance doesn't get worse
anymore, i.e. each additional label is for free.

Given that this alternative scanning approach exists, I don't think it is
appropriate to say that label scanning doesn't scale in general. It is my
understanding that this BIP-style scanning might not work well for light
clients, as they usually don't have the (full) transaction output data to
start with, so it's not applicable for all types of wallets. Still, there
might be full-node wallets in the future with special use-cases that can deal
with hundreds of thousands of labels without a noticeable decline in scanning
performance; if they choose the scanning approach as described in the BIP, and
implement the label cache using an efficient data structure, there shouldn't
be any problems.

Hence, I'd be in favor of allowing label ranges in the SP descriptor format.
Even for use-cases with a smaller amount of labels, it seems to make sense to
shorten the descriptor string in most cases. I can't think of any good reasons
for introducing gaps in the m values for label creation.

I'm still curious to hear some more concrete use-cases for labels in general
and especially large amount of labels. (Admittedly, if we were absolutely
certain that there will never be any use for that, there would indeed not much
point in allowing lots of labels, but we have no way of knowing, and
restricting potential use-cases doesn't seem the right approach to me).

Cheers,
Sebastian

[1] https://github.com/bitcoin/bips/blob/master/bip-0352.mediawiki#user-content-Scanning, see "check for labels" ff.
[2] https://github.com/bitcoin-core/secp256k1/pull/1765
[3] using the header-only libraries khash.h (for the hash-map) and rapidhash.h
(as hash function), see
https://github.com/attractivechaos/klib/blob/master/khash.h and
https://github.com/Nicoshev/rapidhash/blob/master/rapidhash.h
[4] in theory, whenever the number of labels L is larger than twice the number
of taproot outputs N (i.e. L > 2*N), BIP-style scanning should be faster; see
https://gist.github.com/theStack/25c77747838610931e8bbeb9d76faf78?permalink_comment_id=5897811#gistcomment-5897811
for benchmarks over a L/N matrix
Reply all
Reply to author
Forward
0 new messages