Re-activatable archived ledger entries

David Mazieres

unread,

Jul 7, 2022, 10:55:29 PM7/7/22

to Stellar Developers, Nicolas Barry

This is a high-level proposal to allow old, inactive ledger entries to
be purged from the Stellar ledger in such a way that they can later be
resurrected by motivated users. It is motivated by smart contracts, but
should apply equally to classic stellar ledger entries.

In this model, there are two kinds of ledger entries, active and
archived. The specific criteria for archiving active ledger entries
doesn't really matter for the proposal. We could archive ledger entries
that haven't been used in two years, or archive ledger entries for which
no one has paid rent, or whatever other policy seems reasonable. The
important point is that we want most nodes not to have to store archived
ledger entries, while avoiding situations in which users lose a lot of
money because a ledger entry disappears. The idea is to give users the
ability to move entries from archived back to active (albeit with higher
cost and latency than a typical stellar transaction).

We have two kinds of nodes in our model, validator nodes, and archive
nodes:

* Validator nodes hold all active ledger entries plus a small
(logarithmic) amount of state that can be used to authenticate older
ledger entries. Validator nodes participate in SCP and implement a
replicated state machine.

* Archive nodes store ledger entries that have been archived from the
Validator nodes. The system does not require the existence of archive
nodes for day-to-day function, but without archive nodes, expired
ledger entries will be unrecoverable. Archive nodes do not
participate in consensus and need not all contain the same state.
E.g., some nodes might store only recently archived ledger entries.

According to policy, validator nodes periodically demote active ledger
entries to archived, which are then stored by archive nodes. Users who
care about an archived node can obtain a proof of existence from an
archive node and submit a transaction to the validator nodes that
reactivates the ledger entry if the archived state is valid and has not
been previously reactivated.

The idea is pretty simple and not especially novel. The validator nodes
are going to store the root of a Merkle tree whose leaves are all of the
archived ledger entries. To reactivate an archived ledger entry, users
supply a proof of inclusion in the Merkle tree. To ensure that archived
ledger entries are not reactivated multiple times, reactivated leaves
are "clipped" from the Merkle tree, as described below.

In more detail, archived ledger entries are stored in a sequentially
numbered log, A[0], A[1], ..., A[N-1].

Define the Merkle tree T as:

T[0][n] = A[n] if n < N
nil otherwise

T[i+1][n] = SHA(T[i][2*n] || T[i][(2*n)+1])
or nil if T[i][2*n] and T[i][(2*n)+1] are both nil

An archive hash h is T[s][0], where s = ceil(log_2 N), or nil if N == 0.

A proof of inclusion for A[n] consists of the path P[n] consisting of
the siblings of A[n] and all its ancestors, namely P[n] = p[n][0], ...,
p[n][s] where:

p[i] = T[i][(n>>i)^1] // where ^ is XOR

Validator nodes recompute h whenever they archive a ledger entry by
appending it to the list A[0], ..., A[N]. They can do so maintaining
only the archive size, N, and the path P[N], because appending never
requires re-hashing a previously complete subtree. While updating h,
nodes also compute P[N+1], then discard the previously stored nodes from
P[N].

Given an archive hash h, an archive size N, an index n, an alleged
archived ledger entry A[n], and a proof of inclusion P[n], it is easy to
verify to verify that A[n] indeed matches (h,N) assuming the collision
resistance of SHA. This is just straight-up Merkle tree verification.

The important point is that given an entry A[n], a proof P[n], and a
path P[N], it is possible to "clip" A[n] by setting A[n]=nil and then
recomputing P[N] (including the new root of the tree). You just need to
use P[n] to compute new internal nodes of the Merkle tree until the path
P[n] intersects P[N], at which point you update the stored P[N]. You
can even optimize this by clipping a bunch of A[n] in parallel for
different subtrees.

In practice, we need re-activation to happen at a much slower timescale
than most transactions. While the ledger close time is 5 seconds,
archiving and reactivation could happen every 30 minutes. This would
give people who want to reactivate archived ledger entries ample time to
construct and submit the proof with respect to the current archive hash.

It also may be useful to know how much of a given asset has been
archived, so in stellar classic, we can keep a 128-bit per-asset counter
of the sum of all trust lines that have been archived. For smart
contracts, perhaps we can have some pre-paid logic that somehow tracks
and aggregates archived ledger entries.

One important design decision is what happens when a reactivated ledger
entry already exists. For example, suppose a trustline is archived,
then recreated, then the archived version is reactivated. One
possibility is that you are not allowed to reactive the trustline if it
exists. Another would be that we have per-ledger-entry-type merge
semantics (e.g., for trust lines, sum the archived balance with the
active balance, assuming it's under the limit).

There are a bunch of smaller design decisions we'd need to make, but I
think this general approach is a good compromise between never expiring
ledger entries and making people's ledger entries disappear forever.

David

Jonathan Jove

unread,

Jul 8, 2022, 10:28:52 AM7/8/22

to David Mazieres expires 2022-10-05 PDT, Stellar Developers, Nicolas Barry

I think the question "what happens when a reactivated ledger entry already exists" is the single most important detail here, and warrants significant further discussion.

In general, per-ledger-entry-type merge semantics will not be possible in the context of smart contracts. How could the system possibly know what the smart contract wants to happen, unless the smart contract itself specifies the semantics it wants? Doing this seems like a lot of extra work for contract developers and is unlikely to be popular.

I suspect there are significant vulnerabilities associated with forbidding reactivation if a collision would occur. For example, consider the case where a long-term escrow contract is archived. If an attacker can create a contract at that address, then they could create a do-nothing contract preventing the real contract from being reactivated. This would permanently lock the funds in the escrow.

--
You received this message because you are subscribed to the Google Groups "Stellar Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stellar-dev...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/stellar-dev/874jzs1ffl.fsf%40ta.scs.stanford.edu.

David Mazieres

unread,

Jul 9, 2022, 4:15:45 PM7/9/22

to Jonathan Jove, Stellar Developers, Nicolas Barry

Jonathan Jove <j...@stellar.org> writes:

> I think the question "what happens when a reactivated ledger entry already
> exists" is the single most important detail here, and warrants significant
> further discussion.
>
> In general, per-ledger-entry-type merge semantics will not be possible in
> the context of smart contracts. How could the system possibly know what the
> smart contract wants to happen, unless the smart contract itself specifies
> the semantics it wants? Doing this seems like a lot of extra work for
> contract developers and is unlikely to be popular.

Mostly this comes down to how we identify ledger entries. If, for
instance, ledger entries have some identifier that is unique over all
time, then there is no issue.

That said, if we solve the identity problem and ledger entry identities
are not unique over all time, I still don't see why it needs to be a
huge amount of work for smart contract authors. We could have the
default be that you can't resurrect something if the ID is in use. And
we could have an option where if the ledger entry is associated with a
smart contract and the smart contract wants, it can do something simple
like sum a field in the two versions. I think "disallow resurrection"
and "merge by summing balances" would probably cover 95+% of cases.

Also note that disallowing resurrection doesn't mean the funds are
lost. It just means you need to withdraw, merge, or otherwise destroy
the current entry so as to resurrect a previous one.

> I suspect there are significant vulnerabilities associated with forbidding
> reactivation if a collision would occur. For example, consider the case
> where a long-term escrow contract is archived. If an attacker can create a
> contract at that address, then they could create a do-nothing contract
> preventing the real contract from being reactivated. This would permanently
> lock the funds in the escrow.

I think abstractly this sounds like a problem. But in a concrete
setting, it may be that people sign half of escrow transactions that
never get submitted, so we need a mechanism for making sure transactions
on old escrow accounts can't be applied out of context to new escrow
accounts. So big picture, you might want to give escrow accounts a
unique identity over all time anyway, and that would solve the
resurrection problem.

Of course, it may be possible to construct contrived smart contracts
that are vulnerable to this kind of issue and to nothing else. So I
think the question has to be answered with a concrete example. We have
to look at the vulnerable contract and the correct one, and if the
correct contract is more complicated to write, then we have a problem.
But if the vulnerable contract is complex and contrived compared to the
correct contract which is the obvious way of doing things, then this
doesn't worry me too much.

David

Geoff Ramseyer

unread,

Aug 11, 2022, 7:26:08 PM8/11/22

to David Mazieres, Jonathan Jove, Stellar Developers, Nicolas Barry

How does this list feel about delegating all of the recovery logic to the

smart contracts? The mechanism David describes

(or some similar mechanism) ensures that any ledger entry, once archived, can only be recovered

once (until it's archived again, etc).

Contracts could be required to make a "on_recover_ledger_entry: LedgerEntry -> ()" method.

A transaction that recovers an archived entry could first check the proof (as per David's mechanism)

and then call the on_recover_ledger_entry on the contract that owns the entry.

The transaction fails if the proof is wrong or if the on_recover_ledger_entry throws an error.

Contracts that want to merge balances can implement that logic themselves.

Contracts with more complex logic (e.g. collateralized debt positions,

escrow accounts that expire at some future timestamp) can implement their

own custom merging logic. Or contracts could take steps to ensure LedgerKeys are unique,

and then make the on_recover_ledger_entry do nothing.

Because recovery transactions would be script invocations,

they should be subject to the same fee metering and resource limits

as regular smart contract invocations.

This requires every ledger entry to be owned by a contract and that

this ownership information be stored within the archive log.

It also makes transferring ownership of ledger entries difficult -- what if, for example,

I archive an entry, then recreate it, and then transfer it somewhere else,

and then try to recover from the archive? We'll have two copies of the entry, owned by

different accounts. This point is irrelevant if, as per current soroban plans,

ledger entries are not transferable between contracts.

--Geoff

--
You received this message because you are subscribed to the Google Groups "Stellar Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to stellar-dev...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/stellar-dev/87edyu6o0g.fsf%40ta.scs.stanford.edu.

Nicolas Barry

unread,

Aug 12, 2022, 7:23:55 PM8/12/22

to Geoff Ramseyer, David Mazieres, Jonathan Jove, Stellar Developers

Hey Geoff

That sounds reasonable to me. We already have a strict "namespacing" when it comes to contracts: data that originates from a smart contract is already associated with the parent contract by virtue of the structure of its ledger key.

The one ledger entry that is problematic in that context is the contract itself, and for those I *think* that we don't really care (ie: default restore is fine).

I think that we could add this functionality in a later CAP (we just need to make sure that it is possible to add it later, and from what I can see we don't need to do anything special?).

I would like to see how far we can take the minimal proposal before adding complexity. In particular, like you also noted, it might just be easier to make keys unique (I think for this, maybe all we need is to have a counter that gets incremented every time we archive, that people can use like an "epoch" of sorts when they create a key), and contracts that prefer to do this would then potentially allow merging entries that only differ by epoch.

Nicolas

Reply all

Reply to author

Forward