Correct that this is the current behavior today [1].
After taking a brief look at things, the missing gap here (as you've noted)
is a sort of lazy accept/reject operating mode. From the PoV of the main RPC
signing related RPC calls, this seems pretty straight forward (just return
an error to the client, similar to our other errors if one tries to do
certain RPC calls before the server has started up).
One area that needs more investigation is how the system would handle things
like failing to do a forward or even rejecting a state machine transition.
Consider a scenario where an HTLC comes across that it actually destined for
the end user: the remote party sends a sig, we validate+revoke (gg so far),
but what happens when we need to generate our signature? If the user is
connoted at that point, then things work as normal. However if they aren't
yet connected, what's the correct course of action? Should we wait for a
period of time (for how long?), or should we "reject" that attempt so they
can try again?
Attempting to "reject" the commitment update attempt isn't really explicitly
implemented in the protocol (just send an unadd or w/e). So the only way we
can explicitly "cancel" that attempt would be to send a sig, which requires
the user's mobile phone to be active (assuming background stuff isn't super
reliable) for that instance.
As a result, I think in order to implement a robust system, the lifetime of
the "remote node" needs to be closely tied to the liveness of the mobile
client. In other words, if the user's mobile phone doesn't have an active
connection to the system, then the node shouldn't have any p2p connections.
In terms of properly scaling such a system (minimizing duplicate
data/connections, and also optimizing for start up time), I think this issue
is very relevant [21]. By adding another layer of indirection, we can create
a node instance that doesn't actually speak to the p2p network for gossip
and graph download. Instead, it relies on a read-only copy of the graph
(assumption is that we'd store our _own_ channels outside the graph and feed
them in as hop hints), and uses the new peerrpc sub-server to push out
gossip updates. You'd then have a main node that handles all the gossip
ingestion and channel graph updates/maintenance [3]. This would serve to
make any nodes in a clustered setting (or running in the same process) much
more light weight, as gossip handling always shows up in the memory/cpu
profile of larger nodes. This operating mode would also allow for things
like "instant" graph sync within a shared trust domain.
Correct again. Re tor I've been looking into this [4] embedded Go tor client
as a way to more easily design/deploy systems that are built around onion
services. Packaging wise, this makes such systems easier to deploy as the
tor client is bundled into the same binary as everything else.
Examining the other option here (client connecting out to the "remote
node"), I think we'd need to modify the gRPC service a bit to possibly be a
bi-di streaming Endpoint? Ignoring that possibility, the aperture "hashmail"
box that Terminal Web and LNC use today can also address this issue: the
server would connect out via the hashmail box (using our brontide encryption
handshake instead of TLS) and block on writes until the client was online to
process and reply to the requests. With a small bit of plumbing, I think the
entire existing LNC client code can be used here:
https://github.com/lightninglabs/lightning-node-connect/blob/master/mailbox/grpc_noise_conn.go#L39.
Related to this discussion is a somewhat under-appreciated fact that today,
it's possible to instantiate a new lnd node within an _existing_ Go process.
If we take a look at the `cmd/lnd` binary today [6], all it does it parse
the config, sub in the default implementations of the main interfaces, and
then starts the main lnd parent goroutine. This subtle architectural change
means that just about everything I described above can live in a new Voltage
project that implements the custom indirection, shared graph db, etc, etc,
given the required hooks on the lnd/config level.
One trivial example to better demonstrate what's already possible today,
would be a user creating a custom database implementation (on the KV level)
that uses Redis (or some other DB). A user could make a new Go package,
import their DB implementation, then specify a DatabaseBuilder [7]
implementation to pass that into the main config.
The remote signer implementation works in a similar fashion: we parse the
config then see if the remote signer is needed and use that instead of the
default wallet implementation [8]. From the PoV of lnd, nothing has changed,
since it still has all the main interfaces it needs to function, which
allows lnd to not have to know about the type of wallet its even using
(someone could feasibly swap out btcwallet for bitcoind if they really
wanted to).
Admittedly this can be documented a bit better, but we have most of the
tools we need in order to start experimenting and deploying non-traditional
operating modes for lnd. The set of APIs I linked above also aren't in any
way finalized, so we can make changes to them, updating the main top-level
lnd module each time.
Let me know if anything above isn't clear, I'd be happy to clarify!
-- Laolu
[1]:
https://github.com/lightningnetwork/lnd/blob/9c97d26cfb505081732cb457b513a356879ad57e/lnwallet/rpcwallet/rpcwallet.go#L82.
[2]:
https://github.com/lightningnetwork/lnd/issues/6294[3]:
https://github.com/lightningnetwork/lnd/pull/6262#issuecomment-1050358986[4]:
https://github.com/ipsn/go-libtor[5]:
https://github.com/lightninglabs/lightning-node-connect/blob/master/mailbox/grpc_noise_conn.go#L39[6]:
https://github.com/lightningnetwork/lnd/blob/master/cmd/lnd/main.go#L20[7]:
https://github.com/lightningnetwork/lnd/blob/9c97d26cfb505081732cb457b513a356879ad57e/config_builder.go#L83[8]:
https://github.com/lightningnetwork/lnd/blob/9c97d26cfb505081732cb457b513a356879ad57e/config.go#L1667