Supporting addr messages

26 views
Skip to first unread message

Kevin Greene

unread,
Sep 30, 2014, 5:43:35 PM9/30/14
to bitc...@googlegroups.com
Is anybody planning on/currently working on adding support for addr messages and persisting a set of peers that have been seen on the network?


I'm playing around with building a peer crawler for the lulz, and I'm wondering if it is desirable to add support for addr messages directly in bitcoinj. All my peer crawler needs to do is connect to a few peers and listen for addr messages. Then it stores the peers that it learns about in a database. Super simple.

However, the interface for doing this is a bit ugly. The only way to be notified about addr messages is to subclass PeerSocketHandler directly, which means re-implementing the version exchange logic that is normally handled by the Peer class, and other protocol logic such as responding to pings. That also means you can't use the PeerGroup to manage multiple connections so I would have to reimplement that logic as well.

What do you guys think of adding a PeerStore interface? It will have methods like "addPeers", "removePeers", and "getAnyPeer". A first implementation would be called MemoryPeerStore, and would simply store an array of peers in memory. Other implementations could save the peers to a file, postgres, etc.

A PeerStore will be passed to the PeerGroup's constructor. When it wants to connect to a new peer, the PeerGroup first checks if its PeerStore has enough non-stale peers. If the so, then the PeerGroup tries to connect to one of those. If not, then it does PeerDiscovery like normal.

The PeerGroup will store the peers that it learned about from PeerDiscovery in the PeerStore.

To add peers to the PeerStore from addr messages, one option would be to pass the PeerStore to the Peer class's constructor. Then whenever the Peer receives an addr message it simply calls addPeers to update the PeerStore.

Alternatively, the Peer could notify its listeners with a list of peers when an addr message is received, and the listener (i.e. the PeerGroup) could update the PeerStore object.

I think I like the first option better, although then we have to be careful to make the PeerStore thread-safe.

Another question is how to verify that the peers in the PeerStore are actually reachable so they don't get stale. The simplest naive approach would be to just have the PeerGroup remove a peer from the PeerStore if it tries to connect and fails. The PeerStore implementation could also automatically remove peers older than a certain age.

The most heavy-weight approach would be to have something like a PeerTracker that actually opens a connection to each peer in the PeerStore periodically to check if it is still up, but that seems like a lot to ask of a smart phone, especially if there are >1000 peers.

What do you guys think? Then for my peer crawler, I could simply create a PeerGroup and pass my PeerStore to its constructor. That is much nicer than writing my own PeerSocketHandler implementation and managing multiple peer connections manually (since I couldn't use a PeerGroup).

This also has the benefit of reducing traffic to the seed peers, because clients could often connect to cached peers when they start up instead of doing dns discovery every time.

Mike Hearn

unread,
Oct 1, 2014, 5:43:30 AM10/1/14
to bitc...@googlegroups.com
Matt did some work on this:


You'd probably want to build on that. If not then the first step is to add methods on Peer that do a getaddr and return a ListenableFuture<List<PeerAddress>> or something like that, so you can explicitly ask for some. And then yeah some kind of PeerAddressStore or PeerDBDiscovery class would be appropriate. Please do review Matt's work first though, if you'd like to work on this. Thanks!


--
You received this message because you are subscribed to the Google Groups "bitcoinj" group.
To unsubscribe from this group and stop receiving emails from it, send an email to bitcoinj+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Kevin Greene

unread,
Oct 1, 2014, 5:27:34 PM10/1/14
to bitc...@googlegroups.com
Thanks!

Looked over Matt's code. It won't actually work very well for me as-is because the PeerDBDiscovery class is hardcoded to write to a file, but I want to write to a database. There needs to be some PeerStore abstraction so that anyone can implement their own store handling, similar to BlockStore. I also think it's useful to store the information in the version message along with each peer. Or at least the user-agent.

The PeerDBDiscovery class does nicely outline how to handle throwing away stale peers though. So that question is answered :)

Oh derp, I just realized I can use onPreMessageReceived as a PeerGroup listener to be notified of AddressMessages and handle them however I want. So I don't have to go through all the trouble of subclassing the PeerSocketHandler. It's probably still worth adding this to bitcoinj though so it can be used by default in the PeerGroup for peer discovery instead of doing DNS discovery every time.

How about making another layer of abstraction based on the PeerDBDiscovery class, called a PeerTracker. The PeerTracker tracks peers that have been seen on the network (obviously). However, instead of passing a File, you pass the PeerTracker a PeerStore implementation of your choosing, e.g. FilePeerStore, MemoryPeerStore or PostgresPeerStore. I would pull Matt's file-handling code into the FilePeerStore, and make that the default.

In fact, one idea is to have the PeerTracker entirely manage the process of discovering peers. If it has enough peers stored in its PeerStore, then it uses those, and if not, then it has a list of PeerDiscoverers and it manages that process instead of doing that in the PeerGroup like bitcoinj does now. WDYT? The interface to the PeerGroup wouldn't change, but internally it would just use the PeerTracker instead of keeping an array of PeerDiscoverers itself. The only interface change would be adding a nullable PeerStore to the PeerGroup constructor. Could be a nice refactor.

Another option would be to have the PeerGroup ask the PeerTracker for peers before asking the PeerDiscoverers, as I described before. That is less of a change to PeerGroup, but it makes the PeerGroup even more complicated than it is already.

Yet another option would be to have the PeerTracker implement the PeerDiscovery protocol and then don't change the PeerGroup at all. That is what the PeerDBDiscovery class did. I think it makes sense for the PeerGroup to explicitly know about a PeerStore somehow though so it can track peers as the default behavior without having to configure a bunch of extra objects and hook them up. I also don't like how tightly coupled the PeerDBDiscovery class is to the PeerGroup.

Some initial PeerStore code for reference. It looks a lot like BlockStore:

public interface PeerStore {
    public void put(StoredPeer peer) throws PeerStoreException;
    public void put(Collection<StoredPeer> peers) throws PeerStoreException;

    public void remove(StoredPeer peer) throws PeerStoreException;
    public void remove(Collection<StoredPeer> peers) throws PeerStoreException;

    public @Nullable StoredPeer get(InetAddress address, int port);
    public @Nullable StoredPeer getAny() throws PeerStoreException;
}

public class StoredPeer {
    private final PeerAddress address;
    private @Nullable final VersionMessage versionMessage;

    StoredPeer(PeerAddress address) {...}
    StoredPeer(PeerAddress address, @Nullable VersionMessage versionMessage) {...}

    public PeerAddress getAddress() {...}
    public @Nullable VersionMessage getVersionMessage() {...}
}


Mike Hearn

unread,
Oct 2, 2014, 6:40:08 AM10/2/14
to bitc...@googlegroups.com
In fact, one idea is to have the PeerTracker entirely manage the process of discovering peers. If it has enough peers stored in its PeerStore, then it uses those, and if not, then it has a list of PeerDiscoverers and it manages that process instead of doing that in the PeerGroup like bitcoinj does now. WDYT?

Yes, that sounds like a good design. Refactoring and finishing off Matt's code would be an excellent improvement, assuming the stored peer list can be kept fresh enough to maintain the high performance of using the DNS seeds. getaddr has a habit of returning bad addresses currently, but we need to reduce our reliance on the seeds at some point so having the code will be useful.

Go ahead and implement something like this, I will look forward to reviewing it!
Reply all
Reply to author
Forward
0 new messages