Recommendations for Auto-discovering Nodes

197 views
Skip to first unread message

Jason Rogena

unread,
Nov 1, 2020, 1:10:45 PM11/1/20
to raft...@googlegroups.com
Hi there,

I'm building a wrapper on-top of SQLite that implements Raft in an
attempt to create a distributed database. Nothing serious really, just
my attempt to understand Raft by doing :). The project is a throw-away
side project.

One thing that I have to do right now is have the IP addresses and
ports for all the nodes in a configuration file available to all the
nodes. I can totally get away with this. However, just for curiosity
sake, I was wondering what mechanisms people here recommend for node
discovery.

I was initially thinking node discovery would be trivial using DNS SRV
records. A bit of reading lead me to believe that this mechanism of
node discovery is susceptible to some sort of "split-brain" where the
DNS record containing nodes in a cluster has been updated but not all
nodes have the updated record (maybe because they are using a cached
value of the record).

Do people here have any recommendations for node auto-discovery, or
material I can go through in this topic?

Thanks and cheers,
Jason
signature.asc

Archie Cobbs

unread,
Nov 1, 2020, 1:33:35 PM11/1/20
to raft-dev
Raft itself has no provision for automated configuration changes. IOW, from Raft's perspective, all config changes are performed "manually" by some superior being.

Of course that superior being could be a human, or it could be some other external computer algorithm.

So your question lies entirely within the domain of whatever machinery that you construct around Raft.

That means, the good news is you can do whatever you want... as long as you "follow the rules".. e.g.,
  • Don't consider a configuration change complete until it's committed in Raft
  • Only process one config change at a time
  • Prevent any Raft node from receiving a Raft message and mistakenly assigning it to the wrong peer
  • Prevent a peer that had its state suddenly erased from participating in further communication with former cluster
  • Etc.
-Archie

dig...@googlemail.com

unread,
Nov 1, 2020, 2:19:51 PM11/1/20
to raft-dev
I was planning on doing exactly what you're suggesting for my service: have the leader poll DNS records periodically to figure out which nodes should be part of the cluster, and propose membership changes accordingly. A split can never happen, since membership changes require a majority in order to commit.

As long as the cluster is only bootstrapped once, it should not be possible to go awry later on.

Philip O'Toole

unread,
Nov 2, 2020, 7:45:58 AM11/2/20
to raft...@googlegroups.com
Inline.

On Sun, Nov 1, 2020 at 1:10 PM Jason Rogena <ja...@rogena.me> wrote:
Hi there,

I'm building a wrapper on-top of SQLite that implements Raft in an
attempt to create a distributed database.

In case you're interested in examining one way to do this, you can check out rqlite:


 
Nothing serious really, just
my attempt to understand Raft by doing :). The project is a throw-away
side project.

One thing that I have to do right now is have the IP addresses and
ports for all the nodes in a configuration file available to all the
nodes. I can totally get away with this. However, just for curiosity
sake, I was wondering what mechanisms people here recommend for node
discovery.

In rqlite initial nodes are passed at the comment line, though nodes can be added and removed from the cluster at will. I also implemented my own Discovery service. I have yet to get around to DNS SRV records.


Jason Rogena

unread,
Nov 2, 2020, 8:12:30 AM11/2/20
to raft...@googlegroups.com
Thanks,

What y'all say makes a lot of sense. I've been typing this response
email with more questions but as I type the questions, it clicks. It
makes a lot of sense for the leader to be the only node that does the
DNS queries.

I think, in any way, the DNS spec caters for situations where you don't
want DNS resolvers or non-authoritative servers caching a record by
specifying the TTL of the record. I hope I can set the TTL for the SRV
record to something extremely low to force the resolver to always query
for the record from the authoritative DNS server.

What I'm still not sure about is whether people still use SRV records.
Only reference of services that use it I've found online is SIP :).

Thanks again,
Jason

On Sun, 2020-11-01 at 11:19 -0800, 'dig...@googlemail.com' via raft-dev
wrote:
> I was planning on doing exactly what you're suggesting for my
> service: have the leader poll DNS records periodically to figure out
> which nodes should be part of the cluster, and propose membership
> changes accordingly. A split can never happen, since membership
> changes require a majority in order to commit.
>
> As long as the cluster is only bootstrapped once, it should not be
> possible to go awry later on.
>
>
> On Sunday, 1 November 2020 at 18:33:35 UTC archie...@gmail.com wrote:
> > Raft itself has no provision for automated configuration changes.
> > IOW, from Raft's perspective, all config changes are performed
> > "manually" by some superior being.
> >
> > Of course that superior being could be a human, or it could be some
> > other external computer algorithm.
> >
> > So your question lies entirely within the domain of whatever
> > machinery that you construct around Raft.
> >
> > That means, the good news is you can do whatever you want... as
> > long as you "follow the rules".. e.g.,
> >  * Don't consider a configuration change complete until it's
> > committed in Raft
> >  * Only process one config change at a time
> >  * Prevent any Raft node from receiving a Raft message and
> > mistakenly assigning it to the wrong peer
> >  * Prevent a peer that had its state suddenly erased from
> > participating in further communication with former cluster
> >  * Etc.
> --
> You received this message because you are subscribed to the Google
> Groups "raft-dev" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to raft-dev+u...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raft-dev/e18738cc-6880-4f5c-8bb1-42b5c1ec2089n%40googlegroups.com
> .

signature.asc

Jason Rogena

unread,
Nov 2, 2020, 8:23:36 AM11/2/20
to raft...@googlegroups.com
Holy smokes, rqlite looks like an awesome project! Thanks for sharing
the design links. Since mine is just a throw-away project I intend to
use to learn, I was extremely shy to share the repo (no tests, no much
documentation). Seems, though, I can get quite a lot of constructive
criticism here (and from you Philip). The repo is
https://github.com/jasonrogena/lightraft/ .

Using an external service like discovery.rqlite.com seems reasonable.
I'll definitely give it a try. Hope you don't mind non-rqlite clusters
using your discovery service.

Cheers,
Jason
> --
> You received this message because you are subscribed to the Google
> Groups "raft-dev" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to raft-dev+u...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/raft-dev/CAEajhJMTm5Zoxt9xqAsVTNv-1zk33TYb%3D9EQ3V_PZRtvYAyOTA%40mail.gmail.com
> .

signature.asc
Reply all
Reply to author
Forward
0 new messages