Re: Iris Questions

20 views
Skip to first unread message

Péter Szilágyi

unread,
Apr 28, 2014, 8:21:01 AM4/28/14
to Brian Jones, projec...@googlegroups.com
Hi Brian,

On Sat, Apr 26, 2014 at 12:14 PM, Brian Jones <mojo...@gmail.com> wrote:
Hello Peter,

I spent this afternoon reading your white papers, slides, and watching
videos about Iris.  Extremely exciting project!

Glad you like it :) Sorry for not replying earlier, I was preoccupied during the weekend.
 
I am still left with a few questions though.  If you have time maybe you
could answer them for me?

In your browser/webserver example, I understood that there was no need
for a load balancer.  However, given a real world scenario where people
connect through a FQDN, usually those requests go through a LB.  How
would you handle this?  It seems you still have to point the LB to each
physical network address in a cluster to handle internet facing
requests.

If you have front facing servers/balancers that aren't running Iris, then yes, you'll need to take care of getting those requests into the Iris service/network. I've implemented a small service myself on top of Iris (http://www.regionrank.com/ hopefully I'll have time soon to work on this again), which has one gateway node (the receptionist from this slide: http://iris.karalabe.com/talks/fosdem.slide#23) balancing requests among the "librarians". All nodes are running Iris, so the architecture is pretty straightforward. Of course this is just a small project, so maybe as it grows it'll need more sophisticated mechanisms, but for now it works nicely.
 
In regards to broadcasting and channels.  Is there a limit to the number
of channels which can be created?  For example, given a multi-million
user service like LINE or WhatsApp, would it make sense to broadcast to
small discrete channels which correspond to small groups of users?  Is
the overhead cost of broadcasting to potentially millions of channels
high?

Yes, there is a limit to the number of application clusters and/or topics, although not that explicit. Iris internally uses hashes of cluster/topic names for routing to the correct locations. Originally the Pastry and Scribe - overlay networks underneath Iris - used 128bit hashes, but that is gigantic even for Internet scale, so I reduced it (currently to 40 bits). This means that you can easily have tens of thousands of topics without hash collisions, but not many millions.

Increasing the hash size would allow plenty more topics, but increased hashes put extra routing strain on the system. I can imagine a solution of separating the topic/cluster hashes and the routing hashes to allow them to scale differently though. It shouldn't bee *too* hard to do, but there are a few more pressing issues.
 
For example, for each chat channel a user is in I would make multiple
connections for a user like this?

conn1, _ := iris.Connect(55555, "unique-id-1", new(echo))
conn2, _ := iris.Connect(55555, "unique-id-2", new(echo))
conn3, _ := iris.Connect(55555, "unique-id-3", new(echo))
...
connN, _ := iris.Connect(55555, "unique-id-N", new(echo))

Would pub/sub be a better solution here?

I'd put Iris a bit below this level. I meant the clusters/topics to support the architecture of a distributed application on top of which you could add your own custom logic. Having millions of clusters would be way too expensive, since there's an active load balancer behind them which is constantly measuring system load and propagating it through the system. The more clusters you have, the more "noise" this results in.

Topics would be a much better option since they are not load balanced (note, the current implementation uses the clusters internally, so it actually *is* load balanced, but that will be removed eventually when I find the time).
 
Finally, you addressed this in your FOSDEM talk, but I would like to
double check.  The burden of guaranteeing a message delivery is at the
application level?  I get the feeling a broadcast structure would
probably not work very well for this if I wanted guaranteed chat message
delivery.

You are correct. If you need guarantees, then it's pretty much up to you. You can find a short rationale behind not guaranteeing anything on page six in my Iris design paper. In short, at the messaging layer I cannot decide what to do in case of a lost message since I do not know whether it is idempotent or not. This is a topic I'd like to expand on a bit, but again, time time time :)


Thanks for your time!

Regards,

Brian Jones

So in short, current cluster/topic limits are at around tens of thousands (extendable, but require a bit of work). Clusters should not be proliferated since they generate background traffic, but topics should be allowed (requires the removal of the load balancer from the topics). Guarantees for now are up to the user of the system (maybe eventually there would be some guarantees introduced, but I'd like to keep the system as simple as possible and only include things that are truly needed and properly covered).

I'm happy to answer all questions, so fire away if in doubt :D

Cheers,
  Peter

PS: I've CCd the Iris mailing list to have this conversation public and act as a small knowledge base.

Reply all
Reply to author
Forward
0 new messages