Most efficient way to implement PubSub pattern using channels

1,278 views
Skip to first unread message

zoha...@gmail.com

unread,
Dec 25, 2016, 4:13:06 PM12/25/16
to golang-nuts
Hey guys,

 So for a while I have been working on a chat system that is written in Go and it targets itself to run efficiently on low resources systems like Raspberry Pi (under 512MB) and handle hundreds of thousands of connections easily (for really curious people URL is http://beta.raspchat.com and yes it's open-source so contributions are welcome). I have been deeply vested in PubSub part of the problem, where a user can subscribe to multiple chat rooms and when anyone sends a message to user its broadcasted to everyone in room. Over time with multiple iterations this is my solution so far. 

  • There is a global map of chat rooms (channels) containing map of [name] -> RoomInfo. 
  • RoomInfo contains an array of channels (for users who have joined the room) to send message back to users (call it outbounds), and a channel (call it inbound) to receive messages from users.
  • Upon joining a room if it didn't existed the structure is initialized properly and kept in dictionary against name of the channel. Everytime a new room is created a new go routine is started to receive messages from inbound channel to the room and then iterate sending copy of message all outbounds.
  • Each new user creates a channel that can receive messages directed for user (call it user_outbound), and when he/she joins a room, he/she keeps reference to inbound and adds own channel to outbounds array.
  • Each user also spins of a goroutine listening to any message coming from user_outbound, and processing it appropriately (Sends JSON over websocket) and sending messages to given channel using the channel's inbound reference we saved earlier.
 So far everything is like direct pipe and dictionaries (I used race condition free datatypes, but let's not get into that details) moving messages around using channels. But I was put off with my solution when I benchmarked it against nginx's nchan. I was consuming way more memory than what it was consuming and way more CPU than what I had implemented. Which made me go away from problem for a while and come back with a fresh mind. Now that I am looking at problem it made me wonder what is the most efficient way to implement just a pub/sub pattern among goroutines. I can see some libraries out there but reading code none seem to be far different than what I am thinking (or I might be looking at limited implementations). Any clues are greatly appreciated. 

Thanks,
Zohaib

Val

unread,
Dec 25, 2016, 5:15:15 PM12/25/16
to golang-nuts
Hi Zohaib
Obviously you've already put much more effort than myself in this pub/sub problem, so instead of authoritative answers I'll give :

- first, it looks like a single-machine server. A distributed server would look very different (goroutines work only on single machine, sharding, routing and communication between machines must be handled, etc.).

- second, did you benchmark under low load, normal load or high load? What quantities are tested : number of char room, number of users, number of messages per second?  Did this load reflect the expected load of your chat system project, or did you just want to know "who can handle the bigger load, between ngninx and the go program?".
I would suspect that 2 or 3 goroutines per new user, which looks very sensible to me, still incurs a cost in memory (a few KB stack) and CPU (scheduler stress), and that cost may become the bottleneck when the benchmark generates a lot of new incoming users. I also suspect that the same behavior can be achieved in go with only 1 (or a small number of) worker goroutines, in which all the "dispatch" code is map-based.
On the other hand, if you have a fixed number of users, and then a big salvo of message requests generates unexpectedly high resource consumption (much more than nginx), then I have no clue. I'm subscribing to the "answer" channel, as a reader.

Cheers
Val

Zeeshan Arif

unread,
Dec 26, 2016, 12:25:00 AM12/26/16
to golang-nuts
Maybe it's because you have 1outbound channel per user, is there anyway to have a partitioned channel?
-zeesh49

Zeeshan Arif

unread,
Dec 26, 2016, 12:25:00 AM12/26/16
to golang-nuts
Hi Zohaib,

Having a 1 outbound / user might be the issue, you should look into channel partitioning.

thanks
-Zeeshan

John Souvestre

unread,
Dec 26, 2016, 5:35:46 AM12/26/16
to golang-nuts
What is a partitioned channel?

John

John Souvestre - New Orleans LA
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Egon

unread,
Dec 26, 2016, 6:16:11 AM12/26/16
to golang-nuts
As the proper answer requires proper analysis, general suggestions:

1. Figure out what is the bare minimum resource usage (memory and cpu) without using any goroutines -- directly using sockets. This gives you the baseline for understanding what measurements are important in the grander schema. Also it (somewhat) shows you how good can you get in Go.
2. Figure out how much each part costs you in terms of resource usage. i.e. channels, bolt, tls, json encoding/decoding etc.
3. Optimize the things that matter relative to the baseline.
4. Consider batching messages sent to the client.
5. Consider using less allocations.

I suspect that using 

Room: two inbound channels for messages and registration events
User: single outbound message-queue (https://play.golang.org/p/u_nMJPHaU_), this makes batching easier

1 or 2 go-routines per User
1 go-routine per Room

PS: lock-free data-structures are good where you have high contention... but I suspect this is not the case for you -- it will be sufficient to use a mutex or rwmutex or channels.

+ Egon

zoha...@gmail.com

unread,
Dec 26, 2016, 2:19:11 PM12/26/16
to golang-nuts
Hey everyone thanks for suggestions I will look into suggestions by Zeeshan and Egon and come back with results. Is there any library that already does that? I imagine it's such a common pattern one should have a library with interface doing send receive in an efficient manner. May be I am spoiled by Erlang having builtin stuff like GenServer etc.

zoha...@gmail.com

unread,
Dec 26, 2016, 2:21:42 PM12/26/16
to golang-nuts
Hey Zeeshan, 

 What are channel partitions? I am hearing term for first time. Any resources or links?

Justin Israel

unread,
Dec 26, 2016, 5:35:13 PM12/26/16
to zoha...@gmail.com, golang-nuts


On Tue, Dec 27, 2016, 8:21 AM <zoha...@gmail.com> wrote:
Hey Zeeshan, 

 What are channel partitions? I am hearing term for first time. Any resources or links?

I get the sense that he meant multiplexing outgoing user messages across one channel. Not like it's some special official construct called a "partitioned channel" 

--

Tamás Gulácsi

unread,
Dec 27, 2016, 12:29:41 AM12/27/16
to golang-nuts
You can use reflect to select from dynamically changing slice of channels, to lessen the number of goroutines.
You can use a fixed number of channels, and multiplex on them (one incoming and one outgoing channel, encapsulate the messages with header for destination), to lessen the number of channels.

But these may not help - maybe you have overseen sth, overcomplicated, or just have an error somewhere.
So if your code seems correct and you like it, then first measure: profile it first when idle, then under pressure.
Go has nice profiling built-in, see net/http/profiling

Reply all
Reply to author
Forward
0 new messages