Scale and performance expectation of NATS server clustering

199 views
Skip to first unread message

Xintong Zhou

unread,
Feb 19, 2018, 11:54:38 AM2/19/18
to nats
Hi,

We are evaluating NATS for our dynamic message switch and hope to get better understanding on how well NATS performs with millions of subjects and frequent SUB & UNSUB.

A simplified use case is like the following:
1. Clients connect and disconnect frequently to exchange messages between each other.
2. Clients are mapped to NATS subjects.
3. Clients connect to API servers directly and API servers maintain connections to NATS servers.
4. When a client connects to an API gateway, it adds a subscription to the subject that belongs to the client and starts receiving messages.
5. When a client disconnects from the API gateway, it removes the subscription for the client.

We hope to know:

1. Is it a reasonable design with NATS?
2. How many subjects and subscriptions a single NATS server can typically support? Do they scale easily with NATS server clustering?
3. Will frequent SUB & UNSUB cause performance issues for single NATS server or/and NATS server clustering? After carefully reading NATS server protocol, it seems to me that frequent SUB & UNSUB will generate broadcast traffic between servers. Is it correct?
4. What is the normal message delivery latency with a single NATS server and NATS server clustering?

We really appreciate your insights.

Thank you!

co...@synadia.com

unread,
Feb 19, 2018, 2:06:54 PM2/19/18
to nats
Hi, 

Thank you for your interest in NATS!

Generally, scalability questions are very difficult to assess - it depends on data throughput you want to maintain, the hardware resources available, number of connections, and in your case the number and frequency of subscriptions.  Depending on these factors, this design could work for you.  I'd suggest running some tests in your environment to see where the performance boundaries lie.  If you are talking about thousands of subscriptions, this will likely work.  If the subscription count would be in the millions, you'll likely want to take a different approach.  

Can you describe your subject namespace in more detail?  One approach would be tokenizing your subject namespace and using wildcard subscriptions (more info here) to cover a group of clients in each API gateway.  This would reduce the subject namespace NATS maintains and reduce the amount of subscription related chatter (SUB/UNSUB messages) in a NATS cluster.  Instead of many short lived highly granular subscriptions, you'll have fewer longer lived wildcard subscriptions.

For example, if the NATS subject to address a client was "region1.subregion2.client-1234567", your API gateway could subscribe to "region1.subregion2.*".  It would receive all messages for clients in "region1.subregion2".   Then the API gateway could inspect the subject of messages it receives for the last token ("client-1234567" for this example) and route it to the correct client.

We're have plans for a NATS feature to accommodate use cases like this, but we don't have a target date yet.

Here are some answers inline.

1. Is it a reasonable design with NATS?

This was covered above...  It may be, but depends on the # of unique subjects you'll have.
 
2. How many subjects and subscriptions a single NATS server can typically support? Do they scale easily with NATS server clustering?

It depends on your throughput needs and CPU/Memory resources.  I've had 30M idle subscriptions take up ~16GB of memory, although I wouldn't expect high throughput in that server.
 
3. Will frequent SUB & UNSUB cause performance issues for single NATS server or/and NATS server clustering? After carefully reading NATS server protocol, it seems to me that frequent SUB & UNSUB will generate broadcast traffic between servers. Is it correct?

This is correct, although if you can design your subject namespace to use wildcards, as described above, you'll drastically reduce the CPU/Memory/network resources you'll need.
 
4. What is the normal message delivery latency with a single NATS server and NATS server clustering?

It really depends on where they are located, current load on the server, network characteristics - I'd suggest testing in your environment.

I hope these answers help.  Would you mind explaining your use case in more detail?

Thanks,
Colin
Reply all
Reply to author
Forward
0 new messages