Nats streaming cluster

506 views
Skip to first unread message

Henrik Johansson

unread,
Sep 29, 2016, 3:03:20 PM9/29/16
to nats
Hi,

I am curious as to the clustering support for NATS Streaming is coming along.

I mean not just connecting to a clustered NATS but also replicating massages.
I am maybe just assuming that that capability is part of the roadmap but I have a vague memory of it being mentioned at Gophercon.
We could probably run part of our payloads unclustered because they are easy to regenerate in case of losses but some probably not.

Thx,
Henrik

Ivan Kozlovic

unread,
Sep 29, 2016, 3:08:15 PM9/29/16
to nats
Hello Henrik,

Yes, clustering as you describe (with log replication) is in the roadmap, but there is still a lot of work to do.
I don't have a date for availability of that feature, but this is what we are working on at the moment on NATS Streaming, so this has top priority.

Ivan.

Henrik Johansson

unread,
Sep 29, 2016, 3:23:07 PM9/29/16
to nats
Excellent Ivan, thank you!

--
You received this message because you are subscribed to the Google Groups "nats" group.
To unsubscribe from this group and stop receiving emails from it, send an email to natsio+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Ravi Krishna

unread,
Sep 29, 2016, 10:08:00 PM9/29/16
to nats
where can I find more roadmap data? I find one on Github but it seems to be not current

Bryan Crosby

unread,
Oct 6, 2016, 1:45:28 PM10/6/16
to nats
+1 This would be of interest to me as well. In order to have high availability out of the box, it would be nice to have this supported without having to do it manually. As soon as we could get a date, that would be awesome! 

Bryan

akha...@fubo.tv

unread,
Jun 14, 2017, 5:35:53 PM6/14/17
to nats
Any updates on this matter? We are looking to setup NATS streaming and we would like to have a cluster option.


On Thursday, September 29, 2016 at 3:08:15 PM UTC-4, Ivan Kozlovic wrote:

Rogerio Ferreira

unread,
Jun 15, 2017, 11:05:11 AM6/15/17
to nats
Is there anyway we can get a feeling on clustering support being  1-3 moths away or 4-6+ months away without making a hard and fast commitment. It will help us in qualifying if we should make alternate interim plans or not.

Colin Sullivan

unread,
Jun 15, 2017, 5:42:38 PM6/15/17
to nats
Rogerio,

Thank you for your interest in NATS!  We have done quite a bit of design work, and are starting dev - we're in the 4-6 months range you mentioned (without making a hard and fast commitment).

Note, we do now have Fault Tolerance (high availability) and Partitioning (scalability), which addresses some of the problems clustering solves.  Would you mind describing your use case?

Thanks,
Colin

Anton Khabaiev

unread,
Jun 20, 2017, 1:42:25 PM6/20/17
to nats
Hi,

We are using Nats streaming, and it looks like something is wrong with monitoring. Per docs https://github.com/nats-io/nats-streaming-server#enabling I started server with -m 8444 arg, and I can see it would say "

[1] 2017/06/20 17:18:48.882315 [INF] Starting http monitor on 0.0.0.0:8444"

But then when I go to http://localhost:8444 I see regular nats monitoring, not the streaming, and  /streaming, streaming/serverz returns 404.

Please advice.

peter...@apcera.com

unread,
Jun 21, 2017, 8:36:47 AM6/21/17
to nats
Hi Anton,

I've seen this issue at times of high load on my laptop, as the routes for /streaming get added after the monitoring webserver is started. Are you seeing this consistently, well after initial startup?

If you are, can you send me your exact start command? If this continues to be a problem, I'll add a notice to when the streaming monitoring routes are added to the http monitoring server.
Pete

Anton Khabaiev

unread,
Jun 21, 2017, 10:56:00 AM6/21/17
to nats
Yes, I'm running two nats streaming servers in k8s in gcloud, and neither show monitoring for streaming, only regular nats. I'm starting a server with this exact command:
command: ["/nats-streaming-server"]
         args: 
          - "-p"
          - "4222"
          - "-m"
          - "8222"
          - "-store"
          - "file"
          - "-dir"
          - "/nats_streaming_data"
          - "-ft_group"
          - "ft"
          - "-cluster"
          - "nats://0.0.0.0:6222"
          - "-routes"
          - "nats://nats-streaming-2:6222"
          - "-DV"

peter...@apcera.com

unread,
Jun 21, 2017, 11:15:30 AM6/21/17
to nats
What version of nats-streaming-server is running? (eg.)

[88767] 2017/06/21 08:30:32.085092 [INF] STREAM: Starting nats-streaming-server[test-cluster] version 0.5.0

Prior versions of the server would still start the monitoring server, but only the latest adds in the /streaming route with streaming metrics. We will be releasing the latest version next week and will update our nats-streaming docker image shortly afterwards.

Anton Khabaiev

unread,
Jun 21, 2017, 12:06:55 PM6/21/17
to nats
That would be this. 
[1] 2017/06/21 16:05:39.343663 [INF] STREAM: Starting nats-streaming-server[test-cluster] version 0.4.0

Note I'm using latest image from the docker hub repo, so whatever version is there.

peter...@apcera.com

unread,
Jun 21, 2017, 12:30:14 PM6/21/17
to nats
That version doesn't have the /streaming http endpoint. We'll notify back on this thread once we've released the new version and docker image.

Anton Khabaiev

unread,
Jun 21, 2017, 3:13:02 PM6/21/17
to nats
So I just built a docker container from current github repo and that resolved monitoring issue for me. Now we have another big issue.
We've setup 2 nats-streaming-server's in a FT group, with one being active and one standby. Also we have subscriber and publisher connecting to the server. At first everything works, publisher publishes messages and client receives them. 
But when we kill the active node, clients stop receiving messages. Also when we bring active node back online, system is still in a hold, meaning no messages get delivered. 
Is there some additional client code we need to write to resubscribe to another server?
Ex. code we are using:
type NStreamingQueue struct {
    clientId
string
    nc
*stan.Conn
}




func
(n *NStreamingQueue) Publish(topic, message string) error {
   
if len(message) == 0 {
       
return nil
   
}


    con
:= *n.nc


   
if err := con.Publish(topic, []byte(message)); err != nil {
        fmt
.Printf(“Error pushing to nats: %s”, err.Error())
       
return err
   
}


   
return nil
}


func
(n *NStreamingQueue) Consume(topic string, handler func(msg *stan.Msg)) error {
    con
:= *n.nc
   
if sub, err := con.Subscribe(topic, handler, stan.DurableName(n.clientId)); err != nil {
       
sub.Unsubscribe()
        fmt
.Println(“Error receiving a message: “, err.Error())
       
return err
   
}


   
return nil
}


func
(n *NStreamingQueue) Healthcheck() bool {
   
return n.Healthcheck()
}




func
NewNStreamingQueue(urls, clientId string) (*NStreamingQueue, error) {
    nc
, err := stan.Connect(“test-cluster”, clientId, stan.NatsURL(urls))


   
if err != nil {
        fmt
.Println(“Error connecting to nats: “, err.Error())
       
return nil, err
   
}


   
return &NStreamingQueue{nc: &nc, clientId: clientId}, nil
}

Colin Sullivan

unread,
Jun 21, 2017, 4:31:39 PM6/21/17
to nats
Hi Anton,

Thank you for providing the code; I don't see any issues off-hand that would cause this.  Could you provide the output from your NATS streaming server with debug and trace enabled, both -DV -SDV (it will be a lot of output)?  Also, do you see the same behavior outside of docker (locally)?

Thanks,
Colin

Colin Sullivan

unread,
Jun 21, 2017, 4:55:07 PM6/21/17
to nats
Hi Anton,

One more thing...  You'll want to be sure your storage for both NATS streaming servers is shared (your "/nats_streaming_data" mount).  Otherwise, you will effectively have two separate stores and client state won't be shared.  This would likely create the behavior you describe.

Regards,
Colin

Anton Khabaiev

unread,
Jun 22, 2017, 11:22:49 AM6/22/17
to nats
Looks like that is what's happening. We deploy using k8s and they don't share storage by default. Now we could deploy two nats-streaming instances in one container and that would allow them to share storage, but it kinda defeats the purpose of having a cluster if you have all of them in one container. Is there a recommended cluster(ft) setup for nats-streaming?

Colin Sullivan

unread,
Jun 22, 2017, 12:10:56 PM6/22/17
to nats
Glad you found the cause!  So long as your storage under NATS streaming is backed up, a pod restart should be roughly equivalent to the FT warm backup taking over.

Ivan Kozlovic

unread,
Jun 28, 2017, 5:44:18 PM6/28/17
to nats
Anton,

Release v0.5.0 with monitoring is now available on docker hub: https://hub.docker.com/_/nats-streaming/

Ivan.
Reply all
Reply to author
Forward
0 new messages