Forward client won't reconnect to riemann main server "no channels available"

337 views
Skip to first unread message

Cellok

unread,
Jul 1, 2015, 4:45:53 AM7/1/15
to rieman...@googlegroups.com
Hi Guys,

I've deployed a distributed riemann cluster with DRBD, corosync and pacemaker consisting of a riemann "server" cluster (2 nodes active/backup with shared virtual ip) and a riemann "forward" cluster (2 nodes active/backup with shared virtual ip). Now when i switch/migrate backend riemann "servers" the forward node gets an channel error and won't send any further events.

Thread-6 - riemann.core - instrumentation service caught
java.io.IOException: no channels available
        at com.aphyr.riemann.client.TcpTransport.sendMessage(TcpTransport.java:293)
        at com.aphyr.riemann.client.TcpTransport.sendMessage(TcpTransport.java:264)
        at com.aphyr.riemann.client.RiemannClient.sendMessage(RiemannClient.java:114)
        at com.aphyr.riemann.client.RiemannClient.sendEvent(RiemannClient.java:119)
        at riemann.client$send_event.invoke(client.clj:72)
        at riemann.streams$forward$stream__4775.invoke(streams.clj:1213)
        at riemann.core$stream_BANG_$fn__5678.invoke(core.clj:19)
        at riemann.core$stream_BANG_.invoke(core.clj:18)
        at riemann.core$instrumentation_service$measure__5687.invoke(core.clj:57)
        at riemann.service.ThreadService$thread_service_runner__3235$fn__3236.invoke(service.clj:71)
        at riemann.service.ThreadService$thread_service_runner__3235.invoke(service.clj:70)
        at clojure.lang.AFn.run(AFn.java:22)
        at java.lang.Thread.run(Thread.java:745)

Is there a a way to tell the forwarder try to reconnect? Or maybe i just have to wait a little while and it reconnects automatically?

Any help would be appreciated.

Thanks,
Marcel

Aphyr

unread,
Jul 1, 2015, 10:59:22 AM7/1/15
to rieman...@googlegroups.com

It should reconnect within five seconds, but you'll see send errors for a brief time. If it doesn't, that's a riemann-java-client bug.

--
You received this message because you are subscribed to the Google Groups "Riemann Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to riemann-user...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Cellok

unread,
Jul 2, 2015, 2:30:47 AM7/2/15
to rieman...@googlegroups.com
Hi Aphyr,

Unfortunaley, it does not reconnect and there is only the above mentioned one time execption , the forward client also stops accepting events from collectd clients. This is my forward config:

; -*- mode: clojure; -*-
; vim: filetype=clojure

(logging/init {:file "/var/log/riemann/riemann.log"})

; Listen on the local interface over TCP (5555), UDP (5555), and websockets
; (5556)
(let [host "10.70.10.33"]
  (tcp-server {:host host})
  (udp-server {:host host})
  (ws-server  {:host host}))

(load-plugins)
(let [index (default :ttl 60 (update-index (index)))]
(streams
  #(info %)
  (with {:metric 1 :host nil :state "ok" :service "events/sec"} (rate 1 index))
  (let [client (tcp-client :host "10.70.10.32")]
    (forward client)))
)

Thanks,

Cellok

unread,
Jul 3, 2015, 2:51:34 AM7/3/15
to rieman...@googlegroups.com
Well, this is pretty strange...my first setup was with ubuntu server, now i've build a cluster using the fedora server and everything works out of the box...client reconnects...

Peter Neubauer

unread,
Jul 28, 2015, 11:09:22 AM7/28/15
to Riemann Users, mkemp...@googlemail.com
Hi there,
I'm trying to send reports from within Apache Storm and get something similar. With

ArrayList<Proto.Event> events = new ArrayList<>();
            if (rClient == null) {
                rClient = RiemannClient.tcp(riemann_host, riemann_port);
                rClient.connect();
            }
            if (!rClient.isConnected()) {
                rClient.reconnect();
            }
            long now = new Date().getTime();
            while (!rClient.isConnected() && (new Date().getTime() - now) < TIMEOUT) {
                //wait to get connected
            }
....

               rClient.sendEvents(events).deref(5000, TimeUnit.MILLISECONDS);
 

I get that same IOException, using rieman-java-client 4.0. Is this anything that others are experiencing, too?

/peter
Reply all
Reply to author
Forward
0 new messages