Collectd - carbon relay - skyline Pipeline

344 views
Skip to first unread message

johnu george

unread,
Jan 25, 2015, 5:39:32 PM1/25/15
to skyli...@googlegroups.com
Hi All,

I am using collectd to collect system metrics.  This is sent to carbon relay which forwards to skyline listener. However, It is not able to get data from carbon. Collectd, carbon, graphite and skyline host is same. (172.29.86.27)

carbon.conf

[cache]

STORAGE_DIR    = /var/lib/graphite/
CONF_DIR       = /etc/carbon/
LOG_DIR        = /var/log/carbon/
PID_DIR        = /var/run/

LOCAL_DATA_DIR = /var/lib/graphite/whisper/
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2013

ENABLE_UDP_LISTENER = True
UDP_RECEIVER_INTERFACE = 0.0.0.0
UDP_RECEIVER_PORT = 2013

PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2014

[relay]
LINE_RECEIVER_INTERFACE = 0.0.0.0
LINE_RECEIVER_PORT = 2003
PICKLE_RECEIVER_INTERFACE = 0.0.0.0
PICKLE_RECEIVER_PORT = 2004
ENABLE_UDP_LISTENER = True
UDP_RECEIVER_INTERFACE = 0.0.0.0
UDP_RECEIVER_PORT = 2003


RELAY_METHOD = rules
USE_FLOW_CONTROL = False

DESTINATIONS = 172.29.86.27:2024, 127.0.0.1:2014



relay-rules.conf

[all]
pattern = .*
destinations = 172.29.86.27:2024, 127.0.0.1:2014

[default]
default = true
destinations = 172.29.86.27:2024, 127.0.0.1:2014


collectd.conf

<Plugin write_graphite>
        <Node "graphing">
                Host "172.29.86.27"
                Port "2003"
                Protocol "udp"
                LogSendErrors true
                Prefix "collectd"
                Postfix "collectd"
                StoreRates true
                AlwaysAppendDS false
                EscapeCharacter "_"
        </Node>
</Plugin>
 
src/settings.py

GRAPHITE_HOST = '172.29.86.27'
CARBON_PORT = 2013

OCULUS_HOST = ' '

PICKLE_PORT = 2024

# This is the port that listens for Messagepack-encoded UDP packets.
UDP_PORT = 2025

WEBAPP_IP = '172.29.86.27'

# The port for the webapp
WEBAPP_PORT = 1500




In my default collectd.conf, I had graphite port configured as 2013 (carbon-cache udp port).  Later, to make carbon-relay as primary listener for skyline, I changed port( in collectd.conf) to 2003. But, I still couldn't get anything forwarded to skyline.   However in both cases, I can see that whisper files are updated in /var/lib/graphite. I could not see any pickle files in the folder. Is that the issue?
when I am issuing a  tcpdump -i lo port 2003, i can see udp packets. so, collectd is working fine but, carbon relay or skyline listener has some problems in receiving/sending data. That might be reason for Webapp UI looking empty.

How can I debug?

I had executed seed_data.py and it says that pipeline is working.

python seed_data.py
Loading data over UDP via Horizon...
Connecting to Redis...
Congratulations! The data made it in. The Horizon pipeline seems to be working.


Here are the logs of analyzer and horizon

==> /var/log/skyline/analyzer.log <==
2015-01-25 13:25:34 :: 12633 :: seconds to run    :: 0.12
2015-01-25 13:25:34 :: 12633 :: total metrics     :: 12
2015-01-25 13:25:34 :: 12633 :: total analyzed    :: -2
2015-01-25 13:25:34 :: 12633 :: total anomalies   :: 0
2015-01-25 13:25:34 :: 12633 :: exception stats   :: {'Boring': 3, 'Stale': 11}
2015-01-25 13:25:34 :: 12633 :: anomaly breakdown :: {}
2015-01-25 13:25:34 :: 12633 :: sleeping due to low run time...

==> /var/log/skyline/horizon.log <==
2015-01-25 13:25:36 :: 15330 :: operated on metrics. in 0.036183 seconds
2015-01-25 13:25:36 :: 15330 :: metrics. keyspace is 12
2015-01-25 13:25:36 :: 15330 :: blocked 0 times
2015-01-25 13:25:36 :: 15330 :: euthanized 0 geriatric keys
2015-01-25 13:25:36 :: 15330 :: sleeping due to low run time...
2015-01-25 13:25:42 :: 12581 :: worker queue is empty and timed out
2015-01-25 13:25:42 :: 12580 :: worker queue is empty and timed out

==> /var/log/skyline/webapp.log <==
10.89.6.232 - - [25/Jan/2015 13:34:32] "GET / HTTP/1.1" 200 -
10.89.6.232 - - [25/Jan/2015 13:34:32] "GET /static/dump/anomalies.json?callback=jQuery1830010850393446162343_1422093220203&_=1422224689300 HTTP/1.1" 200 -
10.89.6.232 - - [25/Jan/2015 13:34:32] "GET /api?metric=metrics.horizon.test.udp HTTP/1.1" 200 -

Additional observations

I have seen new whisper  files created under carbon directory.  This might help in debugging.

/var/lib/graphite/whisper/carbon/relays/bd-1-2-a/destinations/127_0_0_1\:2014\:None/
     attemptedRelays.wsp 
    queuedUntilReady.wsp 
     sent.wsp

/var/lib/graphite/whisper/carbon/relays/bd-1-2-a/destinations/172_29_86_27\:2024\:None/
    attemptedRelays.wsp
    fullQueueDrops.wsp



Thanks for your help.







earthgecko

unread,
Jan 26, 2015, 2:45:50 PM1/26/15
to skyli...@googlegroups.com
Hi Johnu

Have you won > 24hrs later?

The only thing I could suggest from the above is that:

1) collectd is not sending data to graphite; and/or

2) your relay-rules.conf should read:

relay-rules.conf

###########################


[all]
pattern = .*
destinations = 172.29.86.27:2024, 127.0.0.1:2014

[default]
default = true
destinations = 172.29.86.27:2024:a, 127.0.0.1:2014:a

############################

The additional :a being namespace of the default carbon cache instance in terms of graphite terminology. relay is made to relay to multilple carbon cache instances, this is why your metrics namespaces are "bd-1-2-a" the a is the cache instance (writers of graphite could not necessarily see skyline in when they saw multiple carbon caches).  I am not certain that the cache instance namespace is required, but I added it to ours for a reason that I do not recall any more.

That and the fact that users expect to see things happen immediately but in terms of skyline analyzer, it will only kick in when there is FULL_DURATION seconds worth of data in redis.

I hope you have already won though :)
Reply all
Reply to author
Forward
0 new messages