Getting seed_data into the horizon pipeline

377 views
Skip to first unread message

Kristján Oddsson

unread,
Jun 27, 2013, 7:50:37 AM6/27/13
to skyli...@googlegroups.com
Hey guise!

I've wanted to try out skyline and see if I could potentially use it in my corporate setting since we've been working hard towards getting a better overview on how our software is behaving when no one is around to watch it.

I'm doing a test run in a vagrant virtual machine and I've got the web app up and running and I can run the seed_data.py with the following output:

vagrant@precise64:/var/log/skyline$ ~/skyline.sh seed_data
Connecting to Redis...
Loading data over UDP via Horizon...
Congratulations! The data made it in. The Horizon pipeline seems to be working.

, but to no avail. The website doesn't seem to register any of the data :(

Thanks in advance, Kristján.

Abe Stanway

unread,
Jun 27, 2013, 8:17:45 AM6/27/13
to Kristján Oddsson, skyli...@googlegroups.com
Hi Kristjan!

Not to worry - the data seeder is half baked and poorly thought out. It's not meant to function as anything more but a test of Horizon, and so nothing will currently show up on the webapp yet. This is because only anomalous metrics go on the webapp. If it's not anomalous, it stays off the app until it becomes anomalous.
--
You received this message because you are subscribed to the Google Groups "skyline-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to skyline-dev...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Kristján Oddsson

unread,
Jun 27, 2013, 8:38:36 AM6/27/13
to skyli...@googlegroups.com, Kristján Oddsson
Ahh ok, that makes sense when you say it :)

Well, I'm working on a logstash output plugin for skyline and I'm wondering how I can test skyline to see if it is receiving the data I'm sending.

Abe Stanway

unread,
Jun 27, 2013, 8:43:43 AM6/27/13
to Kristján Oddsson, skyli...@googlegroups.com, Kristján Oddsson
That's easy - you can write in a line somewhere in the Horizon worker to log all input from the queue. Or, you can hang out directly in Redis and watch the keys get appended. You can use `redis-cli monitor` to get a livestream of all input and output of Redis. `redis-cli info` is also a very helpful tool - you can try writing a `watch` wrapper around it so it refreshes in real time.

Kristján Oddsson

unread,
Jun 28, 2013, 8:46:32 AM6/28/13
to skyli...@googlegroups.com, Kristján Oddsson
Thanks for all the help Abe :)

I ca confirm that I can get data into skyline from my plugin. Now I want to test anomalies and actually see something in the webapp :P

What would be the best course of action to do that?

Abe Stanway

unread,
Jun 28, 2013, 8:52:59 AM6/28/13
to Kristján Oddsson, skyli...@googlegroups.com, Kristján Oddsson
Take your site down!

Or write a custom data seeder that spits out an anomaly at the end. And then submit a pull request so other folks can use it :)

Kristján Oddsson

unread,
Jun 28, 2013, 8:56:37 AM6/28/13
to skyli...@googlegroups.com, Kristján Oddsson
I think I'll do the latter, thanks ;)

Kristján Oddsson

unread,
Jun 28, 2013, 10:24:59 AM6/28/13
to skyli...@googlegroups.com, Kristján Oddsson
Ok so I threw together a really bad ruby script where I connect to localhost on port 2025 and send "1" for two minutes, then I send "10000" for 20 seconds. Finally I send "1" for two minutes again. I messagepack the message in the format listed her -> https://github.com/etsy/skyline/wiki/Getting-Data-Into-Skyline#udp-messagepack

My thought was that the twenty seconds when I run "10000" should be registered as a anomaly, but it doesn't show up in the webapp :( 

Abe Stanway

unread,
Jun 28, 2013, 1:27:21 PM6/28/13
to Kristján Oddsson, skyli...@googlegroups.com
+ skyline-dev


On Fri, Jun 28, 2013 at 1:26 PM, Abe Stanway <abest...@gmail.com> wrote:
Okay. Clear redis, do that, and give me the entire output of both /var/log/horizon.log and /var/log/analyzer.log


On Fri, Jun 28, 2013 at 1:21 PM, Kristján Oddsson <kodd...@gmail.com> wrote:
Ok this is where I'm at now. I start up all the components in skyline. With the FULL_DURATION set to 30 I start my script that outputs a random low number (1-100) to skyline via the udp socket on port 2025 each second for two minutes. I then output a random high number (1000-2000) for 20 seconds before outputting a random low number for two minutes again.



Kristján Oddsson.


On Fri, Jun 28, 2013 at 5:08 PM, Abe Stanway <abest...@gmail.com> wrote:
It's not in milliseconds, it's in seconds. You will then need to populate your data for 30 seconds...


On Fri, Jun 28, 2013 at 1:02 PM, Kristján Oddsson <kodd...@gmail.com> wrote:
Again, thanks for answering all my questions Abe :)

I set the FULL_DURATION to 30000 (assuming that it's in milliseconds) to set it at 30 seconds and ran my script again but it did not popup in the webapp.

Kristján Oddsson.


On Fri, Jun 28, 2013 at 3:26 PM, Abe Stanway <abest...@gmail.com> wrote:
Setting FULL_DURATION to 0 will make the Roomba effectively delete the entire metric. You need to set it to how long the timeseries needs to be to establish a stable moving average.

On Jun 28, 2013, at 11:01 AM, Kristján Oddsson <kodd...@gmail.com> wrote:

Hmmm.. I set the FULL_DURATION to 0 and re-ran the script but it's still not registering the anomaly :(

Also I don't see anything out of the ordinary in the logs.

Kristján Oddsson.


On Fri, Jun 28, 2013 at 2:34 PM, Abe Stanway <abest...@gmail.com> wrote:
You'll need to check the log files for the cause. Most likely your FULL_DURATION setting is too long - Skyline will ignore metrics that are less than FULL_DURATION long.





--
Abe Stanway




--
Abe Stanway



--
Abe Stanway

Naveen Voruganti

unread,
Sep 11, 2014, 8:19:47 PM9/11/14
to skyli...@googlegroups.com, kodd...@gmail.com
Hi, I am looking for some script to do the same, can you please send me the script you have to messagepack send data on UDP 

Naveen Voruganti

unread,
Sep 22, 2014, 8:20:50 PM9/22/14
to skyli...@googlegroups.com
I came up with a script which sends metrics data for 1hr and then some abnormal data for 1min. I do not see this data on the localhost webapp. Is there anything else that I need to do like restart services? How to check my data is getting fed into skyline? Any pointers or help is greatly appreciated. I really love what this tool does and want to explore in our organization.

Abe Stanway

unread,
Sep 22, 2014, 10:26:13 PM9/22/14
to Naveen Voruganti, skyli...@googlegroups.com
Naveen - sounds like a nice script, you should send that upstream! A better send_data.py has been on the cards for a while...

Anyway, there are a number of reasons why it might not work, and most of them have to do with the analyzer settings. First, can you verify manually that the data is making into Redis? Have you looked at the Horizon logs?

--
You received this message because you are subscribed to the Google Groups "skyline-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to skyline-dev...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.



--
Abe Stanway

Naveen Voruganti

unread,
Sep 24, 2014, 2:59:27 PM9/24/14
to Abe Stanway, skyli...@googlegroups.com
HI Abe,
After playing some more with the settings and data.json, now I see feed on skyline webapp. I think I am unblocked at this step now and now moving on to next step oculus. I will let you know if I come across any other blocker!

Thanks
Naveen

Naveen Voruganti

unread,
Oct 10, 2014, 3:27:00 PM10/10/14
to Abe Stanway, skyli...@googlegroups.com
Once the skyline detects the anomaly metric, I am trying to search on oculus, seeing below error - any help on this please?


ElasticSearch::RequestError at/search

(400) {"error":"SearchPhaseExecutionException[Failed to execute phase [query], all shards failed; shardFailures {[HuqiO2iJRLOMjWOrY_5bxQ][collections][4]: SearchParseException[[collections][4]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\"query\":{\"custom_score\":{\"query\":{\"bool\":{\"must\":[{\"match\":{\"fingerprint\":{\"query\":\"\",\"type\":\"phrase\",\"slop\":null}}}],\"must_not\":[]}},\"script\":\"oculus_euclidian\",\"params\":{\"query_value\":\"0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0\",\"query_field\":\"values.untouched\",\"scale_points\":25},\"lang\":\"native\"}},\"sort\":[{\"_score\":\"asc\"}]}]]]; nested: QueryParsingException[[collections] [custom_score] the script could not be loaded]; nested: ElasticSearchIllegalArgumentException[Native script [oculus_euclidian] not found]; }{[HuqiO2iJRLOMjWOrY_5bxQ][collections][3]: SearchParseException[[collections][3]: from[-1],size[-1]: Parse Failure [Failed to parse source [{\"query\":{\"custom_score\":{\"query\":{\"bool\":{\"must\":[{\"match\":{\"fingerprint\":{\"query\":\"\",\"type\":\"phrase\",\"slop\":null}}}],\"must_not\":[]}},\"script\":\"oculus_euclidian\",\"params\":{\"query_value\":\"0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0\",\"query_field\":\"values.untouched\",\"scale_points\":25},\"lang\":\"native\"}},\"sort\":[{\"_score\":\"asc\"}]}]]]; nested: QueryParsingException[[collections] [custom_score] the script could not be loaded]; nested: ElasticSearchIllegalArgumentException[Native script [oculus_euclidian] not found]; }]","status":400}


On Mon, Sep 22, 2014 at 7:25 PM, Abe Stanway <m...@abe.is> wrote:
Reply all
Reply to author
Forward
0 new messages