Splunk for dev ops

1,071 views
Skip to first unread message

dave.stauffer

unread,
Aug 14, 2011, 4:41:42 PM8/14/11
to devops-toolchain
anyone have experience with using splunk in a dev ops environment? I'm
looking for some feedback. We are looking at it as a tool that would
give developers access to both historical as well as "real time" (i.e.
App or web server startup) logs without granting them direct access to
the servers. I'm also interested in it's log correlation features that
would help ops to to problem determination across web, app and db
severs simultaneously.

If you have used it before, how good of a job does it do at near real
time log viewing and correlation? Have you used or would you recommend
other tools keeping in mind that the tools should work across multiple
OS/hardware platforms and be something both dev and ops can be
comfortable with (I've looked at a number of eclipse based tools but
they really don't meet non-developer requirements).

Thanks in advance

Dave

Will Thames

unread,
Aug 14, 2011, 4:49:31 PM8/14/11
to devops-t...@googlegroups.com
We use it at Betfair for exactly that purpose - allowing our developers to see logs without granting access to production servers. It's great at real time log viewing and correlation, but if you're working with huge log volumes you'll need to think a bit more carefully about how you do it (e.g. setup indexes per application, use summary indexing, distributed indexers etc.)

Whether or not you'd need other tools really depends on what you want out of it.

Have a chat with Splunk sales - they can provide you with technical contacts - we got them in for a full PoC before even deciding whether Splunk was the best choice (it's a great product, but it's pretty expensive!)

Other solutions are available too - I think logstash is a free equivalent for certain use cases - I'd imagine Splunk's UI is likely superior.

Will

Anthony Goddard

unread,
Aug 14, 2011, 5:30:33 PM8/14/11
to devops-t...@googlegroups.com
Graylog2 is worth a mention here too, we use it for exactly this purpose - giving devs fast access to the log data they need. http://graylog2.org
I've also used it to trace and identify problems across multiple hosts simultaneously, which is pretty neat. Logs are metrics too™ ftw.

The always crafty @portertech has been developing 'tails' recently which is also worth a look for realtime access to server logs. There's a demo @ http://portertech.no.de/

Cheers,
Anthony

http://crankstations.com
@anthonygoddard

Vladimir Vuksan

unread,
Aug 14, 2011, 8:11:53 PM8/14/11
to devops-t...@googlegroups.com
Log.io also looks promising as far as providing an easy way to tail remote
files. I have looked into adapting it so that I can simply use logstash
instead of installing log.ios harvesters. Unfortunately haven't had the
time to get it going :(

http://logio.org/

Vladimir

On Sun, 14 Aug 2011, Anthony Goddard wrote:

> Graylog2 is worth a mention here too, we use it for exactly this purpose - giving devs fast access to the log data they need. http://graylog2.org

> I've also used it to trace and identify problems across multiple hosts simultaneously, which is pretty neat. Logs are metrics too� ftw.

Ernest Mueller

unread,
Aug 15, 2011, 12:57:57 PM8/15/11
to devops-t...@googlegroups.com

> anyone have experience with using splunk in a dev ops environment? I'm
> looking for some feedback. We are looking at it as a tool that would
> give developers access to both historical as well as "real time" (i.e.
> App or web server startup) logs without granting them direct access to
> the servers. I'm also interested in it's log correlation features that
> would help ops to to problem determination across web, app and db
> severs simultaneously.


We use splunk extensively here at NI. We love it. Up front, I will say it's expensive as hell and is a significant chunk of our systems management budget when stacked up against open source and SaaS stuff. However, we've decided it's totally worth it. Why?

Because there's a difference between "ops tools" and "devops tools." We talk about a lot of tools that pretty much only a sysadmin could love (Nagios, I'm looking at you). There is indeed an interesting explosion of log aggregation and management options especially given the NoSQL space. But I judge DevOps tools based on how much developers enjoy using them. The entire point is to enable collaboration via exposing information to them and empowering them to self-service.

This is best illustrated via a timely anecdote. We had a Splunk implementation for our internal Web systems which I used to manage. When we started up a new group in R&D for creation of SaaS products, that team started from scratch tooling-wise. We implemented other more critical stuff first (monitoring, for example), but log management was in the second tier and we looked at the field and re-selected splunk. Our ops team started implementing it (setting up forwarders across UNIX and Windows Amazon instances and Microsoft Azure as well).

During this process I got an email from our FPGA Compile Cloud developers, saying "Hey can we get the ssh keys for the boxes, or can you write us a script to get us the logs or something..." I pointed them at the dev splunk server login. Suddenly I get this IM:

    Aug 8, 2011
    Charles K...
    do we have a splunk monitor on the production server?
    11:05:02 AM
    Ernest Mueller
    the splunk implementation is in progress. The first stab at it by the guys in Penang is under review by our team right now and will be moving from dev to test and opened up to y'all, refined, and moved to prod
    11:06:16 AM
    Charles K...
    i love this app!!!! <wipes a tear of joy>
    11:06:51 AM
    Ernest Mueller
    I thought you would
    11:07:39 AM
    It is very powerful
    11:07:54 AM
    gives realtime insight into system state
    11:08:04 AM
    Charles K...
    it's not just awesome...it's fucking awesome!
    11:08:05 AM
    Ernest Mueller
    go check out some of the other "apps" in there like the *NIX and Windows apps - they pull in a bunch of system logs and metrics
    11:08:23 AM
    Charles K...
    I'm poking around and I'm really liking what I am seeing. It's pretty intuive too!
    11:08:48 AM

And this is why it's a superior tool and a truly DevOps tool. When I give it to a developer, they tell me "it's fucking awesome." Q.E.D.

We have devs using its reports and dashboards, use it to monitor configs as well as logs, use the apps to pull systems stats... Analysis in post, realtime for troubleshooting, allows collaborative review by ops and dev folks. I don't like how expensive it is, but I have no compunction about paying it because it gives me that much more than the freebie options.

Ernest
______________________
UN-altered REPRODUCTION and DISSEMINATION of
this IMPORTANT information is ENCOURAGED.

Nathaniel Eliot

unread,
Aug 15, 2011, 1:21:08 PM8/15/11
to devops-t...@googlegroups.com
Can you elaborate on what other tools you tried, and what makes Splunk better? I'm admittedly coming at this from an open-source perspective; I'd like to know where the best-of-breed FOSS options still fall down, in the hopes of bridging that gap.

--
Nathaniel Eliot
T9 Productions
ecblank.gif

John Vincent

unread,
Aug 15, 2011, 1:25:49 PM8/15/11
to devops-t...@googlegroups.com
Pretty much everyone will tell you Splunk is totally worth it IF you
can afford it. Some folks will even argue that's it's worth a whole
engineer's salary.

Splunk can do the job but depending on volume, it's going to cost you.
You can get around it with various tricks in aggregation but only to a
degree (and I would argue at the expense of using it to its fullest).

There are some opensource alternatives depending on the functionality
of splunk you're after:

- Graylog2 has a great web interface and supports syslog and GELF
(Graylog Extended Log Format)
- Logstash has agent and centralized logging modes.

We're using a combination of the two. Graylog2 is hampered by its
usage of MongoDB (imho) and thus we can only really use it for
near-time log data - about the last 4-5 hours. We use a combination of
logstash agents (with gelf output) and the GELF log4j appender to ship
the logs over to it.

I'm looking at a long term strategy using the logstash server
implementation. In server mode, a centralized logstash instance
accepts logs and shovels them into Elastic Search. It provides a basic
web application for searching the logs. The nice part is that
ElasticSearch is "infinitely scalable" (I know, I know).

The nice thing about logstash as an agent is you can easily multiplex
log destinations so we can continue to ship stuff to Graylog2 using
the GELF output but also ship it to the logstash server instance for
long-term archival.

The only "downside" to logstash is that it only runs under JRuby. That
might turn you off but the upshot is that there is a single jarfile
that you can download and run with an embedded ES instance. The agent
mode is pretty much awesome in a bag because it can input, filter, and
output in so many different ways.

I personally always default to an opensource project of some kind if
it exists until I know what I need. Logstash is a pretty safe way to
do that.

--
John E. Vincent
http://about.me/lusis

Nathaniel Eliot

unread,
Aug 15, 2011, 2:05:40 PM8/15/11
to devops-t...@googlegroups.com
Thanks, Vincent. We already use all the underlying technologies for
Greylog and Logstash, so they're both worth further investigation.

What's the pain point on Mongo?

--
Nathaniel Eliot
T9 Productions

Adam Jacob

unread,
Aug 15, 2011, 2:09:40 PM8/15/11
to devops-t...@googlegroups.com
+1 for Splunk being worth it. If you are a startup, when you first call in to get a quote, let them know - they'll put you into a different program, which will significantly reduce your up-front costs.

--
Opscode, Inc.
Adam Jacob, Chief Product Officer
T: (206) 619-7151 E: ad...@opscode.com

John Vincent

unread,
Aug 15, 2011, 2:31:34 PM8/15/11
to devops-t...@googlegroups.com
I'm not the most unbaised person to talk to about Mongo but in general
it's about memory. Graylog2 doesn't support sharding so you're pretty
much bound to not having more log data than you have memory in the box
minus indexes.

We have a m1.xlarge for mongo and our capped collection is 10GB.
That's the most we could fit (with the additional indexes needed for
our search patterns) to be useful. I couldn't justify bumping up
another instance size for this.

Kevin Foster

unread,
Aug 15, 2011, 2:36:37 PM8/15/11
to devops-t...@googlegroups.com
+1 for Splunk as well.  We use it to monitor our VMware virtual infrastructure and several home grown apps.   Expensive compared to opensource Hell ya but it has a huge ROI. You can spend the same amount of money on cobbling together your own logging environment.  Somethings are just worth paying for.

                  www.kevinfoster.co
-------------------------------------------------------------------
"you're going to fail eventually, why fail at
  something mediocre." David duChemin

“Never compare your beginning to 
someone else’s middle.”  Michael Hyatt
--------------------------------------------------------------------

Ernest Mueller

unread,
Aug 15, 2011, 4:36:40 PM8/15/11
to devops-t...@googlegroups.com

devops-t...@googlegroups.com wrote on 08/15/2011 01:36:37 PM:

> +1 for Splunk as well.  We use it to monitor our VMware virtual
> infrastructure and several home grown apps.   Expensive compared to
> opensource Hell ya but it has a huge ROI. You can spend the same
> amount of money on cobbling together your own logging environment.
>  Somethings are just worth paying for.


I totally agree - I think the hidden costs of open source are often not sufficiently calculated.

At the most fundamental level, if I have to spend an extra man-month cobbling stuff together, that's a lot of money right there and can easily justify a five figure buy. But even that is low - that's thinking from the old world of "IT/ops is just a cost center, they would just be screwing around if they weren't implementing some open source stuff." In the new world we are helping drive innovation and product features and there's an opportunity cost to our time - that man-month is a man-month I'm not working on getting the next greatest thing to market. Companies pay people's salaries because they expect to make 10x+ that amount on their backs in revenue... Time is money.

A lot of the time, there's not something good enough to buy. Before Splunk, the best efforts were the LogLogics of the world which I regard as "write only" log stores useful for compliance and audit but not for ever looking at the logs for troubleshooting etc. I don't buy in every niche, I FOSS it up. But I don't mind doing it in this case.

Ernest Mueller

unread,
Aug 15, 2011, 5:06:38 PM8/15/11
to devops-t...@googlegroups.com

devops-t...@googlegroups.com wrote on 08/15/2011 12:21:08 PM:

> Can you elaborate on what other tools you tried, and what makes
> Splunk better? I'm admittedly coming at this from an open-source
> perspective; I'd like to know where the best-of-breed FOSS options
> still fall down, in the hopes of bridging that gap.


Sure. With Splunk, it automatically pulls in and understands many different log types by default; it makes them searchable in a Google-like interface. It automatically does field extraction and generates faceted navigation even for unfamiliar types; you can drill down/exclude data with a click on the results or on the timeline. The cool thing here is that you don't have to be sending syslog or anything, it understands arbitrary logs in native formats.

It also has built-in apps for UNIX and Windows metrics,

It has a rich forwarder/indexer/etc. architecture that you can scale up as much as you want. We have light forwarders on each node in Amazon and Azure; these push to an intermediate caching forwarder specific to the environment (our test env for UI Builder, for example) and that forwards to our central server.

People can easily create saved reports, alerts, and dashboards; this isn't programming, it's simple business user accessible configuration. I'll be honest, I consider most open source graphing to be a bit of a joke. "Two lines on one graph? Inconceivable!"

When it comes down to it, it's that
1. It requires very little configuration to do the job
2. The UI and usability are extreme

I'm sure there's good FOSS out there too, but I've never seen anything that comes close on those two fronts.

E

Scott McCarty

unread,
Aug 15, 2011, 5:12:36 PM8/15/11
to devops-t...@googlegroups.com
Hey! I use cacti and RRD to do all kinds of lines on one graph, especially for sockets, pipes, files and netflow data :-)

Scott M

Scott Smith

unread,
Aug 15, 2011, 5:19:56 PM8/15/11
to devops-t...@googlegroups.com
Some people, when confronted with a problem, think  “I know, I'll use cacti.”   Now they have two problems.

Ernest Mueller

unread,
Aug 15, 2011, 5:38:05 PM8/15/11
to devops-t...@googlegroups.com

> Some people, when confronted with a problem, think  “I know, I'll
> use cacti.”   Now they have two problems.


LOL, yeah, I have yet to be shown a RRD/cacti graph that doesn't make me want to hit someone in the face. God forbid that real engineering data visualization was stuck in that era.

Ernest

Scott McCarty

unread,
Aug 15, 2011, 6:47:10 PM8/15/11
to devops-t...@googlegroups.com

So I assume, since you are doing REAL engineering, you don't do that x86 scatter computing crap right ;-) Strictly s390 based z196 mainframes with Unified Resource Manager right ;-) Then you can get some real purdy graphs.

Scott M

On Aug 15, 2011 5:38 PM, "Ernest Mueller" <ernest....@ni.com> wrote:

Vladimir Vuksan

unread,
Aug 15, 2011, 7:48:02 PM8/15/11
to devops-t...@googlegroups.com
I think this is a matter of opinion :-). I personally dislike Cacti
but I have seen some sophisticated Cacti setups and people who were happy
with it. Same with Splunk. I know tons of people who are really happy with
it but have trialed it on few occasions and every time came out with a
thought "that's it?". Obviously I expect and use the tool differently than
other people.

Regarding alternatives to Splunk I have implemented logstash however have
not rolled it out to all my nodes but intend to. I particularly like
the different outputs that it supports. I'm actually looking into
replacing ganglia-logtailer (predecessor to logster) with logstash statsd
output plugin.

https://gist.github.com/1124364

Vladimir

Scott McCarty

unread,
Aug 15, 2011, 8:06:34 PM8/15/11
to devops-t...@googlegroups.com

Whil not open source (anymore), I want to point out that LogZilla has a pretty slick web interface and it scales as well as Splunk. Also LZ is way easier to use and WAY less epensive.

    http://www.logzilla.pro

Clayton Dukes has a whitepaper on Cisco.com:

     http://www.cisco.com/en/US/technologies/collateral/tk869/tk769/white_paper_c11-557812.html

Best Regards
Scott M

On Aug 15, 2011 7:48 PM, "Vladimir Vuksan" <vli...@veus.hr> wrote:

Mark Goldfinch

unread,
Aug 15, 2011, 11:46:00 PM8/15/11
to devops-t...@googlegroups.com

> > Some people, when confronted with a problem, think “I know, I'll
> > use cacti.” Now they have two problems.
>
> LOL, yeah, I have yet to be shown a RRD/cacti graph that doesn't make
> me want to hit someone in the face. God forbid that real engineering
> data visualization was stuck in that era.

On this topic, what graphing/visualisation packages is everyone using these days?

I would love for Cacti to have an API to reconfigure it, automating its configuration would solve a bunch of problems for us. Similar comments have otherwise been made by colleagues about the look and feel of the graphs produced by RRDTool.

Thanks,
Mark.

Jason Dixon

unread,
Aug 15, 2011, 11:57:31 PM8/15/11
to devops-t...@googlegroups.com
On Tue, Aug 16, 2011 at 03:46:00PM +1200, Mark Goldfinch wrote:
>
> > > Some people, when confronted with a problem, think ???I know, I'll
> > > use cacti.??? Now they have two problems.

> >
> > LOL, yeah, I have yet to be shown a RRD/cacti graph that doesn't make
> > me want to hit someone in the face. God forbid that real engineering
> > data visualization was stuck in that era.
>
> On this topic, what graphing/visualisation packages is everyone using these days?

Graphite, Reconnoiter, Ganglia, Cacti, in-house stuff.

--
Jason Dixon
DixonGroup Consulting
http://www.dixongroup.net/

Nicholas Tang

unread,
Aug 16, 2011, 4:30:51 AM8/16/11
to devops-t...@googlegroups.com
We've started playing w/ statsd/ graphite, which seems like a pretty
nice combo. Historically we've used cacti, but it annoys the heck out
of me so the goal is to replace it. We're also looking at using
pnp4nagios to graph data that Nagios is monitoring.

More on statsd:
http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/

Nicholas

Scott McCarty

unread,
Aug 16, 2011, 4:57:15 AM8/16/11
to devops-t...@googlegroups.com

I suspect you already know about the cli in Cacti? Here is a really short tutorial for one of the graph projects I maintain.

http://crunchtools.com/software/crunchtools/cacti/graph-mysql-stats/#ScriptedAutomated

It is annoying that Cacti can't be re-configured easily, but the recursive trees of the templates is awesome for certain types of data aquisition such as enumerating all BGP connections or domain names on a server.

As for "nice" looking, when did RRD fall so out of vogue, maybe I am just becoming an old man. Some one please attach a screen shot or two pf a library that replaces rrd that is so much better, please.

Scott M

Lindsay Holmwood

unread,
Aug 16, 2011, 5:27:25 AM8/16/11
to devops-t...@googlegroups.com
On 16 August 2011 18:57, Scott McCarty <scott....@gmail.com> wrote:
>
> As for "nice" looking, when did RRD fall so out of vogue, maybe I am just
> becoming an old man. Some one please attach a screen shot or two pf a
> library that replaces rrd that is so much better, please.
>

Check out the video at 3.33 over here:

http://visage-app.com/

(disclaimer: I wrote Visage)

I'm happy to use a tool that produces butt-ugly graphs, but RRDtool
generates graphs that obscure and obfuscate valuable data.

RRDtool takes a similar approach to downsampling the data on graphs as
it does to downsampling the data it stores over time, and that's
tremendously dangerous when you need to understand exactly how a
system was functioning at a particular point in time.

I wrote Visage specifically to deal with this problem. Each data point
should be inspectable, and composition of graphs should be dynamic and
user driven.

On the data storage front, I'm starting to use OpenTSDB, and am in the
process of writing a Visage backend for it as well. It natively uses
gnuplot to render graphs, and suffers from the same problem as
RRDtool.

Cheers,
Lindsay

--
w: http://holmwood.id.au/~lindsay/
t: @auxesis

John Vincent

unread,
Aug 16, 2011, 5:50:35 AM8/16/11
to devops-t...@googlegroups.com

I'm as bitchy about RRD precision as the next guy but it's fair to
admit that you CAN get decent precision storage. IIRC 3 years of 1
minute precision of a single metric is something like 10MB per RRD?
The problem is that everything up until now has operated on 5 minute
increments. I think that's the default step for rrdcreate.

I think there are two reasons for this:

- The Nagios/Cacti/Cricket/et. al. kind of tools suck(ed) at being
asked to poll any more often.
- "Things" being polled could be negatively impacted by polling often than that

Possibly starting with netflow and now statsd/graphite types of apps,
push models are becoming much more prevalent. Mind you there are still
OTHER problems with RRDs but getting the precision you want is
possible without the smoothing.

Gildas

unread,
Aug 16, 2011, 6:32:56 AM8/16/11
to devops-t...@googlegroups.com
On Tue, Aug 16, 2011 at 11:50 AM, John Vincent <lusi...@gmail.com> wrote:
> I'm as bitchy about RRD precision as the next guy but it's fair to
> admit that you CAN get decent precision storage. IIRC 3 years of 1
> minute precision of a single metric is something like 10MB per RRD?
> The problem is that everything up until now has operated on 5 minute
> increments. I think that's the default step for rrdcreate.
>
> I think there are two reasons for this:
>
> - The Nagios/Cacti/Cricket/et. al. kind of tools suck(ed) at being
> asked to poll any more often.
> - "Things" being polled could be negatively impacted by polling often than that
>
> Possibly starting with netflow and now statsd/graphite types of apps,
> push models are becoming much more prevalent. Mind you there are still
> OTHER problems with RRDs but getting the precision you want is
> possible without the smoothing.

Since softwares such as cacti store an average value in the rrd, it is
hard to compare values between two time period: it is difficult for
instance to compare values from a 1 min average rrd with values from
30 mn or 2 hours average rrds, because averages are "squashed" on the
graphs with the bigger time increments.

One of my ex-colleague mitigated this problem by creating a RRA to
store the highest value as well and not just the average. This allowed
you for instance to compare max values for bandwidth usage between now
and 12 month ago

Cheers,
Gildas

Vladimir Vuksan

unread,
Aug 16, 2011, 8:21:27 AM8/16/11
to devops-t...@googlegroups.com
I wrote a post about RRD storage misconception where ironically some of my
own misconceptions were cleared up as well by Tobi Oetiker :-)

http://vuksan.com/blog/2010/12/14/misconceptions-about-rrd-storage/

As John has pointed out there is no need for averaging.

Vladimir

Dan Sully

unread,
Aug 16, 2011, 9:13:31 AM8/16/11
to devops-t...@googlegroups.com
* Mark Goldfinch shaped the electrons to say...

> On this topic, what graphing/visualisation packages is everyone using these days?

I wrote a time series storage engine because I despise RRD so much:

https://github.com/dsully/circulardb

(The original version actually was written right around the same time as RRD
was released in 1999).

It doesn't include any graphing, but I can recommend client side graphing
using the Flot JavaScript library.

http://code.google.com/p/flot/

--dan

--------------------------------------------------------------
<dsully> please describe web 2.0 to me in 2 sentences or less.
<jwb> you make all the content. they keep all the revenue.

Howard Jones

unread,
Aug 16, 2011, 10:58:04 AM8/16/11
to devops-t...@googlegroups.com
On 16 August 2011 04:46, Mark Goldfinch <mark.go...@modicagroup.com> wrote:
> I would love for Cacti to have an API to reconfigure it, automating its configuration would solve a bunch of problems for us.

It does. Look in the api/ folder on recent (0.8.7*) versions. It's a
little bit rough in places (not exactly idempotent), but it works OK
for bulk adding of hosts and graphs. Also check out the Autom8 plugin
for adding graphs automatically as configuration changes (e.g. as
interfaces are used).

Howie

Scott McCarty

unread,
Aug 16, 2011, 11:03:58 AM8/16/11
to devops-t...@googlegroups.com

Yes, the api is how I do build testing for mysql_stats. I it automatically adds the test host and graphs. Also, use it for new server builds.

Pete Fritchman

unread,
Aug 16, 2011, 12:07:37 PM8/16/11
to devops-t...@googlegroups.com
On Tue, Aug 16, 2011 at 1:30 AM, Nicholas Tang <nichol...@gmail.com> wrote:
> We've started playing w/ statsd/ graphite, which seems like a pretty
> nice combo.

We're also going the statsd+graphite route with a lot of success. For
a graphing dashboard system, we wrote "pencil":
https://github.com/fetep/pencil (uses graphite to render).

It lets you define graphs, have dashboards, global/cluster/host view,
navigation (drill in/out), etc.

--
petef

Noah Campbell

unread,
Aug 16, 2011, 12:10:13 PM8/16/11
to devops-t...@googlegroups.com
Need a screenshot somewhere. Love to see what the dashboard looks like.

-Noah

Noah Campbell
415-513-3545
noahca...@gmail.com

dave.stauffer

unread,
Aug 16, 2011, 1:07:21 PM8/16/11
to devops-toolchain
Wow. Thanks everyone for the great replies (please keep them coming
if you have them though). The main reason I asked the question was
that I reviewed Splunk a couple years ago and while I thought it did a
bang up job of processing data I didn't find the interface very
intuitive and their documentation was severely lacking. I think those
issues have been resolved for me so the last hurdle was the cost. I
think what you confirmed for me is that it is worth the cost. It
looked like it was worth it but being an Ops type tool I wanted to
make sure others were using it successfully in a DevOps environment.
I also appreciate the other open source suggestions. I am looking
into a few of them as alternatives if I get shot down on cost.

The thread seems to have taken a slight diversion into what I would
consider monitoring and graphing. I don't mind that and will add that
we use the GroundWork Open Source product to do both Nagios and Cacti
monitoring. Personally I think Cacti performs a little better than
Nagios and the GroundWork interface helps solve many of the
configuration management issues that are oft sited with these
products. While I 'could' use Cacti for other things I think it is
best left as a system monitoring tool for Ops only. It doesn't really
meet my criteria for a DevOps tool for Log Management. And yes, I too
just try to ignore the 1970's graphs from rrd :>). On a side, side
note, we have recently been using Hyperic monitoring that comes with
SpringSource tcServer. We like it and the graphs are much more 'this
century'. They have an opensource (free) version.

Thanks,
Dave

On Aug 16, 5:50 am, John Vincent <lusis....@gmail.com> wrote:
> On Tue, Aug 16, 2011 at 5:27 AM, Lindsay Holmwood
>
>
>
> <lind...@holmwood.id.au> wrote:

Vladimir Vuksan

unread,
Aug 16, 2011, 1:36:25 PM8/16/11
to devops-toolchain
Regarding better looking graphs we actually added flot rendering of graphs
in Ganglia from RRD data. For example this is an example from the latest
dev version

http://fjrkr5ab.joyent.us/ganglia-2.0/?r=hour&cs=&ce=&m=&c=bx+workstations&h=dragon-isle.bx.psu.edu&mc=2&z=medium&metric_group=cpu

Click on Enlarge next to the metric graphs. There is also a version of the
UI where you all graphs are rendered using flot however that needs
polishing. That looks something like this

http://fjrkr5ab.joyent.us/ganglia-2.0-flotgraphs/graph_all_periods.php?h=dragon-isle.bx.psu.edu&r=hour&hc=4&mc=2&st=&g=cpu_report&z=large&c=bx%20workstations

Actually for all you Ganglia users if you are using Ganglia Web 2.0+ you
can set this override and you will see it :-)

$conf['graph_engine'] = "flot";

As I said it needs work.

Vladimir

Joe Miller

unread,
Aug 16, 2011, 1:59:09 PM8/16/11
to devops-t...@googlegroups.com
+1.  I have also been interested in seeing what pencil looks like but have been too lazy to stand up an instance to see.  A screenshot is just what I need to make the leap =)

Pete Fritchman

unread,
Aug 16, 2011, 2:19:14 PM8/16/11
to devops-t...@googlegroups.com
On Tue, Aug 16, 2011 at 10:59 AM, Joe Miller <jo...@joeym.net> wrote:
> +1.  I have also been interested in seeing what pencil looks like but have
> been too lazy to stand up an instance to see.  A screenshot is just what I
> need to make the leap =)

Ok, screenshots coming this afternoon. :) It's not super pretty, but
very functional (for our needs, at least).

--
petef

Thomas Vincent

unread,
Aug 16, 2011, 2:31:31 PM8/16/11
to devops-t...@googlegroups.com
I am a huge fan of ZenOSS. ZenOSS can take its native ZenPacks, as well as Nagios, and Cacti plugins. The fact that I can treat the Nagios plugin as a data source and starting graphing it within a few minutes is a big win. 

--
Cheers,
Tom

Noah Campbell

unread,
Aug 16, 2011, 3:22:07 PM8/16/11
to devops-t...@googlegroups.com
The one thing that irks me a bit about flot is that it tries to embellish its chart data with chart junk (drop shadows, anti-aliasing of lines, etc.). I like the interactivity, but those embellishments take away from what the graph is supposed to do, communicate clearly with the user.

-Noah

Noah Campbell
415-513-3545
noahca...@gmail.com

Adam Jacob

unread,
Aug 16, 2011, 3:49:08 PM8/16/11
to devops-t...@googlegroups.com
You can turn all that stuff off in Flot, which I like. I've used Flot to go from giant scrollable graphs to tiny sparklines. It's the bees knees.

--
Opscode, Inc.
Adam Jacob, Chief Product Officer
T: (206) 619-7151 E: ad...@opscode.com

> noahca...@gmail.com (mailto:noahca...@gmail.com)


Scott Smith

unread,
Aug 16, 2011, 4:14:09 PM8/16/11
to devops-t...@googlegroups.com
Hey Pete,

Pencil looks interesting, It looks like the color function is not supported in 0.9.8 fwiw. Is it possible to disable that and have it use colorList ?

-scott

Pete Fritchman

unread,
Aug 17, 2011, 1:57:03 AM8/17/11
to devops-t...@googlegroups.com
On Tue, Aug 16, 2011 at 11:19 AM, Pete Fritchman <pe...@databits.net> wrote:

pencil screenshots: http://fetep.github.com/pencil/

--
petef

Guillaume FORTAINE

unread,
Sep 11, 2011, 12:14:56 PM9/11/11
to devops-t...@googlegroups.com
Hello,

For your information :

> On this topic, what graphing/visualisation packages is everyone using these days?

http://sourceforge.net/projects/gbrrdgraphix/

"gbRRDGraphix is a graphical user interface built in Gambas language
to use simply RRDTool commands and 'flow-tools' Netflow utilities.
Added to the project, a Scheduler to update RRDTool database, a
complet Web Site to display all RRDtool graphics"


http://observium.org/wiki/Main_Page

"Observium is an autodiscovering PHP/MySQL/SNMP based network
monitoring which includes support for a wide range of network hardware
and operating systems including Cisco, Linux, FreeBSD, Juniper,
Foundry, HP and many more.

Observium has grown out of a lack of easy to configure and easy use
NMSes. It is intended to provide a more navigable interface to the
health and performance of your network. Its design goals include
collecting as much historical data about devices as possible, being
completely autodiscovered with little or no manual intervention, and
having a very intuitive interface.

Observium is not intended to replace a Nagios-type up/down monitoring
system, but rather to complement it with an easy to manage, intuitive
representation of historical and current performance statistics,
configuration visualisation and syslog capture."

Best Regards,

Guillaume FORTAINE

Ray McCaffity

unread,
May 23, 2013, 6:22:17 PM5/23/13
to devops-t...@googlegroups.com
We run zSeries as well.  logstash works great on x86_64, but not so much s390x.  lib-jffi seems to be a problem.
SuSE SLES 11 sp2 on s390x requires newer versions of almost everything than we have available (ruby, ffi, etc..)
I've tried d/l'ing the source code and compiling but it wants newer versions of ant, etc...  we're married to this OS right now so upgrading isn't an option.
 
Ray
 

On Monday, August 15, 2011 3:47:10 PM UTC-7, fatherlinux wrote:

So I assume, since you are doing REAL engineering, you don't do that x86 scatter computing crap right ;-) Strictly s390 based z196 mainframes with Unified Resource Manager right ;-) Then you can get some real purdy graphs.

Scott M

On Aug 15, 2011 5:38 PM, "Ernest Mueller" <ernest....@ni.com> wrote:
Reply all
Reply to author
Forward
0 new messages