Hi,
Carl Sandland <
carl.s...@instaclustr.com> writes:
> Played around with replacing reinject with direct child streams and the
> payoff was noticeable more in cpu than jvm-heap usage. I guess the payoff
> depends on how carefully 'routed' or 'partitioned' your head stream is ?
Yes. Probably you have the reason to use 'reinject', but if streams are
properly ordered, you won't have to. Unless you have some complex
usecase.
> Also have some views that replacing one reinject would mean following the
> same pattern in the other 99 ! It's true i guess that you should have to
> think about wether this event goes through the head or not each time...
It depends how you are ordering the flow. I like approach from [1],
where streams are divided by some logic, but follows linear
route. Example:
(def storage
"Multiple storage locations."
(sdo
influx
cassandra))
(def cpu-checks
"Checks CPU usage."
(where (and (service #"^cpu-check")
(> metric 90))
(by [:host :service]
send-alert)))
(streams
cpu-checks
memory-checks
(precalculate-something storage)
storage)
"precalculate-something" will do some calculations on metric group and
store it in databases. Those "*-checks" doesn't need to store anything,
they will just alert in case something doesn't look right. If I need to
have some check/alert on precalculated values, I'd add:
(precalculate-something a-check storage)
and there is a clear visual distinction how event flows and where it
ends up.
I find this easier to debug, disable (just comment offending line) or to
explain to coworkers that are not familiar with Riemann.
> I also sometimes think of moving more complex derived metrics out of
> riemann and into the nodes:
This is also a good approach.
> One thing I've tried is just simply; (the end goal here is ratio = p1 /
> p2 for each a/b/c)
> (by :a)(by :b)(by :c)(project [p1 p2])(reinject) => (by a: b: c:)(project
> [p1 p2])(reinject)
>
> pretty sure that's logically the same and it's had a small reduction to
> heap usage.
Every 'by' will create a new table with events, that will be present in
memory while Riemann is running, so I'd go with '(by [:a :b :c] ...)`. I
think this is the reason why you see small reduction to heap usage.
> I love riemann (in case that didn't come through); but I find the step
> from noob => intermediate quite difficult to navigate (I guess it's easy to
> start a helicopter, but harder to fly it into combat?)
I think the major challenge with Riemann is trying to see it as a standard
monitoring tool. If you think about it as a data router and everything
else is a side-effect of that (saving, alerting) and always keep in mind
how events flow, things will be easier to follow. At least that worked
for me :)
> Cheers,
> Carl
Best,
Sanel
[1]
https://github.com/mcorbin/riemann-configuration-example