Percentile calculation by key field with Riemann-0.3.6

18 views
Skip to first unread message

Shakthi Kannan

unread,
Nov 3, 2021, 8:47:03 AM11/3/21
to rieman...@googlegroups.com
Hi,

Using Riemann-0.3.6 on CentOS 7, I have the following configuration in
riemann.config:

=== BEGIN ===
...
(let [index (index)]
(streams
(by :method
(smap #(assoc % :service (str "Alert Lag " (:method %)))
#(info %)
; INFO [2021-11-02 11:52:32,489]
defaultEventExecutorGroup-2-1 - riemann.config -
#riemann.codec.Event{:host foo.com, :service Alert Lag Web, :metric
3.0, :time 1, :ttl 0.0, :method Web}
(percentiles 5 [0.5 0.95 0.99]
(smap (fn [event]
{:service (:service event)
:metric (:metric event)
:tags "Notify"})
#(info %)
))))))
=== END ===

The percentiles output that I get is as follows:

=== LOG ===
...
INFO [2021-11-03 07:07:11,181] riemann task 2 - riemann.config -
{:service Alert Lag 0.5, :metric 17.07741843738219, :time
1635355829979, :tags Notify}
INFO [2021-11-03 07:07:11,182] riemann task 2 - riemann.config -
{:service Alert Lag 0.95, :metric 187.528125, :time 1635355829979,
:tags Notify}
INFO [2021-11-03 07:07:11,182] riemann task 2 - riemann.config -
{:service Alert Lag 0.99, :metric 187.528125, :time 1635355829979,
:tags Notify}
...
=== END ===

1. Why does "Alert Lag Web" in :service not render as "Alert Lag Web
0.5", "Alert Lag Web 0.95", and "Alert Lag Web 0.99"?

2. How do I correctly get the percentiles for each method (say "Web",
"Store", "Mobile") etc.?

Please advise.

Thanks!

SK

Toby McLaughlin

unread,
Nov 4, 2021, 12:26:29 AM11/4/21
to Riemann Users
Hi,

Your config appears to work for me, unless I misunderstood the issue.

Here's me injecting events with the Ruby client:

--------------------------------------------
[1] pry(main)> require 'riemann/client'
=> true

[2] pry(main)> c = Riemann::Client.new
=> #<Riemann::Client:0x000055612979ce00
[...]

[3] pry(main)> ["Web", "Store", "Mobile"].each { |m| c.tcp << {method: m, metric: 10}}
=> ["Web", "Store", "Mobile"]
--------------------------------------------

And here's the output from Riemann:

--------------------------------------------
INFO [2021-11-04 14:53:43,343] defaultEventExecutorGroup-2-2 - riemann.config - #riemann.codec.Event{:host colt, :service Alert Lag Web, :state nil, :description nil, :metric 10, :tags nil, :time 1635999823, :ttl nil, :method Web}
INFO [2021-11-04 14:53:43,358] defaultEventExecutorGroup-2-2 - riemann.config - #riemann.codec.Event{:host colt, :service Alert Lag Store, :state nil, :description nil, :metric 10, :tags nil, :time 1635999823, :ttl nil, :method Store}
INFO [2021-11-04 14:53:43,359] defaultEventExecutorGroup-2-2 - riemann.config - #riemann.codec.Event{:host colt, :service Alert Lag Mobile, :state nil, :description nil, :metric 10, :tags nil, :time 1635999823, :ttl nil, :method Mobile}
INFO [2021-11-04 14:53:48,384] riemann task 2 - riemann.config - {:service Alert Lag Mobile 0.5, :metric 10, :tags Notify}
INFO [2021-11-04 14:53:48,384] riemann task 1 - riemann.config - {:service Alert Lag Web 0.5, :metric 10, :tags Notify}
INFO [2021-11-04 14:53:48,384] riemann task 3 - riemann.config - {:service Alert Lag Store 0.5, :metric 10, :tags Notify}
INFO [2021-11-04 14:53:48,384] riemann task 2 - riemann.config - {:service Alert Lag Mobile 0.95, :metric 10, :tags Notify}
INFO [2021-11-04 14:53:48,384] riemann task 3 - riemann.config - {:service Alert Lag Store 0.95, :metric 10, :tags Notify}
INFO [2021-11-04 14:53:48,384] riemann task 1 - riemann.config - {:service Alert Lag Web 0.95, :metric 10, :tags Notify}
INFO [2021-11-04 14:53:48,385] riemann task 2 - riemann.config - {:service Alert Lag Mobile 0.99, :metric 10, :tags Notify}
INFO [2021-11-04 14:53:48,385] riemann task 3 - riemann.config - {:service Alert Lag Store 0.99, :metric 10, :tags Notify}
INFO [2021-11-04 14:53:48,385] riemann task 1 - riemann.config - {:service Alert Lag Web 0.99, :metric 10, :tags Notify}
--------------------------------------------

Do you maybe have events coming in without a :method ?

Cheers,
Jarpy.

Shakthi Kannan

unread,
Nov 4, 2021, 1:25:33 PM11/4/21
to rieman...@googlegroups.com
Hi Toby,

--- On Thu, Nov 4, 2021 at 9:56 AM Toby McLaughlin <to...@jarpy.net> wrote:
| Here's me injecting events with the Ruby client:
| ...
\--

Thanks! I am able to see the :service name displayed correctly with
your Ruby client test code.

---
| Do you maybe have events coming in without a :method ?
\--

Yes.

1. Do I need to filter out and send only the events that contain a :method?

2. Is there any documentation on how the percentile is calculated
(maybe with some examples and logs)?

Thanks for your prompt response. Appreciate it!

SK
Message has been deleted

Toby McLaughlin

unread,
Nov 5, 2021, 4:38:08 AM11/5/21
to Riemann Users
> Do I need to filter out and send only the events that contain a :method?

I would absolutely recommend that, yes! :)

While I think the basic logic of your config is correct, it is currently processing every event in Riemann, including ones it doesn't understand. Riemann is usually creating its own instrumentation metrics, so your stream is even picking these up.

You could try something like:

  (match :method #{"Web" "Store" "Mobile"}
    (by :method [...]

> Is there any documentation on how the percentile is calculated [?]

I'm only aware of the API docs, which are generated from the Clojure source. Looking at `streams.percentiles`:


...most of the work is done by `folds.sorted-sample`:


> Thanks for your prompt response. Appreciate it!

It's a pleasure. I've been using Riemann for a while now, and like to help out where my (limited) knowledge can be useful.

Shakthi Kannan

unread,
Nov 8, 2021, 1:40:55 AM11/8/21
to rieman...@googlegroups.com
Hi,

--- On Fri, Nov 5, 2021 at 2:08 PM Toby McLaughlin <to...@jarpy.net> wrote:
| I would absolutely recommend that, yes! :)
\--

I am now filtering the relevant events only, and the percentiles are
showing up correctly.

Thanks!

SK

--
Shakthi Kannan
http://www.shakthimaan.com
Reply all
Reply to author
Forward
0 new messages