I'm new to rrdtools. I'm actually still evaluating it as as tool for storing monitoring metrics.
After reading the documentation and several discussions in this list and also in other forums
I'm still confused about a couple of things.
1. How are subsecond timestamt reported? In several places it is mentioned the existence of these timestamps, but can't find any example of how to specify it in the update command, except when using "N" as timestamp.
2. Supposing subsecond timestamps are supported by rrdtools, are they also supported by rrdcached?
3. Finally, is there any patch to add SUM function for totalizing a metric over a period. This is a very important requirement as we need to count the total number of certain events for minute, hour and day. I'm aware of the approach of using the average and multiplying by the reporting period, but this will lead to a approximated value. Even when I understand the concerns about it not working properly in all situations, I think is a very common requirement. Simply can't believe nobody has done this ever, unless I'm completely wrong about the need of such function.
many thanks in advance
-- Pablo Chacin
R&D Engineer
SenseFields SL
Tlf (+34) 93 418 05 85
Baixada de Gomis 1,
08023 Barcelona (Spain)
http://www.sensefields.com/
> I'm new to rrdtools. I'm actually still evaluating it as as tool for
> storing monitoring metrics.
> After reading the documentation and several discussions in this list and
> also in other forums
> I'm still confused about a couple of things.
> 1. How are subsecond timestamt reported? In several places it is
> mentioned the existence of these timestamps, but can't find any example
> of how to specify it in the update command, except when using "N" as
> timestamp.
rrdtool update 1351545246.3897:394.33
> 2. Supposing subsecond timestamps are supported by rrdtools, are they
> also supported by rrdcached?
I think the version on github does support it ... the released
version not.
> 3. Finally, is there any patch to add SUM function for totalizing a
> metric over a period. This is a very important requirement as we need to
> count the total number of certain events for minute, hour and day. I'm
> aware of the approach of using the average and multiplying by the
> reporting period, but this will lead to a approximated value.
everything in rrdtool is a 'rate' so even if there was such a
function the math would still be the same ... can yo elaborate in
what way you expect the result of the calculation to be an
aproximation?
> Even when
> I understand the concerns about it not working properly in all
> situations, I think is a very common requirement. Simply can't believe
> nobody has done this ever, unless I'm completely wrong about the need of
> such function.
> 3. Finally, is there any patch to add SUM function for totalizing a
> metric over a period. This is a very important requirement as we need to
> count the total number of certain events for minute, hour and day. I'm
> aware of the approach of using the average and multiplying by the
> reporting period, but this will lead to a approximated value.
> everything in rrdtool is a 'rate' so even if there was such a
> function the math would still be the same ... can yo elaborate in
> what way you expect the result of the calculation to be an
> aproximation?
> consolidated and then mulitplied by the period will produce the same value
> as if all entries were 10, and not with one UNKN (given the cf is 0.5). A
> SUM function would nicely return 30.
>> I'm new to rrdtools. I'm actually still evaluating it as as tool for
>> storing monitoring metrics.
>> After reading the documentation and several discussions in this list and
>> also in other forums
>> I'm still confused about a couple of things.
>> 1. How are subsecond timestamt reported? In several places it is
>> mentioned the existence of these timestamps, but can't find any example
>> of how to specify it in the update command, except when using "N" as
>> timestamp.
> rrdtool update 1351545246.3897:394.33
>> 2. Supposing subsecond timestamps are supported by rrdtools, are they
>> also supported by rrdcached?
> I think the version on github does support it ... the released
> version not.
>> 3. Finally, is there any patch to add SUM function for totalizing a
>> metric over a period. This is a very important requirement as we need to
>> count the total number of certain events for minute, hour and day. I'm
>> aware of the approach of using the average and multiplying by the
>> reporting period, but this will lead to a approximated value.
> everything in rrdtool is a 'rate' so even if there was such a
> function the math would still be the same ... can yo elaborate in
> what way you expect the result of the calculation to be an
> aproximation?
>> Even when
>> I understand the concerns about it not working properly in all
>> situations, I think is a very common requirement. Simply can't believe
>> nobody has done this ever, unless I'm completely wrong about the need of
>> such function.
> cheers
> tobi
>> many thanks in advance
-- Pablo Chacin
R&D Engineer
SenseFields SL
Tlf (+34) 93 418 05 85
Baixada de Gomis 1,
08023 Barcelona (Spain)
http://www.sensefields.com/
> > 3. Finally, is there any patch to add SUM function for totalizing a
> > metric over a period. This is a very important requirement as we need to
> > count the total number of certain events for minute, hour and day. I'm
> > aware of the approach of using the average and multiplying by the
> > reporting period, but this will lead to a approximated value.
> > everything in rrdtool is a 'rate' so even if there was such a
> > function the math would still be the same ... can yo elaborate in
> > what way you expect the result of the calculation to be an
> > aproximation?
> > consolidated and then mulitplied by the period will produce the same value
> > as if all entries were 10, and not with one UNKN (given the cf is 0.5). A
> > SUM function would nicely return 30.
you can track the amount of unknown time too ... but you can use
the TOTAL function will ignore unknonwn values ...
otoh, why would you want to count UNKNOWN as ZERO ? isn't more
corect to asume that the values during the time where you have no
information were the same as in over the rest of the interval ?
> On 10/29/2012 10:16 PM, Tobias Oetiker wrote:
> > Today Pablo Chacin wrote:
> > > Greetings all
> > > I'm new to rrdtools. I'm actually still evaluating it as as tool for
> > > storing monitoring metrics.
> > > After reading the documentation and several discussions in this list and
> > > also in other forums
> > > I'm still confused about a couple of things.
> > > 1. How are subsecond timestamt reported? In several places it is
> > > mentioned the existence of these timestamps, but can't find any example
> > > of how to specify it in the update command, except when using "N" as
> > > timestamp.
> > rrdtool update 1351545246.3897:394.33
> > > 2. Supposing subsecond timestamps are supported by rrdtools, are they
> > > also supported by rrdcached?
> > I think the version on github does support it ... the released
> > version not.
> > > 3. Finally, is there any patch to add SUM function for totalizing a
> > > metric over a period. This is a very important requirement as we need to
> > > count the total number of certain events for minute, hour and day. I'm
> > > aware of the approach of using the average and multiplying by the
> > > reporting period, but this will lead to a approximated value.
> > everything in rrdtool is a 'rate' so even if there was such a
> > function the math would still be the same ... can yo elaborate in
> > what way you expect the result of the calculation to be an
> > aproximation?
> > > Even when
> > > I understand the concerns about it not working properly in all
> > > situations, I think is a very common requirement. Simply can't believe
> > > nobody has done this ever, unless I'm completely wrong about the need of
> > > such function.
> > cheers
> > tobi
> you can track the amount of unknown time too ... but you can use
> the TOTAL function will ignore unknonwn values .
Didn’t know this. Could you please point-me to
> otoh, why would you want to count UNKNOWN as ZERO ? isn't more
> corect to asume that the values during the time where you have no
> information were the same as in over the rest of the interval ?
In my case, no. I may have gaps in the collection which, if i understand correctly
will be traduced as unknown.
-- Pablo Chacin
R&D Engineer
SenseFields SL
Tlf (+34) 93 418 05 85
Baixada de Gomis 1,
08023 Barcelona (Spain)
http://www.sensefields.com/
>>> consolidated and then mulitplied by the period will produce the
>>> same value as if all entries were 10, and not with one UNKN (given
>>> the cf is 0.5). A SUM function would nicely return 30.
> otoh, why would you want to count UNKNOWN as ZERO ? isn't more
> corect to asume that the values during the time where you have no
> information were the same as in over the rest of the interval ?
Hey Tobi,
I realize RRD is not an ANSI SQL compliant database (*grin*) ... some
confusion might arise that in a database, NULL (unknown) is not
sum()'d:
--8-<---8-<---8-<---8-<---8-<---8-<---8-<--
19:05:23 plugh > insert into tobi values (1);
1 row created.
Elapsed: 00:00:00.00
19:05:31 plugh > insert into tobi values (4);
1 row created.
Elapsed: 00:00:00.01
19:05:36 plugh > insert into tobi values (8);
1 row created.
Elapsed: 00:00:00.00
19:05:38 plugh > insert into tobi values (null);
1 row created.
Elapsed: 00:00:00.00
19:05:44 plugh > commit;
Commit complete.
Elapsed: 00:00:00.10
19:05:46 plugh > select * from tobi;
SOME_COLUMN
-----------
1
4
8
Elapsed: 00:00:00.01
19:05:49 plugh > select sum(some_column) from tobi;
>>> consolidated and then mulitplied by the period will produce the >>> same value as if all entries were 10, and not with one UNKN (given >>> the cf is 0.5). A SUM function would nicely return 30.
> otoh, why would you want to count UNKNOWN as ZERO ? isn't more > corect to asume that the values during the time where you have no > information were the same as in over the rest of the interval ?
Hey Tobi,
I realize RRD is not an ANSI SQL compliant database (*grin*) ... some confusion might arise that in a database, NULL (unknown) is not sum()'d:
--8-<---8-<---8-<---8-<---8-<---8-<---8-<-- 19:05:23 plugh > insert into tobi values (1);
1 row created.
Elapsed: 00:00:00.00 19:05:31 plugh > insert into tobi values (4);
1 row created.
Elapsed: 00:00:00.01 19:05:36 plugh > insert into tobi values (8);
1 row created.
Elapsed: 00:00:00.00 19:05:38 plugh > insert into tobi values (null);
1 row created.
Elapsed: 00:00:00.00 19:05:44 plugh > commit;
Commit complete.
Elapsed: 00:00:00.10 19:05:46 plugh > select * from tobi;
SOME_COLUMN ----------- 1 4 8
Elapsed: 00:00:00.01 19:05:49 plugh > select sum(some_column) from tobi;
Today Pablo Chacin wrote:
> On 10/29/2012 11:57 PM, Tobias Oetiker wrote:
> > you can track the amount of unknown time too ... but you can use
> > the TOTAL function will ignore unknonwn values .
> Didn?t know this. Could you please point-me to
you can use a CDEF to turn UNKNOWN to 0 and all the rest of the
entries to 1 calculate the AVERAGE. This will tell you how much
'UNKNOWN' is contained in your data stream.
> > otoh, why would you want to count UNKNOWN as ZERO ? isn't more
> > corect to asume that the values during the time where you have no
> > information were the same as in over the rest of the interval ?
> In my case, no. I may have gaps in the collection which, if i understand
> correctly will be traduced as unknown.
yes ... and thus, if you build a sum over all the entries, the most
correct solution is that the SUM is also unknown. RRDtool has the
option of ignoring the unknown entries using the TOTAL function
(which will cause them to essentially be treated as 0).
or if you take the average and and multiply with the total
interval, you assume the unknwon samples to be equal to the
average.
I think that "it is not a database" is not the point.
The point is: is the way unknown values handle
coherent with the calculation of totals?
As I see it, there's no fundamental reason not to
be able to calculate totals in a RRD. Other implementations
of the concept, as StatsD and Graphite allows it.
On 10/30/2012 03:51 AM, j.halif...@seznam.cz wrote:
> On 10/29/2012 06:57 PM, Tobias Oetiker wrote:
> > Today Pablo Chacin wrote:
> >> For example:
> >>> 0:UNKN 300:10 600:10 900:10
> >>> consolidated and then mulitplied by the period will produce the
> >>> same value as if all entries were 10, and not with one UNKN (given
> >>> the cf is 0.5). A SUM function would nicely return 30.
> > otoh, why would you want to count UNKNOWN as ZERO ? isn't more
> > corect to asume that the values during the time where you have no
> > information were the same as in over the rest of the interval ?
> Hey Tobi,
> I realize RRD is not an ANSI SQL compliant database (*grin*) ... some
> confusion might arise that in a database, NULL (unknown) is not
> sum()'d:
> yes ... and thus, if you build a sum over all the entries, the most
> correct solution is that the SUM is also unknown. RRDtool has the
> option of ignoring the unknown entries using the TOTAL function
> (which will cause them to essentially be treated as 0).
Unfortunately, I can do this only when retrieving data, in a VDEF,
not in a CDEF to be actually stored that way, right?
By the way, can I use VDEF in a export? from documentation it is
not evident when VDEF are allowed.
Finally, as this requirement comes frequently, will you eventually
consider to incorporate TOTAL (or SUM) CDEF a feature of RRDTools?
Thanks
-- Pablo Chacin
R&D Engineer
SenseFields SL
Tlf (+34) 93 418 05 85
Baixada de Gomis 1,
08023 Barcelona (Spain)
http://www.sensefields.com/
Pablo Chacin wrote:
>The point is: is the way unknown values handle
>coherent with the calculation of totals?
>As I see it, there's no fundamental reason not to
>be able to calculate totals in a RRD.
You can compute totals in RRD - the problem is that the way it does it isn't the way you want to do it *in this application*. Which ever way it's done, there will be a lot of people for whom that is not the right way. Even in one setup the requirements may change - for example ...
At work I use RRD to log traffic on our network. We sell on some of our bandwidth to other tenants in the building, but we don't charge by volume of data or impose monthly caps.
Now consider these scenarios :
1) We have a failure in data collection, but the network is still up.
In this case, traffic is still flowing, but we aren't recording it. The most sensible option would be to assume that the traffic in the gap is roughly the average of the rates either side of the gap.
2) We have an actual outage (extended power failure so we lose our monitoring as well as the line).
In this case the sensible option is to assume the data was zero during the gap.
3) As in 1 - we have a collection failure but the network is still running. If we were imposing caps or charging by the gigabyte then we'd probably have to assume zero for charging/capping purposes as we wouldn't have the data to back up any other assumption. But for our own traffic management/planning we'd want to assume data was still flowing.
So there you are - which way should RRD do it ? Either way is both right and wrong. As it is, it's possible to work around it either way if you use the right calculations when extracting the data.
If you want to assume zero, then this RPN will do the conversion for you, then work with the result :
aa=a,UNKN,0,a,IF
which means: if a is unknown then substitute 0, else use a.
-- Simon Hobson
Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed
author Gladys Hobson. Novels - poetry - short stories - ideal as
Christmas stocking fillers. Some available as e-books.
Thanks Simon, that's a very compelling explanation on how to
handle this situation. Probably it worth putting in the RRDTools'
wiki to prevent others coming regularly and asking the same
question.
Regards
P.S. Just one final question are these RPNs available when fetching
and exporting data? I don't want to actually graph data, but to
export it to other applications. Documentation is not clear.
On 10/30/2012 08:51 AM, Simon Hobson wrote:
> You can compute totals in RRD - the problem is that the way it does
> it isn't the way you want to do it *in this application*. Which ever
> way it's done, there will be a lot of people for whom that is not the
> right way. Even in one setup the requirements may change
> So there you are - which way should RRD do it ? Either way is both
> right and wrong. As it is, it's possible to work around it either way
> if you use the right calculations when extracting the data.
> If you want to assume zero, then this RPN will do the conversion for
> you, then work with the result :
> aa=a,UNKN,0,a,IF
> which means: if a is unknown then substitute 0, else use a.
-- Pablo Chacin
R&D Engineer
SenseFields SL
Tlf (+34) 93 418 05 85
Baixada de Gomis 1,
08023 Barcelona (Spain)
http://www.sensefields.com/