I would like to calculate percentiles (very interested in 99.9). In
order to do that I'm thinking to add a "int[] latencies" inside
"Accumulator" class (only for PUT operations). Where index of this
array "latencies" is the "totalTimeNS" value that represents the time
in NS that a put operation took. (from StatTrackingStore ) The "value"
will be a counter that keeps track of the numbers of put's that took
the same amout of time , for example:
if 2 puts took the same amount of time (75 NS) each then the array
will have
latencies[75] = 2
I'll have to keep the latencies otherwise when the timeout (5 minutes)
arrays the Accumulator will be recreated.
I'll have to passed in the same way "total" does as in
<code>
private static class Accumulator {
...
...
public Accumulator newWithTotal() {
return new Accumulator(System.currentTimeMillis(), 0, 0,
total, latencies);
// return new Accumulator(System.currentTimeMillis(), 0,
0, total);
}
</code>
Any ideas? Is this the right approach?
Thanks you all,
John
This would be a very welcome feature. Your approach of storing the
histogram is good but storing an entry for each possible number of
nanoseconds up to a second would mean an array of 1,000,000,000 ints
which would require 3.8 GB of heap space per histogram! But that is
easily fixed, you just need to make the bin size configurable (default
of 1ms should be fine) and make the maximum configurable so that all
requests outside the range just go to a single bucket (say the last
bucket). Then to calculate the correct bin, rather than doing
bins[time] you would do bins[Math.round(time/binsize_in_ns)]. I doubt
anyone cares to be more accurate than 1ms, so that would be a
reasonable default bin size and would lead to a reasonably sized
array. The arrays could be reused to avoid continuously reallocating
them. We should definitely do it for all the request types GET, PUT,
GET_ALL, DELETE.
A slick implementation would correctly zero out any insignificant
digits (i.e. less than the precision of 1ms) and return positive
infinity if the quantile is found to be in the last bucket (the
overflow bucket).
-Jay
> --
> You received this message because you are subscribed to the Google Groups "project-voldemort" group.
> To post to this group, send email to project-...@googlegroups.com.
> To unsubscribe from this group, send email to project-voldem...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/project-voldemort?hl=en.
>
>
-Jay
index - [Math.log(timeNS) (i have put only the starting range
number for simplicity)
1 - [2.7..
2 - [7.29
3 - [19.68
4 -[53,14..
5 - [143.48
6 - [387.42
7 - [1046.03
8 - [2824
9 - [7625
10 - [20589
11 - [55590
12 - [150094
13 - [405255
14 - [1094189
15 - [2954312
16 - [7976644
17 - [21536939
18 - [58149737
19 - [157004289
20 - [423911582
21 - [1144561273
When calculating the percentile I'll lose accuracy. For example,
index 13 and 14 they have a big gap I won't know if I'm looking at
transaction that took 0.4 or 1.0 ms.
I think that we can have 2 dimensional array. I'll code this and
expose to you all for review.
Thanks,
John Cohen
This is the easiest approach:
<pre>
public class StatTrackingStore<K, V> extends DelegatingStore<K, V> {
...
...
@JmxGetter(name = "writeLatencyInMs", description = "Write
Latencies for: 50, 90, 99, 99.5 and 99.9 percentile")
public String getWriteLatencyInMs() {
return stats.getLatencyStats(Tracked.PUT);
}
}
public class RequestCounter {
/**
* Used to divide the TimeNS that a transaction took. The result
will be
* used as index into an array that keeps track of the total
number of tx.
* For example a value of:
* 34000 ns represents index=3 (int)(34000/10^4)
* 38000 ns represents index=3 (int)(38000/10^4)
* 375000 ns represents index=37 (int)(375000/10^4)
*
* Basically this will knock down 5 digits from time NS.
* I doubt anyone cares to be more accurate than 1ms, so that
would be a
* reasonable default.
*
*/
private final static int DISTRIBUTION_FACTOR = 100000;
/**
* Normalize the index back to millis.
*/
private final static float NORMILIZE_TO_MILLIS =
DISTRIBUTION_FACTOR / 1000000.0F;
public void addRequest(long timeNS) {
int index = (int) (timeNS / Accumulator.MAX_ARRAY);
for(int i = 0; i < 3; i++) {
Accumulator oldv = getValidAccumulator();
long startTimeMS = oldv.startTimeMS;
long count = oldv.count + 1;
long totalTimeNS = oldv.totalTimeNS + timeNS;
long total = oldv.total + 1;
int[] latencies = oldv.latencies;
// Updates the counter that keeps track of the number of
// transactions that took the same amount of time.
//
//
// If the index is bigger than Accumulator.MAX_ARRAY it
means that the
// transaction took such a big time that can be considered
a rare case,
// For example, let's consider:
//
// timeNS = 10.000.000.000 (10 seconds)
// index = timeNS / Accumulator.MAX_ARRAY
// index = 10.000.000.000 / 100.000
// index = 100.000
if(index < Accumulator.MAX_ARRAY) {
latencies[index] = getSafeNextCounter(latencies
[index]);
}else{
// Put this value in the last bucket.
latencies[Accumulator.MAX_ARRAY] = getSafeNextCounter
(latencies[Accumulator.MAX_ARRAY]);
}
//System.out.println("timeNS:" + timeNS + ", latencies["
+index + "]:" + latencies[index]);
if(values.compareAndSet(oldv, new Accumulator(startTimeMS,
count, totalTimeNS, total, latencies))) {
return;
}
}
}
private int getSafeNextCounter(int value) {
// reset the counter if maximum value is reached, unlikely to
// happen...
int nextValue = value+1;
if(nextValue < 2147483640)
return nextValue;
else
return 0;
}
private int findIndexByPercentage(final float percentage, final
long totalTx, int[] latencies) {
float countTx = (totalTx * percentage);
long found = 0;
for(int index = 0; index < Accumulator.MAX_ARRAY; index++) {
found = latencies[index] + found;
if(found > countTx) {
return index;
}
}
return 0;
}
public String getLatency() {
int[] latencies = getValidAccumulator().latencies;
long totalTx = 0;
for(int val: latencies) {
totalTx = val + totalTx;
}
int index50 = findIndexByPercentage(0.50F, totalTx,
latencies);
int index90 = findIndexByPercentage(0.90F, totalTx,
latencies);
int index99 = findIndexByPercentage(0.99F, totalTx,
latencies);
int index995 = findIndexByPercentage(0.995F, totalTx,
latencies);
int index999 = findIndexByPercentage(0.999F, totalTx,
latencies);
// Calculate the percentiles in Millis.
float p50Ms = index50 * NORMILIZE_TO_MILLIS;
float p90MS = index90 * NORMILIZE_TO_MILLIS;
float p99MS = index99 * NORMILIZE_TO_MILLIS;
float p995MS = index995 * NORMILIZE_TO_MILLIS;
float p999MS = index999 * NORMILIZE_TO_MILLIS;
return "Latency: 50=" + String.format("%.2f", p50Ms) +
", 90=" + String.format("%.2f", p90MS) +
", 99=" + String.format("%.2f", p99MS) +
", 99.5=" + String.format("%.2f", p995MS) +
", 99.9=" + String.format("%.2f", p999MS);
}
...
...
...
}
private static class Accumulator {
public final static int MAX_ARRAY = 100000;
private final static int MAX_ARRAY_PLUS_ONE = MAX_ARRAY + 1;
final int[] latencies;
public Accumulator() {
this(System.currentTimeMillis(), 0, 0, 0, new int
[MAX_ARRAY_PLUS_ONE]);
}
...
...
</pre>
Welcome your comments,
Thanks,
John Cohen
Looks great. Could you send a patch, so we could apply it and play
with it a bit.
-Jay
I have done modification in code so RequestCounter so it can send a
JMX notification when this instance is about the reset its stats.
counters.
This is a different approach but complementary of the current source
code. I like the notification based approach because an external
program simply will be listening for notifications event, instead of
the standalone program poll in a fix intervals. Also the notification
is accurate in this way because if the standalone program polls every
1 minutes I could be that the counters got reset in between this
minute so all the stats counters won't reflect what happened during
this minute:
"|" start and end of the 5 min. interval
"^" when the standalone program poll
|-----+-----+-----+-----+-----|------+
...............................^....^
On Jan 28, 11:07 pm, Jay Kreps <jay.kr...@gmail.com> wrote:
> HeyJohn,
>
> Looks great. Could you send a patch, so we could apply it and play
> with it a bit.
>
> -Jay
>
Bruce
On Jan 28, 11:07 pm, Jay Kreps <jay.kr...@gmail.com> wrote:
It really depends on your use case. If your client/monitoring solution
is always connected to the voldemort server via jmx then the
notification may be sufficient. However, if your client only connects
occasionally then having stats that actually reflect a given period of
time (i.e. a sliding window) is actually very useful. In our case our
application only connects to show jmx stats to administrators on
demand. Given that usage scenario the current mechanism would actually
be harmful for us to display to administrators simply because it's
sure to elicit bug reports.
There are downsides to the sliding window approach though ... the main
one being that you have a choice between memory usage vs how many
operations you track.
Regards,
Bruce Ritchie
On Mar 11, 1:54 pm, John Cohen <john.java.w...@gmail.com> wrote:
> Hi,
>
> I wonder if the windows approach is really need it. If you have a
> notification based mechanism you don't have to keep a window at all.
> You are getting the information out of Voldemort anyway using a jmx client
> application, the idea of the window is to "smooth" out the data. why to go
> throw the calculation of keeping a sliding window?
> With a notification based approach you receive the information as soon as
> its about the be reset it.
>
> On Thu, Mar 11, 2010 at 1:37 PM, Bruce Ritchie <bruce.ritc...@gmail.com>wrote:
>
>
>
> > Note that this patch might conflict with the change for making stats
> > be sliding window based (http://code.google.com/p/project-voldemort/
> > issues/detail?id=221<http://code.google.com/p/project-voldemort/%0Aissues/detail?id=221>).
> > project-voldem...@googlegroups.com<project-voldemort%2Bunsubscr i...@googlegroups.com>
> > .
> > > > For more options, visit this group athttp://
> > groups.google.com/group/project-voldemort?hl=en.
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "project-voldemort" group.
> > To post to this group, send email to project-...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > project-voldem...@googlegroups.com<project-voldemort%2Bunsubscr i...@googlegroups.com>
To unsubscribe from this group, send email to project-voldem...@googlegroups.com.