Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
calculate "artificial" servers from a set of R and lambda?
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  Messages 1 - 25 of 26 - Collapse all  -  Translate all to Translated (View all originals)   Newer >
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post will appear after it is approved by moderators
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Martin Berger  
View profile  
 More options Mar 15 2012, 10:08 am
From: Martin Berger <martin.a.ber...@gmail.com>
Date: Thu, 15 Mar 2012 07:08:56 -0700 (PDT)
Local: Thurs, Mar 15 2012 10:08 am
Subject: calculate "artificial" servers from a set of R and lambda?

Maybe my idea sounds somewhat strange, so please tell me if I have some big
failures anywhere:
Many here might now the nice graph with "Arrival rate" (*R*) on the x-axis
and "Response time" (*λ*) on y-axis.
It's also some fun to change the number of "servers" in the underlying
excels/R/whatever to show slightly different graphs.
My question is now if someone ever tried to calculate an artificial value
for the servers (*M*) for a given set of measurements (R,λ) on a totally
unknown system?
Is this value of any usage, to describe the system? (The Service time would
be also an outcome of this calculation, I guess).
Any comments on this approach?

thank you,
 Martin


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
M Edward Borasky  
View profile  
 More options Mar 15 2012, 12:00 pm
From: M Edward Borasky <zn...@borasky-research.net>
Date: Thu, 15 Mar 2012 09:00:56 -0700
Local: Thurs, Mar 15 2012 12:00 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?
I did a lot of multidimensional exploratory plots around Little's Law
a while back (2008 CMG conference, to be exact) from "iostat" data on
Linux during a disk benchmark. Something like a scatterplot matrix
would work for what you're trying to do, I think.

2012/3/15 Martin Berger <martin.a.ber...@gmail.com>:

--
Twitter: http://twitter.com/znmeb Data Journalism Developer Studio
2012LX http://j.mp/DJDS2012LX

"A mathematician is a device for turning coffee into theorems." -- Paul Erdős


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Mar 15 2012, 1:04 pm
From: DrQ <redr...@yahoo.com>
Date: Thu, 15 Mar 2012 10:04:42 -0700 (PDT)
Local: Thurs, Mar 15 2012 1:04 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

Hi Martin,

First, let me see if I can more accurately rephrase your question (with
corrected notation) to check my understanding.

Suppose you have measurements for the following metrics:

   - Response time or, more accurately, residence time: R
   - Arrival rate of requests: λ
   - Service times at each service process or facility: S

If you plot those data as follows:

   - (R/S) on the y-axis
   - (λ/S) on the x-axis

we would expect them to fall in a more or less *convex* arrangement, i.e.,
tending to bend *upward* with increasing traffic. See Figure 1 of this blog
post<http://perfdynamics.blogspot.com/2009/07/remembering-mr-erlang-as-uni...>.
With these assumptions, and denoting the number of *active service processes
* by  *m*, your question then becomes: Can we determine which m value best
matches those data?

In principle, you can, but *not* if you don't know the mean service time S.

Assuming you do know S, it's easiest to apply the power law *approximation*to the exact Erlang residence-time formula:

(R/S) = 1 / [ 1 - (λ/S))^m ] .... See eqn. (4.68) in my Perl::PDQ<http://www.perfdynamics.com/iBook/ppa_new.html>book.

You can solve for (or fit) m, but not both m and S simultaneously.

On Thursday, March 15, 2012 9:00:56 AM UTC-7, M. Edward (Ed) Borasky wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Mar 15 2012, 1:22 pm
From: DrQ <redr...@yahoo.com>
Date: Thu, 15 Mar 2012 10:22:02 -0700 (PDT)
Local: Thurs, Mar 15 2012 1:22 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

Now let me correct myself. :/

You cannot analytically *solve* for m without knowing S, but you could do a
regression fit for both m and S simultaneously.

To clarify what I mean by solving for m, that corresponds to determining
which of the possible m-curves (in Figure 1) your data lies on.

Another way to look at it is, the "flatter" the data under heavy traffic,
the greater the value of m servers.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Andrew Sliwkowski  
View profile  
 More options Mar 15 2012, 1:47 pm
From: Andrew Sliwkowski <asliwkow...@gmail.com>
Date: Thu, 15 Mar 2012 13:47:51 -0400
Local: Thurs, Mar 15 2012 1:47 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

Hi DrQ,
Outstanding reply ...Socrates would be very proud..
cheers/drew ...(hope you don't mind the thread interupt)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Mar 15 2012, 2:05 pm
From: DrQ <redr...@yahoo.com>
Date: Thu, 15 Mar 2012 11:05:51 -0700 (PDT)
Local: Thurs, Mar 15 2012 2:05 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

Bloody hell, I just realized I made another notational error. I should have
just used the correct notation in the first place, instead of trying to be
cleverly accommodating.

The per-server utilization is: ρ = λ / (mS), where you have to divide by
the number of servers (m) to ensure that ρ < 1. You can't have any server
busier than 100% of the time.

Then, the normalized residence time is given by:

(R/S) = 1 / [ 1 - ρ^m ]

Once again, you see how much m and S get mixed together.

Sorry about that. Please consider that I have now been slapped by a
recursive Socrates. :)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Berger  
View profile  
 More options Mar 15 2012, 3:39 pm
From: Martin Berger <martin.a.ber...@gmail.com>
Date: Thu, 15 Mar 2012 20:39:33 +0100
Local: Thurs, Mar 15 2012 3:39 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?
Neil,

thank you for your 2 replies.

I did the formal transformations before and my equitation was

        R = S / [ 1 - ( λ/(mS) )^m ]

that was the point where I decided to ask the group.

Your assumption regarding my question is correct. I still try to do my
best to make it somehow useable for me:

Can I say "If I have a measurement with a very small λ (max. only one
request is in the whole system at the same time) I can say S ≈ R" ?
This would make the whole problem slightly easier :-)

I will have to force R (the program, not the residence time) to solve
my problem.

--
Martin Berger           martin.a.ber...@gmail.com
Lederergasse 27/2/14           +43 660 660 83306
1080 Wien                                   http://berx.at/

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Mar 15 2012, 4:07 pm
From: DrQ <redr...@yahoo.com>
Date: Thu, 15 Mar 2012 13:07:32 -0700 (PDT)
Local: Thurs, Mar 15 2012 4:07 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

Yes, quite so.

min(R) = S for all standard queueing systems, because the shortest
residence time is just the service time (i.e., no waiting time) and occurs
only when ρ ≈ 0 or λ ≈ 0.

Equivalently for the normalized residence time, R/S → 1 as ρ → 0.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Mar 15 2012, 4:38 pm
From: DrQ <redr...@yahoo.com>
Date: Thu, 15 Mar 2012 13:38:51 -0700 (PDT)
Local: Thurs, Mar 15 2012 4:38 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

A word of caution here, as I think more about it.

Everything we've discussed so far is based on the assumption that a single
M/M/m queueing model is appropriate. At some point you will need to
validate that assumption independently. Why do I say that?

Your starting point (as I understand it) is *measured* performance metrics.
And suppose further that you are able to determine the minimum residence
time or minimum response time from those data. That minimum time, however,
might *not* correspond to the service time, as previously discussed.

Consider the case where there is another queue (say, M/M/1 for the sake of
argument)  preceding the M/M/m queue or succeeding it or both. In other
words, there was a tandem arrangement of queued service stages (buffers) in
the production system or the test rig. Remember, all you have is your data.

If, for example,  there were 3 such stages in tandem and only a single
request in the system (as you mentioned earlier), the minimum time to get
through that chain of queues would be the sum of the 3 service times at
each buffer. In other words, we would now have min(R) = S1 + S2 + S, in
which case the minimum *response* time min(R) != S minimum *residence* time
for the M/M/m stage.

This is one reason why I stress the distinction b/w *response* time and *
residence* time in my classes<http://www.perfdynamics.com/Classes/schedule.html>.
It's one thing to read it in my books; quite another to have me ramming it
down your throat in real time. ;-)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Mar 15 2012, 9:48 pm
From: DrQ <redr...@yahoo.com>
Date: Thu, 15 Mar 2012 18:48:27 -0700 (PDT)
Local: Thurs, Mar 15 2012 9:48 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

A good example of an M/M/m model that adheres to the aforementioned
validation assumptions can be found in Example 1.2 (p.16) of my Perl::PDQ
book <http://www.perfdynamics.com/iBook/ppa_new.html>. It's a performance
model of an email span-scan farm taken from a very well known, large-scale,
website.

The farm comprises hundreds of 4-way servers. These were modeled as M/M/4
queueing nodes in PDQ. The reason this model works so well is because the
workload is essentially a cpu-intensive batch processing type with no
significant inter-processor communication (i.e., no SMP overhead). In
addition, the website collected many other performance metrics to provide
full validation of this simple model.

In all honesty, the person who was most surprised that this simple model
came together so quickly and worked so well, was me.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Berger  
View profile  
 More options Mar 16 2012, 6:42 am
From: Martin Berger <martin.a.ber...@gmail.com>
Date: Fri, 16 Mar 2012 11:42:01 +0100
Local: Fri, Mar 16 2012 6:42 am
Subject: Re: calculate "artificial" servers from a set of R and lambda?

Neil,

again you are correct in your assumption, as well as your warnings.

Personally I am aware it is risky to simplify any system down to the level
where I am right now.
In fact the whole system is quite complex: An Oracle cluster with shared
storage network and disk arrays below - It is nothing I would ever
volunteer to model in any detail in PDQ. But in this exercise my audience
(internal application team which uses the DB) are facing 'their' database
as a black box.
Currently they are only focused on 'response time' - so they claim, if they
change their code to get a better 'response time', they scale better. I
just observed sometimes they use much more resources *per request* to get
this better response time. So based on my simple math, they will saturate
anything earlier and scale worse.
That's where I started to dream of an easy way to calculate my "artificial"
servers for any of their testcases.
If I can compare implementations with not only S, but a tupel (S,m), I hope
the comparison is more correct.

Martin


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
laks  
View profile  
 More options Apr 5 2012, 1:47 pm
From: laks <lnarayanan.sesha...@gmail.com>
Date: Thu, 5 Apr 2012 10:47:06 -0700 (PDT)
Local: Thurs, Apr 5 2012 1:47 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?
Hi Martin
  Two queries here out of curiosity: For the storage subsystem do you
use Oracle orion or any other kit to test the storage scalability
metrics separately and then fit it into a model separately for your
Oracle workload ( OLTP or any other as you need) ?

Dr Q/Neil , For such cases like a typical Oracle Cluster/RAC workload
with a shared storage( NAS/SAN or any other) do you suggest modelling/
testing the storage separately and then do the CPU subsystem
separately (SMP, NUMA)

Pls excuse me if this goes off topic or unrelated.

thx
Laks

On Mar 16, 3:42 pm, Martin Berger <martin.a.ber...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Apr 5 2012, 2:04 pm
From: DrQ <redr...@yahoo.com>
Date: Thu, 5 Apr 2012 11:04:41 -0700 (PDT)
Local: Thurs, Apr 5 2012 2:04 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

For my part, I would say so for 2 reasons:

1) The sequence of PDQ queueing nodes in your CaP model does not have to
match every possible component that exists in the real system. It just has
to be sufficient to resolve where the major bottlenecks are occurring or
will occur in the future.

2) If at all possible it is good to measure the latency of the network
storage subsystem separately. The point being that the response time part
of the PDQ model should be a lot simpler than you might imagine or that you
would read about in a book like Simitci's "Storage Network Performance
Analysis," (2003); a good book, btw and I know of no other equivalent book.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
laks  
View profile  
 More options Apr 5 2012, 2:54 pm
From: laks <lnarayanan.sesha...@gmail.com>
Date: Thu, 5 Apr 2012 11:54:22 -0700 (PDT)
Local: Thurs, Apr 5 2012 2:54 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?
Thanks Neil for your replies/clarification here.

Regards
Laks

On Apr 5, 11:04 pm, DrQ <redr...@yahoo.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Martin Berger  
View profile  
 More options Apr 6 2012, 1:04 am
From: Martin Berger <martin.a.ber...@gmail.com>
Date: Fri, 6 Apr 2012 07:04:11 +0200
Local: Fri, Apr 6 2012 1:04 am
Subject: Re: calculate "artificial" servers from a set of R and lambda?
Laks,

in general I try at least to get a 'good knowledge' about the storage
subsystem.
That means a study of the documentation as well as a test with orion
or some other tools. (If you want to test physical and logical IOs of
an oracle instance, maybe you are interested in [1] )
As we have in general full storage networks, there sometimes are
effects our storage admins does not want to accept at first sign. As
an example: we tested one big virtual 'disk' attached to a linux host
versus 10 smaller 'disks (which where based on the same underlying
structures - so they where 'the same thing' - but we got much better
results by these 10 disks. - I was never allowed to investigate this
to find the real reason, but my curent working theory is the multipoe
IO-queues on OS are the key for the better results.

In this particular question I do NOT want to model any sub-system - in
fact I try to do it the other way: step 'back' until I can see the
whole system as a single black box, where all the details can not e
seen anymore.

let's see if it gives me any good :)

Martin

[1] http://kevinclosson.wordpress.com/2012/02/06/introducing-slob-the-sil...


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
laks  
View profile  
 More options Apr 6 2012, 11:18 am
From: laks <lnarayanan.sesha...@gmail.com>
Date: Fri, 6 Apr 2012 08:18:59 -0700 (PDT)
Local: Fri, Apr 6 2012 11:18 am
Subject: Re: calculate "artificial" servers from a set of R and lambda?
Thanks Martin for clarification. I am getting what you wanted as i am
also aware of the situation you explain.

On Apr 6, 10:04 am, Martin Berger <martin.a.ber...@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Apr 6 2012, 11:45 am
From: DrQ <redr...@yahoo.com>
Date: Fri, 6 Apr 2012 08:45:24 -0700 (PDT)
Local: Fri, Apr 6 2012 11:45 am
Subject: Re: calculate "artificial" servers from a set of R and lambda?

Just to clarify on point (2):

From everything I've seen (which isn't necessarily that much), I believe
correctly tuned and appropriately allocated network storage should respond
like a memory reference (i.e., remote cache) such that the response time is
possibly even dominated by the network latency, on average. This effect
probably degrades under heavy traffic conditions. But even if this is just
an approximate truth, it should serve as your *performance goal*, which you
may or may not be able to attain, depending on the app and all that. If you
can't attain it, you must be able to explain why not. It might not be your
fault. :)

The other upshot of this view is that it simplifies any CaP models you may
decide to make, e.g., in PDQ. They should be much much simpler than
anything you see in Simitci's book.

ASIDE:

   1. I think I could almost make the above observation of mine into a
   theorem.
   2. The problem is, I need more data at different workload intensities
   (request rates). I have only seen data that lies in some specific bands and
   it seems to be valid there.
   3. Many people (including Guerrillas) have promised to provide such, but
   none have ever delivered.

So, if you wanna get famous, talk to me.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
wouterdb  
View profile   Translate to Translated (View Original)
 More options May 30 2012, 12:05 pm
From: wouterdb <w.debor...@gmail.com>
Date: Wed, 30 May 2012 09:05:48 -0700 (PDT)
Local: Wed, May 30 2012 12:05 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

Hi all,

I've been working on this problem for a while. I've built a least squares
fitter for queueing stations. (https://github.com/wouterdb/fitlib)

I found out that just a queueing station doesn't really fit my data all
that well, I've played around with some of the suggestions from the PDQ
book and added a parameter that scales all lambdas (request rate).

This gives me a formula of the form R(L) = S/(1-(b*L*S/n)^n).

I take n as a given, R and L are measured and b and S are fitted.
The extra b parameter usually yields a pretty good fit (e.g:
https://github.com/wouterdb/fitlib/raw/master/example.png)

What do you think of this approach?

Wouter

PS: feel free to play around with the code, patches and datasets are
welcome.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Jun 22 2012, 3:16 am
From: DrQ <redr...@yahoo.com>
Date: Fri, 22 Jun 2012 00:16:27 -0700 (PDT)
Local: Fri, Jun 22 2012 3:16 am
Subject: Re: calculate "artificial" servers from a set of R and lambda?

Not many responses, so far.

I, for one, can't say I follow your math or the overall objective based on
a single M/M/m queue (if that's what your formula is supposed to
represent), but I can say this. It doesn't do your cause any favors to
produce a plot with no labeled axes or legend of any kind. If you are doing
science, rather than art, the reader shouldn't be left guessing what you
are trying to convey. Otherwise, the response is likely to be a deafening
silence.


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wouter De Borger  
View profile   Translate to Translated (View Original)
 More options Jun 26 2012, 8:49 am
From: Wouter De Borger <w.debor...@gmail.com>
Date: Tue, 26 Jun 2012 14:49:11 +0200
Local: Tues, Jun 26 2012 8:49 am
Subject: Re: calculate "artificial" servers from a set of R and lambda?

I stand corrected, new graph attached

What I am trying to do is automatically determine the parameters of
an M/M/m queue, given a set of measurements.
In the example graph, the data is from a mysql database, executing a set of
select queries.

However, it turns out that an M/M/m queue doesn't fit the data all too
well, so I added overdriven throughput  (as in 'analyzing computer system
performance with PDQ, 2e ed, p 381').

I have two concrete questions about this:

   1. I was wondering if anyone else has ever tried to do automated
   parameter estimation before?
   2. How can I improve the closeness of fit of the model?

Wouter

...

read more »

  example.png
56K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Jun 26 2012, 11:57 am
From: DrQ <redr...@yahoo.com>
Date: Tue, 26 Jun 2012 08:57:17 -0700 (PDT)
Local: Tues, Jun 26 2012 11:57 am
Subject: Re: calculate "artificial" servers from a set of R and lambda?

re: Plot. Vast improvement. Nice job. One tweak you might add is to note
which model you are fitting against.
Anyway, now we can get down to some brass tacks.

Before launching off into various performance modeling exotica (like,
nonlinear regression and overdriven demand), you need to convince yourself
that the target model (i.e., M/M/m) even makes sense for an RDMBS. For
example, I could take the position that any RDMBS can only have a finite
number (N) of processes handing DB requests. In which case, the queue
length is bounded by N. An open queueing model like M/M/m is not compatible
with that assumption b/c there, the number of requests can be unbounded. So
which it it?

On the other hand, looking at your nice new plot, it seems to indicate that
the RDMBS you're measuring does behave like an unbounded queue, b/c it goes
asymptotic at about 2500 RPS. If I assume that  indication (fitted curve)
is correct and apply Little's law at saturation, I calculate your mean
service time to be about 0.0004 s. And that, indeed, seems to jive with
your y-intercept. Of course, we could've seen this immediately if you'd
plotted against the utilization rather that the request rate on the x-axis.
(cf. Fig. 4.17 on p. 121 of PPQ2)

But you say, "an M/M/m queue doesn't fit the data all too well" which I now
find puzzling. Your new plot has pretty much convinced me of the thing you
were trying to unconvince me of.

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wouter De Borger  
View profile   Translate to Translated (View Original)
 More options Jun 27 2012, 5:55 am
From: Wouter De Borger <w.debor...@gmail.com>
Date: Wed, 27 Jun 2012 11:55:22 +0200
Local: Wed, Jun 27 2012 5:55 am
Subject: Re: calculate "artificial" servers from a set of R and lambda?

On Tue, Jun 26, 2012 at 5:57 PM, DrQ <redr...@yahoo.com> wrote:
> re: Plot. Vast improvement. Nice job. One tweak you might add is to note
> which model you are fitting against.
> Anyway, now we can get down to some brass tacks.

> Before launching off into various performance modeling exotica (like,
> nonlinear regression and overdriven demand), you need to convince yourself
> that the target model (i.e., M/M/m) even makes sense for an RDMBS. For
> example, I could take the position that any RDMBS can only have a finite
> number (N) of processes handing DB requests. In which case, the queue
> length is bounded by N. An open queueing model like M/M/m is not compatible
> with that assumption b/c there, the number of requests can be unbounded. So
> which it it?

The number of concurrent request is bounded, but the server is configured
so that we never hit that limit.
How can I best model limited queue size?

> On the other hand, looking at your nice new plot, it seems to indicate
> that the RDMBS you're measuring does behave like an unbounded queue, b/c it
> goes asymptotic at about 2500 RPS. If I assume that  indication (fitted
> curve) is correct and apply Little's law at saturation, I calculate your
> mean service time to be about 0.0004 s. And that, indeed, seems to jive
> with your y-intercept. Of course, we could've seen this immediately if
> you'd plotted against the utilization rather that the request rate on the
> x-axis. (cf. Fig. 4.17 on p. 121 of PPQ2)

New plot attached

> But you say, "an M/M/m queue doesn't fit the data all too well" which I
> now find puzzling. Your new plot has pretty much convinced me of the thing
> you were trying to unconvince me of.

The plot fits quite well, but only because the request rate has been
multiplied by 2.3. I know it works, but I don't see its real world meaning.
I think a multiplier smaller than one would mean there is a delay station
somewhere inside the server, which adds to the response time, but not
to utilization.
But what does a multiplier larger than one mean?

...

read more »

  UtilExample.png
74K Download

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Jun 27 2012, 1:40 pm
From: DrQ <redr...@yahoo.com>
Date: Wed, 27 Jun 2012 10:40:40 -0700 (PDT)
Local: Wed, Jun 27 2012 1:40 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

Well, you can't with an M/M/m model. But let's come back to that later.

>> On the other hand, looking at your nice new plot, it seems to indicate
>> that the RDMBS you're measuring does behave like an unbounded queue, b/c it
>> goes asymptotic at about 2500 RPS. If I assume that  indication (fitted
>> curve) is correct and apply Little's law at saturation, I calculate your
>> mean service time to be about 0.0004 s. And that, indeed, seems to jive
>> with your y-intercept. Of course, we could've seen this immediately if
>> you'd plotted against the utilization rather that the request rate on the
>> x-axis. (cf. Fig. 4.17 on p. 121 of PPQ2)

> New plot attached

Me too. See attached PNG (somewhere. I'm really learning to hate GGs).

BTW, have you considered getting on the R train?

df <- read.csv("~/Desktop/wouterdb/wdata")

m <- 2

mmm.fit <- nls(Rtime ~ S/(1-(b*Arate*S/m)^m), data=df, start=list(S=1e-4,b=
1.0))

summary(mmm.fit)

plot(df,xlab="Arrival rate (RPS)",ylab="Response time (s)")

lines(df$Arate,predict(mmm.fit),col="blue")

title(main=paste(paste("Parametric M/M/",toString(m),sep=""),"Model of
RDBMS Data"))

Done.

<https://lh3.googleusercontent.com/-apAsG6BCIGw/T-tD7vDNOHI/AAAAAAAABI...>

>> But you say, "an M/M/m queue doesn't fit the data all too well" which I
>> now find puzzling. Your new plot has pretty much convinced me of the thing
>> you were trying to unconvince me of.

> The plot fits quite well, but only because the request rate has been
> multiplied by 2.3. I know it works, but I don't see its real world meaning.
> I think a multiplier smaller than one would mean there is a delay station
> somewhere inside the server, which adds to the response time, but not
> to utilization.
> But what does a multiplier larger than one mean?

Dunno. I have to think more about this, now that I can reproduce your plot.

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Wouter De Borger  
View profile  
 More options Jun 28 2012, 3:54 am
From: Wouter De Borger <w.debor...@gmail.com>
Date: Thu, 28 Jun 2012 09:54:56 +0200
Local: Thurs, Jun 28 2012 3:54 am
Subject: Re: calculate "artificial" servers from a set of R and lambda?

I've never considered R before, but I'll look into it ;-)

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
DrQ  
View profile  
 More options Jun 28 2012, 8:20 pm
From: DrQ <redr...@yahoo.com>
Date: Thu, 28 Jun 2012 17:20:21 -0700 (PDT)
Local: Thurs, Jun 28 2012 8:20 pm
Subject: Re: calculate "artificial" servers from a set of R and lambda?

A. You claim to have measured MySQL query response times.
==========================================================

* What measurement tools were used to obtain your data?

* What is the accuracy of those meaurements?

* Most of your data has up to 16 or 20 significant digits

> max(nchar(as.character(df$Arate)))
[1] 16
> max(nchar(df$Rtime))

[1] 20

That puts you in the same league as the best quantum measurements. :)

B. Taking the measurements as read, we can determine the following
==========================================================

* S=0.00039 estimated.

We see the data asymptotes near  max(Arate).
By Little's law:

> 1/max(df$Arate)

[1] 0.0003914660

If we round down and use S=0.00039 to compute the theoretical M/M/1 and
M/M/2
residence times, we get the uppermost dashed curve and the lowest dashed
curves,
respectively. Your data lies within that envelope. See plot.

<https://lh4.googleusercontent.com/-dP3Bda_PNjQ/T-zyPpndZII/AAAAAAAABI...>

The M/M/2 curve is much flatter b/c there is twice as much capacity
to handle DB queries, so it takes a higher request rate for a waiting
line/queue to form. This is the motivation for you introducing the fitting
parameter-b.

The results of your fit in python for m=2 are:
S=0.000336
b=2.34

The blue solid curve is your M/M/m model S/(1-(b*Arate*S/m)^m) with these
parameter values.

My regression fit in R produces:

> mmm.S
0.0004673506
> mmm.b

1.597472

shown as the red solid curve. Examining the residuals, etc., would tell us
which is the
better fit, but that's not important here.

The red dotted curve is your parametric model with my b=1.597472 from
regression analysis and my estimated S=0.00039 value.
That curve is closer to M/M/2 theoretical.

The red dashed curve (near the blue curve) corresponds to your parametric
model
but with the parameters: S=0.00039 and b=2. As you can see, they are very
close.
NOTE: If I use your value of b=2.34 instead of 2, the model blows up.

C. Interpretation of your parametric M/M/m model
==========================================================

This suggests to me that the role of your b-parameter is to counterbalance
the value
of the m-servers in the denominator of the utilization term in the M/M/m
model.
What does this interpretation mean?

Looking at this diagram for an M/M/m queue http://is.gd/3Q1R7w if
we have m=2 servers, each server cannot exceed 100% busy in the M/M/m
residence time formula. With a request rate of 2*lambda coming into the
queue, each server gets half of that traffic, viz., 2*lambda/2 per server.
But this expression has the same form as b*lambda/m in your parametric
model.

So, the effect of the b-parameter is to create a *fractional server* model.
Instead of there being m=2 servers, it's acting as though there are 2/1.59
= 1.25 (non-integral) servers,
or whatever, and hence the parametric curve lies somewhere b/w
the m=1 and m=2 theoretical curves.

The fly in the ointment, however, is that the *exponent* is still integral.
Why?

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Messages 1 - 25 of 26   Newer >
« Back to Discussions « Newer topic     Older topic »