Average session length and workload modelling

223 views
Skip to first unread message

Alex Bagehot

unread,
Jun 27, 2014, 6:49:07 PM6/27/14
to guerrilla-cap...@googlegroups.com
Hello,

I have put some thoughts down on paper around the implications of average session length and would be interested if anyone had review comments. These ideas are not new but they may not be widely accepted.


I only touch on it at the end but I feel there is a bit of an elephant in the room around (not) performance test workload modelling open/partly open systems. Perhaps this is because there is not a lot literature on it. Where it is mentioned it is typically very brief so people may not be aware. Contrast this with an endless body of documentation on concurrent users.
I mentioned it on a couple of other forums but didn't get much constructive discussion so far - I am hoping there may be some people interested in this here.

Thanks,
Alex

James Newsom

unread,
Jun 28, 2014, 1:51:52 AM6/28/14
to guerrilla-cap...@googlegroups.com
Something that I have observed first hand with not average page views per customer but rather average page views per order for a large e-commerce system is that the ratio of page views per order drop dramatically. In the middle of the year the customers would average about 1,500 page views per order but during Black Friday/Cyber Monday the ratio would drop dramatically down to 286 page views per order.

That was something that I had to take into consideration when I was building my PyDQ models for planning peak capacity and performance. Once upon a time I had a decent scatter plot showing the relationship between page views per order and increasing orders on light days versus heavier days when we were having decent middle of the year sales that I used prepare for the peak sales days of Black Friday/Cyber Monday.

James

--
You received this message because you are subscribed to the Google Groups "Guerrilla Capacity Planning" group.
To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-...@googlegroups.com.
To post to this group, send email to guerrilla-cap...@googlegroups.com.
Visit this group at http://groups.google.com/group/guerrilla-capacity-planning.
For more options, visit https://groups.google.com/d/optout.

Alex Bagehot

unread,
Jun 28, 2014, 10:46:05 AM6/28/14
to guerrilla-cap...@googlegroups.com

Very good point - as users become more focused on buying presents the workload changes significantly. The long term averages I was looking at (mainly as they are the only ones I can find across sites) will not show that detail. [Averages aren't great either as a stat, but again it's what is available.]

Out of interest, Alexa does provide some insight into whether a site might be affected by significant changes in workload:

staples clearly has a spiky/bursty workload (possibly cyber monday):

http://www.alexa.com/siteinfo/staples.com

The guardian newspaper not (changed domains last year):

http://www.alexa.com/siteinfo/www.theguardian.com


Thanks!

Alex

Greg Hunt

unread,
Jun 28, 2014, 11:05:02 PM6/28/14
to guerrilla-cap...@googlegroups.com
James
Isn't that because its actually a different workload?  With major sales periods people are at the site for different reasons from non-sale periods.  For a general retailer its the difference between "I think I want a handbag" and "going to buy the kids' Christmas presents NOW".  So lots of things change: the different page views per order number is the product of both a different session length and a different conversion rate (a different propensity to buy), but there are more behavioural differences.  In a major sale the conversion rate rises, average session length may go up or down (a number of different groups of users and buying behaviours), the number of unique visitors rises, average transaction value changes, and search patterns are quite different, for example Lego might displace women's fashion as the top search term, which can do funny things to caching, if you have that problem, and the number of search result pages that uses look at changes (the shape of the search result tail changes). 

Greg

harry van der horst

unread,
Jul 2, 2014, 4:28:35 PM7/2/14
to guerrilla-cap...@googlegroups.com
Alex, I have some experience on ebanking systems. 
There we observe that at peakhours both the transfers and the enquiries go up a lot, but the enquiries more than the transfers.
However we also observe that any delays at peakhours will cause major aggravation with the customers.
So the commercial cost of failure at peakhours is a lot worse then at normal hours.
in other words, if you put a price on a message, at peaktime, the value must be higher.
Have you ever considered pricing yr pages?
It is not easy to get good data for that , but it gives you excellent access to senior mngmnt and the business boys and girls.
And after the first time, the exercise gets easier.
harry




--
You received this message because you are subscribed to the Google Groups "Guerrilla Capacity Planning" group.
To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-...@googlegroups.com.
To post to this group, send email to guerrilla-cap...@googlegroups.com.
Visit this group at http://groups.google.com/group/guerrilla-capacity-planning.
For more options, visit https://groups.google.com/d/optout.



--
met hartelijke groeten/kind regards
harry van der Horst
M 0031643016999

James Newsom

unread,
Jul 3, 2014, 1:03:35 AM7/3/14
to guerrilla-cap...@googlegroups.com
Yes, the workload does change with the difference of ratios in the pages hit (e.g., 1200 product pages per order middle of the year versus 250 product pages per order at peak) but they still are hitting the same pages (e.g., product pages, checkout pages, et cetera) albeit not at the same rate middle of the year versus peak season. It's something that has to be accounted for in e-commerce systems at least. I don't know if banking systems experience the same shits during heavy traffic like e-commerce sites do.

One of the things I would do was to perform production stress tests where I would isolate live customer traffic to a small set of servers in order to drive the utilization of the boxes up in order to collect metrics to determine the service demand of the hits to the system while we were having a decent sale.

James

James Newsom

unread,
Jul 3, 2014, 1:06:04 AM7/3/14
to guerrilla-cap...@googlegroups.com
When I was doing capacity planning for a large e-commerce site we would always build up the system to handle the traffic on peak hour peak day for Black Friday and Cyber Monday. It meant that we had a lot more extra hardware during the middle of the year but it was worth it for those few days that were oh-so-important. One of our goals besides sales was to make sure that we stayed off the front page of the Wall Street Journal for having a site outage on the most important week of the year for our business.

James

DrQ

unread,
Jul 4, 2014, 2:59:31 PM7/4/14
to guerrilla-cap...@googlegroups.com
Alex, I had a lot of trouble following the points in your blog post and ultimately what was the key idea. That seems to be consistent with the lack of response on other fora that you noted. Too many words without any formalization.

At one point, you do ask about how users can be modeled but I didn't see one or how it might best be constructed for a test rig. Inevitably, I can't help but look at such descriptions from a more formal queue-theoretic standpoint. Presumably you also have something like that in my too, because you do use some QT terms, like open and closed (although without definition). But then I see strange terms like "arrival throughput". The QT expression for incoming requests is arrival rate. On the output side you could use the term departure rate to differentiate. If the arrival rate is non-zero but the departure rate is effectively zero, then the system throughput will also be zero: with queues building internally in the SUT.

Another strange phrase that threw me: "we expect 10 new independent users to arrive at the site every second."  In QT we normally think of users as submitting requests into the SUT. The (finite number of) users are represented by the load-test scripts. So, it's the requests that arrive, not the users. You may have a different definition, but I'd have to see formally what that was.

I also had a lot of trouble getting an unambiguous handle on the difference between pages, page views, sessions, etc. I gave up. If you attended one of my classes, I would have you up in front of the whiteboard (in private, if preferred) attempting to define clearly, explicitly and symbolically, each of your terms, e.g., the system throughput starts out as X = 1000 pages/second or whatever.

Skipping ahead to your plot, you state "the arrival throughput is related to the average session length by a power function..." I don't see a power-law (for which you would require some serious justification). I see a simple inverse relationship of some kind.  If I assume you actually mean the system throughput, not the arrival rate (in the title), and the page views are associated with some composite service time S, then Little's law tells us that the system throughput X is related to S by 

X = ρ / S

where ρ is the server utilization. For a constant ρ, X and S are inversely related. If, as you say "At the time of the outage the users already interacting with the site will block completely" then,  ρ = 0 and therefore X = 0: independent of S. No news there, but the system is no longer in steady state. Once all the possible requests are submitted, the internal queues will grow and the system will come to a screeching halt. I guess that's the definition of an outage. But, at that point we're talking about a functional issue, not a performance issue.

Even if we finally understood what you are actually trying to model, we would still need to determine how that should be mapped onto the test rig configuration. But at least we would have a well understood and consistent QT framework to refer to.

Alex Bagehot

unread,
Jul 6, 2014, 11:36:43 AM7/6/14
to guerrilla-cap...@googlegroups.com
thanks for your time reviewing, and comments.
Apologies for the lack of formality. hopefully we can start to remove the words and become more formal.
The question/idea I have is:


what model type should most public websites be modelled as: open, partly-open [ http://users.cms.caltech.edu/~adamw/papers/openvsclosed.pdf ] or closed [ section 3.2 types of circuits Perl::PDQ] ?
I believe open or partly open from my own experience and the literature including your Perl::PDQ book,  and blog posts.

This seems to be at odds with what I see in terms of 99% of questions along the lines of "how should I test my site with X concurrent users?" on various fora. Fixed numbers of users is associated with closed models. In addition some tools may have an incremental cost per simulated user provoking a driver to minimise he number of those users which could cause the test problems.

Interestingly even if the question is "how should I test my site with X requests per second/minute/hour/day/etc?", the answer that invariably comes back is "first you need to calculate the number of users... by this method...". And so immediately the problem stated in open model arrival rate terms is translated into a closed type model. I can only assume this is because of tool limitations that mean an open model is not supported directly without translating into a fixed number of users where the arrival rate up to a point is also fixed via pacing/ variable think time. 

** number of concurrent users is still important. The question is whether it should be an input to the model or be validated as an output.

My interest in this comes from an interest in developing volume test tools and a general curiosity to resolve the current situation, of extreme focus on fixed concurrent users, that does not seem to be correct. I don't think I have anything to add to QT itself. It's more how volume test tools can be best implemented to support accurate tests and consistently with QT where possible. I contribute to a load simulator called Gatling currently.


I have put some comments inline below also.

Thanks,
Alex

On Fri, Jul 4, 2014 at 7:59 PM, 'DrQ' via Guerrilla Capacity Planning <guerrilla-cap...@googlegroups.com> wrote:
Alex, I had a lot of trouble following the points in your blog post and ultimately what was the key idea. That seems to be consistent with the lack of response on other fora that you noted. Too many words without any formalization.

At one point, you do ask about how users can be modelled but I didn't see one or how it might best be constructed for a test rig. Inevitably, I can't help but look at such descriptions from a more formal queue-theoretic standpoint. Presumably you also have something like that in my too, because you do use some QT terms, like open and closed (although without definition).

I am interested in public websites (rather than admin type sites like call centers) and what model type they should be as this would likely influence how a volume test rig is set up.

... as part of reviewing your posts in this area I have seen this one:

It seems to be very close to what I am interested in. The message is that we should model internet traffic / public websites with the open type.


In the above posts you say "The sticking point is that conventional load-test simulators like LoadRunnerJMeter, and httperf, represent the load in terms of a finite number of virtual user (or vuser) scripts, whereas the Internet has an indeterminately large number of real users creating load"

It is possible now to write scripts in new load simulation tools (especially for example where the session length is short) where an indeterminately large number of users is created over time, they are applied to the SUT at a specified mean arrival rate and over time large numbers could pass through the system . I think this assumption, that we can only load test with finite users, is now invalidated for certain tools. There will be some limits - and given few people script in this manner there are likely some bugs and tuning to apply to the tools to allow for scaling up to ever larger numbers. Iago is one such tool. see for the tool writers thoughts in this area last year:
https://www.youtube.com/watch?v=99RABfKNfcY#t=935


The challenge with purely open models, where each customer only makes 1 request to the system, is that no real sites (or very few) seem to exist for it (at least in the top 50 websites I reviewed from Alexa in my blog post none existed). As my understanding of the QT definition of open means each user only makes 1 request or in other words each request is completely independent. However, I suspect that some of the sites with low average pages per session may have significant parts of the workload where only 1 page is requested in a session.

Furthermore can we use "http request" and "page view" interchangeably in QT models? 
Current internet statistics [ http://httparchive.org/trends.php ] suggest that the average number of http requests taking account of caching is around 45 per page view (or "interaction" [ http://www.w3.org/TR/html5/browsers.html#browsing-the-web ] if no "page" is loaded eg. for single page applications).
So in the context of website simulation when we say "request" in QT is that the request the user makes (eg. a click in a browser), or each of the 45 http requests that the browser makes for each page/user click?


You set out 3 types in section 3.2 "types of circuits" in Perl::PDQ.
This paper [ http://users.cms.caltech.edu/~adamw/papers/openvsclosed.pdf ] however describes a different type: partly-open that seems to me to be closer to how users interact with real sites and volume tests are scripted. 
is there some reconciliation of that to QT?


 
But then I see strange terms like "arrival throughput". The QT expression for incoming requests is arrival rate.
Agree my mistake.
 
On the output side you could use the term departure rate to differentiate. If the arrival rate is non-zero but the departure rate is effectively zero, then the system throughput will also be zero: with queues building internally in the SUT.

Another strange phrase that threw me: "we expect 10 new independent users to arrive at the site every second."  In QT we normally think of users as submitting requests into the SUT. The (finite number of) users are represented by the load-test scripts.

As mentioned above re. finite users that is an assumption that doesn't always hold now - tools like Iago can generate an arbitrary arrival rate of customers in an open model fashion now.

 
So, it's the requests that arrive, not the users. You may have a different definition, but I'd have to see formally what that was.

I am probably not being very formal - but in an open model users/customers arrive and depart as mentioned in Perl::PDQ section 2.4.2 "arrival rate".
There are various figures showing this eg. 2.14, 2.15.
I understand though that queues and servers only deal with requests.
 

I also had a lot of trouble getting an unambiguous handle on the difference between pages, page views, sessions, etc. I gave up. If you attended one of my classes, I would have you up in front of the whiteboard (in private, if preferred) attempting to define clearly, explicitly and symbolically, each of your terms, e.g., the system throughput starts out as X = 1000 pages/second or whatever.

Have you ever considered doing a class in the UK? I am sure there would be demand for it.
I have provided some more information above though on the detail around page/request etc.

A session is fairly simple - 1 ordered sequence of requests by a user.
For example the customer arrives, then makes several ordered requests in a closed loop, once that is complete the customer departs.
Those request could be for example home page, then browse page then product page, then add to basket, then order confirmation.
 

Skipping ahead to your plot, you state "the arrival throughput is related to the average session length by a power function..." I don't see a power-law (for which you would require some serious justification). I see a simple inverse relationship of some kind.

Yes my mistake.
 
 If I assume you actually mean the system throughput, not the arrival rate (in the title), and the page views are associated with some composite service time S, then Little's law tells us that the system throughput X is related to S by 

X = ρ / S

where ρ is the server utilization. For a constant ρ, X and S are inversely related. If, as you say "At the time of the outage the users already interacting with the site will block completely" then,  ρ = 0 and therefore X = 0: independent of S. No news there, but the system is no longer in steady state. Once all the possible requests are submitted, the internal queues will grow and the system will come to a screeching halt. I guess that's the definition of an outage. But, at that point we're talking about a functional issue, not a performance issue.

Even if we finally understood what you are actually trying to model, we would still need to determine how that should be mapped onto the test rig configuration. But at least we would have a well understood and consistent QT framework to refer to.

I hope I have made it clearer with some more detail.
 

On Friday, June 27, 2014 3:49:07 PM UTC-7, ceeaspb wrote:
Hello,

I have put some thoughts down on paper around the implications of average session length and would be interested if anyone had review comments. These ideas are not new but they may not be widely accepted.


I only touch on it at the end but I feel there is a bit of an elephant in the room around (not) performance test workload modelling open/partly open systems. Perhaps this is because there is not a lot literature on it. Where it is mentioned it is typically very brief so people may not be aware. Contrast this with an endless body of documentation on concurrent users.
I mentioned it on a couple of other forums but didn't get much constructive discussion so far - I am hoping there may be some people interested in this here.

Thanks,
Alex

DrQ

unread,
Jul 6, 2014, 1:22:38 PM7/6/14
to guerrilla-cap...@googlegroups.com
Thanks for expanding on your original GG post.

>> should most public websites be modelled as: open, partly-open or closed

A couple of points to consider:
  1. The terms open/closed refer to analytic queueing models. 
  2. A load-test/perf-test rig is a workload simulation model
  3. There are many serious questions about how (1) gets implemented in (2). 
  4. By default all test rigs are closed models in that there can only be a finite number of workload generators. Any deviations from this operation need to be clearly identified and explained.
  5. Item (3) above should be considered in steady state without muddying the waters by introducing non-equilibrium conditions like in-flight outages, etc. Don't let a perfectly good model be destroyed by so-called reality.
The idea is to start your blog post, for example, with a familiar system by clearly defining the associated conditions and assumptions and then building up to the deviant system you actually want to consider, indicating where various assumptions need to be relaxed or modified.

In that vein, take a look at Guerrilla Class supplementary materials: in particular the 2012 paper by Jim Brady and compare and contrast with the 2006 CMU paper. Any additional commentary on these papers is also welcomed here.

I'll come back to your other comments as I find time.

--njg


On Sunday, July 6, 2014 8:36:43 AM UTC-7, ceeaspb wrote:
thanks for your time reviewing, and comments.
Apologies for the lack of formality. hopefully we can start to remove the words and become more formal.
The question/idea I have is:


what model type should most public websites be modelled as: open, partly-open [ http://users.cms.caltech.edu/~adamw/papers/openvsclosed.pdf ] or closed [ section 3.2 types of circuits Perl::PDQ] ?
I believe open or partly open from my own experience and the literature including your Perl::PDQ book,  and blog posts.

This seems to be at odds with what I see in terms of 99% of questions along the lines of "how should I test my site with X concurrent users?" on various fora. Fixed numbers of users is associated with closed models. In addition some tools may have an incremental cost per simulated user provoking a driver to minimise he number of those users which could cause the test problems.

Interestingly even if the question is "how should I test my site with X requests per second/minute/hour/day/etc?", the answer that invariably comes back is "first you need to calculate the number of users... by this method...". And so immediately the problem stated in open model arrival rate terms is translated into a closed type model. I can only assume this is because of tool limitations that mean an open model is not supported directly without translating into a fixed number of users where the arrival rate up to a point is also fixed via pacing/ variable think time. 

** number of concurrent users is still important. The question is whether it should be an input to the model or be validated as an output.

My interest in this comes from an interest in developing volume test tools and a general curiosity to resolve the current situation, of extreme focus on fixed concurrent users, that does not seem to be correct. I don't think I have anything to add to QT itself. It's more how volume test tools can be best implemented to support accurate tests and consistently with QT where possible. I contribute to a load simulator called Gatling currently.


I have put some comments inline below also.

Thanks,
Alex

To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-planning+unsub...@googlegroups.com.
To post to this group, send email to guerrilla-capacity-planning@googlegroups.com.

Alex Bagehot

unread,
Jul 8, 2014, 12:13:29 PM7/8/14
to guerrilla-cap...@googlegroups.com
thanks for the feedback, I read the Brady 2012 paper. 

Certainly reporting in terms of RPS is what I have been doing to date. I learned this lesson many years back when you experience that production environment measure rates mostly.
It doesn't reference the openvsclosed paper though I would guess because it's focus is on how to make the best out of JMeter. My current focus is not to limit what tool is used which would imply comparing/contrasting real load generators as a reference point. Also it notes the difference between real and simulated users but does not identify what kind of loop the real user population is (or whether it is a loop at all).

It identifies that "Traffic can be offered to the Figure 2 [real user] target system in an unbridled way that will cause it to overload"
and then states "Mimicking the independent behavior of real users can be difficult to accomplish in the limited resource closed loop environment of Figure 1"

So that is one of the main points of this thread, that more recent tools employ new techniques like non-blocking asynchronous io code, lock free algorithms, message passing rather than thread per user and zero copy to allow for a massively more scalable/efficient load generator. This should allow us enough resource to simulate at high rates a new unique user arriving, and once done in the closed request loop departing not to return back into the system.

User independence is discussed. It doesn't include how independent users can offer load to the system in an unbridled or uncoordinated way. I agree that for a test to pass we need to acheive steady state. We also need to replicate or predict how systems will fail, the lack of unbridled offered load is a risk. Again a difference in focus which will be useful to cover off at this time.


I think I will need to write my own paper at the end of this discussion to help others avoid the questions that are still outstanding.

Thanks,
Alex

To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-...@googlegroups.com.
To post to this group, send email to guerrilla-cap...@googlegroups.com.

DrQ

unread,
Jul 8, 2014, 1:00:09 PM7/8/14
to guerrilla-cap...@googlegroups.com
One quick comment that may help to further clarify what Jim Brady is trying to say in his paper.

A somewhat latent assumption in the definition of a closed queueing network is that each of the N users/generators/scripts cannot issue more than one outstanding request. Jim does not state this assumption. This one-2-one correspondence accounts for the total number of possible requests being finite and equal to N. At the risk of mixing metaphors a little, these requests are either enqueued (in the SUT) or waiting for a mean service time Z while the user "thinks" about what to do next. Moreover, the constraint creates a negative feedback or self-throttling queueing system. So technically, the arrival pattern is correlated and therefore not truly exp distributed.

For an open queueing network this constraint is relaxed such that the total number of requests in the system can be unbounded (but not infinite). Since there is no feedback, the arrivals are exp distributed. These assumptions are more clearly stated in the CMU-Usenix paper.

What Jim describes in the first part of his paper is how he uses the closed environment (the default for load-test rigs like LR and JM) to mimic an open environment. He does this in 2 steps: 
  1. Unhooking the no more than one outstanding request constraint, i.e., removes feedback.
  2. Using the former think time (Z) as an exponential interval generator to create a Poisson source.
He then checks that the resulting arrival pattern into the SUT has CoV = 1.

Later, he shows that even though he has a finite number of generators (~300 scripts/threads), the effective number of open-sourced (if I can use that term) requests in the system can be adjusted by changing the Z value (no longer a think time). That's why I find Figure 2 confusing b/c he's smooshed these two concepts together when they are normally kept quite distinct.

Overall, his approach appears to require less hackery than what is described in the CMU-Usenix paper and should be applicable to any conventional load-test harness. I think that's his main point.


On Tuesday, July 8, 2014 9:13:29 AM UTC-7, ceeaspb wrote:
thanks for the feedback, I read the Brady 2012 paper. 

Certainly reporting in terms of RPS is what I have been doing to date. I learned this lesson many years back when you experience that production environment measure rates mostly.
It doesn't reference the openvsclosed paper though I would guess because it's focus is on how to make the best out of JMeter. My current focus is not to limit what tool is used which would imply comparing/contrasting real load generators as a reference point. Also it notes the difference between real and simulated users but does not identify what kind of loop the real user population is (or whether it is a loop at all).

It identifies that "Traffic can be offered to the Figure 2 [real user] target system in an unbridled way that will cause it to overload"
and then states "Mimicking the independent behavior of real users can be difficult to accomplish in the limited resource closed loop environment of Figure 1"

So that is one of the main points of this thread, that more recent tools employ new techniques like non-blocking asynchronous io code, lock free algorithms, message passing rather than thread per user and zero copy to allow for a massively more scalable/efficient load generator. This should allow us enough resource to simulate at high rates a new unique user arriving, and once done in the closed request loop departing not to return back into the system.

User independence is discussed. It doesn't include how independent users can offer load to the system in an unbridled or uncoordinated way. I agree that for a test to pass we need to acheive steady state. We also need to replicate or predict how systems will fail, the lack of unbridled offered load is a risk. Again a difference in focus which will be useful to cover off at this time.


I think I will need to write my own paper at the end of this discussion to help others avoid the questions that are still outstanding.

Thanks,
Alex

Alex

To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-planning+unsubs...@googlegroups.com.

Alex Bagehot

unread,
Jul 9, 2014, 10:55:41 AM7/9/14
to guerrilla-cap...@googlegroups.com
thanks,  question inline below.


On Tue, Jul 8, 2014 at 6:00 PM, 'DrQ' via Guerrilla Capacity Planning <guerrilla-cap...@googlegroups.com> wrote:
One quick comment that may help to further clarify what Jim Brady is trying to say in his paper.

A somewhat latent assumption in the definition of a closed queueing network is that each of the N users/generators/scripts cannot issue more than one outstanding request. Jim does not state this assumption. This one-2-one correspondence accounts for the total number of possible requests being finite and equal to N. At the risk of mixing metaphors a little, these requests are either enqueued (in the SUT) or waiting for a mean service time Z while the user "thinks" about what to do next. Moreover, the constraint creates a negative feedback or self-throttling queueing system. So technically, the arrival pattern is correlated and therefore not truly exp distributed.

For an open queueing network this constraint is relaxed such that the total number of requests in the system can be unbounded (but not infinite). Since there is no feedback, the arrivals are exp distributed. These assumptions are more clearly stated in the CMU-Usenix paper.

I get everything to this point.
 

What Jim describes in the first part of his paper is how he uses the closed environment (the default for load-test rigs like LR and JM) to mimic an open environment. He does this in 2 steps: 
  1. Unhooking the no more than one outstanding request constraint, i.e., removes feedback.
I can't see this in the paper having gone through it a few times - how does it achieve a user/generator/thread having more than 1 inflight request?
To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-...@googlegroups.com.
To post to this group, send email to guerrilla-cap...@googlegroups.com.

DrQ

unread,
Jul 9, 2014, 1:10:07 PM7/9/14
to guerrilla-cap...@googlegroups.com
I agree that the open generation mechanism is not very clearly presented (too many words and trying to say too much at once). And Fig. 2 doesn't help either, IMHO. What I believe he means is the following. 

In the open system case, I would drop all reference to "users" and simply talk about request generators that can issue Gets and Posts. Each generator is a script that runs on a thread (one of ~300 in his case). A generator is capable of issuing more than one request without idling until the corresponding response is received, e.g., from a Get. Eventually, the requested data (e.g., HTML pages) will return to that generator. During that period, however, that same generator can issue other requests. That's prohibited in the definition of a closed system. The delay between issuing multiple requests is exp distd by virtue of the Z setting. The shorter the Z value, the greater the number of total requests that will have been generated and issued into the system. That number can be in the thousands, according to Figure/Table 12, even though there are only ~300 actual generators.

Jim shows how he calculates that number of requests in the text under Fig. 12. In my notation it would look like this:

N_gen = X * (R + Z) = 67.64 * (4.78 + 10) = 323.3192 + 676.4 = 999.7192 (rounded to 1000)

This, of course, is just Little's law in yet another guise. Since the system is in steady state,  the mean system throughput X = λ (the arrival rate from the load generator side). He gets the X and R values from his JM measurements. This is my interpretation so, caveat lector. :)


On Wednesday, July 9, 2014 7:55:41 AM UTC-7, ceeaspb wrote:
thanks,  question inline below.

Alex Bagehot

unread,
Jul 9, 2014, 10:44:44 PM7/9/14
to guerrilla-cap...@googlegroups.com
ok thanks, As far as I am aware JMeter cannot issue async http requests as you describe (there is no http sampler to do this as far as I am aware).
I had a look see whether I could find anything and also with reference to the JMeter test plan screenshot.

I found this:
followed by

But it's not the same.The only async there is decoupling two sync http requests from each other.

If I wrote a equivalently structured script in Gatling the same limitation would apply that the (fixed number of looping) generators(users) would be blocked on each http request causing feedback throttling of the arrival rate. As there is a finite number of generators in this example, for Gatling at least, the model couldn't be open.


If Jim is available to comment on his paper that would be beneficial I think to confirm.


Where I am currently in terms of understanding is this :

There are benefits to avoiding modelling and executing load tests as closed models. 

Load test source data is typically in throughput/arrival rates (from production logs, business forecasts etc) and the output is best reported by charts (scatter plots) of <metric> vs. throughput rather than concurrent users. Therefore it follows that once you have eliminated the practical tool limitations of thread per generator/user implementation there is no need for concurrent users at all from start to finish, except for validation where needed. (that would be the essence of it at least). This would eliminate translation calculations from/to arrival rate/concurrency and reduce risk.

If that were correct, there are some loose ends:

1) from what I understand you have said there is only open or closed systems in QT (the mixed model in Perl::PDQ has both open and closed but they do not interact). You have identified issues with the partly-open model in the CMU paper which mean that you don't recommend it being used. So how can we ground this in QT or other mathematical model?
2) there aren't any papers that I am aware of that work through a realistic load test example with the above details present. A worked example may help.
3) level of detail of the simulation. It has been argued that as the tcp packets in an http request are synchronous this makes the whole simulation synchronous and therefore closed. I am not convinced but interested to check that we can ignore that level of detail (else we could get bogged down in lower levels eg. whether each cpu instruction causes a closed loop as it is synchronous) to allow for a non-closed(!) model without causing the test issues.


thanks,
Alex



To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-...@googlegroups.com.
To post to this group, send email to guerrilla-cap...@googlegroups.com.

DrQ

unread,
Jul 10, 2014, 12:13:21 PM7/10/14
to guerrilla-cap...@googlegroups.com
> If Jim is available to comment on his paper that would be beneficial I think to confirm.

Good idea. I sent him an invite. Let's see if he bites and get this part sorted out before going much further.

DrQ

unread,
Jul 11, 2014, 11:34:00 AM7/11/14
to guerrilla-cap...@googlegroups.com
Jim is now member. Welcome Jim! 

It may take some time to digest this thread. Here's my synopsis.

Alex is a performance engineer in the UK who has also read the CMU-Usenix paper. He and I have been reviewing your CMG paper.  I've been trying to explain your paper in terms of basic queueing theory, as I understand it. We seem to have reached a point where Alex doesn't believe JMeter can be converted to mimc an open system. Your demonstrating that the measured arrival traffic has CoV = 1 apparently isn't convincing him. Hopefully you can provide more detail about what you did in JM. I have some questions about both papers, but those are of a more subtle queueing theory nature, so I haven't brought them up here, yet.

Given the vast number of technical dimensions in this topic, I'm not sure this forum is the best place to sort it all out, but let's see how it goes.

...

Stephen O'Connell

unread,
Jul 11, 2014, 11:58:21 AM7/11/14
to guerrilla-cap...@googlegroups.com
I have been following this thread with great interest.  If it moves off this forum, please keep me copied on the follow-on discussion.

Thanks,
Stephen...



--
You received this message because you are subscribed to the Google Groups "Guerrilla Capacity Planning" group.
To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-...@googlegroups.com.
To post to this group, send email to guerrilla-cap...@googlegroups.com.

DrQ

unread,
Jul 11, 2014, 12:06:57 PM7/11/14
to guerrilla-cap...@googlegroups.com, s...@saoconnell.com
Good to see, Stephen. I thought we'd lost you intellectually to that ML monkey business.  ;)
To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-planning+unsub...@googlegroups.com.
To post to this group, send email to guerrilla-capacity-planning@googlegroups.com.

Alex Bagehot

unread,
Jul 15, 2014, 1:53:27 PM7/15/14
to guerrilla-cap...@googlegroups.com
thanks, I can ask also on the jmeter users list if needed.


--
You received this message because you are subscribed to the Google Groups "Guerrilla Capacity Planning" group.
To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-...@googlegroups.com.
To post to this group, send email to guerrilla-cap...@googlegroups.com.

James Brady

unread,
Jul 15, 2014, 7:59:13 PM7/15/14
to guerrilla-cap...@googlegroups.com

Hello - this is Jim ready to answer questions regarding my CMG2012 paper titled "When Load Testing Large User Population Web Applications the Devil Is In the (Virtual) User Details".

As the title suggests, my perception is that too much emphasis in most load testing efforts is on mimicking the behavior of individual users and not enough focus is placed on the quality of the workload these “Virtual” users offer to the target environment. My objective is to be reasonably certain the request mix, rate, and timing created by the load tool and it’s crowded computing environment is consistent with that of a large population of application users in steady state making requests on separate computers with no coercion between them when pressing the enter key.

Large populations that reach steady state which operate in this manner conform to a Poisson process where times between requests are Negative-Exponentially distributed and the numbers of requests in constant length intervals are Poisson distributed. I determine if my request timing is consistent with this environment by checking the statistical equality of the request time mean and standard deviation since such equality is the case for the Negative-Exponential distribution.

I realize that I am using a closed model tool (JMeter) to represent what is largely an open model environment. I attempt to mitigate the negative effects of this circumstance by implementing a large number of virtual user threads while maintaining a high mean think time / mean response time ratio. The goal is to minimize the request pattern distortions associated with a virtual user thread in queue or service being unavailable as a requestor. As a practical matter, the impact of an individual traffic source’s state on the distribution of times between arrivals becomes blurred as the number of sources increase toward the open source environment.

The JMeter HTTP/HTTPS Sampler is described in the JMeter User Documentation:

http://jmeter.apache.org/index.html

In the end, users, virtual or real, are traffic sources, not traffic, and traffic is what drives resource consumption on target systems.  That is why I put so much emphasis on the mix, rate, and timing of requests not on the traffic sources making the requests.

To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-planning+unsub...@googlegroups.com.

davecb

unread,
Jul 16, 2014, 7:59:36 AM7/16/14
to guerrilla-cap...@googlegroups.com


On Tuesday, July 15, 2014 7:59:13 PM UTC-4, James Brady wrote:

I realize that I am using a closed model tool (JMeter) to represent what is largely an open model environment. I attempt to mitigate the negative effects of this circumstance by implementing a large number of virtual user threads while maintaining a high mean think time / mean response time ratio. The goal is to minimize the request pattern distortions associated with a virtual user thread in queue or service being unavailable as a requestor. As a practical matter, the impact of an individual traffic source’s state on the distribution of times between arrivals becomes blurred as the number of sources increase toward the open source environment.

I rather suspect that the real-world population is a bunch of individuals who enter the system, and in a closed loop do quite a number of operations before leaving it again. In a given observed period most are in a loop, while a few are entering or exiting, and some other number are entering, doing a few operations and leaving.

I think a JMeter simulation/replay of such a community would be a valid test, but may or may not be a proper fit to the open model.  Someone with more mathematical intuition will be needed to say whether an open or a closed system is a good (in my limited sense, meaning "predictive") model of the system.

I think emphasis on having representative traffic can only be a good thing (;-)) 

Alex Bagehot

unread,
Jul 20, 2014, 10:53:31 AM7/20/14
to guerrilla-cap...@googlegroups.com
Hello Jim,

Thanks for the reply. I think you have answered my question on whether there was any custom Jmeter sampler being used to approximate an open system. 

In terms of mix, rate, and timing of requests I think your work is absolutely right and should be worked into load generators to validate the request stream has good quality.

Similarly in terms of traffic sources, I agree. At that level there is no difference between JMeter and Gatling/Iago -  they both have multiple threads that make requests to the SUT. How those threads are controlled/driven is completely different though, meaning that in Gatling we don't need any of the complications/workarounds(ie. pacing) around what request rate will come from a Z , N combination.

Given you can model a steady state open system reasonably accurately in a close model tool like JMeter, why bother moving to a new toolset with the drawbacks that may have (fewer experienced engineers, less stable, unfamiliar approach, etc)? I think it is around questions like Co-ordinated omission / percentile reporting, unbridled load, simpler repeatable workload, where I believe the open model tools will provide benefit going forward. We should then get more examples of how to model open systems documented also.

Thanks,
Alex


To unsubscribe from this group and stop receiving emails from it, send an email to guerrilla-capacity-...@googlegroups.com.
Reply all
Reply to author
Forward
0 new messages