Gatling scaling

919 views
Skip to first unread message

Alvin Lin

unread,
Dec 13, 2013, 1:58:00 PM12/13/13
to gat...@googlegroups.com
Is scaling feature in the Gatling development roadmap? 

I love Gatling, it has been my choice of load test tool for quite a while. I have also introduce this tool to many organizations, and they were all impressed by it. 

Yes, we can follow the "scaling out" strategy in the documentation, we can write scripts, but these require more work than necessary. It would nicer if Gatling provides scaling feature out of the box. What I am 
envisioning is there is a master node sending commands to slave nodes that will be doing the actual load generation. Then at the end of the load test, master will automatically aggregate
the report. The other nice feature would be that master can report live status of each slave during load test (so that I don't have to open 10 SSH windows looking at each slave). So as a load test running, I only 
have to interact with the master node.

I understand Gatling is still at a very young stage, and there are probably more important things to do than scaling feature (like nailing the DSL), but I feel we really need to put the scaling 
feature into the development roadmap so Gatling can be even more ahead of its competitors and  I really like to see Gatling crush its competitors like LoadUI or JMeter :)

Agreed?

Floris Kraak

unread,
Dec 13, 2013, 4:10:43 PM12/13/13
to gat...@googlegroups.com

Yep. You really need to implement what loadrunner has had for nigh on 15 years or so, now :)

Not to say it's a small job or anything, just that the open source tools really have a lot of catching up to do.. and I for one would welcome the existence of actual competition.  :)

--
You received this message because you are subscribed to the Google Groups "Gatling User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gatling+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Stéphane Landelle

unread,
Dec 14, 2013, 5:49:17 AM12/14/13
to gat...@googlegroups.com
Yep, absolutely agree.
So much to do, so little time, but that will be for 2014.


2013/12/13 Floris Kraak <rand...@gmail.com>

Ioannis Mavroukakis

unread,
Dec 17, 2013, 4:57:20 AM12/17/13
to gat...@googlegroups.com
Agreed, but then again it is always easy when you have the HP behemoth behind you since 2006. We need to see more involvement from the community, I assume the main devs do have day jobs to contend with so their bandwidth can narrow considerably. It's a shame my Scala skills wouldn't fill the back of a postage stamp, I'd love to get stuck in with something like this.

Alvin Lin

unread,
Dec 17, 2013, 5:12:26 AM12/17/13
to gat...@googlegroups.com
Yes we definitely need more community involvement. I really would like to contribute to the scaling feature, so I have been reading on Scala and Gatling's source :-)

I actually find contributing to Gatling few times a week after work takes the stress from my day job away ;) 

Stéphane Landelle

unread,
Dec 17, 2013, 6:40:45 AM12/17/13
to gat...@googlegroups.com
> Agreed, but then again it is always easy when you have the HP behemoth behind you since 2006. We need to see more involvement from the community, I assume the main devs do have day jobs to contend with so their bandwidth can narrow considerably. It's a shame my Scala skills wouldn't fill the back of a postage stamp, I'd love to get stuck in with something like this.

Yep, that's basically it: so much to do, so little time. But there's many ways to contribute: lend a hand on the mailing list, report bugs, give the snapshots a try, help improve the documentation (we're working on the Sphinx template for the new documentation, then we'll "just" have to write it down), front end skills would be really appreciated, etc. Any help welcome!

> Yes we definitely need more community involvement. I really would like to contribute to the scaling feature, so I have been reading on Scala and Gatling's source :-)

That's really appreciated!


> I actually find contributing to Gatling few times a week after work takes the stress from my day job away ;) 

You mean that our stress tool is also an anti stress tool?

Stefan Magnus Landrø

unread,
Dec 17, 2013, 5:53:23 PM12/17/13
to gat...@googlegroups.com
I tell you, loadrunner is a totally inefficient tool - it requires way more resources than gatling - that's why they had to implement scaling 15 years ago.... 
A few months ago, I was faced with a situation where a loadrunner installation with 4 load generators couldn't generate as much load as gatling. Performance wise, loadrunner just sucks. 

Floris Kraak

unread,
Dec 18, 2013, 4:01:37 AM12/18/13
to gat...@googlegroups.com
Hmm. That doesn't match my experience. Basic web vusers are fairly lightweight. I can have a single generator put out somewhere around ~1000 TPS without too much effort before the CPU caps out, and I really haven't tried tuning that at all. In fact, my scripts tend to be quite heavy in terms of features ;-)
Don't forget that this stuff was built in the late 90's, when CPU and memory resources were quite a bit more scarce. The later additions invariably suck performance wise, but the core web protocol really doesn't.
So don't try to use the fancy Ajax stuff - that protocol is fairly horrible. ;-)
But maybe I just lack comparison material here. ;-)



--
REALITY.SYS corrupted. Reboot Universe? (Y/N/Q)

Nicolas Rémond

unread,
Dec 18, 2013, 4:07:39 AM12/18/13
to gat...@googlegroups.com
With Gatling, we already reached 80k requests/sec on a i7 MacBookPro ...

Ioannis Mavroukakis

unread,
Dec 18, 2013, 4:17:13 AM12/18/13
to gat...@googlegroups.com
Hey Nicolas. Floris mentioned TPS so requests/sec is not an equivalent metric.


You received this message because you are subscribed to a topic in the Google Groups "Gatling User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gatling/MZQKEQcgNeQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gatling+u...@googlegroups.com.

Nicolas Rémond

unread,
Dec 18, 2013, 4:19:11 AM12/18/13
to gat...@googlegroups.com
What does the T stands for ?


Ioannis Mavroukakis

unread,
Dec 18, 2013, 4:22:25 AM12/18/13
to gat...@googlegroups.com
Transactions. 

Nicolas Rémond

unread,
Dec 18, 2013, 4:25:49 AM12/18/13
to gat...@googlegroups.com
Alright. I guessed that. But what is the "transaction" in this context ?

And oops, I meant 8k requests/sec ;-) During the simulation, the number of concurrent active sessions was also about 8k.



Ioannis Mavroukakis

unread,
Dec 18, 2013, 4:33:33 AM12/18/13
to gat...@googlegroups.com
Exactly :) Since we don't know what a transaction does in each system (could be a login procedure, could be placing an order etc) it's hard to compare.

Floris Kraak

unread,
Dec 18, 2013, 5:09:14 AM12/18/13
to gat...@googlegroups.com
In this case each transaction was a single HTTP request with possibly a redirect response followed by a second request to the redirected resource.
But yes, it's quite possible for a "transaction" to actually be a collection of HTTP requests. So it doesn't necessarily equate to RPS, though in practice it often will.

Anyway, that was a fairly untuned file upload script (sending files in the 1-2 MB range on average), and when I look back at that test it wasn't CPU, it was actually memory that we were capping out on. (And that definitely wasn't normal - I had some problems with that script which I won't bore you with.)

I don't quite know what hardware those generators have, since they're actually virtual machines that share physical hardware with other generators. Hardware that isn't really all that new, I might add ;-)
If you really insist on comparing them I can probably dig up something from the archives.

The base point though is that I rarely to never have problems with generator performance, and we do have a fairly decent-sized site to simulate, with 1.5 million unique customers logging in every day. We can simulate the load on the dual-server LST environment with 1 generator sitting there mostly idle, and even though that's just 1/12th of prod it serves our needs quite easily.

So no. I don't believe "performance" is really a LR problem. That tool has many, many issues, but web protocol script performance isn't one of them.


Stefan Magnus Landrø

unread,
Dec 18, 2013, 5:45:21 AM12/18/13
to gat...@googlegroups.com
Transactions

Sendt fra min iPhone

Stefan Magnus Landrø

unread,
Dec 18, 2013, 5:48:52 AM12/18/13
to gat...@googlegroups.com
So why bother about scaling Gatling, then?

Sendt fra min iPhone

Stéphane Landelle

unread,
Dec 18, 2013, 5:51:04 AM12/18/13
to gat...@googlegroups.com
IMHO, the main situation that would require to scale out is when you saturate the NIC.


2013/12/18 Stefan Magnus Landrø <stefan...@gmail.com>

Ioannis Mavroukakis

unread,
Dec 18, 2013, 5:54:47 AM12/18/13
to gat...@googlegroups.com
+10

Stefan Magnus Landrø

unread,
Dec 18, 2013, 6:05:00 AM12/18/13
to gat...@googlegroups.com, gat...@googlegroups.com
As in input/output queue length? 

Sendt fra min iPhone

Alvin Lin

unread,
Dec 18, 2013, 1:50:52 PM12/18/13
to gat...@googlegroups.com
I have had a case where we cannot not have super good hardware to generate node, in which case I needed multiple less powerful hardware to generate load  that I needed. 

Stefan Magnus Landrø

unread,
Dec 19, 2013, 4:22:59 AM12/19/13
to gat...@googlegroups.com
Or as in consuming lots of CPU because you have a crappy NIC?
Or even bandwidth? (most datacenters use 1 Gbps or 10 Gbps for internal links these days I believe) 
+10


Transactions. 


To unsubscribe from this group and stop receiving emails from it, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
--
REALITY.SYS corrupted. Reboot Universe? (Y/N/Q)

--
You received this message because you are subscribed to the Google Groups "Gatling User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "Gatling User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gatling/MZQKEQcgNeQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "Gatling User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "Gatling User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gatling/MZQKEQcgNeQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "Gatling User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "Gatling User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gatling/MZQKEQcgNeQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "Gatling User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.



--
REALITY.SYS corrupted. Reboot Universe? (Y/N/Q)

--
You received this message because you are subscribed to a topic in the Google Groups "Gatling User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gatling/MZQKEQcgNeQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "Gatling User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "Gatling User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gatling/MZQKEQcgNeQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gatling+unsubscribe@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the Google Groups "Gatling User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gatling/MZQKEQcgNeQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gatling+unsubscribe@googlegroups.com.

Floris Kraak

unread,
Dec 19, 2013, 4:28:52 AM12/19/13
to gat...@googlegroups.com
There are several possible reasons to scale out beyond "generators have limited resources".

1) Load balancers.
If you have a loadbalancer set to be IP-sticky then you'll need at a minimum more than one IP just to make sure the load gets spread across the backend machines properly. This can be done with IP spoofing, but often it's simpler (and less likely to upset network security people) to just have more than one generator.

2) More complicated protocols require quite a lot more resources / CPU power.
Ajax/Trueclient, Citrix, "web services", ...
This may not *yet* apply to gatling, but I guarantee you, once you start making the client simulation closer to what modern browsers do you're going to need more horsepower.
The difference between hitting URL's with raw data and simulating clicks on buttons can actually be quite large in terms of resource consumption, especially if that means running the half-a-million lines of javascript certain sites dump into your browser nowadays.

3) Network bandwidth.
As Stéphane points out, if you are going to send really large amounts of data you can exceed the total capacity of the nic. This is really a variant of the "generators have limited resources" theme.
It happens less and less though. ;-)

4) Seperation of responsibilities.
Seperating the generator from the controller, the controller from the scheduler, the scheduler from the user interface etc is quite simply just good engineering.
If you want to offer your tester live views of what is going on inside the test you don't want the gui code to be running in the same JVM as the generator itself.
Nor do you want multiple users to be able to stomp all over each other's tests or analysis by accident - while you *do* want to offer a view of who ran what tests when. With a single monolithic app you simply can't do that.
"Scaling out" really is just a tiny subset of the things you can start doing once you have a properly architected set of components that talk to each other, rather than a monolithic application.


LR has 5 components:
- Generator (LR agents)
- Controller (controls a single test/one user simultaneously, can be used standalone to get a live view on whatever test you're running)
- Scripting interface (vugen, desktop app)
- Analysis tool (also a desktop app)
- ALM - aka "Performance Center" - Web interface useable by multiple people to run multiple tests simultaneously using any number of controllers and generators. Also offers live test views and tracks all the test assets (scripts, scenario's and what not)

The latter thing is pretty complicated stuff but it's really just a management interface around the first three items.


As for "why gatling"..
Well. LR is pretty good stuff .. but it's proprietary, expensive, stagnant, buggy, and inflexible. It doesn't support the whole agile model our customers want to move to very well, and adding the things we need to that means waiting for the supplier to start building the required functionality (and paying through the nose for it).
Never mind the bugs ;-)

Floris Kraak

unread,
Dec 19, 2013, 4:37:31 AM12/19/13
to gat...@googlegroups.com
.. and I completely forgot item #5 ..

5) Measurement quality.
Quite simply put: If you are going to use just one generator, you can't be 100% sure that a bad response time is the result of the system under test being slow, or *the generator* being slow. Especially when generators are virtualized it isn't necessarily guaranteed that that generator has all of the resources on the physical hardware to itself. Nor can it be guaranteed that all of the components in that system are problem-free. A faulty nic can cause very large problems and it would be very easy to blame those on the system under test.
I've had some pretty interesting discussions with certain senior loadtesters over on the LR boards some time back. James Pulley for instance believes that all test farms should have at least 5 generators:
- 2 regular load generators
- 1 generator that runs only a fraction of the load, to measure the impact of the test load on responstimes for the first two machines.
- 1 generator out of order or in maintenance.
- 1 generator 'spare'

That's a bit on the extreme side, though. ;-)

Stéphane Landelle

unread,
Dec 19, 2013, 4:48:02 AM12/19/13
to gat...@googlegroups.com
+ 1 on all these points

Quote of the day: "Floris Kraak knows his performance test stuff!"


2013/12/19 Floris Kraak <rand...@gmail.com>

Alvin Lin

unread,
Dec 19, 2013, 5:04:04 AM12/19/13
to gat...@googlegroups.com
Agreed, I learned something. I can use these points to reinforce my request of more load generators :)

Ioannis Mavroukakis

unread,
Dec 19, 2013, 5:27:07 AM12/19/13
to gat...@googlegroups.com

2) More complicated protocols require quite a lot more resources / CPU power.
Ajax/Trueclient, Citrix, "web services", ...
This may not *yet* apply to gatling, but I guarantee you, once you start making the client simulation closer to what modern browsers do you're going to need more horsepower.
The difference between hitting URL's with raw data and simulating clicks on buttons can actually be quite large in terms of resource consumption, especially if that means running the half-a-million lines of javascript certain sites dump into your browser nowadays.


Agreed on all points except this. If I start making the client simulation closer to what browsers do, I might as well use Selenium (and then I'm *really* going to need more horsepower). If I need a load testing tool to figure out that my JS is slow, I've used a cannon to crack a nut, this is a job for profiling tools on the browser.

Stefan Magnus Landrø

unread,
Dec 19, 2013, 6:02:54 AM12/19/13
to gat...@googlegroups.com
I agree on most of your points, but I have some comments


On Thursday, 19 December 2013 10:28:52 UTC+1, Floris Kraak wrote:
There are several possible reasons to scale out beyond "generators have limited resources".

1) Load balancers.
If you have a loadbalancer set to be IP-sticky then you'll need at a minimum more than one IP just to make sure the load gets spread across the backend machines properly. This can be done with IP spoofing, but often it's simpler (and less likely to upset network security people) to just have more than one generator.

Makes sense. However, most of the time, faulty load balancer configuration will end up routing all traffic from a single ip to one node. At the same time, it can easily be verified that all servers are being hit. I don't know about you guys, but we tend to use Big Ip LTM in our projects, and seldom have to modify the config. In addition, I believe one should be able to trust commercial grade enterprise hw/sw, considering that they perform lots of testing too on their side (using IXIA and the like).  

2) More complicated protocols require quite a lot more resources / CPU power.
Ajax/Trueclient, Citrix, "web services", ...
This may not *yet* apply to gatling, but I guarantee you, once you start making the client simulation closer to what modern browsers do you're going to need more horsepower.
The difference between hitting URL's with raw data and simulating clicks on buttons can actually be quite large in terms of resource consumption, especially if that means running the half-a-million lines of javascript certain sites dump into your browser nowadays.
Well, in my opinion, going down this pathway, will give you trouble. Until now, people have been focusing on writing applications that are unit testable. I believe it's about time people start writing apps that are performance testable - and drop using application frameworks that masks the HTTP protocol. If you're using JSF or Wicket or all kinds of portal-systems, you'll get in trouble. IMHO, REST-based (single page) applications are the way forward from a performance testing perspective.  
 

3) Network bandwidth.
As Stéphane points out, if you are going to send really large amounts of data you can exceed the total capacity of the nic. This is really a variant of the "generators have limited resources" theme.
It happens less and less though. ;-)
I've never had any issues during performance testing of regular web applications. Streaming and the likes might be a different story though.
 

4) Seperation of responsibilities.
Seperating the generator from the controller, the controller from the scheduler, the scheduler from the user interface etc is quite simply just good engineering. 
If you want to offer your tester live views of what is going on inside the test you don't want the gui code to be running in the same JVM as the generator itself.
Well, I believe it is a trade-off. If you gui uses lets say reactive programming techniques, then you should be ok. 
Nor do you want multiple users to be able to stomp all over each other's tests or analysis by accident - while you *do* want to offer a view of who ran what tests when. With a single monolithic app you simply can't do that.
Agreed, however from my experience, most shops only have a few performance testers, and those guys should be able to talk to each other. 
"Scaling out" really is just a tiny subset of the things you can start doing once you have a properly architected set of components that talk to each other, rather than a monolithic application.
Agreed.

 
5) Measurement quality.
Quite simply put: If you are going to use just one generator, you can't be 100% sure that a bad response time is the result of the system under test being slow, or *the generator* being slow. Especially when generators are virtualized it isn't necessarily guaranteed that that generator has all of the resources on the physical hardware to itself. Nor can it be guaranteed that all of the components in that system are problem-free. A faulty nic can cause very large problems and it would be very easy to blame those on the system under test.
I've had some pretty interesting discussions with certain senior loadtesters over on the LR boards some time back. James Pulley for instance believes that all test farms should have at least 5 generators:
- 2 regular load generators
- 1 generator that runs only a fraction of the load, to measure the impact of the test load on responstimes for the first two machines.
- 1 generator out of order or in maintenance.
- 1 generator 'spare'

That's a bit on the extreme side, though. ;-) 

>>>> Agreed. Faulty servers are troublesome. 


LR has 5 components:
- Generator (LR agents)
- Controller (controls a single test/one user simultaneously, can be used standalone to get a live view on whatever test you're running)
- Scripting interface (vugen, desktop app)
- Analysis tool (also a desktop app)
- ALM - aka "Performance Center" - Web interface useable by multiple people to run multiple tests simultaneously using any number of controllers and generators. Also offers live test views and tracks all the test assets (scripts, scenario's and what not)

The latter thing is pretty complicated stuff but it's really just a management interface around the first three items.


I had to use a performance center setup recently. It's a nasty beast, and at least my installation was full of bugs and caused me lots of trouble. Mega hard to debug what was going on, too.
I'd prefer a simpler architecture with one controller and a bunch of generators. Fewer components, less trouble.

Of course, it would be really nice to have all the features you mention. It comes at a price though, and that is less other features. 

Floris Kraak

unread,
Dec 19, 2013, 2:49:03 PM12/19/13
to gat...@googlegroups.com

I also said "Citrix", remember? Things like that (or RDP, as another example) tend to be quite heavy.

Anyway, to get back to your point: HP has been trying to do that for years, with various degrees of success. Remember that "Ajax protocol" I was talking about as an example of something with horrible performance? That was one attempt.
The latest attempt has been TrueClient - and that's pretty much an embedded firefox instance used as a load generation tool.

For some reason there is this idea floating around in upper management layers that performance testing has to be cheap, done by the cheapest people, using smart tools that do all the thinking for them. HP has been making steps towards unifying their functional test tool with loadrunner over the years, precisely for this reason.

Personally, I don't really agree. Performance is and always has been a complicated game, that require knowledgeable people. But that opinion isn't shared by everyone.



--
You received this message because you are subscribed to the Google Groups "Gatling User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gatling+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Floris Kraak

unread,
Dec 19, 2013, 3:15:27 PM12/19/13
to gat...@googlegroups.com


On Thu, Dec 19, 2013 at 12:02 PM, Stefan Magnus Landrø <stefan...@gmail.com> wrote:
I agree on most of your points, but I have some comments


On Thursday, 19 December 2013 10:28:52 UTC+1, Floris Kraak wrote:
There are several possible reasons to scale out beyond "generators have limited resources".

1) Load balancers.
If you have a loadbalancer set to be IP-sticky then you'll need at a minimum more than one IP just to make sure the load gets spread across the backend machines properly. This can be done with IP spoofing, but often it's simpler (and less likely to upset network security people) to just have more than one generator.

Makes sense. However, most of the time, faulty load balancer configuration will end up routing all traffic from a single ip to one node. At the same time, it can easily be verified that all servers are being hit. I don't know about you guys, but we tend to use Big Ip LTM in our projects, and seldom have to modify the config. In addition, I believe one should be able to trust commercial grade enterprise hw/sw, considering that they perform lots of testing too on their side (using IXIA and the like). 

It's been years since I last had to do this type of testing, but it does happen that when a big institution buys for half a million worth of hardware, they want someone to verify with certainty that that hardware performs it's job correctly.

You're correct, though. Aside from the odd configuration error problems with network hardware are very rare. I've only really seen it happen once, with a slightly cheaper cisco router that had the "quality of service" configuration enabled without configuring it, and traffic coming in at 100 mbit on one side with a 1Gb interface on the other end. File transfers larger than about 60kb would somehow end up getting queued for ages without dropping packets, causing very odd behaviour (and breaking TCP flow control in the process.)

 

2) More complicated protocols require quite a lot more resources / CPU power.
Ajax/Trueclient, Citrix, "web services", ...
This may not *yet* apply to gatling, but I guarantee you, once you start making the client simulation closer to what modern browsers do you're going to need more horsepower.
The difference between hitting URL's with raw data and simulating clicks on buttons can actually be quite large in terms of resource consumption, especially if that means running the half-a-million lines of javascript certain sites dump into your browser nowadays.
Well, in my opinion, going down this pathway, will give you trouble. Until now, people have been focusing on writing applications that are unit testable. I believe it's about time people start writing apps that are performance testable - and drop using application frameworks that masks the HTTP protocol. If you're using JSF or Wicket or all kinds of portal-systems, you'll get in trouble. IMHO, REST-based (single page) applications are the way forward from a performance testing perspective.  


Honestly, I've have this standing policy that I will not allow myself to get involved in tests using Citrix, RDP, RMI, or anything else that doesn't look at least vaguely like text based HTTP traffic. That has served me well over the years, as practically every performance test using loadrunner I've seen so far using any of the above protocols has been a spectacular failure.

In other words: I completely agree, but that doesn't stop dumb management folks from purchasing software from outside sources that breaks that rule (and invariably performs poorly ..).

(I might make an exception for something modern, extremely well documented and open, like Google Protocol Buffers - but even then I would seriously consider switching away from loadrunner for such a test - the language choice just doesn't agree with such a project ..)


 

3) Network bandwidth.
As Stéphane points out, if you are going to send really large amounts of data you can exceed the total capacity of the nic. This is really a variant of the "generators have limited resources" theme.
It happens less and less though. ;-)
I've never had any issues during performance testing of regular web applications. Streaming and the likes might be a different story though.

It gets rarer as the network bandwith increases.
We do have to test things with limited bandwith and varying degrees of latency though, due to the prevalence of wifi and mobile networks nowadays (not to mention the existance of offices in Australia.)
But that doesn't really affect how many generators you use ..

 
 

4) Seperation of responsibilities.
Seperating the generator from the controller, the controller from the scheduler, the scheduler from the user interface etc is quite simply just good engineering. 
If you want to offer your tester live views of what is going on inside the test you don't want the gui code to be running in the same JVM as the generator itself.
Well, I believe it is a trade-off. If you gui uses lets say reactive programming techniques, then you should be ok. 

I would be extremely careful with that. Since gatling is running inside a JVM, garbage collection pauses will occur to deal with the extra data required with that fancy GUI you want to use.
Garbage collection pauses that in my view already can have too much influence on your test results - the system under test often uses software in a JVM as well, and we really do have an interest in what the impact of the GC collects is on that end.
 
Nor do you want multiple users to be able to stomp all over each other's tests or analysis by accident - while you *do* want to offer a view of who ran what tests when. With a single monolithic app you simply can't do that.
Agreed, however from my experience, most shops only have a few performance testers, and those guys should be able to talk to each other. 

True. The organisation I work for is a bit of an exception.
For most people a simple generator/controller split should be good enough.

 
"Scaling out" really is just a tiny subset of the things you can start doing once you have a properly architected set of components that talk to each other, rather than a monolithic application.
Agreed.

 
5) Measurement quality.
Quite simply put: If you are going to use just one generator, you can't be 100% sure that a bad response time is the result of the system under test being slow, or *the generator* being slow. Especially when generators are virtualized it isn't necessarily guaranteed that that generator has all of the resources on the physical hardware to itself. Nor can it be guaranteed that all of the components in that system are problem-free. A faulty nic can cause very large problems and it would be very easy to blame those on the system under test.
I've had some pretty interesting discussions with certain senior loadtesters over on the LR boards some time back. James Pulley for instance believes that all test farms should have at least 5 generators:
- 2 regular load generators
- 1 generator that runs only a fraction of the load, to measure the impact of the test load on responstimes for the first two machines.
- 1 generator out of order or in maintenance.
- 1 generator 'spare'

That's a bit on the extreme side, though. ;-) 

>>>> Agreed. Faulty servers are troublesome. 


It doesn't have to be the server's fault. A flaw in the test script could overload the generator, too. Or another VM, intruding on the generator's CPU/IO/memory/... resources. Or a backup that starts running. Or .. any of a million things that can happen that isn't necessarily the fault of faulty hardware. And it doesn't even have to be very blatantly obvious, either.
 

LR has 5 components:
- Generator (LR agents)
- Controller (controls a single test/one user simultaneously, can be used standalone to get a live view on whatever test you're running)
- Scripting interface (vugen, desktop app)
- Analysis tool (also a desktop app)
- ALM - aka "Performance Center" - Web interface useable by multiple people to run multiple tests simultaneously using any number of controllers and generators. Also offers live test views and tracks all the test assets (scripts, scenario's and what not)

The latter thing is pretty complicated stuff but it's really just a management interface around the first three items.


I had to use a performance center setup recently. It's a nasty beast, and at least my installation was full of bugs and caused me lots of trouble. Mega hard to debug what was going on, too.
I'd prefer a simpler architecture with one controller and a bunch of generators. Fewer components, less trouble.

 
Yeah. Quite nasty. Poorly architected too, in some ways - that applet based GUI they're using for instance uses only a single thread for both GUI processing and getting updates back from the server, so sometimes keystrokes or clicks just simply drop into a black hole while it's waiting for the server to reply. And that's just *one* of the half a million nasty little issues waiting to bite you in the rear.
I don't envy the people who have to maintain that thing. I really don't.
 
Of course, it would be really nice to have all the features you mention. It comes at a price though, and that is less other features. 

That's just a matter of investing more time ..
But I really wouldn't start trying to rebuild ALM entirely either. Just the generator/controller split would be a very good first step.

Ioannis Mavroukakis

unread,
Dec 20, 2013, 8:19:41 AM12/20/13
to gat...@googlegroups.com
On 19 December 2013 19:49, Floris Kraak <rand...@gmail.com> wrote:

I also said "Citrix", remember? Things like that (or RDP, as another example) tend to be quite heavy.

Anyway, to get back to your point: HP has been trying to do that for years, with various degrees of success. Remember that "Ajax protocol" I was talking about as an example of something with horrible performance? That was one attempt.
The latest attempt has been TrueClient - and that's pretty much an embedded firefox instance used as a load generation tool.

For some reason there is this idea floating around in upper management layers that performance testing has to be cheap, done by the cheapest people, using smart tools that do all the thinking for them. HP has been making steps towards unifying their functional test tool with loadrunner over the years, precisely for this reason.

Cheap: Yes, why not? The "More expensive == better" mantra is long dead. 
Cheapest people: Cheapest, is not necessarily correlated with less bright; even if that where the case, where's the knowledge transfer from the not-so-cheap people? This is an exercise for HR and those ultimately responsible for making the hiring decisions, after interviewing.
Smart tools: A tool is only as good as the person that interprets it, smart or otherwise.
 

Personally, I don't really agree. Performance is and always has been a complicated game, that require knowledgeable people. But that opinion isn't shared by everyone.

A cynic might take this as a statement to satisfy one's salary requirements - but that opinion isn't shared by everyone ;-)  

Floris Kraak

unread,
Dec 20, 2013, 8:51:01 AM12/20/13
to gat...@googlegroups.com
I think the current discrepancy between salary levels in different countries is pretty destructive in the long run.

One of the effects of that is that it causes work to drift to cheaper countries even when the more expensive people back home are better qualified and/or more experienced. Sure, there are lots of bright people over in india; But there has been a time when all that management over here was looking at was the difference in compensation (about a factor 5, I believe) and ignored everything else, including not just experience, but things like cultural differences and simple time zone lag as well.

Indian culture has always been very much a "yes sir!" culture - while a good performance engineer is a critical thinker, capable of grasping the entire architectural picture and calling the architects a bunch of idiots if necessary.
That perhaps may be changing - time and experience will give these people the neccessary education at some point - but especially in the heydays of outsourcing that just wasn't a requirement.

I understand your cynicism. There are plenty of not-so-qualified people over here, too.
But my point is that cost should not be the *only* factor. And the thought that you can 'dumb down' performance testers by making the tools smarter is imho really misguided.



--
You received this message because you are subscribed to the Google Groups "Gatling User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gatling+u...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Ioannis Mavroukakis

unread,
Dec 20, 2013, 9:06:23 AM12/20/13
to gat...@googlegroups.com
That has long come back to haunt a lot of companies in places as simple as call centres. I totally agree with cost not being the only factor, however it is squarely "our" fault for not tailoring the message for the right audience i.e. to be able to put through that initial savings will be eroded at an exponential pace once the complexity of a particular project widens (one can argue that letting the complexity widen is a bad thing, but topic for another discussion). My approach to this is, is to demo a "smart" tool (of any sort) to a tech semi-literate and ask him to draw conclusions. The results, often speak for themselves.

 


--
You received this message because you are subscribed to a topic in the Google Groups "Gatling User Group" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/gatling/MZQKEQcgNeQ/unsubscribe.
To unsubscribe from this group and all its topics, send an email to gatling+u...@googlegroups.com.

Floris Kraak

unread,
Dec 20, 2013, 9:11:32 AM12/20/13
to gat...@googlegroups.com

I disagree on one particular point:

It's "our" fault that we let people with zero knowledge of IT run IT shops.
It's "our" fault that we allow management level salaries to balloon to ridiculous proportions, causing the profession to attract the greediest people in our society, rather than the best.

We've communicated the message often enough, but the fact is that many of the people making the decisions simply have no clue about what this job entails in the first place. Letting them get into that position in the first place is the problem. Not the content of the message.


Reply all
Reply to author
Forward
0 new messages