Issue launching multiple concurrent Firefox instances using Selenium Grid

600 views
Skip to first unread message

Undeleterious

unread,
Aug 4, 2009, 11:06:24 AM8/4/09
to selenium-developers
Hi all,

I have been having a few issues with my Selenium Grid setup when
running multiple instances of Firefox concurrently.

My initial problem was the "Firefox is already running" dialog box as
described here: http://support.mozilla.com/en-US/kb/Firefox+is+already+running+but+is+not+responding.
This has occurred before, apparently when the session ID wasn't unique
enough: http://clearspace.openqa.org/message/65733.

I started to add debug to Selenium RC and suspected that after
creating a new profile the wait for the profile to be copied was not
long enough. It waits for extensions.ini to exist and then assumes
that the profile is created, perhaps sometimes this file can be
present before the profile is fully copied. By adding an arbitrary
wait this problem seems to have gone away. This isn't a nice fix, so
perhaps someone can think of a better way to be sure the profile
exists.

The second issue that I am now encountering is that Selenium RC is not
returning from executing the command to create the Firefox profile. By
adding debug into Selenium RC I can see that the shell.execute(); line
in FirefoChromeLauncher.populateCustomProfileDirectory(String
profilePath) method is not completing but I have no idea why.

From my limited knowledge of Java could the execute be modified to
implement the Future interface so that it could timeout a failed
attempt and either retry or throw an Exception?

This is a real pain as we have hundreds of tests and I really want to
get things running in an acceptable amount of time using Grid. I know
I could avoid launching multiple instances, and have more virtual
machines but this is much less scalable than I was hoping.

I am running the RCs on fresh installations of Windows XP virtual
machines (running on VirtualBox). Any advice would be much
appreciated. Let me know if you need me to expand or clarify anything.

Dave.

Patrick Lightbody

unread,
Aug 5, 2009, 12:49:25 AM8/5/09
to selenium-...@googlegroups.com
Dave,
This definitely sounds like a real issue. Could you do two things?

1) Provide a patch that addresses the first issue you identified

2) Give us a pointer (ideally with a URL to the code in
http://svn.openqa.org and a line number) to where you think the
additional logic could be added to detect the timeout, possibly using
a Future as you suggested.

Patrick

Undeleterious

unread,
Aug 5, 2009, 5:50:18 AM8/5/09
to selenium-developers
I'm happy to provide a patch, but I'm certain that there's a better
way to resolve the first issue.

All I have done is add the line:
AsyncExecute.sleepTight(2500);
after line 304 of
http://svn.openqa.org/svn/selenium-rc/trunk/server-coreless/src/main/java/org/openqa/selenium/server/browserlaunchers/FirefoxChromeLauncher.java

Can anyone think of a better and more robust way of actually making
sure that the profile has been fully created before continuing?

As for the execute that might need to timeout, that's on line 105 of
http://svn.openqa.org/svn/selenium-rc/trunk/server-coreless/src/main/java/org/openqa/selenium/server/browserlaunchers/FirefoxChromeLauncher.java

Incidentally, these issues are related to creating a new profile for
each session. If there was a way to reuse a profile between sessions -
but not between RCs - then although it doesn't address this issue it
may provide a workaround that has the additional benefits of faster
launch times for Firefox and ability to test cookies between sessions.
See issues http://jira.openqa.org/browse/SRC-168 and
http://jira.openqa.org/browse/SEL-661.

I also just found http://jira.openqa.org/browse/SEL-668 that might be
related to my first issue.

Thanks,
Dave

On Aug 5, 5:49 am, Patrick Lightbody <patr...@lightbody.net> wrote:
> Dave,
> This definitely sounds like a real issue. Could you do two things?
>
> 1) Provide a patch that addresses the first issue you identified
>
> 2) Give us a pointer (ideally with a URL to the code inhttp://svn.openqa.organd a line number) to where you think the
> additional logic could be added to detect the timeout, possibly using
> a Future as you suggested.
>
> Patrick
>
> On Tue, Aug 4, 2009 at 8:06 AM, Undeleterious<dave.h...@gmail.com> wrote:
>
> > Hi all,
>
> > I have been having a few issues with my Selenium Grid setup when
> > running multiple instances of Firefox concurrently.
>
> > My initial problem was the "Firefox is already running" dialog box as
> > described here:http://support.mozilla.com/en-US/kb/Firefox+is+already+running+but+is....

Patrick Lightbody

unread,
Aug 5, 2009, 9:39:06 AM8/5/09
to selenium-...@googlegroups.com
I've done some work for BrowserMob to re-use sessions that I intend to
port back to SRC eventually. Like you said, it won't address the root
issue, but it will at least reduce the frequency of it happening (as
well as make things faster!)

The sleepTight() definitely isn't the right way to go. The root issue
is that we're launching Firefox with a special chrome extension that
will cause it to immediately quit. That allows us to get a fully
populated profile directory. The block of code is here:

private void populateCustomProfileDirectory(String profilePath)
throws IOException {
/*
* The first time we launch Firefox with an empty profile directory,
* Firefox will launch itself, populate the profile directory, then
* kill/relaunch itself, so our process handle goes out of date.
* So, the first time we launch Firefox, we'll start it up at an URL
* that will immediately shut itself down.
*/
cmdarray = new
String[]{browserInstallation.launcherFilePath(), "-profile",
profilePath, "-chrome", CHROME_URL};
LOGGER.info("Preparing Firefox profile...");
shell.setCommandline(cmdarray);
shell.execute();
waitForFullProfileToBeCreated(20 * 1000);
}

Your issue is with waitForFullProfileToBeCreated - it appears that
perhaps this method returns prematurely. Adding a fixed sleep time of
course is still brittle. But I really have an even more basic
question: why is it that we need to wait at all? Clearly someone
thought it was a good idea, but shell.execute() is supposed to be
synchronous (it returns an exit code after all).

Can anyone shed some light on this? Dan perhaps?

Patrick

Dave Hunt

unread,
Aug 5, 2009, 10:42:10 AM8/5/09
to selenium-developers
I don't want to confuse the two issues:

Issue 1

The problem:
1. Profile is created
2. Firefox is launched (fails because 'firefox is already running')

The workaround:
1. Profile is created
2. Sleep for 2.5 seconds
3. Firefox is launched

I suspect that the profile execution returns before the profile is
fully created, therefore explaining a wait in the code for
'extensions.ini' to exist. I guess sometimes just waiting for one
nominated file in the profile to exist isn't enough.


Issue 2

1. Profile is created (this appears to never return an exit code)

I'm currently trying out some different configurations. Currently I
have successfully run with two RCs launching Firefox simultaneously
perhaps this issue only occurs if the two or more executes happen to
run at the same time..?

Let me know if this needs any more clarification. I appreciate your
time.

Dave.
> >http://svn.openqa.org/svn/selenium-rc/trunk/server-coreless/src/main/...
>
> > Can anyone think of a better and more robust way of actually making
> > sure that the profile has been fully created before continuing?
>
> > As for the execute that might need to timeout, that's on line 105 of
> >http://svn.openqa.org/svn/selenium-rc/trunk/server-coreless/src/main/...
>
> > Incidentally, these issues are related to creating a new profile for
> > each session. If there was a way to reuse a profile between sessions -
> > but not between RCs - then although it doesn't address this issue it
> > may provide a workaround that has the additional benefits of faster
> > launch times for Firefox and ability to test cookies between sessions.
> > See issueshttp://jira.openqa.org/browse/SRC-168and
> >http://jira.openqa.org/browse/SEL-661.
>
> > I also just foundhttp://jira.openqa.org/browse/SEL-668that might be
> > related to my first issue.
>
> > Thanks,
> > Dave
>
> > On Aug 5, 5:49 am, Patrick Lightbody <patr...@lightbody.net> wrote:
> >> Dave,
> >> This definitely sounds like a real issue. Could you do two things?
>
> >> 1) Provide a patch that addresses the first issue you identified
>
> >> 2) Give us a pointer (ideally with a URL to the code inhttp://svn.openqa.organda line number) to where you think the

jcmeyrignac

unread,
Aug 6, 2009, 6:06:07 AM8/6/09
to selenium-developers
We fixed this bug in our Selenium RC branch:

in HTMLLauncher.java, replace:

String sessionId = Long.toString(System.currentTimeMillis() %
1000000);

With:
String sessionId = UUID.randomUUID().toString().replace("-",
"");

And add:
import java.util.UUID;

The problem was when you run multiple sessions of SeleniumRC, they
might fail because their SessionID are equal.

BTW, we run 3 simultaneous Selenium RC without problem since a few
months.

JC
> > > I also just foundhttp://jira.openqa.org/browse/SEL-668thatmight be
> > > related to my first issue.
>
> > > Thanks,
> > > Dave
>
> > > On Aug 5, 5:49 am, Patrick Lightbody <patr...@lightbody.net> wrote:
> > >> Dave,
> > >> This definitely sounds like a real issue. Could you do two things?
>
> > >> 1) Provide a patch that addresses the first issue you identified
>
> > >> 2) Give us a pointer (ideally with a URL to the code inhttp://svn.openqa.organdaline number) to where you think the

Dave Hunt

unread,
Aug 6, 2009, 6:21:02 AM8/6/09
to selenium-developers
Thanks JC. The HTMLLauncher looks like it is specific to running HTML
suite tests, which we are not doing. I am using the Java client
driver, and the browser session IDs are already UUIDs. From my
investigation the IDs of the sessions that are failing in my case are
unique.

I suspect your fix will help others though, have you submitted a
patch?

Thanks,
Dave
> > > >> 2) Give us a pointer (ideally with a URL to the code inhttp://svn.openqa.organdalinenumber) to where you think the

jcmeyrignac

unread,
Aug 6, 2009, 8:11:26 AM8/6/09
to selenium-developers
Sorry, we don't use Selenium Grid, but the bug sounded similar.

> I suspect your fix will help others though, have you submitted a
> patch?
>
It was not my fix (it was from one of my co-workers), so I didn't
submit a bug.
This is done now:
http://jira.openqa.org/browse/SRC-729
(sorry for the typo in the title, I typed conflit instead of conflict,
since it's the french term).

JC

Dave Hunt

unread,
Aug 6, 2009, 9:26:52 AM8/6/09
to selenium-developers
I have encountered this issue on two RCs running on two virtual
machines 10 minutes apart. The following is from the console, and
includes some additional debug that I compiling into RC:

[java] 13:52:16.920 INFO - creating new remote session
[java] 13:52:16.930 DEBUG - Requested browser string '*firefox'
matches *firefox
[java] 13:52:16.930 INFO - Allocated session
d1f3b9125c8c4c61a60951c3efeb3d62 for http://www.example.com,
launching...
[java] 13:52:21.311 DEBUG - Extracting /customProfileDirCUSTFFCHROME
to C:\DOCUME~1\selenium\LOCALS~1\Temp
\customProfileDird1f3b9125c8c4c61a60951c3efeb3d62
[java] 13:52:21.532 INFO - Preparing Firefox profile...
[java] 13:52:21.532 DEBUG - Executing: C:\Program Files\Mozilla Firefox
\firefox.exe -profile C:\DOCUME~1\selenium\LOCALS~1\Temp
\customProfileDird1f3b9125c8c4c61a60951c3efeb3d62 -chrome
chrome://killff/content/kill.html

It doesn't really give any more information than I had already
provided. I also added line outputting the exit code of the execute as
debug, which as you can see is not returned. This time I have noticed
some socketExceptions with the other running RCs after attempting to
kill Firefox. I'll do some more investigation...

Dave
> > > I also just foundhttp://jira.openqa.org/browse/SEL-668thatmight be
> > > related to my first issue.
>
> > > Thanks,
> > > Dave
>
> > > On Aug 5, 5:49 am, Patrick Lightbody <patr...@lightbody.net> wrote:
> > >> Dave,
> > >> This definitely sounds like a real issue. Could you do two things?
>
> > >> 1) Provide a patch that addresses the first issue you identified
>
> > >> 2) Give us a pointer (ideally with a URL to the code inhttp://svn.openqa.organdaline number) to where you think the

Dave Hunt

unread,
Aug 6, 2009, 10:17:34 AM8/6/09
to selenium-developers
Sorry, the socketException looks to be normal. Please ignore.

On Aug 6, 2:26 pm, Dave Hunt <dave.h...@gmail.com> wrote:
> I have encountered this issue on two RCs running on two virtual
> machines 10 minutes apart. The following is from the console, and
> includes some additional debug that I compiling into RC:
>
> [java] 13:52:16.920 INFO - creating new remote session
> [java] 13:52:16.930 DEBUG - Requested browser string '*firefox'
> matches *firefox
> [java] 13:52:16.930 INFO - Allocated session
> d1f3b9125c8c4c61a60951c3efeb3d62 forhttp://www.example.com,
> > > >> 2) Give us a pointer (ideally with a URL to the code inhttp://svn.openqa.organdalinenumber) to where you think the

Dave Hunt

unread,
Aug 6, 2009, 11:36:30 AM8/6/09
to selenium-developers
After looking into this a bit more, it looks like we might be able to
use an ExecuteWatchdog (http://api.dpml.net/ant/1.6.5/org/apache/tools/
ant/taskdefs/ExecuteWatchdog.html) to timeout the execute, and perhaps
retry a few times before throwing an exception. I'm prepared to try
this out for myself - and it'll be really handy to force myself to
learn some new skills here - but this issue is really causing a block
for us at the moment so if anyone else wants to jump in and save the
day that would be great!

Cheers,
Dave

Philippe Hanrigou

unread,
Aug 6, 2009, 12:16:47 PM8/6/09
to selenium-developers
Hi Dave,

Thanks for investigating all these problems.

2009/8/5 Undeleterious <dave...@gmail.com>

Incidentally, these issues are related to creating a new profile for
each session. If there was a way to reuse a profile between sessions -
but not between RCs

I believe you can achieve this by launching the RCs with the -browserSessionReuse option. This is how I usually use the *firefox browser mode in Selenium Grid.

Cheers,
- Philippe

Dave Hunt

unread,
Aug 6, 2009, 12:47:17 PM8/6/09
to selenium-developers
I'm trying the -browserSessionReuse now to see if it is a viable
workaround.

I have also been able to properly set up the projects in Eclipse IDE,
and have compiled a version using a watchdog. I'll try to get that
running tomorrow and report on my progress. I suspect that I might be
able to get something working but it wont be neat, so I'll work on a
patch and submit it so it can be refined before it's merged.

Dave.

On Aug 6, 5:16 pm, Philippe Hanrigou <philippe.hanri...@gmail.com>
wrote:
> Hi Dave,
>
> Thanks for investigating all these problems.
>
> 2009/8/5 Undeleterious <dave.h...@gmail.com>

Dave Hunt

unread,
Aug 7, 2009, 4:09:34 AM8/7/09
to selenium-developers
Okay the -browserSessionReuse makes the issue less likely to occur,
but I was able to reproduce it again when trying to scale up the
number of concurrent instances. I have some other stuff to do today,
but it'll make my week if I get something working by the end of the
day.

Cheers,
Dave

Dave Hunt

unread,
Aug 7, 2009, 5:50:55 AM8/7/09
to selenium-developers
I think I'm out of my depth here. I modified the
populateCustomProfileDirectory method to try timing out the execute,
which seemed to work but I've somehow introduced an issue with killing
firefox... I can't see how my modifications caused this - perhaps
there's something wrong with they way I've built Selenium RC.

Here is the method after my modifications:

private void populateCustomProfileDirectory(String profilePath)
throws IOException {
/*
* The first time we launch Firefox with an empty profile
directory,
* Firefox will launch itself, populate the profile directory, then
* kill/relaunch itself, so our process handle goes out of date.
* So, the first time we launch Firefox, we'll start it up at an
URL
* that will immediately shut itself down.
*/
cmdarray = new String[]{browserInstallation.launcherFilePath
(), "-profile", profilePath, "-chrome", CHROME_URL};
LOGGER.info("Preparing Firefox profile...");

ExecuteWatchdog watchdog = new ExecuteWatchdog(30000L);
ExecuteStreamHandler pumpStreamHandler = new PumpStreamHandler
();
Execute execute = new Execute(pumpStreamHandler, watchdog);
execute.setCommandline(cmdarray);
LOGGER.info("Executing " + browserInstallation.launcherFilePath
() + " -profile " + profilePath + " -chrome " + CHROME_URL);
int exitValue = execute.execute();
LOGGER.info("Exit value " + exitValue);
if (Execute.isFailure(exitValue)) {
LOGGER.info("Execute failed!");
if (watchdog.killedProcess()) {
LOGGER.info("Execute timed out!");
throw new RuntimeException("Timed out waiting for profile to
be created!");
}
}
waitForFullProfileToBeCreated(20 * 1000);
}

Now I'm frequently getting 'Firefox seems to have ended on its own
(did we kill the real browser???)' when Firefox is still running.

If anyone can help me out here I'd really appreciate it. Patrick, do
you have a patch I could use temporarily for your profile reuse
option?

Cheers,
Dave

Patrick Lightbody

unread,
Aug 11, 2009, 10:07:06 PM8/11/09
to selenium-...@googlegroups.com
Dave,
Can you highlight the changes you made to the method? It might also be
worth trying to create a simple, contained test case we can use to
reproduce the problem on our machines. I'd be happy to fix this bug if
you can make a test case for me.

As for my patch (which we use for BrowserMob), unfortunately it's not
something we can easily share. The reason is se have a very unique
need for launching browsers (a fixed number of browsers running
concurrently, with each one getting a different set of environment
variables). As such, the key that we use for re-using profiles is
unique for our use (DISPLAY=:X) and isn't something that would
necessarily apply for your usage :(

Patrick

Dave Hunt

unread,
Aug 12, 2009, 12:12:34 PM8/12/09
to selenium-developers
I've attempted to highlight the changes below using '+' to indicate
new lines of code and 'm' to indicate a modified line. I rolled this
back and the other issue went away, so whatever I've done in this code
is bad.

Writing a test to reproduce this is a great idea, and I will work on
that as soon as I can get my head around it. At the moment my Selenium
Grid setup is offline but I should be able to at least reproduce the
'Firefox is already running' issue.

Cheers,
Dave.

private void populateCustomProfileDirectory(String profilePath)
throws IOException {
/*
* The first time we launch Firefox with an empty profile
directory,
* Firefox will launch itself, populate the profile directory,
then
* kill/relaunch itself, so our process handle goes out of date.
* So, the first time we launch Firefox, we'll start it up at an
URL
* that will immediately shut itself down.
*/
cmdarray = new String[]{browserInstallation.launcherFilePath
(), "-profile", profilePath, "-chrome", CHROME_URL};
LOGGER.info("Preparing Firefox profile...");
+ ExecuteWatchdog watchdog = new ExecuteWatchdog(30000L);
+ ExecuteStreamHandler pumpStreamHandler = new PumpStreamHandler
();
+ Execute execute = new Execute(pumpStreamHandler, watchdog);
execute.setCommandline(cmdarray);
+ LOGGER.info("Executing " +
browserInstallation.launcherFilePath() + " -profile " + profilePath +
" -chrome " + CHROME_URL);
m int exitValue = execute.execute();
+ LOGGER.info("Exit value " + exitValue);
+ if (Execute.isFailure(exitValue)) {
+ LOGGER.info("Execute failed!");
+ if (watchdog.killedProcess()) {
+ LOGGER.info("Execute timed out!");
+ throw new RuntimeException("Timed out waiting
for profile to be created!");
+ }
+ }
waitForFullProfileToBeCreated(20 * 1000);

Dave Hunt

unread,
Aug 14, 2009, 12:51:33 PM8/14/09
to selenium-developers
I've written a rather crude test for this with some success.
Unfortunately I couldn't get the FirefoxChromeLauncherIntegrationTest
to pass due to a NullPointerException at
org.openqa.selenium.server.browserlaunchers.LauncherUtils.extractHTAFile
(LauncherUtils.java:227) - full stack after sig.

Instead I added a test called
testLaunchMultipleBrowsersConcurrentlyWithDefaultConfiguration in
FirefoxCustomProfileLauncherIntegrationTest. This seems to have
problems launching 20 concurrent browsers. When we run our tests we
often get issues running just 5 concurrent browsers. This would either
imply a difference between FirefoxChrome and FirefoxCustomProfile
launchers, or perhaps there are issues with browsers closing while
others are starting..?

Let me know if you want me to provide a patch with this test. My
preference would be to get the FirefoxChrome launcher tests working.
The issue appears to be "/core/TestPrompt.html" not being on my class-
path. Any ideas? Are there tests passing for you? I'm running them in
Eclipse.

Feel like we're finally getting somewhere. I like reproducible
failures. :)

Cheers,
Dave.

java.lang.NullPointerException
at java.io.Reader.<init>(Unknown Source)
at java.io.InputStreamReader.<init>(Unknown Source)
at
org.openqa.selenium.server.browserlaunchers.LauncherUtils.extractHTAFile
(LauncherUtils.java:227)
at
org.openqa.selenium.server.browserlaunchers.FirefoxChromeLauncher.copyRunnerHtmlFiles
(FirefoxChromeLauncher.java:202)
at
org.openqa.selenium.server.browserlaunchers.FirefoxChromeLauncher.makeCustomProfile
(FirefoxChromeLauncher.java:185)
at
org.openqa.selenium.server.browserlaunchers.FirefoxChromeLauncher.launch
(FirefoxChromeLauncher.java:82)
at
org.openqa.selenium.server.browserlaunchers.LauncherFunctionalTestCase.launchBrowser
(LauncherFunctionalTestCase.java:17)
at
org.openqa.selenium.server.browserlaunchers.FirefoxChromeLauncherIntegrationTest.testLauncherWithDefaultConfiguration
(FirefoxChromeLauncherIntegrationTest.java:12)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at junit.framework.TestCase.runTest(TestCase.java:168)
at junit.framework.TestCase.runBare(TestCase.java:134)
at junit.framework.TestResult$1.protect(TestResult.java:110)
at junit.framework.TestResult.runProtected(TestResult.java:128)
at junit.framework.TestResult.run(TestResult.java:113)
at junit.framework.TestCase.run(TestCase.java:124)
at junit.framework.TestSuite.runTest(TestSuite.java:232)
at junit.framework.TestSuite.run(TestSuite.java:227)
at org.junit.internal.runners.JUnit38ClassRunner.run
(JUnit38ClassRunner.java:81)
at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run
(JUnit4TestReference.java:45)
at org.eclipse.jdt.internal.junit.runner.TestExecution.run
(TestExecution.java:38)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests
(RemoteTestRunner.java:460)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests
(RemoteTestRunner.java:673)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run
(RemoteTestRunner.java:386)
at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main
(RemoteTestRunner.java:196)


On 12 Aug, 03:07, Patrick Lightbody <patr...@lightbody.net> wrote:

Dave Hunt

unread,
Aug 20, 2009, 4:46:57 AM8/20/09
to selenium-developers
I've now replicated this launching just two Firefox profiles on a
virtual machine... I suspect the performance of the machine (and the
fact that virus scanning is running) increase the chance of this
happening.

Patrick Lightbody

unread,
Aug 21, 2009, 11:52:24 AM8/21/09
to selenium-...@googlegroups.com
Dave - this is great news. Can you share the patch that includes the
test? Or, if you have SVN access on the new GCode location (you should
since you're part of the doc team) you can just check the test in
directly.

Dave Hunt

unread,
Aug 24, 2009, 5:31:31 AM8/24/09
to selenium-developers
I've checked it in. Like I said it's a rather crude test... See if you
can replicate my problem, although I am not able to do so 100% of the
time I do often get failures to launch the browser that may result
differently with the FirefoxChrome launcher. I was unable to get the
existing FirefoxChrome integration tests to run, so I wrote my test
against the FirefoxCustomProfile launcher.

Thanks,
Dave

Dave Hunt

unread,
Oct 12, 2009, 12:17:13 PM10/12/09
to selenium-developers
Has anyone been able to replicate this? I'm having difficulty scaling
our Selenium solution as a result of this issue. I am able to run with
Grid with multiple instances of Firefox as long as I only run one
instance per VM. Obviously there's a lot more overhead if I'm running
an entire additional OS just for an additional instance of Firefox.
This is so easy for me to replicate - just starting 2 or 3 Firefox
instances at the same time on one VM...

Does anyone have some suggestions on the hardware/software I should be
using to scale up with Selenium Grid?

I've so far tried the following:
SunOS host with VirtualBox WinXP guests
WinXP host with VirtualBox WinXP guests
Ubuntu host with VirtualBox WinXP guests

The SunOS solution was by far the best. I still had an issue with
multiple concurrent Firefox instances, but it was a shared server and
ultimately VirtualBox was blamed (probably accurately) for causing
reliability issues with the server. We then had dedicated boxes (not
server spec though) for the other configurations, which I haven't been
particularly impressed with.

What have other people used? Unfortunately I haven't been able to
convince the CTO to try a Cloud solution like Sauce Labs, so think at
least for the time being I'm limited to an internal solution. In my
last job we had ~8 physical machines, which was better than my current
solutions but seems crazy to suggest. Is there a nice off-the-shelf
virtualisation solution, or some sort of rackable unit consisting of
mutiple physical machines? I still believe my biggest bottleneck is
disk IO.

Any thoughts would be really appreciated.

Cheers,
Dave
> ...
>
> read more »

Mark Collin

unread,
Oct 12, 2009, 1:35:20 PM10/12/09
to selenium-...@googlegroups.com
We are currently running a grid  over VMWare ESXi that seems to be performing reasonably well.

Each VM is running 2 browser instances and 2 Firefox RC instances that are purely for checking webmail (our system sends out e-mails after various functions are performed which requires an e-mail check), in total we have 8 Firefox RC instances running purely to check e-mail as and when required.

On our system we manage two browsers testing in anger, and two e-mail check RC's before we run out of memory on the VM's.  Admittedly our VM's are not amazingly powerful, but it gets the job done (eventually...).  So our configuration looks like this:

VM1: Safari, IE6, 2 Firefox Mail Check
VM2: Opera, IE7, 2 Firefox Mail Check
VM3: Firefox, IE8, 2 Firefox Mail Check

etc.

Maybe related to the fact you are not using a bare metal hypervisor?  I haven't tried it on VirtualBox myself so i'm afraid I can't really offer any insights as to common potential pitfalls.
-- This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you have received this email in error please notify postm...@ardescosolutions.com. This message contains confidential information and is intended only for the individual named. If you are not the named addressee you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately by e-mail if you have received this e-mail by mistake and delete this e-mail from your system. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited.

Dave Hunt

unread,
Oct 13, 2009, 4:51:07 AM10/13/09
to selenium-developers
Hi Mark,

Thanks for the information regarding your setup, although I'm not sure
I understand how your e-mail check works. What configurations do you
have for your VMs? Is memory your only bottleneck?

I have been using the following VM configurations:
VM1: Google Chrome, IE6, FF2
VM2: Safari, IE7, FF3
VM3: Opera, IE8, FF3.5

My intention was to have multiple FF3 RCs on the one VM, but this
comes up with issues as mentioned earlier in this discussion. As an
alternate solution I added VM4 and VM5 that were clones of VM3. Most
of the time I'm only using one browser, so three VMs are active but
appear to be rather sluggish. Id' like to get more performance out of
them and suspect that disk queue is my main bottleneck.

I currently have around 190 tests for one application, which take
about an hour to complete. For this to be acceptable I really want to
halve this time, so either need to double the number of RCs or
increase the throughput. I will also be adding tests from more
applications in the near future so really want something I can scale
comfortably.

I will look into VMWare ESXi today, and will look into suitable
hardware. Part of me feels that just buying ~6 small form factor PCs
would solve the problem nicely. I can definitely see the advantages of
virtualisation but the hardware specifications appear to be somewhat
of a dark art.

Thanks again,
Dave.
> ...
>
> read more »

Sascha

unread,
Oct 15, 2009, 6:07:56 PM10/15/09
to selenium-developers
Some month ago I also had a problem with
populateCustomProfileDirectory. I had a corrupt firefox profile and
firefox just stopped execution but did not kill the browser. So my
tests stopped. In that time I gave this browser execution a timeout.
See http://jira.openqa.org/browse/SRC-664.

Some weeks later I also had the problem with the parallel execution
and the "firefox already running" message. To workaround this I added
a retry count of three to the code. I attached my actual patch for the
that on SRC-664. I am actually not getting this problem anymore. But I
am only using up to four browser instances on one machine. But as this
now takes more than a hour to run my complete test suite I am planning
to raise this again.

Sascha
> > We are currentlyrunninga grid  over VMWare ESXi that seems to be
> > performing reasonably well.
>
> > Each VM isrunning2 browser instances and 2FirefoxRC instances that
> > are purely for checking webmail (our system sends out e-mails after
> > various functions are performed which requires an e-mail check), in
> > total we have 8FirefoxRC instancesrunningpurely to check e-mail as
> > and when required.
>
> > On our system we manage two browsers testing in anger, and two e-mail
> > check RC's before we run out of memory on the VM's.  Admittedly our VM's
> > are not amazingly powerful, but it gets the job done (eventually...).
> > So our configuration looks like this:
>
> > VM1: Safari, IE6, 2FirefoxMail Check
> > VM2: Opera, IE7, 2FirefoxMail Check
> > VM3:Firefox, IE8, 2FirefoxMail Check
>
> > etc.
>
> > Maybe related to the fact you are not using a bare metal hypervisor?  I
> > haven't tried it on VirtualBox myself so i'm afraid I can't really offer
> > any insights as to common potential pitfalls.
>
> > On Mon, 2009-10-12 at 09:17 -0700, Dave Hunt wrote:
> > > Has anyone been able to replicate this? I'm having difficulty scaling
> > > our Selenium solution as a result of this issue. I am able to run with
> > > Grid with multiple instances ofFirefoxas long as I only run one
> > > instance per VM. Obviously there's a lot more overhead if I'mrunning
> > > an entire additional OS just for an additional instance ofFirefox.
> > > This is so easy for me to replicate - just starting 2 or 3Firefox
> > > instances at the same time on one VM...
>
> > > Does anyone have some suggestions on the hardware/software I should be
> > > using to scale up with Selenium Grid?
>
> > > I've so far tried the following:
> > >  SunOS host with VirtualBox WinXP guests
> > >  WinXP host with VirtualBox WinXP guests
> > >  Ubuntu host with VirtualBox WinXP guests
>
> > > The SunOS solution was by far the best. I still had an issue with
> > > multiple concurrentFirefoxinstances, but it was a shared server and
> > > > > > I've now replicated this launching just twoFirefoxprofiles on a
> > > > > > virtual machine... I suspect the performance of the machine (and the
> > > > > > fact that virus scanning isrunning) increase the chance of this
> > > > > > happening.
>
> > > > > > On 14 Aug, 17:51, Dave Hunt <dave.h...@gmail.com> wrote:
> > > > > >> I've written a rather crude test for this with some success.
> > > > > >> Unfortunately I couldn't get the FirefoxChromeLauncherIntegrationTest
> > > > > >> to pass due to a NullPointerException at
> > > > > >> org.openqa.selenium.server.browserlaunchers.LauncherUtils.extractHTAFile
> > > > > >> (LauncherUtils.java:227) - full stack after sig.
>
> > > > > >> Instead I added a test called
> > > > > >> testLaunchMultipleBrowsersConcurrentlyWithDefaultConfiguration in
> > > > > >> FirefoxCustomProfileLauncherIntegrationTest. This seems to have
> > > > > >> problems launching 20 concurrent browsers. When we run our tests we
> > > > > >> often get issuesrunningjust 5 concurrent browsers. This would either
> > > > > >> imply a difference between FirefoxChrome and FirefoxCustomProfile
> > > > > >> launchers, or perhaps there are issues with browsers closing while
> > > > > >> others are starting..?
>
> > > > > >> Let me know if you want me to provide a patch with this test. My
> > > > > >> preference would be to get the FirefoxChrome launcher tests working.
> > > > > >> The issue appears to be "/core/TestPrompt.html" not being on my class-
> > > > > >> path. Any ideas? Are there tests passing for you? I'mrunningthem in
> ...
>
> read more »

Dave Hunt

unread,
Oct 16, 2009, 4:23:31 AM10/16/09
to selenium-developers
Hey Sascha,

Thanks for the patch - I'm looking forward to trying it out as it's a
similar workaround to one I made myself but probably much cleaner...
That said, it is a workaround and I'm still keen on finding a way to
prevent launching Firefox before the profile is ready, otherwise the
issue may return depending on scale and resources.

Thanks again,
Dave.

On 15 Oct, 23:07, Sascha <saschaschwa...@yahoo.de> wrote:
> Some month ago I also had a problem with
> populateCustomProfileDirectory. I had a corrupt firefox profile and
> firefox just stopped execution but did not kill the browser. So my
> tests stopped. In that time I gave this browser execution a timeout.
> Seehttp://jira.openqa.org/browse/SRC-664.
> ...
>
> read more »
Reply all
Reply to author
Forward
0 new messages