A general way to expose browser-vendor-specific commands in WebDriver client Java bindings

314 views
Skip to first unread message

Shuotao Gao

unread,
Oct 2, 2014, 7:44:18 PM10/2/14
to selenium-...@googlegroups.com, st...@chromium.org, busta...@chromium.org, vlotos...@gmail.com, Simon Stewart, sam...@chromium.org
Hi selenium developers,

I would like to propose a general way to expose browser-vendor-specific commands in WebDriver client Java bindings as below.
In ChromeDriver, we have a bunch of chrome-specific commands, like launching a Chrome App, getting Javascript Heap Snapshot, starting/stopping Javascript CPU profiling, etc.
Because these commands are not directly exposed in the WebDriver client, these features are neither obvious nor easily accessible to ChromeDriver users.

If the approach below looks OK overall, I would be happy to send a pull request and discuss implementation details through code review.

1. Refactor java/client/src/org/openqa/selenium/remote/HttpCommandExecutor.java and add the following method:


    void registerCommand(String commandName, String endpointPath, String httpMethod) { if (this.customizedNameToUrl.containsKey(commandName)) return; if ("get".equalsIgnoreCase(httpMethod)) { this.customizedNameToUrl.put(commandName, get(endpoitPath)); } else if ("post".equalsIgnoreCase(httpMethod)) { this.customizedNameToUrl.put(commandName, post(endpoitPath)); } else if ("delete".equalsIgnoreCase(httpMethod)) { this.customizedNameToUrl.put(commandName, delete(endpoitPath)); } else { throw new RuntimeException("Unsupported http method:" + httpMethod); } }


2. As a browser vendor, we could add a BROWSERUtil.java (eg., java/client/src/org/openqa/selenium/chrome/ChromeUtil.java)


   boolean LaunchApp(RemoteWebDriver driver, String appId) {

       HttpCommandExecutor executor = driver.getCommandExecutor();

       executor.registerCommand("chromium-launch-app", "/session/:sessionId/chromium/launch_app", "post");

       Map<String, String> params = new HashMap<String, String>();

       param.put(“id”, appId);

       Command cmd = new Command(driver.getSessionId(), “chromium-launch-app”, params);

       Response response = executor.execute(cmd);

       // Parse response and return the result...

return true;

   }


3. Usage by end users:
DesiredCapabilities capabilities = DesiredCapabilities.chrome(); RemoteWebDriver driver = new RemoteWebDriver(new URL("http://127.0.0.1:9515"), capabilities);
boolean result = ChromeUtil.LaunchApp(driver, "chrome-app-id");


Pros:
1. As most Driver classes (excpet HtmlUnitDriver only?) extend RemoteWebDriver, this solution will apply to most driver instances, no matter how they are created.
    For some framework, the driver instances might be injected into the test code, or it is not easy for the user to control the creation of the driver instance and set additional commands through HttpCommandExecutor.
2. Compare to setting additional commands through HttpCommandExecutor, this solution makes it possible to parse the response from the remote end and return decoded result to the test. Eg., for heap snapshot, return an instance of HeapSnapshot class instead of a map of json format.

Cons:
1. In BROWSERUtil.java, there might be duplicate code for different commands. But this could be resolved through some engineering work.

Thanks,
Shuotao Gao

Andreas Tolfsen

unread,
Oct 3, 2014, 6:27:33 AM10/3/14
to selenium-...@googlegroups.com, st...@chromium.org, busta...@chromium.org, vlotos...@gmail.com, Simon Stewart, sam...@chromium.org
On Fri, Oct 3, 2014 at 12:44 AM, Shuotao Gao <gaosh...@gmail.com> wrote:
> I would like to propose a general way to expose browser-vendor-specific
> commands in WebDriver client Java bindings as below.
>
> In ChromeDriver, we have a bunch of chrome-specific commands, like launching
> a Chrome App, getting Javascript Heap Snapshot, starting/stopping Javascript
> CPU profiling, etc.
>
> Because these commands are not directly exposed in the WebDriver client,
> these features are neither obvious nor easily accessible to ChromeDriver
> users.

I haven't worked on the Java bindings for some time, but have you
considered approaching this using role based interfaces and an
augmenter?

Shuotao Gao

unread,
Oct 3, 2014, 5:17:41 PM10/3/14
to selenium-...@googlegroups.com, st...@chromium.org, busta...@chromium.org, Seva Lo, Simon Stewart, sam...@chromium.org, David Burns
Yes, I tried that in a quite early pull request https://github.com/SeleniumHQ/selenium/pull/131
But it seems not a preferred solution according to the comments I got for that pull request.


--
You received this message because you are subscribed to the Google Groups "Selenium Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/selenium-developers/CAL_dnaVADk17A4KYF1-%2BejC%3DQTA5W1pTKm95%2ByQN7bGjrE%2BcBg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Jason Leyba

unread,
Oct 3, 2014, 6:45:21 PM10/3/14
to selenium-...@googlegroups.com, st...@chromium.org, busta...@chromium.org, vlotos...@gmail.com, Simon Stewart, sam...@chromium.org
Everything is already there to define new commands:

new HttpCommandExecutor(
    ImmutableMap.of(
        "chromium-launch-app",
        new CommandInfo("/session/:sessionId/chromium/launch_app", HttpMethod.POST)),
    serverUrl);

// or...

new HttpCommandExecutor(serverUrl) {{
  defineCommand("chromium-launch-app", new CommandInfo("/session/:sessionId/chromium/launch_app", HttpMethod.POST);
}};

This is already used by Selendroid:

--
You received this message because you are subscribed to the Google Groups "Selenium Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

David Burns

unread,
Oct 3, 2014, 8:44:51 PM10/3/14
to selenium-...@googlegroups.com
When I have previously broached the subject of vendors giving code to the project when it would be specific to that vendor I was told that vendors code is not welcome and they should host modules like this themselves/package management sites. My example was giving over the transport code for speaking directly to Firefox. This is why I made the comment I did in that PR.

Has the project changed their stance on this?

Davod

--
You received this message because you are subscribed to the Google Groups "Selenium Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

Shuotao Gao

unread,
Oct 6, 2014, 9:05:04 PM10/6/14
to selenium-...@googlegroups.com
I also checked constructing HttpCommandExecutor with new commands , but I see the following cons there for which I wanted to try the new solution.
HttpCommandExecutor wont' work well if the user couldn't easily control over the creation of a driver instance. Some infra might have wrapped API for creation of driver instance, or the driver instance is injected directly into the test code at runtime.
- If we follow the style in SelendroidCommandExecutor.java, it seems impossible for users to make it work with RemoteWebDriver which is used more often than ChromeDriver class.

The main difference of the new solution is that it allows injecting the new command during test running, rather than construction time of driver instance.
In that way, it could workaround the limitations above.

If we don't want to support using RemoteWebDriver with vendor-specific commands, we could go with constructing HttpCommandExecutor with new commands, but force infra and end-user to adapt to it.

Shuotao Gao

unread,
Oct 6, 2014, 9:22:27 PM10/6/14
to selenium-...@googlegroups.com
It seems not that nice to ask ends users to download additional packages/libraries besides the selenium ones.
What's the main reason that we wanted to leave vendor code hosted by vendors themselves?
As java/client/src/org/openqa/selenium/[chrome|firefox|ie|safari|android|etc]/ are more browser-specific, could we just add stuff related to vendor-specific commands to those directories?

It seems better to maintain both parts together:
- No additional download step for end users.
- If change on selenium side breaks vendor-specific codes, we could discover it right away by running all tests together in selenium.

David Burns

unread,
Oct 9, 2014, 9:14:05 AM10/9/14
to selenium-...@googlegroups.com
On Tue, Oct 7, 2014 at 2:22 AM, Shuotao Gao <gaosh...@gmail.com> wrote:

It seems not that nice to ask ends users to download additional packages/libraries besides the selenium ones.
What's the main reason that we wanted to leave vendor code hosted by vendors themselves?
As java/client/src/org/openqa/selenium/[chrome|firefox|ie|safari|android|etc]/ are more browser-specific, could we just add stuff related to vendor-specific commands to those directories?

It seems better to maintain both parts together:
- No additional download step for end users.
- If change on selenium side breaks vendor-specific codes, we could discover it right away by running all tests together in selenium.


This was the argument I put back to the project. Perhaps the other committers would care to comment on this.

David

Simon Stewart

unread,
Jan 1, 2015, 4:43:03 PM1/1/15
to selenium-developers
Jumping in a little late to the conversation (yay holiday email catchups)

The main reason for wanting browser vendors be responsible for their own code was that it should be possible for there to be independent releases of the browser-specific code without forcing everyone to do an update. Put another way, the selenium project offers the APIs that are shared between all browsers, and the glue that binds everything together (notably an implementation of the local end of the spec, and two intermediate nodes (the server and grid), as well as IDE, which uses the APIs offered by the local-end)

The only reason we release as often as we do right now is that we've got binary dependencies in the Firefox driver that we can't work around yet. The IEDriver and the existing ChromeDriver are already on their own release schedules.

The spec minimizes the risk of a change in selenium breaking browser vendors code. In the java world, for better or worse, everyone grabs deps from maven central. If we define our own maven pom.xml that depends on the latest released versions of each of the vendor's projects, we'd solve the "download" problem too. A similar approach for other languages would also work well.

Simon

Seva Lo

unread,
Apr 1, 2015, 11:35:27 PM4/1/15
to selenium-...@googlegroups.com
Good discussion!

I think we have two distinct questions/decisions to make here. I propose we don't block one on another.

Question 1. Whether/when to ask/force/make browser vendors host vendor specific code outside of Selenium.
That is a nice and big one. I started a separate thread trying to focus decision making and the work on that.

Question 2. Whether to allow ourselves browser vendors to define browser specific commands in the vendor specific parts of the client bindings.

Regardless where the vendor specific code is, vendor specific client bindings need a way to export the custom commands to the users. Hopefully that can happen in a consistent way across the browsers/vendors but it seems that it has to happen in some way because Firefox, Chrome, PhantomJS and probably most others already have or will eventually have their custom commands to expose.

(If I am going to far and never allowing custom commands in any form in vendor specific client bindings *is* an option, I am happy to be corrected here).

Now assuming that we have to have *some* way to expose custom commands in the bindings:

Java, Python, JavaScript and .Net bindings currently provide a standard API for a driver to define custom commands (not sure about Ruby; I don't speak Ruby). There are some usages of that API, in and out there: Selenium/JavaScript/PhantomJS, Selenium/.Net/PhantomJS, Selendroid(Java).

TL;DR I propose we allow all browser vendors do what the above examples already do.

Can we please do a vote?

Agree, disagree, proposal is unclear, have different proposal, need more time to think, don't care (default)?

Here are more details about what I propose.

Simply using that already available API gets us as far as having those commands supported by the 'RemoteWebDriver'. That is enough to make things usable. Then FirefoxDriver, ChromeDriver, etc would add public methods for the custom commands.

At this point users who use FirefoxDriver-like class OR RemoteWebDriver class with remote end being Selenium Java Server - are covered.

Users who want to use RemoteWebDriver client and connect directly to the driver can use a custom CommandExecutor (it seems that this approach will work at least with JavaPythonJavaScript and .Net). I think this is good enough.

This is basically the approach taken by https://github.com/SeleniumHQ/selenium/pull/168 (minus the unnecessary changes the common webdriver/remote/webdriver.py and selenium/remote/HttpCommandExecutor.java).

This is NOT the approach taken in https://github.com/SeleniumHQ/selenium/pull/131 which may be good for adding a new cross-browser command WebDriver protocol (should go through w3c/webdriver), but not for adding a vendor specific one.

Thank you,
Seva

Reply all
Reply to author
Forward
0 new messages