Selenium and XHTML

Alexei Barantsev

unread,

Feb 21, 2013, 10:47:55 AM2/21/13

to selenium-...@googlegroups.com

Hi, all,

Currently, we can't use Selenium to deal with "true" XHTML pages that have application/xhtml+xml content type.

The root cause is inability to resolve XML namespaces (unless they all are specified in the top-level html element that is usually not the case).

To allow this ability we need to add a new command to the API (and the protocol) to add a namespace resolver, like

driver.manage().namespaces().add("xhtml", "http://www.w3.org/1999/xhtml");

A new protocol command would be:

POST /session/:sessionId/namespace

URL Parameters:

:sessionId - ID of the session to route the command to.

JSON Parameters:

prefix - {string} The namespace prefix to set.

value - {string} The namespace value to set.

I'm asking for approval to implement this fuctionality, and for the help to implement it in all supported browsers and languages.

Regards,

--

Alexei Barantsev

Software-Testing.Ru

Selenium2.Ru

Oscar Rieken

unread,

Feb 21, 2013, 10:50:59 AM2/21/13

to selenium-...@googlegroups.com

do you have an example page you can point towards when you mean "true" xhtml?

also why would you need to add namespaces from your tests?

is it a problem with locating nodes in the document?

--
You received this message because you are subscribed to the Google Groups "Selenium Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Alexei Barantsev

unread,

Feb 21, 2013, 11:01:24 AM2/21/13

to selenium-...@googlegroups.com

For example: http://www.orcca.on.ca/~elena/useful/AboutMathML/MathML.xhtml

Try to find "math" element on this page with XPath locator.

By.xpath("//math") -- returns empty list

By.xpath("//MathML:math") -- throws exception because MathML prefix can't be resolved

Regards,

--

Alexei Barantsev

Software-Testing.Ru

Selenium2.Ru

To unsubscribe from this group and stop receiving emails from it, send an email to selenium-developers+unsub...@googlegroups.com.

Oscar Rieken

unread,

Feb 21, 2013, 11:36:37 AM2/21/13

to selenium-...@googlegroups.com

So i played around with this in irb here using firefox (which supports xhtml according to the text in the source of the page)

http://pastie.org/6299911

I used find_element(tag_name: 'math') and that worked just fine

find_elements(tag_name: 'mo') returned a collection of 'mo' elements as well

there were some issues with getting text

but when i tried it with chrome

http://pastie.org/6300121

it worked just fine might be an issue with selenium 2.30 and firefox 18.0.2 but you can locate and interact with those elements.

my recommendation never use xpath

reading up on how to use selenium may help as well

Thanks

Oscar

To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

David Burns

unread,

Feb 21, 2013, 11:37:25 AM2/21/13

to selenium-...@googlegroups.com

My gut feel is that this shouldn't be our issue. If its not valid XHTML then it should be treated as HTML and if it works it does and if it doesn't then the page developer should make it valid and we work from there.

David

To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
David Burns
Email: david...@theautomatedtester.co.uk
URL: http://www.theautomatedtester.co.uk/

Alexei Barantsev

unread,

Feb 21, 2013, 12:36:13 PM2/21/13

to selenium-...@googlegroups.com

It's a totally valid document. You're allowed to specify namespace in any node, and it will define context to this node and its descendant nodes.

Alexei.

To unsubscribe from this group and stop receiving emails from it, send an email to selenium-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "Selenium Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Alexei Barantsev

unread,

Feb 21, 2013, 12:40:32 PM2/21/13

to selenium-...@googlegroups.com

Yes, I know one can use tag name or even css selector location strategies.

But we claim support for XPath too, and recommendation "never use xpath" is too stong a message.

Especially it sounds absurdly when we talk about XHTML that is essentially XML and should work smoothly with other XML-related technologies, that XPath is one of.

Alexei.

To unsubscribe from this group and stop receiving emails from it, send an email to selenium-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

Oscar Rieken

unread,

Feb 21, 2013, 12:54:47 PM2/21/13

to selenium-...@googlegroups.com

I say never use xpath because in my experience its too inconsistant across all supported browsers. yes it is supported, yes you can use it, but its not always the best solution

To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

David Burns

unread,

Feb 21, 2013, 2:41:00 PM2/21/13

to selenium-...@googlegroups.com

Ok, can you wait a little bit so we can define behaviours. I am waiting for Andreas to reply to a month old email about webdriver and XML. (Calling Andreas on it to prompt a reply).

If its valid then I think we should implicitly handle this situation and not add methods to allow searching on edge cases.

To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.

Alexei Barantsev

unread,

Mar 8, 2013, 9:51:07 AM3/8/13

to selenium-...@googlegroups.com

Any news on the subject?

David Burns
Email: david...@theautomatedtester.co.uk
URL: http://www.theautomatedtester.co.uk/

--
You received this message because you are subscribed to the Google Groups "Selenium Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

David Burns

unread,

Mar 8, 2013, 10:32:45 AM3/8/13

to selenium-...@googlegroups.com

No. I am assuming that this mailing list is low on Andreas' list.I pinged him on IRC and waiting for an answer.

David Burns
URL: http://www.theautomatedtester.co.uk/

To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

Andreas Tolf Tolfsen

unread,

Mar 11, 2013, 6:03:14 AM3/11/13

to selenium-...@googlegroups.com

On Thu, Feb 21, 2013 at 09:36:13AM -0800, Alexei Barantsev wrote:
> On Thursday, February 21, 2013 8:37:25 PM UTC+4, David Burns wrote:
> >
> > My gut feel is that this shouldn't be our issue. If its not valid
> > XHTML then it should be treated as HTML and if it works it does and
> > if it doesn't then the page developer should make it valid and we
> > work from there.
>

> It's a totally valid document. You're allowed to specify namespace in
> any node, and it will define context to this node and its descendant
> nodes.

Not according to the W3C Validator service:

http://validator.w3.org/check?uri=http%3A%2F%2Fwww.orcca.on.ca%2F~elena%2Fuseful%2FAboutMathML%2FMathML.xhtml&charset=%28detect+automatically%29&doctype=Inline&group=0

If the validator is to be believed, this isn't parsed in strict XHTML
parser mode by any web browser, but rather in quirks mode.

Andreas Tolf Tolfsen

unread,

Mar 11, 2013, 8:13:23 AM3/11/13

to selenium-...@googlegroups.com

On Thu, Feb 21, 2013 at 07:47:55AM -0800, Alexei Barantsev wrote:
> Currently, we can't use Selenium to deal with "true" XHTML pages that
> have application/xhtml+xml content type. The root cause is inability
> to resolve XML namespaces (unless they all are specified in the
> top-level html element that is usually not the case).

So basically we must provide a custom namespace resolver to
document.evaluate(…), similar to the one we have for SVG currently.

> To allow this ability we need to add a new command to the API (and the
> protocol) to add a namespace resolver, like
>
> driver.manage().namespaces().add("xhtml", "http://www.w3.org/1999/xhtml");

If we can avoid it, my preference would be to do this transparently by
either providing a static list of known XML namespaces (extending our
current list to also include a few more), or to parse the DOM to find
out what namespaces are used and include all of them in the XPath
evaulation call.

The latter option of autodetecting used namespaces may have performance
implications depending on how and where we check, and on the size and
complexity of the DOM.

Secondly, we'd need an algorithm for parsing DOM nodes with the xmlns
attribute since the namespace key from the attribute's value (typically
the last part of the path in the URL) can be overridden like this:

xmlns:CUSTOMNS="http://example.org/foons"

The autodetection of namespaces could either be done on document load or
on the first (or potentially all) calls to find_element[s]_by_xpath().
Due to the asyncedness of most documents these days, always checking
would be the safest, but might also have the biggest performance
implication.

If none of this is feasible, I'd consider supporting Alexei's proposal
for a new API call.

Benjamin Hawkes-Lewis

unread,

Mar 11, 2013, 8:35:29 AM3/11/13

to selenium-...@googlegroups.com

On 21 February 2013 15:47, Alexei Barantsev <bara...@gmail.com> wrote:
> Currently, we can't use Selenium to deal with "true" XHTML pages that have
> application/xhtml+xml content type.
> The root cause is inability to resolve XML namespaces (unless they all are
> specified in the top-level html element that is usually not the case).

Do you have a test case that demonstrates this?

Do XPath expressions of this form:

".//*[namespace-uri(.) = 'http://www.w3.org/1998/Math/MathML' and
local-name(.) = 'mi']"

not work?

--
Benjamin Hawkes-Lewis

Reply all

Reply to author

Forward