Selenium and XHTML

1,197 views
Skip to first unread message

Alexei Barantsev

unread,
Feb 21, 2013, 10:47:55 AM2/21/13
to selenium-...@googlegroups.com
Hi, all,

Currently, we can't use Selenium to deal with "true" XHTML pages that have application/xhtml+xml content type.
The root cause is inability to resolve XML namespaces (unless they all are specified in the top-level html element that is usually not the case).

To allow this ability we need to add a new command to the API (and the protocol) to add a namespace resolver, like

    driver.manage().namespaces().add("xhtml", "http://www.w3.org/1999/xhtml");

A new protocol command would be:

    POST /session/:sessionId/namespace
    URL Parameters:
      :sessionId - ID of the session to route the command to.
    JSON Parameters:
      prefix - {string} The namespace prefix to set.
      value - {string} The namespace value to set.

I'm asking for approval to implement this fuctionality, and for the help to implement it in all supported browsers and languages.

Regards,
-- 
Alexei Barantsev
Software-Testing.Ru
Selenium2.Ru

Oscar Rieken

unread,
Feb 21, 2013, 10:50:59 AM2/21/13
to selenium-...@googlegroups.com
do you have an example page you can point towards when you mean "true" xhtml?
also why would you need to add namespaces from your tests? 
is it a problem with locating nodes in the document?


--
You received this message because you are subscribed to the Google Groups "Selenium Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Alexei Barantsev

unread,
Feb 21, 2013, 11:01:24 AM2/21/13
to selenium-...@googlegroups.com
For example: http://www.orcca.on.ca/~elena/useful/AboutMathML/MathML.xhtml
Try to find "math" element on this page with XPath locator.
By.xpath("//math") -- returns empty list
By.xpath("//MathML:math") -- throws exception because MathML prefix can't be resolved

Regards,
-- 
Alexei Barantsev
Software-Testing.Ru
Selenium2.Ru

To unsubscribe from this group and stop receiving emails from it, send an email to selenium-developers+unsub...@googlegroups.com.

Oscar Rieken

unread,
Feb 21, 2013, 11:36:37 AM2/21/13
to selenium-...@googlegroups.com
So i played around with this in irb here using firefox (which supports xhtml according to the text in the source of the page)

I used find_element(tag_name: 'math') and that worked just fine 
find_elements(tag_name: 'mo') returned a collection of 'mo' elements as well
there were some issues with getting text

but when i tried it with chrome

it worked just fine might be an issue with selenium 2.30 and firefox 18.0.2 but you can locate and interact with those elements.

my recommendation never use xpath 
reading up on how to use selenium may help as well

Thanks
Oscar


To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

David Burns

unread,
Feb 21, 2013, 11:37:25 AM2/21/13
to selenium-...@googlegroups.com
My gut feel is that this shouldn't be our issue. If its not valid XHTML then it should be  treated as HTML and if it works it does and if it doesn't then the page developer should make it valid and we work from there.

David
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 


--
David Burns
Email: david...@theautomatedtester.co.uk
URL: http://www.theautomatedtester.co.uk/

Alexei Barantsev

unread,
Feb 21, 2013, 12:36:13 PM2/21/13
to selenium-...@googlegroups.com
It's a totally valid document. You're allowed to specify namespace in any node, and it will define context to this node and its descendant nodes.

Alexei.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

--
You received this message because you are subscribed to the Google Groups "Selenium Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 


--

Alexei Barantsev

unread,
Feb 21, 2013, 12:40:32 PM2/21/13
to selenium-...@googlegroups.com
Yes, I know one can use tag name or even css selector location strategies.

But we claim support for XPath too, and recommendation "never use xpath" is too stong a message.
Especially it sounds absurdly when we talk about XHTML that is essentially XML and should work smoothly with other XML-related technologies, that XPath is one of.

Alexei.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-developers+unsubscribe...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

Oscar Rieken

unread,
Feb 21, 2013, 12:54:47 PM2/21/13
to selenium-...@googlegroups.com
I say never use xpath because in my experience its too inconsistant across all supported browsers. yes it is supported, yes you can use it, but its not always the best solution 


To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

David Burns

unread,
Feb 21, 2013, 2:41:00 PM2/21/13
to selenium-...@googlegroups.com
Ok, can you wait a little bit so we can define behaviours. I am waiting for Andreas to reply to a month old email about webdriver and XML. (Calling Andreas on it to prompt a reply).

If its valid then I think we should implicitly handle this situation and not add methods to allow searching on edge cases.

To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

For more options, visit https://groups.google.com/groups/opt_out.
 
 

Alexei Barantsev

unread,
Mar 8, 2013, 9:51:07 AM3/8/13
to selenium-...@googlegroups.com
Any news on the subject?

--
You received this message because you are subscribed to the Google Groups "Selenium Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-developers+unsub...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

David Burns

unread,
Mar 8, 2013, 10:32:45 AM3/8/13
to selenium-...@googlegroups.com
No. I am assuming that this mailing list is low on Andreas' list.I pinged him on IRC and waiting for an answer.




To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.

Andreas Tolf Tolfsen

unread,
Mar 11, 2013, 6:03:14 AM3/11/13
to selenium-...@googlegroups.com
On Thu, Feb 21, 2013 at 09:36:13AM -0800, Alexei Barantsev wrote:
> On Thursday, February 21, 2013 8:37:25 PM UTC+4, David Burns wrote:
> >
> > My gut feel is that this shouldn't be our issue. If its not valid
> > XHTML then it should be treated as HTML and if it works it does and
> > if it doesn't then the page developer should make it valid and we
> > work from there.
>
> It's a totally valid document. You're allowed to specify namespace in
> any node, and it will define context to this node and its descendant
> nodes.

Not according to the W3C Validator service:

http://validator.w3.org/check?uri=http%3A%2F%2Fwww.orcca.on.ca%2F~elena%2Fuseful%2FAboutMathML%2FMathML.xhtml&charset=%28detect+automatically%29&doctype=Inline&group=0

If the validator is to be believed, this isn't parsed in strict XHTML
parser mode by any web browser, but rather in quirks mode.

Andreas Tolf Tolfsen

unread,
Mar 11, 2013, 8:13:23 AM3/11/13
to selenium-...@googlegroups.com
On Thu, Feb 21, 2013 at 07:47:55AM -0800, Alexei Barantsev wrote:
> Currently, we can't use Selenium to deal with "true" XHTML pages that
> have application/xhtml+xml content type. The root cause is inability
> to resolve XML namespaces (unless they all are specified in the
> top-level html element that is usually not the case).

So basically we must provide a custom namespace resolver to
document.evaluate(…), similar to the one we have for SVG currently.

> To allow this ability we need to add a new command to the API (and the
> protocol) to add a namespace resolver, like
>
> driver.manage().namespaces().add("xhtml", "http://www.w3.org/1999/xhtml");

If we can avoid it, my preference would be to do this transparently by
either providing a static list of known XML namespaces (extending our
current list to also include a few more), or to parse the DOM to find
out what namespaces are used and include all of them in the XPath
evaulation call.

The latter option of autodetecting used namespaces may have performance
implications depending on how and where we check, and on the size and
complexity of the DOM.

Secondly, we'd need an algorithm for parsing DOM nodes with the xmlns
attribute since the namespace key from the attribute's value (typically
the last part of the path in the URL) can be overridden like this:

xmlns:CUSTOMNS="http://example.org/foons"

The autodetection of namespaces could either be done on document load or
on the first (or potentially all) calls to find_element[s]_by_xpath().
Due to the asyncedness of most documents these days, always checking
would be the safest, but might also have the biggest performance
implication.

If none of this is feasible, I'd consider supporting Alexei's proposal
for a new API call.

Benjamin Hawkes-Lewis

unread,
Mar 11, 2013, 8:35:29 AM3/11/13
to selenium-...@googlegroups.com
On 21 February 2013 15:47, Alexei Barantsev <bara...@gmail.com> wrote:
> Currently, we can't use Selenium to deal with "true" XHTML pages that have
> application/xhtml+xml content type.
> The root cause is inability to resolve XML namespaces (unless they all are
> specified in the top-level html element that is usually not the case).

Do you have a test case that demonstrates this?

Do XPath expressions of this form:

".//*[namespace-uri(.) = 'http://www.w3.org/1998/Math/MathML' and
local-name(.) = 'mi']"

not work?

--
Benjamin Hawkes-Lewis
Reply all
Reply to author
Forward
0 new messages