How to get link text which is having nbsp within it?

1,122 views
Skip to first unread message

mohit khatri

unread,
Jul 6, 2012, 12:03:08 AM7/6/12
to webd...@googlegroups.com
Hi,

I would like to fetch link text but not able to get it. I've multiple links on page and the code which i used is

WebElement popup = driver.findElement(By.xpath("//*[@class='pop-main']"));
       
            List<WebElement> pageSubElement = popup.findElements(By.tagName("a"));
           
            for (int i = 0; i < pageSubElement.size(); i++){
                tagText = pageSubElement.get(i).getText();
                System.out.println("text "+tagText);
               
            }

But above code is not working. My HTML source is as below.

<a href="/xyz---html" >In&nbsp;this&nbsp;Price&nbsp;Range&nbsp;(around&nbsp;$17)</a>

Thanks,

darrell

unread,
Jul 6, 2012, 10:22:16 AM7/6/12
to webdriver
Mohit,

Please be a little more clear on what you expect and what you actually
get. The statement "above code is not working." is rather vague. I
created a web page with your HTML snippet in it. I used your code and
it output "text In this Price Range (around $17)". Did you get an
error? Did you get no output? Did you get the same output but expected
"text In&nbsp;this&nbsp;Price&nbsp;Range&nbsp;(around&nbsp;$17)"?

If you expected to see the &nbsp; in the output string then your
expectations were wrong. The getText() method never returns the
&nbsp;. It will always return a space.

If you want to see the source for the page, you'll have to use
driver.getPageSource(). When I was using Selenium 0.88 there were a
lot of things still missing. To get around this I would use
driver.getPageSource() then read the string into an HTML parser. I
could then query the tree for all <a> tags and write my own
getElementSource() to see what the source text was.

I can see how you want to be sure the source has certain special
characters (like &nbsp, &amp or &quot). This might be a good feature
request. Create an alternative to getText(). Call it getSrcText().

Darrell

David

unread,
Jul 8, 2012, 12:50:35 AM7/8/12
to webdriver
If you want partial source say for the given element only, you could
retrieve it via the innerHtml DOM attribute of the element and not
have to deal with or parse the rest of the HTML source from the page.

I believe you can retrieve via theWebElement.getAttribute("innerHtml")
or so, forget whether it was case sensitive as innerHtml or innerHTML.
There's also an innerText attribute that returns the text as well but
that might be preformatted before it is returned.

mohit khatri

unread,
Jul 8, 2012, 12:31:53 PM7/8/12
to webd...@googlegroups.com
Thanks Darrell & David for your replies.

Actually while using driver.findelement().getText(), I was not getting any result i.e. i was getting empty string that's why I got confused that it might be due to nbsp; but after analyzing the DOM, I found that innerText property of the webelement is empty that's why it is returning empty string. I guess getText() returns innerText Value!

In this scenario, webelement's text is present under 'text' & 'textContent' properties. So after using WebElement.getAttribute("text") or WebElement.getAttribute("textContent") I got the correct text of the element and which solved my problem. 

Again thanks for your replies and sorry for the confusion. 

~ Mohit

darrell

unread,
Jul 9, 2012, 11:14:28 AM7/9/12
to webdriver
Thank you David.

I've never needed to get the HTML for just one element. It is case
sensitive and you need to get the "innerHTML" attribute. This will get
everything inside the web element you have a reference to. It does not
give you the source for the web element, itself. I check DOM from
w3.org and found all elements should have the following:

innerHTML = w.getAttribute("innerHTML");
tagName = w.getAttribute("tagName");
accessKey = w.getAttribute("accessKey");
className = w.getAttribute("className");
direction = w.getAttribute("dir");
id = w.getAttribute("id");
lang = w.getAttribute("lang");
style = w.getAttribute("style");
tabIndex = w.getAttribute("tabIndex");
title = w.getAttribute("title");

In many cases the string returned is "". These are good to know.

Darrell

David

unread,
Jul 10, 2012, 2:23:59 PM7/10/12
to webdriver
Thanks Darrell, the info you posted is good to know. And thanks for
clarifying the case sensitivity of innerHTML.

David

unread,
Jul 10, 2012, 9:45:51 PM7/10/12
to webdriver
Not sure if these are formally supported across browsers, but thought
I'd mention some additional DOM attributes that are worth checking
out:

clientHeight, scrollHeight, clientWidth, scrollWidth

using these DOM attributes/properties, can calculate whether
scrollbars exist or not

http://selenium-automation.blogspot.com/2010/10/selenium-automation-problems.html

the example in that post is for Selenium RC. With WebDriver, I believe
can directly get these via getAttribute of WebElement. If not, can get
it via DOM via javascript, which you can still use with WebElement as
an argument (see http://code.google.com/p/selenium/issues/detail?id=2067
comment # 60 for an example).

On Jul 9, 8:14 am, darrell <darrell.grain...@gmail.com> wrote:
Reply all
Reply to author
Forward
0 new messages