How to get href and linktext of <a> using javascript Executor?

128 views
Skip to first unread message

testingzeal

unread,
Jan 6, 2016, 10:29:04 PM1/6/16
to webdriver
I am fetching links of a page and adding to a JavaList and later reading from the javaList to get their Response codes. Below code works fine but it is taking lot of time adding links to the list.

Ex- I have 500 links on a page and it is taking about 15 minutes to add to the JavaList which is time consuming. I spoke to my java dev and they think usually ArrayList process data very fast and think it is webdriver issue as i am fetching the "href" using *link.getAttribute("href")*. That is the reason i am thinking probably using javascript Executor it may be faster but not sure of how to do that.

Any ideas please?




List<WebElement> hrefList = new ArrayList<WebElement>();
List
<WebElement> links = driver.findElements(By.tagName("a"));
 
System.out.println(links.size());
 
for (WebElement link :links)
 
{

hrefList.add(link.getAttribute("href")); // Performance issues while adding it to the list.
 
}

Krishnan Mahadevan

unread,
Jan 7, 2016, 12:25:04 AM1/7/16
to webdriver
You could do something like the below sample code 

public static void main(String[] args) {
FirefoxDriver driver = null;
try {
driver = new FirefoxDriver();
driver.get(URL);
String script = "var array=[]; " +
"var links=document.links; var max=links.length;" +
"for(var i=0;i<max;i++){ array.push(links[i].href); }" +
"return array;";
long start = System.currentTimeMillis();
List<String> hrefs = (List<String>) driver.executeScript(script);
long end = System.currentTimeMillis();
System.err.println("It took " + (end - start) + " ms to fetch " + hrefs.size() + " links ");
for (String href : hrefs) {
System.err.println("Link --> " + href);
}
} finally {
if (driver != null) {
driver.quit();
}
}
}
I ran this sample code against one of the websites that I picked from this recommendation by Nick Bilton : http://www.nickbilton.com/98/

It took 105 ms to fetch 252 links 


Thanks & Regards
Krishnan Mahadevan

"All the desirable things in life are either illegal, expensive, fattening or in love with someone else!"
My Scribblings @ http://wakened-cognition.blogspot.com/
My Technical Scribbings @ http://rationaleemotions.wordpress.com/

--
You received this message because you are subscribed to the Google Groups "webdriver" group.
To unsubscribe from this group and stop receiving emails from it, send an email to webdriver+...@googlegroups.com.
To post to this group, send email to webd...@googlegroups.com.
Visit this group at https://groups.google.com/group/webdriver.
For more options, visit https://groups.google.com/d/optout.

testingzeal

unread,
Jan 7, 2016, 6:50:21 PM1/7/16
to webdriver
Thank you Krishnan! 


I have 3 questions-


Below is the code i have for my version

1.List<WebElement> hrefs =  (List<WebElement>)js.executeScript(script); // This line has a Warning and it is - Type safety: Unchecked cast from Object to List<WebElement>

2. I change list type to be List<WebElement> instead of String just to get the link Text and doesn't get in the loop.

FYI - When i change it to List<String> it is working as expected. Is there a way to get linkText along with href?


driver.get(homeUrl);
String script = "var array=[]; " +
"var links=document.getElementById('truncated-header').getElementsByTagName('a'); var max=links.length;" +
"for(var i=0;i<max;i++){ array.push(links[i].href); }" +
"return array;";
long start = System.currentTimeMillis();


JavascriptExecutor js = (JavascriptExecutor)driver;
List<WebElement> hrefs =  (List<WebElement>)js.executeScript(script);
for (WebElement href : hrefs) {
//responseCode = GetResponseCode.getResponseCode(href.toString());
System.out.println(href.getText()+ " -"+href)  ;

Bill Ross

unread,
Jan 7, 2016, 8:34:33 PM1/7/16
to webd...@googlegroups.com
Maybe if you change this you can do it, e.g. have the Strings for the things you want separated by some delimiter:

  array.push(links[i].href)

Note that javascript is working with what it has available in the browser, and will likely only return things as strings.

Bill
Any ideas please?



System.out.println(links.size(< wbr>));
 
for (WebElement link :links)
 
{
hrefList.add(link.getAttribute("href")); // Performance issues while adding it to the list.
 
}
-- You received this message because you are subscribed to the Google Groups "webdriver" group. To unsubscribe from this group and stop receiving emails from it, send an email to webdriver+...@googlegroups.com. To post to this group, send email to webd...@googlegroups.com. Visit this group at https://groups.google.com/group/webdriver. For more options, visit https://groups.google.com/d/optout.

darrell

unread,
Jan 8, 2016, 9:38:04 AM1/8/16
to webdriver
Your code is slightly different from Krishnan's. You don't need the line which defines the start variable. Krishnan had a start and end variable so he could measure the time his implementation took. Here is the complete implementation without measure the time it takes to execute:

WebDriver driver = new FirefoxDriver()
String script = "var array=[]; " +
"var links=document.links; var max=links.length;" +
"for(var i=0;i<max;i++){ array.push(links[i].href); }" +
"return array;";
List<String> hrefs = (List<String>)((JavascriptExecutor)driver).executeScript(script);

Everything else is just for demonstrating that the code works. The line with executeScript is packed with a few things. The variable driver is of type WebDriver. The method executeScript is part of JavascriptExecutor. So you have to cast the driver variable to a JavascriptExecutor. Also, the signature for executeScript returns a class of type Object but we know it is really returning a List<String>. So we cast the result of executeScript to type List<String>.  Your first question is telling you that you are doing a wrong thing. The executeScript is returning a List<String>. By telling the compiler it is returning a List<WebElement> is like buying a car and putting a label on it saying it is a Boeing 747 aircraft. When you cast you are telling the compiler what the method is ACTUALLY returning. Essentially by saying Krishnan's code is returning a List<WebElement> you are lying to the compiler. 

The second question you have is saying I TELL the compiler I have a List<WebElement> but it is REALLY a List<String>. So it does not work. Bottom line, if you lie to the compiler it will believe you. Telling the compiler you have a List<WebElement> when it is really a List<String> does not make it a List<WebElement>. Another analogy would be, I walk into a store with a $5. I give it to the store owner and tell them it is a $100. Doesn't make it true. Now if I go into a store in India, put down a $5 USD and tell the store owner it is the same as 300 Rupee then that is true. In the case of Krishnan's code, the executeScript returns an Object. I cannot convert it to a List<WebElement> but I can convert it to List<String>.

To get the href and the link text you need to select a javascript data type which would return the data you need, create some Javascript which would return the appropriate structure then use executeScript to save it in an equivalent Java data type. This is really requires a much stronger knowledge of Java and Javascript. It might be easier to make two calls. One which returns the hrefs and one which returns the link text. Then you can assume item 1 in the hrefs list would match item 1 in the link text list.
Reply all
Reply to author
Forward
0 new messages