Newbie: How to overcome Javascript "onclick" button to scrape web page?

544 views
Skip to first unread message

Terence Ng

unread,
May 7, 2013, 2:28:13 AM5/7/13
to scrapy...@googlegroups.com
This is the link I want to scrape:

The "English Version" tab is at the upper right hand corner in order to show the English version of the web page.


There is a button I have to press in order to read the funds information on the web page.

<div onclick="AgreeClick()" style="width:200px; padding:8px; border:1px black solid; background-color:#cccccc; cursor:pointer;">Confirmed</div>

And the function of AgreeClick is:

function AgreeClick() {
var cookieKey = "ListFundShowDisclaimer";
SetCookie(cookieKey, "true", null);
Get("disclaimerDiv").style.display = "none";
Get("blankDiv").style.display = "none";
Get("screenDiv").style.display = "none";
//Get("contentTable").style.display = "block";
ShowDropDown(); 

How do I overcome this onclick="AgreeClick()" function to scrape the web page?

Anderson Caco

unread,
May 7, 2013, 10:39:25 AM5/7/13
to scrapy...@googlegroups.com
Use the Google Chrome inspector (Ctrl+Shift+J) and inspect the network calls.


2013/5/7 Terence Ng <teren...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "scrapy-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scrapy-users...@googlegroups.com.
To post to this group, send email to scrapy...@googlegroups.com.
Visit this group at http://groups.google.com/group/scrapy-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--

Anderson Ferraz
Estagiário, baixista e sniper do time dos CT.
T (71) 3494-3514

Travis Briggs

unread,
May 7, 2013, 11:52:21 AM5/7/13
to scrapy...@googlegroups.com
Actually, I think it's easier than that.

If you look at the page, the div you want (id="contentTable") is already on the page when it loads. They are simply putting a dismissable div over the page with Javascript.

Your spider will have the full source of the page, so you don't need to worry about the pop-up div.

Try this in the scrapy shell:

In [2]: hxs.select('//table//table//table//td[@class="fundPriceCell1"]//text()').extract()[:5]
Out[2]: 
[u'06/05/2013',
 u'0.1102%\n                  ',
 u'02/05/2013',
 u'0.1102%\n                  ',
 u'29/04/2013']


Is that the information you want?

-Travis
Reply all
Reply to author
Forward
0 new messages