Scrap htlm data in a list

49 views
Skip to first unread message

Alexandre Boulmé

unread,
May 13, 2015, 3:01:46 AM5/13/15
to casp...@googlegroups.com
Hi all, 

I just discovered casperjs and I'd like to use it to scrap data on a web page. It's for a list of items and I just need to get their label. Here is the html structure of the page :

<div class="xT X8c" data-id="1">
     <div class="l0d">
        <div class="n0d">
          <a class="d-s ob yTc q0d" oid="114095242435722376870" tabindex="0" target="_top" href="114095242435722376870">
              <span class="VCc">Text to retrieve</span>
          </a>
       </div>
    </div>
    <div class="Z8c"></div>
    <div class="XEd" data-tooltip-align="b,c"></div>
</div>

 I need to retrive the "Text to retrieve" in each instance of the "xTc X8c" class.

 I tried to reach directly all the "span.VCc" class with this code: 

var casper = require('casper').create();
var name = [];


function getSpanTexts() {
    var texts = document.querySelectorAll('span.VCc');
    return Array.prototype.map.call(texts, function(e) {
        return e.textContent;
    });
}


});


casper.then(function() {
  name = this.evaluate(getSpanTexts);
  this.echo('fonction en cours');
});

casper.run(function() {
    // echo results in some pretty fashion
    this.echo(name.length + ' names found:');
    this.echo(' - ' + name.join('\n - ')).exit();
});.

But it doesn't work: 
- It says name tab has 129 lines (should retrieve 10 results)
- And it doesn't display anything :/

Can someone help me ?

Thanks a lot, 
Alex

Tim Scott

unread,
May 13, 2015, 7:54:54 AM5/13/15
to casp...@googlegroups.com
It looks like it should work. But I would try replacing:

casper.run

with

casper.then
--
CasperJS homepage & documentation: http://casperjs.org/
CasperJS @github: https://github.com/n1k0/casperjs
 
You received this message because you are subscribed to the Google Groups "casperjs" group.
Visit this group at http://groups.google.com/group/casperjs?hl=en.
---
You received this message because you are subscribed to the Google Groups "CasperJS" group.
To unsubscribe from this group and stop receiving emails from it, send an email to casperjs+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrea D'Errico

unread,
Nov 25, 2015, 1:36:46 PM11/25/15
to CasperJS
Hi.. I think the problem is that Google+ has a protection about Casperjs... Can someone scrape Google+?

samiah mushtaq

unread,
Nov 30, 2015, 9:15:06 AM11/30/15
to CasperJS
hello guys
 
 i want to install casperjs on my desktop.. please let me know where am i going wrong , what i did is:
 1. downloaded casperjs n extracted it in c drive with folder name casperjs
 2. downloaded phantomjs n extracted it in c drive with folder name phantomjs 
3. downloaded python
 4. appended environmental path variable with ;C:\phantomjs;C:\casperjs\batchbin 

versions installed :
 phantom 1.8.2 , python 2.6 n casperjs 1.1 beta 3 on windows 7

error :  


 regards samiah

Alex peguero-cruz

unread,
Nov 30, 2015, 10:36:56 AM11/30/15
to casp...@googlegroups.com
Please unsubscribe me from these emails.

Thanks,

Alex

Nicolas Perriault

unread,
Nov 30, 2015, 10:40:52 AM11/30/15
to casp...@googlegroups.com
Are you kidding us? Plenty of unsubscribing information at the bottom of each email, inclusing the one you've just quoted.

Come on.
Nicolas Perriault
https://nicolas.perriault.net/
Phone: +33 (0) 660 92 08 67

Alex peguero-cruz

unread,
Nov 30, 2015, 10:44:44 AM11/30/15
to casp...@googlegroups.com
Relax, honest mistake. Don't get your panties in a bunch.

Cheers
Reply all
Reply to author
Forward
0 new messages