using click() to download pdfs

821 views
Skip to first unread message

Robert Poor

unread,
Jul 14, 2015, 1:36:50 AM7/14/15
to casp...@googlegroups.com
(This is cross-posted from http://stackoverflow.com/questions/31395566/casperjs-using-click-to-download-pdfs -- I hope I'm not violating some implicit or explicit protocol by doing so, but I don't see that much casperjs traffic there.)

I would like to use casperjs to download a series of pdf files.

In this case, I can't use the technique described in http://stackoverflow.com/questions/30436533/how-to-download-multiple-pdf-files-in-casperjs because the underlying onclick="showMyBill(this)" POSTs a huge amount of state which is impractical to replicate.

So as far as I can tell, I'm limited to using casper.click() and then -- somehow -- capturing the .pdf sent in response. The HTML that triggers the download looks like this:

  <div class="slick-row">
    <span class="isilk in_line" onclick="showMyBill(this)" title="Bill 1506160070">
    </span>
    2015-06-16
  </div>

the short question:

After I call casper.click(css), how can I grab the resulting .pdf code and write it to the local filesystem?

more info:

Schematically, I'm doing something like this:

function downloadOnePDF(index) {
  var css = getSelectorCSS(index);
  casper.click(css);

  // Some code needs to go here:

  if (index > 0) { 
    downloadOnePDF(index - 1);  // download next
  }
}

I see two problems here:
  • My web proxy shows me that the first PDF file is getting downloaded, but not subsequent ones.
  • I'm not saving and PDFs to disk yet.

I've tried adding a downloaded.file event handler, but it's not getting called:

casper.on('downloaded.file', function(targetPath) {
    console.log('downloaded.file triggered with ' + targetPath);
})

So: any suggestions welcome.

Sateesh Kavuri

unread,
Nov 24, 2015, 1:28:05 AM11/24/15
to CasperJS
Robert, were you able to find a solution to this?

I am also looking for a solution, where in a file gets downloaded when I click a button. The button invokes a javascript, which then sends the file in response to the click. Please help, I am stuck

Robert Poor

unread,
Nov 24, 2015, 6:01:42 AM11/24/15
to CasperJS
On Monday, 23 November 2015 22:28:05 UTC-8, Sateesh Kavuri wrote:
Robert, were you able to find a solution to this?

I am also looking for a solution, where in a file gets downloaded when I click a button. The button invokes a javascript, which then sends the file in response to the click.

Ultimately, as far as I can tell, you MUST call casper.download( ... ) to fetch the data.  (You can dig down into the sources to see what it's doing.  Having looked, I decided to stick with the published interface).  So if the button has javascript associated with it, use a web proxy (like Charles or Firebug) to see what gets sent to the server.   Then write just a bit of code to assemble the same form as would be POSTed when you click the button, and call download() with it.

So for example, the web proxy told me that clicking on this HTML:

<form id="frminvoice" class="pdfer" target="" method="post" action="/Document" novalidate="novalidate">
  <input id="id" type="hidden" value="0028816967-0001" name="id">
  <input id="type" type="hidden" value="Invoice" name="type">
  <input id="acctID" type="hidden" value="245765" name="acctID">
  <a id="btnSubmit" class="pdf" href="#">View Invoice</a>
</form>

... would POST the following form:

id 0028816967-0001
type Invoice
acctID 245765

So I gathered together all of the ids, types and acctIDs using simple casper constructs:

    var bill_ids = casper.getElementsAttribute('div#myAjaxDiv table tr td form.pdfer input[name="id"]', 'value');
    var bill_types = casper.getElementsAttribute('div#myAjaxDiv table tr td form.pdfer input[name="type"]', 'value');
    var acct_ids = casper.getElementsAttribute('div#myAjaxDiv table tr td form.pdfer input[name="acctID"]', 'value');

and assembled them into a list of "bill_reference" objects, which not coincidentally happen to be the exact form required for a POST message:

    var bill_references = bill_ids.map(function(_, i) {
        return { id: bill_ids[i], type: bill_types[i], acctID: acct_ids[i] };
    });

Then I just iterated over the bill_references to download each PDF:

    for (var index = 0; index < n; index++) {
        downloader.downloadOneBill(bill_references[index], index);
    }

MyDownloader.prototype.downloadOneBill = function(bill_reference, index) {
    var downloader = this;
    var casper = this.casper();

    casper.then(function b01() {
        var url = downloader.BILL_POST_URL;
        var form = downloader.createBillDownloadForm(bill_reference);
        var target = downloader.createBillFilename(bill_reference);
    
target, 'POST', form);
    });
}

Robert Poor

unread,
Nov 24, 2015, 6:06:14 AM11/24/15
to CasperJS
Whoops!  GoogleGroups posted the note before I finished editing, but I think you'll get the idea.   The last bit of code should read:

 MyDownloader.prototype.downloadOneBill = function(bill_reference, index) {

    casper.then(function b01() {
        var form = bill_reference;
        var target = bill_reference + '.pdf';
    
                        target, 
                        'POST', 
                        form);
    });
}

Hope this helps!

Reply all
Reply to author
Forward
0 new messages