Load 100% JS-based webpages

3,170 views
Skip to first unread message

Marco Donizelli

unread,
Oct 25, 2012, 4:55:17 AM10/25/12
to phan...@googlegroups.com
Hi all,

I am trying to use phantomJS to load a webpage which is entirely based on JavaScript (that is, goes blank if you disable JavaScript in a browser):

var page = require('webpage').create();
page.open('http://www.thelightroomdesigns.com/', function(status)
{
    page.render('test.png');
    phantom.exit();
});

if I execute the script above, 'test.png' is blank, as if JavaScript was disabled within phantomJS itself.

Is there a specific setting/API I need to use to enable phantomJS to execute the JavaScript within the webpage I am loading?

Thanks in advance

marco

Ivan De Marino

unread,
Oct 25, 2012, 1:54:23 PM10/25/12
to phan...@googlegroups.com
The callback of "page.open" is fired after the DOM Content has been Loaded.
That means, the whole HTML + all the resources in the HEAD.

What's after that (scripts included in the body part), are loaded in asynchronously respect to DOM Content Load.

What that means, is that you are getting a blank page because PhantomJS has just finished loading the content, and is now getting the rest of the scripts to render the page.
You are not waiting for that to happen.

A simple test to make it clearer: try to put your "render and exit" in a timeout after, say, 5 seconds.

You will surely get something into your screenshot.

Ah, an image talks better than anything else (attached).

Ivan



marco

--
 
 
 



--
Ivan De Marino
Coder, Technologist, Cook, Italian

blog.ivandemarino.me | www.linkedin.com/in/ivandemarino | twitter.com/detronizator
Loading a URL in a Browser.png

Marco Donizelli

unread,
Oct 25, 2012, 5:17:03 PM10/25/12
to phan...@googlegroups.com
Ciao Ivan, grazie mille / thanks a lot.

Ok, maybe I can inject jQuery, and use the document.ready, to avoid random timeouts. Or window.onload without jQuery..

One more thing: is this the same reason why I do not seem to be able to get PhantomJS to follow redirects implemented via Javascript?

That is, redirects like:

<script>
location.href = '...';
</script>

If I load a webpage with such a redirect, PahntomJS does not seem to follow, while of course it follows all the HTTP redirects. (Haven't tested the ugly meta redirect yet.)

Thanks in advance for all the help.

m

Ariya Hidayat

unread,
Oct 25, 2012, 9:09:07 PM10/25/12
to phan...@googlegroups.com
> <script>
> location.href = '...';
> </script>

I'm sure that is a separate issue and someone has already reported
that. Search the issue tracker please.


--
Ariya Hidayat, http://ariya.ofilabs.com
http://twitter.com/ariyahidayat

James Greene

unread,
Oct 26, 2012, 12:45:43 AM10/26/12
to phan...@googlegroups.com
Separate issue related to relative URL navigation:

~~James

Marco Donizelli

unread,
Oct 26, 2012, 2:33:37 PM10/26/12
to phan...@googlegroups.com
Thanks guys!

But actually everything is working flawlessly as soon as I plugged in the jQuery document.ready:

----


var page = require('webpage').create();
page.onLoadStarted = function()
{
    console.log('Now loading ' + page.url + '...');
};
page.open('http://cat.com/', function(status)
{
    console.log(page.url + ' loaded, status = ' + status);
    if (status === 'success')
    {
        page.includeJs('http://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js', function()
        {
            console.log('jQuery in');
            var response = page.evaluate(function()
            {
                $(document).ready(function()
                {
                    console.log('Document ready...');
                    return;
                });
            });
            page.render('cat.png');
            phantom.exit();
        });
    }
});

-----

cat.com uses a JS-based redirect, and the combination of phantomJS + jQuery document.ready digests that perfectly!

I am amazed at this product! Hats off Ariya!!!

-----

Now, if I may use this thread to go back to the original post, that was about this 100% JS-based site:

http://www.thelightroomdesigns.com/

With jQuery document.ready, I get phantomJS to start processing the site, but it bails out with this error:

TypeError: 'undefined' is not a function (evaluating '$(document).getSize()')
  http://static.wix.com/services/html-wysiwyg/1.227.4/web-viewer-components.js:1922
  ...

I assume there is something wrong in the JS, but nevertheless the site loads fine in Safari / Chrome. And the JS console does not report that error, not even as a warning.

Is there anyway I can avoid phantomJS to quit in case of incorrect JS?

ty in advance.

m

James Greene

unread,
Oct 27, 2012, 9:26:17 AM10/27/12
to phan...@googlegroups.com
Marco --
Just setup a `WebPage#onError` handler/callback:
https://github.com/ariya/phantomjs/wiki/API-Reference#wiki-webpage-onError
~~James

Marco Donizelli

unread,
Oct 29, 2012, 9:07:28 AM10/29/12
to phan...@googlegroups.com
Hi James, thanks a million, works perfectly.

m

FITstrophysicist Dr. Brian Hart

unread,
Oct 20, 2017, 12:38:24 PM10/20/17
to phantomjs
Hi Marco-

I am also interested in this topic, as I have a website that I've developed on WiX, but as a web developer I want to rip the HTML/CSS/JS/AJAX content I've crafted down so I can turn it into a Web application with my own programming/scripting.

Any chance you could post the final code that you had come up with?

Much obliged.  Also, just out of curiosity, how are you actually executing this code?  I see that it's in JavaScript, but what IDE/compiler are you using to put in the code, and then click a "Run" button to actually make it do its thing?

Again, much appreciated.
Reply all
Reply to author
Forward
0 new messages