javascriptEnabled = false

510 views
Skip to first unread message

Hatta

unread,
Jul 29, 2012, 7:48:43 AM7/29/12
to phan...@googlegroups.com
Hello,

Assuming "page.settings.javascriptEnabled = false", should I expect
page.evaluate() to "fail" since javascript is disabled?

In that case, how could I retrieve the webpage source then?

If it was supposed to work tho I'm afraid something it's broken cause
it simply ignores page.evaluate() completely.

the code: http://paste.dprogramming.com/dpxd7r8x

document.documentElement.innerHTML gives me an empty page which is not
the page I fetched

<head></head><body></body>

even when reading it from page.onLoadFinished.

I appreciate any lights on this matter. Thanks in advance.

Regards.

--
Hatta

Ariya Hidayat

unread,
Jul 30, 2012, 1:35:07 AM7/30/12
to phan...@googlegroups.com
> document.documentElement.innerHTML gives me an empty page which is not
> the page I fetched

Obviously, because it is not the content of the page you're fetching.

The solution, if JavaScript enabled, is to evaluate a script which
returns document.documentElement.innerHTML, hence it grabs the content
of the actual page. See the included examples/loadspeed.js that shows
how to get the page title (the principle is the same).

Now since in your case JavaScript is disabled, there is no way you can
run evaluate. Fortunately, if you scan the API documentation
carefully, you can grab the HTML content of any page like this:

console.log(page.content);


Regards,

--
Ariya Hidayat, http://ariya.ofilabs.com
http://twitter.com/ariyahidayat

Ariya Hidayat

unread,
Sep 19, 2012, 8:55:15 PM9/19/12
to phan...@googlegroups.com
> I am unable to get this to work. If I set "page.settings.javascriptEnabled
> = false", then do console.log(page.content) inside my page.open callback,
> the logged value is undefined. Any thoughts on how to get this to work?

What do you get from this?

var page = require('webpage').create();
page.settings.javascriptEnabled = false;
page.open('http://google.com', function () {
console.log(page.content);
phantom.exit();
});

James Greene

unread,
Sep 20, 2012, 12:09:42 AM9/20/12
to phan...@googlegroups.com
For the record, Ariya's example works for me (James Greene).  Not sure what issues my colorful counterpart (James Brown) was experiencing....
~~James
Message has been deleted

Marcos Zanona

unread,
Nov 4, 2012, 7:30:58 AM11/4/12
to phan...@googlegroups.com
But isn't PhantomJS supposed to work as the browser would? While disabling javascript on Chrome, for example, there is still access to the console through developer tools and consequently, the `window` object.
I believe it should mimic the same behaviour, where disabling javascript means Phantomjs should only disable `script` tags for running within the page, but not prevent the developer to evaluate new snippets at will through the console, where, on PhantomJS, `evaluate` is the only way achieve such thing. The access to the `window` object shouldn't be restricted to the `javascriptEnabled` option set to false. This leaves the developer with no choice as far as I can understand.

As mentioned: It is still possible to strip out all the `script` tags and then set `page.content` to the desired markup, although the generated document would lose many other characteristics which the same would inherit if loaded from it's original URL such: `document.location`, any `<a href='/test.html'><a/>` links pointing to their root domain would also break and many others intricacies I can think of.

So I guess the `evaluate` prevention while using `javascriptEnabled=false` brings no real benefit to the developer who is interested in dissect and debug the page which I believe, is the main purpose of phantomJS.

I really hope you can re-consider that.

Thank you

Ariya Hidayat

unread,
Nov 4, 2012, 11:38:04 AM11/4/12
to phan...@googlegroups.com
> But isn't PhantomJS supposed to work as the browser would? While disabling
> javascript on Chrome, for example, there is still access to the console
> through developer tools and consequently, the `window` object.
> I believe it should mimic the same behaviour, where disabling javascript
> means Phantomjs should only disable `script` tags for running within the
> page, but not prevent the developer to evaluate new snippets at will through
> the console, where, on PhantomJS, `evaluate` is the only way achieve such
> thing. The access to the `window` object shouldn't be restricted to the
> `javascriptEnabled` option set to false. This leaves the developer with no
> choice as far as I can understand.
>
> As mentioned: It is still possible to strip out all the `script` tags and
> then set `page.content` to the desired markup, although the generated
> document would lose many other characteristics which the same would inherit
> if loaded from it's original URL such: `document.location`, any `<a
> href='/test.html'><a/>` links pointing to their root domain would also break
> and many others intricacies I can think of.
>
> So I guess the `evaluate` prevention while using `javascriptEnabled=false`
> brings no real benefit to the developer who is interested in dissect and
> debug the page which I believe, is the main purpose of phantomJS.
>
> I really hope you can re-consider that.

Your theory is good on paper. But just like every other aspects of
PhantomJS, when you face something like this, usually the reason is
either:

* philosophical design choice, or
* technical limitation

In this case, it is the latter. If someone steps up and volunteers to
do the necessary research to overcome the technical problems so that
the beautiful behavior can be achieved, then the discussion can be
more fruitful. Otherwise, it ain't never gonna happen.

Any "why PhantomJS can't do this" situation should be (at first)
treated the same way as "why my car can't fly" question. A car
designer loves to have it, but the technology might not be there yet.
Reply all
Reply to author
Forward
0 new messages