Auto-converting or interop with Node.js modules?

269 views
Skip to first unread message

James Greene

unread,
Dec 31, 2012, 9:45:03 AM12/31/12
to phan...@googlegroups.com
It seems like a lot of the work we've been doing lately is to mimic existing Node.js modules. As both projects work cross-platform, this got me thinking: is there any way for us to simply interop/bridge with Node.js? I'm sure we could setup a HTTP communication line but that's not ideal.

Or, if not interop/bridging, what about an automatic conversion process to compile a Node.js module (and its upstream dependencies) as a PhantomJS module?

Opinions? Knowledge? Implementation ideas?

~~James

James Greene

unread,
Dec 31, 2012, 9:55:03 AM12/31/12
to phan...@googlegroups.com
One option would be a bi-directional RMI/RPC setup like dnode:
http://substack.net/posts/85e1bd/DNode-Asynchronous-Remote-Method-Invocation-for-Node-js-and-the-Browser

~~James

James Greene

unread,
Dec 31, 2012, 10:13:33 AM12/31/12
to phan...@googlegroups.com

Also, Qt 5.0 is built atop V8 instead of JSC. Would getting PhantomJS on to Qt 5.0 help with some manner of integration via the common V8 underpinnings of At and Node.js?
~~James

--
You received this message because you are subscribed to the Google Groups "phantomjs" group.
Visit this group at http://groups.google.com/group/phantomjs?hl=en.


Ariya Hidayat

unread,
Jan 3, 2013, 3:17:27 AM1/3/13
to phan...@googlegroups.com
The expansion of PhantomJS API is always a catch-22. If we don't add
it, people will ask for something because they need extra API to a
certain thing. If we add more and more stuff, I'll encounter an
increase amount of "why are you duplicating Node.js". The latter is
just a symptom, most people compare the two based on what they see on
the surface, failing to recognize the history behind the project and
why it came that way.

A little flash back. Why was it a stand-alone executable and not a
Node.js module? Because I couldn't find a way to (1) unify the two
different JavaScript runtimes, V8 and JavaScriptCore (2) merge the
event loops. These days we've seen attempts to do both of them (see
node-webkit or node-chimera), but I still don't see a future-proof
solution which is more than just a workaround (including the fact that
you still need to explain why `evaluate` can't use the usual Node.js
idioms). The same goes for File API, also the embedded HTTP server, it
was again based on the demands of information transfer (not just one
way). We now see how beneficial it is to have the HTTP server support
(for Ghost Driver). CommonJS-esque 'require' style of module was also
another demand so that we can write sensible module using the familiar
construct. The child process API follows the same patterns.

In other words, the technical side of solving the interop needs to be
solved first. Otherwise, the final outcome is still open-ended in
nature.

Unfortunately, I have no definitive answer on this matter. Other
thoughts and feedback?


Regards,


--
Ariya Hidayat, http://ariya.ofilabs.com
http://twitter.com/ariyahidayat
http://gplus.to/ariyahidayat

Ariya Hidayat

unread,
Jan 3, 2013, 3:18:19 AM1/3/13
to phan...@googlegroups.com
AFAICS the WebKit module of Qt 5 is still using JavaScriptCore.

James Greene

unread,
Jan 5, 2013, 12:51:26 AM1/5/13
to phan...@googlegroups.com
I believe you are correct about QtWebKit continuing to use JSC.

From searching around, it appears that QtScript is remaining on JSC but QtScript itself is now a separate addon in Qt 5.0.  Its successor is QJSEngine (a thin V8 wrapper), which is used by the QQmlEngine class.


Sincerely,
    James Greene



Ariya Hidayat

unread,
Jan 26, 2013, 5:17:21 PM1/26/13
to phan...@googlegroups.com
Another idea that came to my mind, in a move to reduce the
comparison/confusion with Node.js, is to push the idea of PhantomJS as
pure "web automator". While scripting PhantomJS is possible using
JavaScript, it's mainly because JavaScript is the zeroth-class
citizen. If we show more examples of other languages, probably even in
the wiki and other documentations, then the image of language-agnostic
will organically grow.

Now, of course there are still implementation details which need to be
tackled. For a start, the "JS" ending in "PhantomJS" does not really
help. In hindsight, I should have called it something more neutral.

Comments? Feedback?

James Greene

unread,
Jan 26, 2013, 8:36:06 PM1/26/13
to phan...@googlegroups.com

In real life conversations, I usually refer to PhantomJS as "Phantom", just as I refer to Node.js as "Node".

I'd be fine with just calling it "Phantom" but that might totally kill its searchability results. Of course, there are always SEO workarounds and exceptions made by search engines for popular phrases, e.g. searching for "Chrome", "Opera", "Office", etc.

Perhaps "WebPhantom" or "PhantomBrowser"? Not sure I'm thrilled with either....

For the record, I'm confident in saying that the "JS" suffix did enable me to stumble upon it months earlier than I would have otherwise. I'd imagine the same is true for others. There is definitely a value there that shouldn't be overlooked.

~~James

James Greene

unread,
Apr 13, 2013, 2:23:53 PM4/13/13
to phan...@googlegroups.com
Here's a crazy thought:
What about wrapping the Node.js core into a separate PhantomJS executable (like what we do with QtWebKit already today)?

If we can figure it out architecturally, this could potentially have a lot of benefits:
  • Still works cross-platform
  • Still won't require the users to have Node.js installed
  • Gives us complete parity with Node.js for the JS execution environment
  • Enables use of any Node.js userland modules (e.g. consumers could use "connect" for their web server instead of a custom Qt-wrapped version of Mongoose)
  • We could probably eliminate all of the PhantomJS-specific core/bundled modules that we have created (or will be creating) and rely on (or create) normal Node.js modules instead (e.g. "fs", "child_process", "system", "webserver", etc.)
  • Would enable us to utilize NPM for a package management system

Potential cons:
  • Our users will have to get used to a more asynchronous style of coding (or rely even more heavily on a Promise-like wrapper like CasperJS.

Thoughts?


Sincerely,
    James Greene

Bryan Bishop

unread,
Apr 13, 2013, 2:38:48 PM4/13/13
to phan...@googlegroups.com, Bryan Bishop
On Sat, Apr 13, 2013 at 1:23 PM, James Greene <james.m...@gmail.com> wrote:
> What about wrapping the Node.js core into a separate PhantomJS executable
> (like what we do with QtWebKit already today)?

This sounds like node-chimera? Are there differences?

https://github.com/deanmao/node-chimera

- Bryan
http://heybryan.org/
1 512 203 0507

James Greene

unread,
Apr 13, 2013, 2:44:43 PM4/13/13
to phan...@googlegroups.com
Bryan —
Yes: the difference being that node-chimera is a Node.js userland module (i.e. consumers must have Node.js core installed on their system) whereas what I'm proposing would consume Node.js core as part of a standalone PhantomJS executable (and use it as the PhantomJS outer context).

Sincerely,
    James Greene



--
You received this message because you are subscribed to the Google Groups "phantomjs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phantomjs+...@googlegroups.com.

Ariya Hidayat

unread,
Apr 13, 2013, 2:48:20 PM4/13/13
to phan...@googlegroups.com
> Here's a crazy thought:
> What about wrapping the Node.js core into a separate PhantomJS executable
> (like what we do with QtWebKit already today)?

Still the same issue: until someone demonstrates a stable, non-hackist
solution to integrate both Qt and Node.js event loops, then we can
investigate going that route.

You also need to take into account the possible future of moving to
another multi-process backend (e.g. Blink). That prohibits creating a
JavaScript binding to an important object (one of those Node.js core
modules) in the _same_ process. Delegation is the name of the game
(just like V8 binding in Blink, and formerly WebKit) but this is
getting too far in the distant future.

James Greene

unread,
Apr 13, 2013, 2:59:31 PM4/13/13
to phan...@googlegroups.com
Hmm... this looks somewhat immature but very interesting: https://github.com/arturadib/node-qt

Sincerely,
    James Greene



--
You received this message because you are subscribed to the Google Groups "phantomjs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to phantomjs+...@googlegroups.com.

Bryan Bishop

unread,
Apr 13, 2013, 3:13:27 PM4/13/13
to phan...@googlegroups.com, Bryan Bishop
On Sat, Apr 13, 2013 at 1:44 PM, James Greene wrote:
> Yes: the difference being that node-chimera is a Node.js userland module
> (i.e. consumers must have Node.js core installed on their system) whereas
> what I'm proposing would consume Node.js core as part of a standalone
> PhantomJS executable (and use it as the PhantomJS outer context).

Ariya, do you see any particular Qt issues with userland bindings to
webkit, like demonstrated in node-chimera? To me, it seems like that
approach would go a long way to satisfy nodejs users that seem to
evidently hate JavaScriptCore.

Ariya Hidayat

unread,
Apr 13, 2013, 4:03:10 PM4/13/13
to phan...@googlegroups.com, Bryan Bishop
> Ariya, do you see any particular Qt issues with userland bindings to
> webkit, like demonstrated in node-chimera?

I already outlined this many times, didn't I? Someone needs to solve
the problem of (1) event loop integration (2) what happens if we use
multiprocess. And no, a node-chimera-style workaround of "let's yield
to QApplication from time to time" is not a proper, future-proof
solution in my standard.

Forgive me for being pragmatic, I also don't see the real value of
unnecessary pressure to bend PhantomJS internals to match some other
project's objectives. As I proposed many times, use PhantomJS to do
"web automation" and nothing else, nothing more. You can still do all
the glorious thing in Node.js if you want, just delegate the web
automation thingie (the last mile) to PhantomJS. Heck, use Python or
Ruby or Java if you like.

I see Poltergeist as a successful model, nobody ever demanded Ruby FFI
to PhantomJS and yet plenty of Capybara users are already happy with
Poltergeist. Could the bridge be improved? Certainly. Does it matter
much? Probably not (as long as the users are happy).

I believe limiting the project scope as just "headless web automator"
is still sensible, without limiting its broad usefulness. It's not a
generic, full-blown JavaScript environment. Do your heavy lifting in
your favorite scripting language first.

At the end of the day, it's a matter of time and effort. We have
barely enough resources to sustain the current development, with the
glaring potholes here and there (outdated WebKit, flaky testing,
insufficient documentation). If someone wants to dedicate his time to
research the next-generation awesome integration with any project,
knock yourself out (it's FOSS project after all). I may bet on the
wrong horse but I'm confident that fixing the above mentioned potholes
would give a much bigger impact to the users.


Thanks!

Regards,

Bryan Bishop

unread,
Apr 13, 2013, 4:22:37 PM4/13/13
to Ariya Hidayat, Bryan Bishop, phan...@googlegroups.com
On Sat, Apr 13, 2013 at 3:03 PM, Ariya Hidayat <ariya....@gmail.com> wrote:
> I already outlined this many times, didn't I? Someone needs to solve

I have seen you outline concerns with integrating Qt and Nodejs into
the same binary, but not with Nodejs as a parent. I also misunderstood
whether or not these problems would persist in both scenarios.
However, I appreciate you taking the time to clarify that this
architectural concern pervades both scenarios. Thank you.

> the problem of (1) event loop integration (2) what happens if we use
> multiprocess. And no, a node-chimera-style workaround of "let's yield
> to QApplication from time to time" is not a proper, future-proof
> solution in my standard.

Hmm, so the only code that I see that looks like an occasional yield
is commented out in this file:

https://github.com/deanmao/node-chimera/blob/master/src/chimera.cc

// void Chimera::sleep(int ms)
// {
// QTime startTime = QTime::currentTime();
// while (true) {
// QApplication::processEvents(QEventLoop::AllEvents, 25);
// if (startTime.msecsTo(QTime::currentTime()) > ms)
// break;
// }
// }

Maybe I am missing where this is happening?

Also, if there is occasional yielding somewhere that I can't see, is
part of the problem that timeouts would no longer be scheduled
correctly?

> Forgive me for being pragmatic, I also don't see the real value of
> unnecessary pressure to bend PhantomJS internals to match some other
> project's objectives. As I proposed many times, use PhantomJS to do

I agree with you. Really. I am also content with requirejs/browserify
with PhantomJS for managing my outer context source code. But, I
propose that the main theoretical value with investigating alternative
architectures is to stop the constant complaints from users.

> I believe limiting the project scope as just "headless web automator"
> is still sensible, without limiting its broad usefulness. It's not a
> generic, full-blown JavaScript environment. Do your heavy lifting in
> your favorite scripting language first.

I agree. The way that I use PhantomJS, I basically ignore the majority
of the modules other than WebPage, and just XHR data around to other
tiny web services that are running inside my network. As it happens,
PhantomJS is very very good at XHR and WebPage, so that's what I rely
on most. Everything else I squirrel out of PhantomJS' process as
quickly as possible.

> At the end of the day, it's a matter of time and effort. We have
> barely enough resources to sustain the current development, with the

Please understand that I never claimed otherwise. :-(

> knock yourself out (it's FOSS project after all). I may bet on the
> wrong horse but I'm confident that fixing the above mentioned potholes
> would give a much bigger impact to the users.

I also never claimed that you were "betting on the wrong horse". I was
merely saying that I would appreciate a clarification of why a
particular architecture would be broken. As you can see, I also quoted
some source code demonstrating my confusion. This has nothing to do
with questioning your technical foresight.

Ariya Hidayat

unread,
Apr 13, 2013, 6:40:17 PM4/13/13
to Bryan Bishop, phan...@googlegroups.com
> Hmm, so the only code that I see that looks like an occasional yield
> is commented out in this file:

Not that one, but the one in main.js:

setInterval(webkit.processEvents, 50);

> Also, if there is occasional yielding somewhere that I can't see, is
> part of the problem that timeouts would no longer be scheduled
> correctly?

It's been a while since I investigated the internals QtWebKit with
respect to the running event loop. But basically the above approach is
akin to co-operative multitasking. Some work is necessary to verify
that both Qt(WebKit) side and any foreign environment will no
impacted. For example, if I create SuperAwesomeTool which is
integrated by someone else with some other project, I may not know
about this ahead of time and I put some assumption inside
SuperAwesomeTool which will break under a different situation.

> I agree with you. Really. I am also content with requirejs/browserify
> with PhantomJS for managing my outer context source code. But, I
> propose that the main theoretical value with investigating alternative
> architectures is to stop the constant complaints from users.

I agree with that. It's just I also learn that no matter what you do,
there will be always complaints. I don't think we shall interpret that
complaint at its face value.

> I agree. The way that I use PhantomJS, I basically ignore the majority
> of the modules other than WebPage, and just XHR data around to other
> tiny web services that are running inside my network. As it happens,
> PhantomJS is very very good at XHR and WebPage, so that's what I rely
> on most. Everything else I squirrel out of PhantomJS' process as
> quickly as possible.

And this is the use case we need to promote more. Maybe for 2.0 we
need to revise the landing page and its related documentation to show
something like http://docs.seleniumhq.org/docs/. You can immediately
choose the programming language you want to use and the example code
is adjusted right away. I believe this is what the user really wants.
I'm fairly confident that typical Joe Sixpack won't care much whether
we solve the problem via an elegant event loop integration or not. If
the instructions (tweaked to his/her language of choice) are clear and
easy to follow, it is convincing.

> I also never claimed that you were "betting on the wrong horse". I was
> merely saying that I would appreciate a clarification of why a
> particular architecture would be broken. As you can see, I also quoted
> some source code demonstrating my confusion. This has nothing to do
> with questioning your technical foresight.

Au contraire, I definitely encourage people to experiment with
different approach so that my limited (and likely outdated)
understanding does not become a showstopper. What we need to avoid is
to get trapped in a local maxima, e.g. as if I play the Armchair
Software Architect all day long and therefore it won't get anywhere.



Thank you!

Bryan Bishop

unread,
Apr 13, 2013, 7:42:45 PM4/13/13
to Ariya Hidayat, Bryan Bishop, phan...@googlegroups.com
On Sat, Apr 13, 2013 at 5:40 PM, Ariya Hidayat wrote:
>> Hmm, so the only code that I see that looks like an occasional yield
>> is commented out in this file:
>
> Not that one, but the one in main.js:
>
> setInterval(webkit.processEvents, 50);

Wow, yeah I definitely missed that. That makes the situation with
node-chimera much more understandable. Now, at the risk of beating a
dead horse, I am curious what architectural issues you see with
node-webkit?

You mentioned node-webkit earlier in this thread. They combined the
two event loops (although their documentation makes it sound like it
was only for events related to rendering, or something) from Chromium
and Nodejs.

Your January complaint about node-webkit seems to be about the
presence of multiple javascript engines that might confuse a
programmer? You mentioned evaluate, but I don't know if that was
PhantomJS evaluate, or the potential evaluate hacks that would have to
be written for node-webkit, or both, etc..

Am I understanding the issues right?

Here are some words they wrote about their event loop work:

https://github.com/rogerwang/node-webkit/wiki/How-node.js-is-integrated-with-chromium

"Both node.js and Chromium have their main loops. So it would take
some efforts to make it run in Chromium. One of the founding feature
of node-webkit is to call Node functions directly from DOM, so we
integrate them into the same thread. That requires the integration of
the main loop of Node and the one from Chromium Render process."

"In order to make the objects from Node and DOM to refer to each
other, Node is made to use the same V8 engine instance as the one in
Chromium. The objects from the 2 worlds are in 2 contexts
respectively, to keep their namespace clean."

"Chromium internally use class MessageLoop and MessagePump to support
its events loop, since node uses libuv to support its events loop, we
need to implement a new MessagePump which uses libuv as its underlying
events library. The new type of Message Pump is used only in the
render process, where the WebKit engine resides."

(Honestly, I thought one of these used libev, but I guess not. Huh.)

P.S. Sorry if this sounds like beating a dead horse.

Ariya Hidayat

unread,
Apr 13, 2013, 11:12:26 PM4/13/13
to Bryan Bishop, phan...@googlegroups.com
> Wow, yeah I definitely missed that. That makes the situation with
> node-chimera much more understandable. Now, at the risk of beating a
> dead horse, I am curious what architectural issues you see with
> node-webkit?
> You mentioned node-webkit earlier in this thread. They combined the
> two event loops (although their documentation makes it sound like it
> was only for events related to rendering, or something) from Chromium
> and Nodejs.

I haven't spent quality time with node-webkit to form a definitive
conclusion on such integration approach. But for sure, whatever being
done there can't be just applied to PhantomJS.

> Your January complaint about node-webkit seems to be about the
> presence of multiple javascript engines that might confuse a
> programmer? You mentioned evaluate, but I don't know if that was
> PhantomJS evaluate, or the potential evaluate hacks that would have to
> be written for node-webkit, or both, etc..

I don't think having two JavaScript engines help anyone at all. Even
if there is a subtle difference, someone will get tripped and people
start to make a storm in a teacup. We definitely want to avoid "where
is my Function.prototype.bind" drama again.

> P.S. Sorry if this sounds like beating a dead horse.

This is far from being a dead horse. It is more like "we know very
little about this and can't make a good judgement". If we keep
reviving the same subject every 3 months and still nobody has done
intensive, CSI-grade analysis on the subject, we would inch slowly
toward the well-known Maserati problem
(http://www.quora.com/Startups/Whats-a-Maserati-Problem).
Reply all
Reply to author
Forward
0 new messages