Performing a series of page interactions

Brian Theado

unread,

Jun 26, 2011, 5:58:02 PM6/26/11

to phan...@googlegroups.com

I have written some code to make it easier to perform multi-page
interactions. For instance if you want to visit a login page and fill
out a form and then click on a link at the resulting page and then
perform some other action on the third page, etc.

It still seems a little clumsy, but I'm pretty new to javascript and
maybe there are some easy improvements. Maybe using the coffeescript
syntax would be an improvement (once I've learned coffeescript).

My code includes an 'interact' function which takes a WebPage object
and two function callbacks as input. The first callback contains the
code which will cause the page to reload (clicking a link, submitting
a form, etc.). The second callback contains the code which will be
called when the page load completes. So in order to navigate to a
second page you nest another 'interact' call inside the second
callback of the first 'interact' call. To navigate to a third page,
nest another interact call inside the second interact call, etc.

interact(page, function() {/* open, click, or form submit here
*/}, function() {/* Nested call to interact here */})

Here is an example interaction. It visits the phantomjs google-code
issue list page, clicks on the details link for the last issue on the
page, then displays the date/time of the last comment on the page (or
issue creation if there are no comments):

<code>
phantom.injectJs("interact.js");
page = new WebPage()
interact(page, function() {
// Open the issues list page
this.open("http://code.google.com/p/phantomjs/issues/list");
}, function() {
interact (this, function() {
// Click on the details link for the last issue on the page
// Would be nice to be able to write:
// click (this, '#resultstable > tbody >
tr:nth-last-of-type(1) > td.id > a')
// but I didn't figure out how
this.evaluate(function() {simulateMouseClick('#resultstable >
tbody > tr:nth-last-of-type(1) > td.id > a')})
}, function() {
last_update = this.evaluate(function() {
// Creation date or date of most recent comment
n = document.querySelectorAll('.author > .date,
.issuecomment > .date');
return n.item(n.length-1).getAttribute('title');
});
console.log ("Last updated " + last_update);
phantom.exit();
})
})
</code>

<code name='interact.js'>
var pageNum = 1;
function interact (page,causes_load_callback, onload_callback) {
console.log("Setting onLoadFinished");
page.onLoadFinished = function (status) {
// While developing a script, it is difficult to predict which
interactions
// QtWebkit will consider a page reload, so set a temporary callback to
// notify the script writer that an unexpected page reload took place
page.onLoadFinished = function(status) {
console.log(pageNum + "Unexpected page load: " + status + " - " +
page.evaluate(function() {
return document.title + " - " + document.location.href;
})
);
page.render("/tmp/unexpected-" + pageNum + ".png");
phantom.exit();
};

// For debugging, log the results and render the page to a
temp directory
console.log (status + ":" + pageNum + " Page load complete - " +
page.evaluate(function() {
return document.title + " - " + document.location.href;
})
);
page.render("/tmp/" + pageNum + ".png");
pageNum++;

// Allow elements to be clicked and call the page load callback
page.injectJs("simclick.js");
console.log("about to apply callback");
onload_callback.apply(page);
};

// Call the callback which will run code that will result in page reload
causes_load_callback.apply(page);
console.log("Return from interact");
return;
}
</code>

<code name='simclick.js'>
// code from http://code.google.com/p/phantomjs/issues/detail?id=47
function simulateMouseClick(selector) {
var targets = document.querySelectorAll(selector),
evt = document.createEvent('MouseEvents'),
i, len;
evt.initMouseEvent("click", true, true, window, 0, 0, 0, 0, 0,
false, false, false, false, 0, null);

for ( i = 0, len = targets.length; i < len; ++i ) {
targets[i].dispatchEvent(evt);
}
}
</code>

If I could figure out how to add code to phantomjs to download a file
(http://code.google.com/p/phantomjs/issues/detail?id=52), then most of
my automation needs would be resolved.

Brian

Ariya Hidayat

unread,

Jun 29, 2011, 11:08:55 AM6/29/11

to phan...@googlegroups.com

Thank you Brian for the nice use cases of PhantomJS!

Looking at the way your code looks like, I believe we shall do
something in order to help you (and others with similar goals) produce
a much more readable and debuggable script.

I was reluctant to go the synchronous way, I even went as far as
removing the semi-blocking sleep() from 1.2. I was under the
impression that using a JavaScript microlibrary which allows writing
sequential statement will be good enough.

However, based on the feedback from other expert in the web testing
tool, I realize maybe this is a mistake from my side. Maybe we need
the synchronous mode after all, consider the interactive testing is
usually procedural and step-by-step. Consider the following
(pseudocode):

open a login page
set the user name
set the wrong password intentionally
click the login button
wait till the page reloads
verify that login is not possible

Having the sync mode built-in, with specified time-out, will allow the
test script to run linearly. Once we have remote debugging capability,
it's supereasy to trace and debug.

Being able to do certain operations both async and sync can lead to
confusion. However, my understanding is people with use only one mode
most of the time. For running unit tests headlessly, you don't really
care about sync mode. For interactive/user-emulation tests, you don'y
really care about async.

What do you guys think?

--
Ariya Hidayat
http://www.linkedin.com/in/ariyahidayat

Ivan De Marino

unread,

Jun 29, 2011, 12:16:12 PM6/29/11

to phan...@googlegroups.com

Javascript doesn't have a sleep() method.

Why should Phantom?

There are way better ways in Javascript to do this, and it's event-driven.

And, in case you can't do it event driven, stuff like "waitFor" function (the one present in the "examples/") are good as well.

Actually, I'd go to the extend to say that "waitFor" is good also because it's the closest "emulation" of user behaviour.

If you are doing "User-simulated testing", you want to "monitor for stuff to happen, then do". It's what the user does after all.

I say "no synch".

IMHO.

:)

--

Ivan De Marino

Front-End Developer @ Betfair

email: ivan.de...@gmail.com | detron...@gmail.com | ivan.d...@betfair.com

web: blog.ivandemarino.me | www.linkedin.com/in/ivandemarino | twitter.com/detronizator

mobile: +44 (0)7515 955 861

James Roe

unread,

Jun 29, 2011, 6:04:52 PM6/29/11

to phantomjs

There are ways to do synchronous steps in an asynchronous environment,
without loosing the asynchrony stuff.

Things like..
Steps(function(err, next) {
do_soemthing()
next(err)
, function () {}
, ...)

When we get our module system up, we could just make it an optional
thing to use, and it could be require'd. :) This is what I think is
optimal.

Ariya Hidayat

unread,

Jun 30, 2011, 10:20:08 AM6/30/11

to phan...@googlegroups.com

I'm not particularly worried about which structure is better or
whether there is an alternative wrapper for that.

The mysterious question would be, how to make such a procedural
testing easy to write and east to debug? Especially if this is such a
complicated and big test.

For running unit-tests headless, what we have is already more than
enough. For creating smoke tests and user simulations, I just believe
it could be significantly improved.

Regards,

Ariya

Brian Theado

unread,

Jul 1, 2011, 10:27:53 PM7/1/11

to phan...@googlegroups.com

On Wed, Jun 29, 2011 at 11:08 AM, Ariya Hidayat <ariya....@gmail.com> wrote:
[...]

> Looking at the way your code looks like, I believe we shall do
> something in order to help you (and others with similar goals) produce
> a much more readable and debuggable script.

[...]

Sure, I wouldn't object to that. Long-term, that sounds useful.

Brian

Brian Theado

unread,

Jul 1, 2011, 10:36:33 PM7/1/11

to phan...@googlegroups.com

On Wed, Jun 29, 2011 at 6:04 PM, James Roe <roeja...@hotmail.com> wrote:
> There are ways to do synchronous steps in an asynchronous environment,
> without loosing the asynchrony stuff.
>
> Things like..
> Steps(function(err, next) {
> do_soemthing()
> next(err)
> , function () {}
> , ...)

Yeah, I think the recursive nesting of my original approach is much
more clumsy looking than it need be. I have made a change to allow
it to be linear. Here is the revised code:

<code>
phantom.injectJs("interact2.js");
page = new WebPage()

var steps = [

function() {
// Open the issues list page
this.open("http://code.google.com/p/phantomjs/issues/list");
},
function() {

// Click on the details link for the last issue on the page
// Would be nice to be able to write:
// click (this, '#resultstable > tbody >
tr:nth-last-of-type(1) > td.id > a')
// but I didn't figure out how
this.evaluate(function() {simulateMouseClick('#resultstable >
tbody > tr:nth-last-of-type(1) > td.id > a')})
},
function() {
last_update = this.evaluate(function() {
// Creation date or date of most recent comment
n = document.querySelectorAll('.author > .date,
.issuecomment > .date');
return n.item(n.length-1).getAttribute('title');
});
console.log ("Last updated " + last_update);
phantom.exit();
}

];
interact(page, steps)
</code>

<code interact2.js>
var pageNum = 1;
function interact (page,callback_list) {

console.log("Setting onLoadFinished");
page.onLoadFinished = function (status) {

// For debugging, log the results and render the page to a
temp directory
console.log (status + ":" + pageNum + " Page load complete - " +
page.evaluate(function() {
return document.title + " - " + document.location.href;
})
);
page.render("/tmp/" + pageNum + ".png");
pageNum++;

// Allow elements to be clicked

page.injectJs("simclick.js");
console.log("about to apply callback");

// Call interact again--pass only the functions that
// haven't been invoked yet
interact(page, callback_list.slice(1));
};

// Call the first callback in the list. It should result in page
reload so the
// above onLoadFinished gets called
callback_list[0].apply(page);

console.log("Return from interact");
return;
}
</code>

Brian

Peter Lyons

unread,

Jul 2, 2011, 2:39:20 PM7/2/11

to phantomjs

I made my own variation of Brian's interact.js (in coffeescript). It
supports 2 styles of control flow: 1. Just a simple linear list using
an array of callbacks and 2. a dynamic linked list type approach where
the next action can be determined on the fly, which should allow
conditionals, loops, skips, etc.

Here's my <interact.coffee>
---------------------------------
pageNum = 1
window.interact = (page, actions, verbose=true) ->
header = (status) ->
return if not verbose
console.log "Page #{pageNum}: #{status}. " + \
page.evaluate ->
document.location.href + " - " + document.title
pageNum++

if Array.isArray actions
#Simple linear sequence of actions
page.onLoadFinished = (status) ->
header status
#Call interact again--pass only the functions that
#haven't been invoked yet
interact page, actions.slice 1
#Call the first callback in the list. It should result in page
#reload so the
#above onLoadFinished gets called
actions[0](page)
else
#Object style interaction. Allows loops, dynamic logic, etc
page.onLoadFinished = (status) ->
header status
actions.next(page, actions)
actions.next(page, actions)
----------------------------------

Here's an example of a multi-page interaction (signing in and backing
up my workflowy.com data) using the linear style array approach:

workflowy_backup_array.coffee
----------------------------
out = (message) ->
console.log message
phantom.injectJs 'interact.coffee'

page = new WebPage()
page.settings.loadImages = false
page.settings.loadPlugins = false
page.onConsoleMessage = (message) ->
if message.indexOf("Unsafe JavaScript") == 0
return
out message

actions = [
(page) -> page.open 'https://workflowy.com'
(page) ->
page.evaluate ->
$('#id_username').val 'pe...@peterlyons.com'
$('#id_password').val 'My super secret password'
$('#login').submit()
(page) -> page.evaluate -> $('#workflowy').exportIt()
(page) ->
page.evaluate ->
console.log $('#exportPopup .textContainer pre').html()
phantom.exit()
]

window.interact page, action
-------------------------------

And here's a version using the object style.

workflowy_backup_obj.coffee
-----------------------------
phantom.injectJs 'interact.coffee'
out = (message) ->
console.log message
page = new WebPage()
page.settings.loadImages = false
page.settings.loadPlugins = false
page.onConsoleMessage = (message) ->
if message.indexOf("Unsafe JavaScript") == 0
return
out message

actions =
open: (page, actions) ->
actions.next = actions.signIn
page.open 'https://workflowy.com'
signIn: (page, actions)->
actions.next = actions.export
page.evaluate ->
$('#id_username').val 'pe...@peterlyons.com'
$('#id_password').val 'My super secret password'
$('#login').submit()
export: (page, actions) ->
actions.next = actions.print
page.evaluate -> $('#workflowy').exportIt()
print: (page) ->
page.evaluate ->
console.log $('#exportPopup .textContainer pre').html()
phantom.exit()

actions.next = actions.open
interact page, actions
----------------------------

Please let me know if you find that useful or can offer other
approaches to this type of use case.

Brian Theado

unread,

Jul 3, 2011, 3:01:50 PM7/3/11

to phan...@googlegroups.com

Peter,

On Sat, Jul 2, 2011 at 2:39 PM, Peter Lyons <pe...@peterlyons.com> wrote:
> I made my own variation of Brian's interact.js (in coffeescript).

[...]

> Please let me know if you find that useful or can offer other
> approaches to this type of use case.

I like it and have switched the code I was writing to use it. Thanks
for sharing.

I'm new to javascript/dom coding...could you explain why you attached
the function to the window object?

> window.interact = (page, actions, verbose=true) ->

Brian

Peter Lyons

unread,

Jul 3, 2011, 11:59:10 PM7/3/11

to phantomjs

> I like it and have switched the code I was writing to use it. Thanks
> for sharing.
>
> I'm new to javascript/dom coding...could you explain why you attached
> the function to the window object?
>
> > window.interact = (page, actions, verbose=true) ->

That's just a coffeescript thing. Coffeescript automatically declares
all variables with "var" so they are local. If you want to make a
global variable, you must do it explicitly like above.

Mark Riggins

unread,

Jul 15, 2011, 5:36:07 AM7/15/11

to phan...@googlegroups.com

After looking at a few more examples of the convoluted methods necessary to write synchronous code, using asynchronous API's, it has become crystal clear

that a synchronous interface is essential for clarity.

We already have a programming language here (javascript) with elegant flow control, why resort to arrays for functions when we are already in a language that

supports for loops, while loops, if-then-else etc logic.

Unit tests need to be READABLE above all else.

Mark Riggins

unread,

Jul 15, 2011, 5:37:47 AM7/15/11

to phan...@googlegroups.com

Any idea when that might happen? I'd sure love to use it.

Ariya Hidayat

unread,

Jul 15, 2011, 11:48:03 AM7/15/11

to phan...@googlegroups.com

Please star issue 157 to follow the updates.

Bartosz Nitka

unread,

Aug 4, 2011, 9:06:25 PM8/4/11

to phan...@googlegroups.com

Here's how I do it:
https://github.com/niteria/phantomjs/commit/a64d0bcd535b7b54e15b1eb40f7d5d57bb6d14cf
I've found that asynchronous style forces me to write too much boilerplate and is hard to debug.
Also calling asynchronous open multiple times on the same page leads to unpredictable results, sometimes segfaults.
It would be nice to add timeout to the synchronous version, but so far I didn't need it and I think webkit might have
its own timeout.

Another solution might be to return a token from async call that you can wait on, but if it's too general it might be too easy
to do deadlocks.

Bartosz Nitka

unread,

Aug 4, 2011, 9:25:39 PM8/4/11

to phan...@googlegroups.com

...And since webkit doesn't implement tail call optimization, if you force async code to be sequential (with framework like Step) you'll eventually run out of stack frames.

Ariya Hidayat

unread,

Aug 5, 2011, 3:50:03 PM8/5/11

to phan...@googlegroups.com

I think I might go the route where a certain mode is set, async or
sync. IMO this is better than alternate function name for each
operation. In the context of testing using PhantomJS, I can't imagine
the situation where someone would heavily mix sync and async
operations anyway.

Thoughts?

--
Ariya Hidayat
http://www.google.com/search?q=ariya+hidayat

Bartosz Nitka

unread,

Aug 6, 2011, 12:57:19 AM8/6/11

to phantomjs

Personally I have no use for async, I left it in my fork just to be
compatible with phantomjs.
Async gives the promise of parallelism (or at least non-blocking IO),
but the way things are now you'll have to use
frameworks like Step or flow-js to use it (because if you want to call
phantom.exit
at the end you have to wait for async calls) and it gets messy pretty
quickly (maybe with coffescript it doesn't,
I haven't tried it yet, main problem is lot of anonymous function
noise).

On an unrelated note you can do poor mans tail call optimization with
setTimeout and patch your favorite
framework with it.

There's also another problem with parallelism and phantomjs. With my
use case I wanted
to have different set of cookies/cache for each WebPage object and
currently all WebPage objects
share the same NetworkManager which remembers cookies/cache.
So if you wanted to take a screenshot of each logged in user's page in
parallel there's no way to do it now anyway.
I run everything sequentially so I just implemented clear() which just
replaces NetworkManager with a new one.

Only use case for async in phantomjs I can think of is doing something
in parallel to a set of completely unrelated pages,
but if they're unrelated you can just run multiple phantomjs
processes.

I'd love to hear about other async + phantomjs use cases.

Béla Juhász

unread,

Aug 7, 2011, 5:07:04 PM8/7/11

to phan...@googlegroups.com

Hi!

This weekend I started something: https://github.com/bacey/phasmine

It's Jasmine (the BDD-thing) integrated with PhantomJS, so I can test page interactions.

It's heavily based on Peter's and Brian's interact.coffee script (huge thanks for you!, + I hope I can use your code).

I haven't tested it thoroughly (you know: It Works For Me), nor used it in a real project, still, I thought I share it with you.