Batching Advanced User Interactions

140 views
Skip to first unread message

Jim Evans

unread,
Dec 26, 2012, 11:44:54 AM12/26/12
to selenium-...@googlegroups.com
Over the long weekend, I started looking at attempting to stabilize user interactions in the IE driver using native events. Even today, this is still problematic when automating things like hovering over elements. The biggest problem in using native events on Windows is that to do them properly, you need to use the Win32 SendInput API[1][2][3], which is not the method we use. Instead, we use SendMessage, which can cause a race condition, even with persistent hover turned on. The downside to using the SendInput API is that it hooks into the input event queue of the system at a very low level, which means the window you want to send the input to must be the window with the system focus, and this is something that's disapproved of in WebDriver[4].

People have repeatedly said to me, "I don't care if I have to let the browser window have the system focus, I *need* my user interactions to be accurate, and I'll sacrifice screen focus for that accuracy." With that in mind, I thought I had a design for solving the problem once and for all in the IE driver, letting the user specify that they wanted WebDriver to manage system window focus for them (toggled with a capability, of course). The basic design was to use a mutex[5] for each user interaction (keystroke, mouse operation, touch event), and force the IE window on which we're operating to the foreground before executing the user interaction, and here's where I ran into the problem. The JSON wire protocol[6] sends each discrete event via a unique HTTP request. Even though most language bindings use an "action sequence builder" class (Actions in Java and .NET, ActionBuilder in Ruby, and action_chains in Python) to give the illusion of atomicity, in reality, there's a chance between each of the actions in the sequence for a second instance to attempt to execute a command, stealing focus from the IE window we really want. And there's no sane way to stretch the mutex across RPC calls.

To me, it makes more sense to change the protocol. We should consider replacing the myriad of HTTP endpoints for the Advanced User Interactions (e.g., /session/:sessionid/moveto, /session/:sessionid/click, /session/:sessionid/doubleclick, /session/:sessionid/keys, etc.), with a single endpoint, (maybe /session/:sessionid/action), with the JSON body of the request containing an array of objects representing actions, in order. The object of each individual action will contain the same parameters as the current individual protocol commands, but an additional property ("actionType") containing the name of the action to perform ("moveto", "click", "doubleclick", "keys", etc.). This would allow all of the functionality of the existing wire protocol, but would enable the user to put granularity where he or she wants it, by sending only a single JSON wire protocol request per call to the perform() method (or its equivalent).

I know there have been discussions about batching JSON wire protocol commands to send several to a server implementation at once. That's a different discussion than the one I want to have here. In this case, I only want to modify the Advanced User Interactions portions of the protocol to match what the language bindings already attempt to enforce.

--Jim

[1] http://msdn.microsoft.com/en-us/library/windows/desktop/ms646310%28v=vs.85%29.aspx
[2] http://blogs.msdn.com/b/oldnewthing/archive/2005/05/30/423202.aspx
[3] http://blogs.msdn.com/b/oldnewthing/archive/2010/12/21/10107494.aspx
[4] http://www.w3.org/TR/webdriver/#running-without-window-focus
[5] http://msdn.microsoft.com/en-us/library/windows/desktop/ms684266%28v=vs.85%29.aspx
[6] http://code.google.com/p/selenium/wiki/JsonWireProtocol

Jason Leyba

unread,
Dec 26, 2012, 12:03:38 PM12/26/12
to selenium-...@googlegroups.com
When we first started brainstorming the interactions API (after GTAC 2009?), we talked about sending them over in batches - we just never followed through on it.  So, I whole heartedly support this idea.  +1

-- Jason



--
 
 

Daniel Wagner-Hall

unread,
Dec 26, 2012, 12:17:27 PM12/26/12
to selenium-developers
+1


--
 
 

Leah Klearman

unread,
Dec 26, 2012, 1:21:32 PM12/26/12
to selenium-...@googlegroups.com
+1


--
 
 

Simon Stewart

unread,
Jan 1, 2013, 4:49:02 PM1/1/13
to selenium-...@googlegroups.com
I've slept on this a bit. I say a reserved +1 with the caveat that if we decide to implement batching we revisit this end point.

Simon

--
 
 

Reply all
Reply to author
Forward
0 new messages