.Net - any plans for async/await?

334 views
Skip to first unread message

Chris Moschini

unread,
Dec 11, 2019, 9:26:01 AM12/11/19
to Selenium Developers
The dotnet portion of the Selenium src is synchronous based, even though .Net provides very simple structures for asynchrony - async/await with Tasks - and even though nearly any use of Selenium involves a lot of waiting. This means the threads calling Selenium spend most of their time pegged even though they could be freed up waiting for the short moments to send an HTTP request to a Dev API, or the long waits while a large page loads for 3 seconds.

Moving the .Net part of the code to async/await should be reasonably straight-forward. Before I go forking it to eliminate this problem in my own usage of Selenium, are there any plans here to move the .Net side to async/await? Reasons it's been set aside/postponed? Is it waiting for a major version change? If I do fork it, are there ways I can do my work to ensure I can help contribute when Selenium .Net does move to async/await?

I did search in advance, and I mostly found an unrelated discussion about branding.

Thanks!

Paul Hammant

unread,
Dec 11, 2019, 9:33:50 AM12/11/19
to selenium-developers
I've controlled hundreds of WebDriver instances (cloud) from one JVM in a multi-threaded situation, and it whizzed along. This was 10 years ago (Se 2.0).These were function tests towards an overall pass/fail for a CI job. Many hours of tests if done serially, brought back to conclusion in a few minutes. Of course, I moved my bottleneck - the web-server being stood up would not handle load, and the hops to/from the browsers would be slower than same-machine or at least "closer" somehow than the cloud. These days I'd spawn separate jobs into cloudland to reach hundreds of parallel browser tests. Say Jenkins/Mesos as the runner, and a target web-server per node.

What sort of Selenium-requiring workload are you doing, that needs to push into async/await?



--
You received this message because you are subscribed to the Google Groups "Selenium Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to selenium-develo...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/selenium-developers/0ae3a052-78f6-46af-91f7-4ad9ed4132fd%40googlegroups.com.

Jim Evans

unread,
Dec 11, 2019, 6:18:09 PM12/11/19
to Selenium Developers
Until the currently-under-development 4.0 version, the .NET bindings had to support as far back as .NET 3.5, which did not support the parallel task library constructs at all. Rather than split the API in such a way as to confuse users as to who could and couldn’t use which parts of the API, we did not add async methods. Even today, I’m hesitant to do so now, since it would amount to an entirely duplicated API. I’m also not convinced that the benefits are there for this particular use case.

Chris Moschini

unread,
Dec 13, 2019, 3:11:58 PM12/13/19
to selenium-...@googlegroups.com
On Wed, Dec 11, 2019 at 6:18 PM Jim Evans <james.h....@gmail.com> wrote:
Until the currently-under-development 4.0 version, the .NET bindings had to support as far back as .NET 3.5, which did not support the parallel task library constructs at all. Rather than split the API in such a way as to confuse users as to who could and couldn’t use which parts of the API, we did not add async methods. Even today, I’m hesitant to do so now, since it would amount to an entirely duplicated API. I’m also not convinced that the benefits are there for this particular use case.

OK great - so the 4.0 version is moving to a version of .Net that supports async/await, so if I did fork this, it sounds like it would be helpful if I approached it in a way that was useful to the 4.0 version.

Moving to async doesn't require duplicating anything; instead, it should all just be async, since what it's doing is fundamentally asynchronous. In performance testing, in the worst-case, adding async to something that MIGHT return as fast as running it synchronously, adds essentially 0 harm to performance because of how well .Net optimizes underneath the hood. In the best-case scenario, async calls that might be gone for seconds at a time, avoid thread exhaustion by giving back that thread to the threadpool. So, by moving every call that makes a network call to async - which is nearly everything in Selenium - you avoid API duplication, avoid harming performance, and for many common operations (like loading a webpage), eliminate a ton of thread stranding.

The argument that you could just buy more servers is not a very good argument - it's basically just saying the code can be inefficient if you spend more money. Yes that's true, but I have to say I'm opposed to making something slow just because you can afford it. Switching to async is pretty simple because of how much plumbing is already done for you in .Net. So, if it can be dramatically more efficient... it just should be. The upshot is that users of Selenium could run thousands of tests instead of hundreds on a small server without having to expand your buy to accommodate more threads.

As one example, one of the code paths we're using Selenium for is PDF creation. Chrome is a great place to test a webpage and see exactly how it will render before emitting as PDF. So we develop to our exact specifications in HTML for Chrome, then we spin up Selenium to emit relatively long PDF documents from it. These PDFs have a lot of data, so Selenium can be waiting a long time while other code fetches that data and renders the page. In fact, during that wait, the code that's fetching the data and rendering it is unable to use as many threads to do its work because Selenium is locking up a thread. It's a server, so many of these can happen at a time. Each instance of Selenium running synchronously is one more thread in the threadpool lost, doing nothing.

Jim Evans

unread,
Dec 14, 2019, 1:24:39 AM12/14/19
to Selenium Developers
“Moving to async doesn't require duplicating anything; instead, it should all just be async, since what it's doing is fundamentally asynchronous.” Changing the existing methods to async would cause compile errors for every current user who updated to that version of the language bindings. Adding async methods amounts to duplicating the API surface, once with the existing sync methods, the other with the async ones. I’m not saying one would not delegate to the other; I’m complaining that doubling the API surface is not ideal.

Please understand I’m not opposed to the idea on principle, and I think it would likely be a nicer approach. However, it’s hard to see a clear way through that balances backward compatibility and forward maintenance that isn’t in some way suboptimal.
Reply all
Reply to author
Forward
0 new messages