Using Selenium to retrieve all 'data:' urls

491 views
Skip to first unread message

neilb...@gmail.com

unread,
Jul 5, 2016, 10:48:43 AM7/5/16
to Selenium Users
Wondering if there's a way to retrieve all the 'data:' urls(https://tools.ietf.org/html/rfc2397) rendered by a web page. This is very easy to do in PhantomJS (just use page.onResourceReceived and the response.url has 'data:' url). But I couldn't fin a way to do this with Selenium. Thanks.

David

unread,
Jul 5, 2016, 9:22:01 PM7/5/16
to Selenium Users
I would think the general way to do that with Selenium would be to find all elements that contain data URIs after page load. For example, all images that use data URIs instead of URLs to load the image. You could do a find elements using

XPath: //img[contains(@src,'data:')]
CSS: img[src *= 'data:']

that would give you a list of WebElements. From each WebElement, you can then extract out the src attribute then strip off the data URI header as needed (e.g. "data:image/gif;base64,").

The only issue might be to determine when page load is complete, that'd be site/application specific.
Reply all
Reply to author
Forward
0 new messages