Using ProxyMesh (https://proxymesh.com ) IP in selenium chrome driver for web scrapping

223 views
Skip to first unread message

varsh...@gmail.com

unread,
Jul 28, 2015, 8:26:12 AM7/28/15
to Selenium Users
I have performed web-scrapping using python-scrapy framework with a Proxy Mesh IP. If the proxy requires authentication I use the following code :

    import base64
    
    # Start your middleware class
    class ProxyMiddleware(object):
        # overwrite process request
        def process_request(self, request, spider):
            # Set the location of the proxy
            request.meta['proxy'] = "http://....."
    
            # Use the following lines if your proxy requires authentication
            proxy_user_pass = "username:pwd"
            # setup basic authentication for the proxy
            encoded_user_pass = base64.encodestring(proxy_user_pass)
            request.headers['Proxy-Authorization'] = 'Basic ' + encoded_user_pass

When I want to do the same while scraping using selenium chrome driver what is the appropriate technique that can be used. I find examples using firefox but no luck in chrome driver. Please share your ideas.




Mercious

unread,
Jul 28, 2015, 9:45:38 AM7/28/15
to Selenium Users, varsh...@gmail.com
Why do you want to scrape with selenium? 

Selenium has to support JavaScript, render stuff, download pictures and everything.

It is horribly slow compared to any light weight scrapping-framework that simply parses the Page-DOM.

It is possible but i would strongly advise against it. What are you trying to scrape off the website? 
Reply all
Reply to author
Forward
0 new messages