HELP - ScrapeBox is discarding most of the Goblin's proxies

425 views
Skip to first unread message

Ash (Developer)

unread,
Apr 28, 2011, 10:58:43 PM4/28/11
to proxy-gobl...@googlegroups.com
Brief overview of How ScrapeBox validates proxies. 

ScrapeBox has one of the strictest proxy validation requirements. It does not allow
you to use your own proxy judges and if you need proxies for scraping the Google, 
it only works if the page google.com does not not give a 302 message, or if it 
redirects to a non-us google site. 

I have stopped using the proxy checker that's built-into SB, but for those of
you who still do, the following tips might help you to increase the number of working 
proxies. 

In the Goblin
  1. Use the Goblin's strict judge => http://molura.com/proxy-judge-strict.php
  2. If you need proxies for scraping Serps, check the "Perform Google Verification" checkbox.
  3. Choose Only Elite proxies
In ScrapeBox
  1. Use a low value for the Max Connections for the Proxy Harvester

    Menu >> Settings >> Adjust Maximum Connections >> Proxy Harvester : Set to max 5

  2. Change the proxy harvester's timeout settings to a high value. This might make SB run 
    slightly slower, but it will make sure all working proxies are used.

    Menu >> Settings >> Adjust Timeout Settings >> Proxy Harvester Timeout : Set to above 50 seconds

  3. And finally when testing the proxies, skip the google test, if you're only going to use the proxies for 
    posting and not scraping. 

Using the above setting, you should see a marked improvement in the number of valid proxies. :) If you've got some of your 
own tips & tricks do share them. 

Cheers,

Ash 
Molura.com

Ash (Developer)

unread,
May 5, 2011, 12:10:01 PM5/5/11
to proxy-gobl...@googlegroups.com
Update: I contacted ScrapeBox and asked them to verify if the above post is correct and this is the response I received.

Hi Ash,

Yes you are 100% correct, one of the checks actually queries a keyword at Google and then checks to see if the proxy was able to harvest a result. This is of course only necessary when gathering proxies for scraping Google.

For posting and other tasks, as long as the proxy is anonymous, is not a Codeen proxy and allows to POST http requests it’s suitable for commenting.

Regards, ScrapeBox Support.
http://www.scrapebox.com


This is the email I sent:
You see, many users have been having problems with ScrapeBox flagging multiple
Goblin proxies as invalid. For some reason my proxies pass with other testers like
EPS & Charon but almost always fail on ScrapeBox.

I did some testing and came up with this response.
https://groups.google.com/d/topic/proxy-goblin-support/tX7mGxyj0_E/discussion

Do let me know if I was completely off and if there are other variables to consider

when preparing proxies for ScrapeBox.

Thanks!

Cheers,
Ash

 


Reply all
Reply to author
Forward
0 new messages