We, too, are fighting the "Retrying URL: Connection reset..."
monster. On GSA 5.2.xx version (not VE) on continuous crawl.
cphdforumsrosslynmain.aspx Retrying URL: Connection reset by peer
during fetch. 15 Mar 4:21 AM
cphdforumsrosslynproposedguidelines.aspx Retrying URL: Network
unreachable during fetch. 15 Mar 4:28 AM
1. What is your host load currently set to? 1.0 for the domain
2. Does this only happen to some hosts or all hosts/URLs you are
crawling? Seeing it on most of our 25 hosts, but most are <500 URLs.
Primarily concerned with main domain of approx 40k URLs
3. Is this will http or smb? http
Overtime (few weeks) you can watch the Crawl Diagnostics Report where
all the Crawled URLs drain down while the corresponding Retrieval
Errors column increases proportionally.
Current klugey strategy is a weekly refresh where we drill down to the
directories with the bulk of the errors, export them to a file, then
paste that list in Freshness Tuning in the Recrawl these URL Patterns
field. Repetitive and tedious across several folder paths and several
domains as you can imagine. And I'm aware that we have a small volume
of URLs comparatively.
We will try your suggestion "4. It might be worth it to do a quick
tcpdump/wireshark to see what is happening."
Wanted to be on record along side AsparagusX and others. Thanks.
On May 8, 5:59 am, "justin.brister" <
justin.bris...@googlemail.com>
wrote: