crawling is not properly done

13 views
Skip to first unread message

hojo

unread,
Mar 16, 2009, 1:53:28 AM3/16/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini, radhik...@ust-global.com
Hi team

I am please to introduce myself Hojo Clement; senior software engineer
in UST Global; reside in India.

We are currently working on configuring Google Search Appliance in our
Portal deployed in SharePoint. In the configuration of google search
we faced lots of issues but your blog
http://groups.google.com/group/Google-Search-Appliance-Help/browse_thread/thread/a4f1826ca75375a7
has helped us a lot. Right now we have achieved the following steps
but we didn’t get any search result. Could you please help us to solve
this issue?

1. Installed Google search appliance in a server
2. Set the URL for crawling.
3. Added robots.txt as managed path in our SharePoint server
4. Installed JDK, Apache tomcat and Connector file in a separate
server.

In the crawling status we are getting like this.

URLs Found That Match Crawl Patterns: 6,530
Total Documents Being Served: 0
Current Crawling Rate: 0 pages per second
Document Bytes Filtered: 0
Documents Crawled Since Yesterday: 3,281
Document Errors Since Yesterday: 2,211

Please help us for getting search result
Thanks
Hojo

brianb

unread,
Mar 17, 2009, 4:42:50 AM3/17/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Hi Hojo,

That is pretty strange. A couple things:

1. Please check crawl diagnostics. What do you see there? I see a lot
of errors in the above so it would be good to see what is in Crawl
Diagnostics.
2. Are you able to get anything in the search at all? Please try a
search.
3. What version are you using? You can login to version manager on
port 9941 to get this information.
4. Also check under Crawl and Index -> Feeds. The sharepoint connector
will send feeds to the appliance telling it what URLs to crawl so you
should see feed entries here.

Brian



On Mar 16, 2:53 pm, hojo <hojoclem...@gmail.com> wrote:
> Hi team
>
> I am please to introduce myself Hojo Clement; senior software engineer
> in UST Global; reside in India.
>
> We are currently working on configuring Google Search Appliance in our
> Portal deployed in SharePoint.  In the configuration of google search
> we faced lots of issues but your bloghttp://groups.google.com/group/Google-Search-Appliance-Help/browse_th...

hojo

unread,
Mar 19, 2009, 8:25:26 AM3/19/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Hi Brianb

Thanks for your reply.

we got crawling in our application and feeds are completed
and search results in testing center also working fine. Our issue
occured due to some authentication in our coonector administration
settings.

But, currently we are getting one error in administration--
>Network settings-->Network Diagnostics --> we are giving our test
url then we are getting one error says that returncode 401, should
be 200

If you have any idea with regards to this scenario please advise us

Thanks
Hojo
> > In thecrawlingstatus we are getting like this.
>
> > URLs Found That Match Crawl Patterns:   6,530
> > Total Documents Being Served:   0
> > CurrentCrawlingRate:  0 pages per second
> > Document Bytes Filtered:        0
> > Documents Crawled Since Yesterday:      3,281
> > Document Errors Since Yesterday:        2,211
>
> > Please help us for getting search result
> > Thanks
> > Hojo- Hide quoted text -
>
> - Show quoted text -

brianb

unread,
Mar 23, 2009, 2:16:31 AM3/23/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Hi Hojo,

Generally that 401 in the Network Diagnostics is ok. The diagnostics
does not use credentials and just does a quick HTTP connectivity test.
The 401 just means that you will need to make sure that proper
credentials are set. If the rest of the site is actually crawling
without error (please check this in Crawl Diagnostics), then you
should be good and can ignore this error.

Brian
Reply all
Reply to author
Forward
0 new messages