google-mini file server not crawled

6 views
Skip to first unread message

albfran

unread,
Oct 2, 2008, 10:56:47 AM10/2/08
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Sorry, I'm a new google-mini user.
We would like to index some specific file servers that reside on our
Lan, therefore the host name given in the URL list for being crawled
is an SMB type.
The problem arises, after starting the index process, with an 404
error "no document found" on that URL, while there are more tha 27k
documents distributed on different folders down the URL.

What's wrong,

Thanks in advance

Alb

Thiru

unread,
Oct 2, 2008, 6:19:57 PM10/2/08
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Hi Alb,

You do not have to apologize. We all have to learn something or the
other. The error can be misleading at times. Few things to check :

- Make sure you have entered the smb url in a right format under
"Start Crawling from the Following URLs" and "Follow and Crawl Only
URLs with the Following Patterns:".
It should be something like : smb://server_name/share_name/ (make sure
that you have the trailing slash)

- Make sure to test the patterns using the "test these patterns"
utility from the same page (Crawl and index > Crawl urls)

- You should also have to provide authentication information under
"Crawl and Index > Crawler Access". Make sure to use the same format
when you create the crawler access rule, i.e. smb://server_name/share_name/

- Be aware that the Google Mini does not support Microsoft DFS
(Distributed File System) for smb crawling.

Cheers,
Thiru

Prathap

unread,
Oct 3, 2008, 5:14:27 AM10/3/08
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini

albfran

unread,
Oct 6, 2008, 9:14:30 AM10/6/08
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Hi Thiru,

thanks for your suggestions. After few battles now the system crawls
the SMB folder and its sub-forlder. It started to work when I reset to
blank, on the Administration-Network screen, the DNS-Suffix.

Now let's go to the next problem: when I use the appliance, as a
normal guest, in order to search for anything, among the thousands of
crawled documents, the response is always "sorry the search has not
found your criteria in any of the documents".

Have you any other good help for me ?

Alb

Thiru

unread,
Oct 7, 2008, 2:46:41 PM10/7/08
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Hi Alb,

Good to hear that you resolved the crawling issues. Now, check the
following for serving phase.

- If you have set the Crawler Access rule for secure serving, i.e.
"Make Public" flag is NOT checked. Then all the documentes that are
matching that rule will be secure.
You have to use "&access=a" in your search query to get the results.
By default, the search query contains &access=p. All the supported
search parameters are explained here :
http://code.google.com/apis/searchappliance/documentation/50/xml_reference.html#request_parameters

- If you do not want to use secure serving, the select the "Make
Public" flag. It may take couple of hours for the changes to take
effect.

Cheers,
Thiru
Reply all
Reply to author
Forward
0 new messages