Google mini crawler issue

0 views
Skip to first unread message

saurabh

unread,
Dec 30, 2009, 4:25:21 PM12/30/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
We recently purchased google mini and will be using it tos earch
content on our intranet website. Our intrnanet is secure website
protected by ichain. If a user logs in to it using browser he is
brought on login page served by ichain server. Once he has keyed in
user id and password he is logged in and lands on home page.

In google mini in starting url and filterable url we ahve given the
initial url

https://abc.def.com .....this lands user on ichain home page....

I have put in under crawler access valid user id and password.

What i have observed is that google mini is not able to log in . It
crawls and indexes the login page nut nothign afetr it. Also if i try
to put some internal url of our intrnaet like https://abc.def.com/gh/ij.xml
in crawler .....(Which if user puts on browser is brought on ichain
login page and once keys in user id is directly taken tot hat
page) .....Google device is able to index link ....so say if i am
testing for ij.xml it shows me the link but content it still shows of
login page.....

I know form authentication is not there in Google Mini. Is there any
other way i can get it to index pages.

Any guidance will be appreciated.

brianb

unread,
Jan 5, 2010, 1:05:20 AM1/5/10
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
It sounds like you are using cookie authentication which the Mini does
not support. It is only supported on the larger GSA. The only
workaround for the Mini would be to make sure that the cookies are
static (they do not expire) and you could add a cookie header to Crawl
and Index -> HTTP Headers like:

Cookie: Cookiename=Cookievalue

And then recrawl. If the cookies do expire though, the Mini will no
longer be able to crawl so you would need to be careful if you use
this workaround. The best way would really be to look into purchasing
a GSA which will fully support this feature.

Regards,

Brian

On Dec 31 2009, 6:25 am, saurabh <saurabhgosw...@hotmail.com> wrote:
> We recently purchased google mini and will be using it tos earch
> content on our intranet website. Our intrnanet is secure website
> protected by ichain. If a user logs in to it using browser he is
> brought on login page served by ichain server. Once he has keyed in
> user id and password he is logged in and lands on home page.
>
> In google mini in starting url and filterable url we ahve given the
> initial url
>

> https://abc.def.com.....this lands user on ichain home page....


>
> I have put in under crawler access valid user id and password.
>
> What i have observed is that google mini is not able to log in . It
> crawls and indexes the login page nut nothign afetr it. Also if i try

> to put some internal url of our intrnaet likehttps://abc.def.com/gh/ij.xml

kak1978

unread,
Feb 1, 2010, 11:06:05 AM2/1/10
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini

The way we handled this situation is make code changes in our
application to bypass the authentication for the request comming from
google mini's IP.
Reply all
Reply to author
Forward
0 new messages