unable to Index wiki pages

16 views
Skip to first unread message

Piyush

unread,
Mar 4, 2008, 4:11:29 AM3/4/08
to Google Search Appliance
Hi,

I am unable to index wiki pages in google-mini (we are using TWIKI --
http://twiki.org/)

But under Crawl Diagnostics, I can see all wiki pages under "Crawled
URLs" -- ie. google-mini is able to crawl -- but donot index.

Any idea, Why google-mini is not indexing those pages..??

Thanks
Piyush
Mo: 9910904233

Piyush

unread,
Mar 4, 2008, 9:45:26 AM3/4/08
to Google Search Appliance
Any ideas /thoughts..??

On Mar 4, 2:11 pm, Piyush <piyku...@gmail.com> wrote:
> Hi,
>
> I am unable to index wiki pages in google-mini (we are using TWIKI --http://twiki.org/)

jdandrea

unread,
Mar 4, 2008, 10:41:21 AM3/4/08
to Google Search Appliance
Greetings!

On Mar 4, 4:11 am, Piyush <piyku...@gmail.com> wrote:

> I am unable to index wiki pages in google-mini (we are using TWIKI --http://twiki.org/)
>
> But under Crawl Diagnostics, I can see all wiki pages under "Crawled
> URLs" -- ie. google-mini is able to crawl -- but donot index.

Reality check: Does the troubleshooting guide[1] get you any further?
I don't like suggesting "easy-outs" like an index reset, but perhaps
that is in order here.

Can you crawl/index other pages outside your Twiki server/install?

--
Joe D'Andrea
Liquid Joe LLC
www.liquidjoe.biz
+1 (908) 781-0323

[1] http://code.google.com/apis/searchappliance/documentation/50/troubleshooting/Troubleshooting.html

Piyush

unread,
Mar 4, 2008, 7:50:19 PM3/4/08
to Google Search Appliance
Hi,

> Can you crawl/index other pages outside your Twiki server/install?

Yes, I crawl/index other pages apart from twiki, and they are crawled
and index well.

Also, GSA is able to crawl & Index pdfs/docs/xls etc --which are
attachments to twiki pages.

But GSA is not able to Index the TWiki pages, even though it crawls
them.....

Please suggest for some solution. Thanks!!

Thanks
Piyush
> [1]http://code.google.com/apis/searchappliance/documentation/50/troubles...

Carl Gherardi

unread,
Mar 5, 2008, 12:40:26 AM3/5/08
to Google-Sear...@googlegroups.com
On Wed, Mar 5, 2008 at 9:50 AM, Piyush <piyk...@gmail.com> wrote:
>
> Hi,
>
>
> > Can you crawl/index other pages outside your Twiki server/install?
>
> Yes, I crawl/index other pages apart from twiki, and they are crawled
> and index well.
>
> Also, GSA is able to crawl & Index pdfs/docs/xls etc --which are
> attachments to twiki pages.
>
> But GSA is not able to Index the TWiki pages, even though it crawls
> them.....
>
> Please suggest for some solution. Thanks!!

I use and index twiki without issues, using the default theme with
some minor customisations.

Are you using a custom theme that uses a lot of javascript?

Have a look at your robots.txt too.

This works for me (tm) so you may need to reset the index and just
make sure your twiki install is sane.

Carl G

jdandrea

unread,
Mar 5, 2008, 9:29:57 AM3/5/08
to Google Search Appliance
Greetings!

On Mar 4, 7:50 pm, Piyush <piyku...@gmail.com> wrote:

> > Can you crawl/index other pages outside your Twiki server/install?
>
> Yes. ... But GSA is not able to Index the TWiki pages, even though it crawls
> them.....

Hmm. I keep thinking to ask you to check "Follow and Crawl" ... but
you say/write that they're being crawled already!

By any chance, is there a meta element on the Twiki pages (meaning
your Twiki theme/template) that indicates "noindex, follow" ?

<meta name="robots" content="noindex, follow"> (HTML example)

If you give the (last resort) index reset a try, let us know if that
helps as well.

--
Joe D'Andrea

GKP

unread,
Mar 5, 2008, 10:15:52 AM3/5/08
to Google Search Appliance
I'm not sure about the mini but I have the full blown GSA and I've
encountered the same problem. I don't know what actually causes it,
but support tells me that some process "gets stuck" and has to be
restarted. Have you tried restarting the mini?

Seb

unread,
Mar 5, 2008, 10:33:29 AM3/5/08
to Google Search Appliance
Maybe it is a session cookie issue. You can try to add the site to the
"cookie sites", to force the GSA/mini to retain the session cookie
between requests.

On 4 Mar, 09:11, Piyush <piyku...@gmail.com> wrote:
> Hi,
>
> I am unable to index wiki pages in google-mini (we are using TWIKI --http://twiki.org/)
Reply all
Reply to author
Forward
0 new messages