documents links served through SMB malformed?
The group you are posting to is a
Usenet group . Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
From:
"miguel gmail" <miguel.lis... @gmail.com>
Date: Mon, 25 Feb 2008 10:24:36 +0400
Local: Mon, Feb 25 2008 1:24 am
Subject: documents links served through SMB malformed?
Hi,
Ive just configured the GSA to crawl over a smb server (windows 2k3) a number of RTF files.
The GSA crawls them correctly, but when trying to follow any link, I always get a 404 because of the url (if it is relevant, with firefox):
http://my.gsaserver.com/getFile?origUrl=smb://smb.server.com/path/of/...
The directories are just fine, and the doc.rtf is actually there, but I just get a 404 for all the files in the smb server.
The smb server is crawling a password protected server. However, Ive made the content public checking the relevant checkbox.
Is there something I am missing?
Many thanks in advance, -- Saludos, miguel
Los agujeros negros son lugares donde dios dividió por cero.
Black holes are places where god divided by zero.
You must
Sign in before you can post messages.
You do not have the permission required to post.
From:
brian <brianj... @gmail.com>
Date: Thu, 28 Feb 2008 04:18:42 -0800 (PST)
Local: Thurs, Feb 28 2008 7:18 am
Subject: Re: documents links served through SMB malformed?
Hi Miguel,
That is very strange. The URL should be something similar to
http://<yourgsa>/smb/share/filename
Can you confirm:
1. What version you are running?
2. Are you using the default stylesheet? If not, can you try to?
3. What are you start URL patterns?
4. What does Crawl Diagnostics say for those particular URLs? Are they
in the proper format?
That is a good start. Let us know what you find.
Brian
On Feb 25, 3:24 pm, "miguel gmail" <miguel.lis... @gmail.com> wrote:
> Hi,
> Ive just configured the GSA to crawl over a smb server (windows 2k3) a
> number of RTF files.
> The GSA crawls them correctly, but when trying to follow any link, I
> always get a 404 because of the url (if it is relevant, with firefox):
> http://my.gsaserver.com/getFile?origUrl=smb://smb.server.com/path/of/ ...
> The directories are just fine, and the doc.rtf is actually there, but
> I just get a 404 for all the files in the smb server.
> The smb server is crawling a password protected server. However, Ive
> made the content public checking the relevant checkbox.
> Is there something I am missing?
> Many thanks in advance,
> --
> Saludos,
> miguel
> Los agujeros negros son lugares donde dios dividió por cero.
> Black holes are places where god divided by zero.
You must
Sign in before you can post messages.
You do not have the permission required to post.
From:
Traci <traci_latta... @ssga.com>
Date: Thu, 28 Feb 2008 05:49:20 -0800 (PST)
Local: Thurs, Feb 28 2008 8:49 am
Subject: Re: documents links served through SMB malformed?
I had the same issue. There is a bug logged on this (bug 907861), but
there is also a 'hack' solution.
http://groups.google.com/group/Google-Search-Appliance/browse_thread/...
There is a 'hack' solution in there that will work if all of your
content is marked as 'make public' (which is the case for us). It
seems to work for me:
In the xslt change line 2472:
- select="concat($protocol,'/',$temp_url)"/>
+ select="concat('file://///',$temp_url)"/>
Hope that helps.
Traci
On Feb 25, 1:24 am, "miguel gmail" <miguel.lis... @gmail.com> wrote:
> Hi,
> Ive just configured the GSA to crawl over a smb server (windows 2k3) a
> number of RTF files.
> The GSA crawls them correctly, but when trying to follow any link, I
> always get a 404 because of the url (if it is relevant, with firefox):
> http://my.gsaserver.com/getFile?origUrl=smb://smb.server.com/path/of/ ...
> The directories are just fine, and the doc.rtf is actually there, but
> I just get a 404 for all the files in the smb server.
> The smb server is crawling a password protected server. However, Ive
> made the content public checking the relevant checkbox.
> Is there something I am missing?
> Many thanks in advance,
> --
> Saludos,
> miguel
> Los agujeros negros son lugares donde dios dividió por cero.
> Black holes are places where god divided by zero.
You must
Sign in before you can post messages.
You do not have the permission required to post.
From:
"Carl Gherardi" <carl.ghera... @gmail.com>
Date: Thu, 28 Feb 2008 23:00:22 +0900
Local: Thurs, Feb 28 2008 9:00 am
Subject: Re: [GSA] Re: documents links served through SMB malformed?
I think he's using feeds for these documents, hence the origurl param.
On Thu, Feb 28, 2008 at 9:18 PM, brian <brianj
... @gmail.com> wrote:
> Hi Miguel,
> That is very strange. The URL should be something similar to > http://<yourgsa>/smb/share/filename
> Can you confirm:
> 1. What version you are running? > 2. Are you using the default stylesheet? If not, can you try to? > 3. What are you start URL patterns? > 4. What does Crawl Diagnostics say for those particular URLs? Are they > in the proper format?
> That is a good start. Let us know what you find.
> Brian
> On Feb 25, 3:24 pm, "miguel gmail" <miguel.lis... @gmail.com> wrote: > > Hi,
> > Ive just configured the GSA to crawl over a smb server (windows 2k3) a > > number of RTF files.
> > The GSA crawls them correctly, but when trying to follow any link, I > > always get a 404 because of the url (if it is relevant, with firefox):
> > http://my.gsaserver.com/getFile?origUrl=smb://smb.server.com/path/of/ ...
> > The directories are just fine, and the doc.rtf is actually there, but > > I just get a 404 for all the files in the smb server.
> > The smb server is crawling a password protected server. However, Ive > > made the content public checking the relevant checkbox.
> > Is there something I am missing?
> > Many thanks in advance, > > -- > > Saludos, > > miguel
> > Los agujeros negros son lugares donde dios dividió por cero.
> > Black holes are places where god divided by zero.
You must
Sign in before you can post messages.
You do not have the permission required to post.