Based on the search protocol reference document, it says
You can specify multiple file types by adding filetype: terms to the
search query, combined with the Boolean OR.
I tried adding following in my query:
&as_filetype=html&as_filetype=pdf
but it is retruning only pdf documents. Am I doing anything wrong in
this.
Appreciate any help or guidance.
Thanks,
Afzal
/search?q=health+filetype:pdf+OR+filetype:doc+&btnG=Google+Search&access=p&client=default_frontend&output=xml_no_dtd&proxystylesheet=default_frontend&sort=date:D:L:d1&entqr=3&oe=UTF-8&ie=UTF-8&ud=1&site=default_collection
--
You received this message because you are subscribed to the Google Groups "Google Search Appliance/Google Mini" group.
To post to this group, send email to Google-Search-...@googlegroups.com.
To unsubscribe from this group, send email to Google-Search-Applia...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/Google-Search-Appliance-Help?hl=en.
Thanks for your help, it worked with what you have suggested. Though,
I have another question:
On the search result page it is diplaying
Results 1 - 2 of about 1570 for training filetype:pdf OR filetype:doc
OR filetype:html OR filetype:htm.
Is there any way we can hide all these filetypes from this so that it
should only say
Results 1 - 2 of about 1570 for training
Thank you
-Afzal
On Mar 17, 3:29 pm, Marcos Farias <mfarias2...@gmail.com> wrote:
> Hi Afzal,
>
> Take a look onhttp://code.google.com/apis/searchappliance/documentation/62/xml_refe...
>
> You'll notice that as_filetype is different of filetype. The first one is
> a distinct param while the last one (filetype) is a special query term,
> which means you should include it inside your q param value.
>
> For instance, if you want to get all pdf or doc files that include
> "health", you can use "q=health filetype:pdf OR filetype:doc" as in the
> following example:
>
> /search?q=health+filetype:pdf+OR+filetype:doc+&btnG=Google+Search&access=p&client=default_frontend&output=xml_no_dtd&proxystylesheet=default_frontend&sort=date:D:L:d1&entqr=3&oe=UTF-8&ie=UTF-8&ud=1&site=default_collection
>
> Regards and good luck,
> Marcos Fariashttp://www.justdigital.com.br/
>
>
>
> On Wed, Mar 17, 2010 at 3:17 PM, Afzal <afzal....@gmail.com> wrote:
> > Can we search multiple file types without using the filter option in
> > GSA. I want to use the query parameter to search for multiple files
> > with extensions like (.html, .doc, .pdf).
>
> > Based on the search protocol reference document, it says
>
> > You can specify multiple file types by adding filetype: terms to the
> > search query, combined with the Boolean OR.
>
> > I tried adding following in my query:
>
> > &as_filetype=html&as_filetype=pdf
>
> > but it is retruning only pdf documents. Am I doing anything wrong in
> > this.
>
> > Appreciate any help or guidance.
>
> > Thanks,
> > Afzal
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google Search Appliance/Google Mini" group.
> > To post to this group, send email to
> > Google-Search-...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > Google-Search-Applia...@googlegroups.com<Google-Search-Appliance-Help%2Bunsu...@googlegroups.com>
> > .
> > For more options, visit this group at
> >http://groups.google.com/group/Google-Search-Appliance-Help?hl=en.- Hide quoted text -
>
> - Show quoted text -
To unsubscribe from this group, send email to Google-Search-Applia...@googlegroups.com.
I tried adding the filetype filters in the collections as you have
suggested, but still seeing the files that are not supposed to be in
the result set. Following is the list of filetypes I added in the
collection:
.css$
.csv$
.doc$
.dot$
.exe$
.gif$
.htm$
.html$
.jpg$
.pdf$
.ppt$
.prn$
.txt$
.xls$
Am I doing anything wrong in this.
Please help.
Thanks,
-Afzal
On Mar 18, 10:39 am, Marcos Farias <mfarias2...@gmail.com> wrote:
> Afzal,
>
> I believe the easiest (and perhaps the best) way to reach that goal is
> make use of collections. In that way, you could create a collection, named
> as docs_collection for instance, which would include just pdf, doc, html and
> htm files.
>
> Then, when performing search, you would just specify in the q param the
> string training and set site param as equal to docs_collection.
>
> This also could give you an extra benefit of getting results quicker when
> compared to use the filetype special query term.
>
> Access Help Center > Crawl and Index > Collections at your GSA's Admin
> Console in order to get more information on how to use this feature.
>
> Good luck,
> Marcos Fariashttp://www.justdigital.com.br/
> > > > Google-Search-Applia...@googlegroups.com<Google-Search-Appliance-Help%2Bunsu...@googlegroups.com>
> > <Google-Search-Appliance-Help%2Bunsu...@googlegroups.com<liance-Help%252Buns...@googlegroups.com>
>
> > > > .
> > > > For more options, visit this group at
> > > >http://groups.google.com/group/Google-Search-Appliance-Help?hl=en.-Hidequoted text -
>
> > > - Show quoted text -
>
> > --
> > You received this message because you are subscribed to the Google Groups
> > "Google Search Appliance/Google Mini" group.
> > To post to this group, send email to
> > Google-Search-...@googlegroups.com.
> > To unsubscribe from this group, send email to
> > Google-Search-Applia...@googlegroups.com<Google-Search-Appliance-Help%2Bunsu...@googlegroups.com>
Well, first, why do you have .exe and .gif in there? You don't want
those in your index, do you? They aren't searchable text, really.
Second, are you filtering by collection?
Dave Watts, CTO, Fig Leaf Software
http://www.figleaf.com/
http://training.figleaf.com/
Fig Leaf Software is a Veteran-Owned Small Business (VOSB) on
GSA Schedule, and provides the highest caliber vendor-authorized
instruction at our training centers, online, or onsite.
Thanks,
On Mar 19, 12:26 pm, Dave Watts <dwa...@figleaf.com> wrote:
> > I tried adding the filetype filters in the collections as you have
> > suggested, but still seeing the files that are not supposed to be in
> > the result set. Following is the list of filetypes I added in the
> > collection:
>
> > .css$
> > .csv$
> > .doc$
> > .dot$
> > .exe$
> > .gif$
> > .htm$
> > .html$
> > .jpg$
> > .pdf$
> > .ppt$
> > .prn$
> > .txt$
> > .xls$
>
> > Am I doing anything wrong in this.
>
> Well, first, why do you have .exe and .gif in there? You don't want
> those in your index, do you? They aren't searchable text, really.
>
> Second, are you filtering by collection?
>
> Dave Watts, CTO, Fig Leaf Softwarehttp://www.figleaf.com/http://training.figleaf.com/
--
You received this message because you are subscribed to the Google Groups "Google Search Appliance/Google Mini" group.
To post to this group, send email to Google-Search-...@googlegroups.com.
To unsubscribe from this group, send email to Google-Search-Applia...@googlegroups.com.
I have put the same filter for the Crawl and Index too and added these
in the both cases (lower and upper)
Following is the url that I am using currently
/search?
q=training&site=hcs_hin_dev&client=hcs_frontend&output=xml_no_dtd&proxystylesheet=hcs_frontend&filter=1
I have defined the filters in the collection "hcs_hin_dev".
Regards,
On Mar 19, 1:31 pm, Marcos Farias <mfarias2...@gmail.com> wrote:
> Azfal,
>
> 1 - You are sure that you are not confunding the "Crawl and Index" menu with
> the "Collections" menu, right?
> 2 - remember GSA is by default case sensitive. In case your files have
> differents cases in their extension, you could try to use regexpIgnoreCase
> 3 - could you send us the url you are using to perform the search? Maybe
> that give us a hint of what is happening.
>
> Rgds
>
> > Google-Search-Applia...@googlegroups.com<Google-Search-Appliance-Help%2Bunsu...@googlegroups.com>