Filter by filetype

215 views
Skip to first unread message

Timothy Ruhle

unread,
Feb 5, 2012, 7:16:48 PM2/5/12
to Google-Search-...@googlegroups.com

I am trying to set up a filter for users by file type.

Using special query terms File Type Filtering or File Extension Filter adds text to the end of the query term. Which in turn displays Searched for "abc etx:pdf" and also adds that to the suggestions which is hardly ideal.

Setting up a seperate front end for each filetype or using as_filetype also results in a similar predicament.

I don't really want to have to set up seperate collections for each one becuase then I would end up with over 70 collections (there are 10 sites I am crawling).

Are there any other alternatives that filter results by mime or extension that aren't added to the query term? What is the best way to filter by mime or extension?

Dave Watts

unread,
Feb 5, 2012, 8:41:04 PM2/5/12
to google-search-...@googlegroups.com
> I am trying to set up a filter for users by file type.
>
> Using special query terms File Type Filtering or File Extension Filter adds
> text to the end of the query term. Which in turn displays Searched for "abc
> etx:pdf" and also adds that to the suggestions which is hardly ideal.
>
> Setting up a seperate front end for each filetype or using as_filetype also
> results in a similar predicament.

The only mechanisms for searching by file type involve injecting a
value into the query parameter. I'm not sure why you wouldn't want
that to be displayed to the user if that's part of their search,
though. If you really don't want it displayed as part of the search,
you could rewrite the output however you like.

> I don't really want to have to set up seperate collections for each one
> becuase then I would end up with over 70 collections (there are 10 sites I
> am crawling).

There's nothing really wrong with having seventy collections, if
that's what you want.

> Are there any other alternatives that filter results by mime or extension
> that aren't added to the query term? What is the best way to filter by mime
> or extension?

There are no options to filter by MIME type of which I'm aware. The
best way to filter by extension is to use whichever approach mentioned
previously (filetype query operator, as_filetype URL parameter, front
end) best suits you.

Dave Watts, CTO, Fig Leaf Software
http://www.figleaf.com/
http://training.figleaf.com/

Fig Leaf Software is a Veteran-Owned Small Business (VOSB) on
GSA Schedule, and provides the highest caliber vendor-authorized
instruction at our training centers, online, or onsite.

Michael Cizmar

unread,
Feb 6, 2012, 9:13:39 AM2/6/12
to Google-Search-...@googlegroups.com, google-search-...@googlegroups.com
This is also how Dynamic Navigation is done as well.  You'll notice a dnavs parameter that matches the filter.  These are subtracted from the q parameter in the xslt.  Perhaps you could do something similar in your frontend if you wanted to hide these filters.

Michael

80none

unread,
Feb 8, 2012, 8:52:38 AM2/8/12
to google-search-...@googlegroups.com
As for hiding "filetype:" query term from the search box,
look for the place the frontend XSLT sets "qval".

<xsl:variable name="qval">
...


As of 6.14, "ext:" filters based on file extensions while "filetype:" filters
based on actual mime types.

http://code.google.com/apis/searchappliance/documentation/614/xml_reference.html
--

> --
> You received this message because you are subscribed to the Google Groups
> "Google Search Appliance/Google Mini" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/Google-Search-Appliance-Help/-/GeYCqYtTrygJ.
>
> To post to this group, send email to
> Google-Search-...@googlegroups.com.
> To unsubscribe from this group, send email to
> Google-Search-Applia...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/Google-Search-Appliance-Help?hl=en.

Dave Watts

unread,
Feb 8, 2012, 9:57:42 AM2/8/12
to google-search-...@googlegroups.com
> As of 6.14, "ext:" filters based on file extensions while "filetype:" filters
> based on  actual mime types.
>
> http://code.google.com/apis/searchappliance/documentation/614/xml_reference.html

I didn't know that. Thanks for the heads-up!

Reply all
Reply to author
Forward
0 new messages