Is There a Workaround for Partial Wildcard Search?

92 views
Skip to first unread message

rdesarmes

unread,
Jun 1, 2007, 10:05:01 AM6/1/07
to Google Search Appliance
I'm replacing a legacy search application with a Google Search
Appliance (GB 1001). In the legacy search app users can search on a
portion of the word and the search engine would return the document
containing the whole word. For instance a search for mathema* would
return a document containing mathematics. Google can't do that.
Google can only do "whole word" wildcard searches. I know I can do
word stemming but it's no substitution for wildcard searches.

Has anywone been able to find a workaround for partial wildcard
searches (such as regular expressions or any subword pattern matching
scheme etc)?

7la...@gmail.com

unread,
Jun 1, 2007, 11:31:17 AM6/1/07
to Google Search Appliance
I'm afraid I don't have an answer for you, but rather a question. Do
you have any idea of how many people in your enterprise use
wildcards? According to Google's statistics less than 2% of users
have ever used advanced search features.

Is there a specific application you have in mind for wildcard search?
A database lookup, for example? Perhaps there's a realtime (vs
indexed) workaround.

Skip Knox

unread,
Jun 1, 2007, 11:43:59 AM6/1/07
to Google-Sear...@googlegroups.com
>According to Google's statistics less than 2% of users have ever used advanced search features.

Ah, but *why*?

1. Because they have to click on Advanced Search to get to it, since normal wildcarding doesn't work
2. Once there, the Advanced Search screen is complex and cryptic, in comparison with the dead-simple interface of the main screen.

Solutions?
1. Allow basic wildcards. Asterisk and question mark. This will take care of a great many users.
2. Simplify Advanced Search and call it Detailed Search or Better Search or some such. Offer the Boolean set and maybe scope by domain. Just three or four choices. Then from *there* offer Advanced Search with all the gory details.

My guess is that the same 2% will drive on down to Advanced Search (most of us go there only to learn how to invoke them from the input line anyway). But that another 5% or so will go to the Detailed Search. This in addition to the 10% or so who would start using wildcards.

Statistics provided by the Skip Knox Institute of Thin Air Research.

-= Skip =-


rdesarmes

unread,
Jun 1, 2007, 12:11:39 PM6/1/07
to Google Search Appliance
Thanks for the reply. For this application 100% of the users will use
wildcard (it's part of their training) because they'll be searching
for terms that they don't know how to spell.
And I don't want it as an advanced search feature - just an asterisk
and perhaps question mark as Skip suggested.

> > scheme etc)?- Hide quoted text -
>
> - Show quoted text -

7la...@gmail.com

unread,
Jun 1, 2007, 2:11:41 PM6/1/07
to Google Search Appliance
rdesarmes, I think you're putting the cart before the horse.
Shouldn't the order of priorities be content, search, training? Why
train users on a feature that doesn't exist?

I use DOS and Linux and vi and I love the elegance of wildcarding for
zero, one, and more characters, but this functionality usually goes
beyond ordinary users.

Wouldn't it make the most sense to train your users to spell instead
of how to compensate for misspellings? Can you give more insight as
to the content and how it is formatted?

Also, are you aware that the GSA's spell checker intelligently learns
new jargon from the indexed content? The GSA I administer suggests my
name when I misspell it, for example.

rdesarmes

unread,
Jun 1, 2007, 3:24:52 PM6/1/07
to Google Search Appliance
I agree completely with you. The order of priority is content,
search, and training.
The training I was referring to is from the current legacy app. They
haven't been trained on Google Search yet since we haven't implemented
Google yet. We need to solve our wildcard issue before replacing our
legacy search.

As far as spelling goes, the document repository is healthcare content
so it contains a lot of medical terms (such as leiomyomatosis and von
Hippel-Lindau disease) and medical drug names. It's tough to teach
the users to spell such terms unless you send them to medical
school :)

The DYM feature of Google can help, but is still no substitute for
wildcard search.

> > > - Show quoted text -- Hide quoted text -

Dave Lemen

unread,
Jun 4, 2007, 6:31:03 AM6/4/07
to Google Search Appliance
> ... In the legacy search app users can search on a

> portion of the word and the search engine would return the document
> containing the whole word. For instance a search for mathema* would
> return a document containing mathematics. Google can't do that.

Since you've already selected the GSA, one way to accomplish this
would be to create a separate index containing a list of those
difficult-to-spell words, using a tool that supports wildcards. Then,
create a OneBox module that checks each query for wildcard characters,
and if it finds any, throws the term over the fence to see if there
are any hits. You would then display any hits that come back as links
to a GSA query on the exact spelling.

Reply all
Reply to author
Forward
0 new messages