Search using collection codes is poor

20 views
Skip to first unread message

Rod Page

unread,
Mar 2, 2009, 12:07:57 PM3/2/09
to Biodiversity Collections Index
I'm puzzled by the search results, and almost added a duplicate
collection because the results are not what I expected.

For example, if I search on the string "UTA", why are collections with
the code "UTA" the 14th and 15th results in the list? Why are they not
the first hits?

After searching on the terms "University Texas Arlington" I found
that UTA already exists, but this is more effort than it should be.

It should be trivial to add some code that parses the search string,
if it is all (or mostly) upper case letters, then the search should be
just in the collection code field.

Regards

Rod

PS I think I now see why the search for UTA is broken. The top hits
are in Utah, which matches the first three characters of UTA.

rogerhyam

unread,
Mar 4, 2009, 8:32:38 AM3/4/09
to Biodiversity Collections Index
Hi Rod,

I am afraid the search uses the plain MySQL full text indexing and
tuning it would probably cause more troubles than it solved.

I do have a fudge where I trigger a code based search on fewer than
two chars. Try looking of 'K' or 'KW' for example and it slips into
code mode. But I had complaints when I triggered this on three chars
because people couldn't look up places like 'Kew'. There is no
restriction on case on codes so it doesn't help to only look for three
caps - some one would complain that it didn't find lowercase then

You can always do code based searches just by prepending the search
with @CODE: or using the code search page here:

http://www.biodiversitycollectionsindex.org/search/by-code

I was once told never to write a search engine where people have the
ability to know what is in the database by other means. The wonderful
thing about Google is you don't know what they have indexed. People
only really complain about Google when they have put a site up and
Google doesn't return their pages. Who knows what it isn't returning
the rest of the time!

Hope this helps,

Roger
Reply all
Reply to author
Forward
0 new messages