ht://Dig is great and kudos to all involved in its development. I'm an amateur ht://Digger, taking over some of Benoit Majeau's work. He has posted to this list before.
I've read much of the documentation on ht://Dig, and I haven't yet found an answer to the situation that I'm facing.
Here's the situation. Users are currently able to successfully search the database documents containing, say, "polymčres" by typing it with the accent on the e. This is normal, and works well for all other French words.
But the majority of our users would really like to be able to find documents containing "polymčres", without having to type the accented character(s). Currently, htsearch finds no documents on "polymeres", and that unacceptable for our users.
Is there anything you can suggest for this situation?
Thanks in advance for your help.
Pat :-)
----------------------------------------------------------------------
To unsubscribe from the htdig mailing list, send a message to
htdig-...@sdsu.edu containing the single word "unsubscribe" in
the body of the message.
> But the majority of our users would really like to be able to find documents containing "polymčres", without having to type the accented character(s). Currently, htsearch finds no documents on "polymeres", and that unacceptable for our users.
>
> Is there anything you can suggest for this situation?
Off hand I'd suggest that you use the synonym database, but I wasn't
able to find real clear documentation on how to do that.
bon chance,
Doug
At the moment, this may be the best solution. We have several people
working on internationalization, including an "ASCIIfy" system where you
could translate the accented words to ASCII. Contact Iosif Fettich
<ifet...@netsoft.ro> for more information.
Using synonyms is pretty easy. Edit the synonym file in the common
directory (the file currently in there is really a suggestion) and run
htfuzzy synonyms and you should be set. (My rundig script actually keeps
tabs on the synonym file and runs htfuzzy when necessary.)
-Geoff Hutchison
Williams Students Online
http://wso.williams.edu/
> Patrick Dugal wrote:
>
> > But the majority of our users would really like to be able to find documents containing "polymères", without having to type the accented character(s). Currently, htsearch finds no documents on "polymeres", and that unacceptable for our users.
> >
> > Is there anything you can suggest for this situation?
>
> Off hand I'd suggest that you use the synonym database, but I wasn't
> able to find real clear documentation on how to do that.
>
> bon chance,
>
> Doug
Couldn't there be a change which would do the following:
1 - have a reference file which had "mappings", i.e.
è = e
é = e
è = e
2 - whenever a single word like "polymères" was encountered,
the word would be remapped into a new word using the
mapping table, and then both words would be indexed,
i.e. "polymères" and polymeres". Then when a person typed
in "polymere" they would get a hit for "polymères"...
It appears that this already happens for case (i.e. when i search on
"montréal" i get hits for Montreal. This is probably not a mapping but
just everything being cast to lower case in the database).
thanks,
Glen Newton
CISTI NRC Canada