Hi EGR,
I think it depends what the keywords are used for. Is it search,
constructing navigation, grouping like content together? What you are
using them for depends on whether you need a controlled vocabulary.
As Hilary notes (and what I say below repeats some of her other
points), if you are using them to enhance the findability of articles
then you need to focus on words that are not in the article. If it's
more that the keywords are about aiding navigation - i.e. if I want to
bring up articles on "Istanbul" then they need to be tagged to that -
then it kinda makes more sense.
One thing to remember when you build a controlled vocabulary is that
you typically identify synonyms. If you can link your controlled
vocabulary to your CMS and your search engine then a controlled
vocabulary and synonym list means that when someone types in
"Istanbul" for search they will also get documents about
"Constantinople" and "Byzantium". Which is nice.
Back to your questions:
> Anyone out there fans of a controlled vocabulary?
Yes, when it's appropriate to do so. One really important thing to
note is that if it's a purely manual system dependent on content
creators referring to some long list of words somewhere then it's
going to be very, very hard to implement well. Understand the
capabilities of your CMS and your search engine and see what you can
automate.
> How do you build that?
Always focus on the outcome you want rather than creating a nice,
locially neat, abstract structure. If search is the key thing then
perhaps start with 1. search terms entered by users of your content
ordered by frequency of use and 2. keywords added by your content
creators also ordered by frequency of use. How do they compare? Any
obvious mismatches that lead to dud search results?e.g. content
creators tag things using "Byzantium", users search with
"Constantinople" and don't find what they need. Hey presto - obvious
controlled vocabulary issue. You can buy "pre-constructed"
vocabularies for lots of domains but expect that they will need
tailoring in some way and don't skimp on the user-centric approach
(and it sounds like you may have a lot of ground to cover).
> How often (if at all) is it updated?
All depends on how quickly your content, your creators and your users
change - and how much time you have. Probably somewhere between
annually and quarterly.
> Does it yield better search results--internal and external?
It can. But if poor search results are your problem then I'd try to
diagnose why that's happening. Is it how the search engine has been
configured?
Cheers,
Matt