the nanoc data is in text files, they have a YAML header with title, date etc. and markdown content. While the nanoc compiler runs the content is represented in ruby objects (each item has title, date etc.), I think that would be the right time to run the indexer (pre- or postprocessing).
I am not really decided about the output question. I like the picky way to show the categories, limit queries etc., but it's not really necessary. The web site addresses the general public, not library research assistants, so many possibilities maybe won't be used. My goal is more like: the interface should be user friendly, and search should distinguisch between words in the main content or tags for a blog article, and words in a tag cloud that happens to be on the same page..