Opening up indexing process/Niche interest

14 views
Skip to first unread message

Jon

unread,
Aug 8, 2008, 5:06:16 AM8/8/08
to Hansard Prototype
Presumabley the audience for this information is potentially pretty
varied as such current indexing doesn't neccessarily reflect what they
might be after.

Have you considered opening up indexing (I suspect you have) to others

Would this be a significant leap of function or would it be more of a
natural progression ?

My assumption would be that this would make the information accessible
to wider range of 'niche' audiences

regards, Jon

Robert Brook

unread,
Aug 8, 2008, 7:51:17 AM8/8/08
to Hansard Prototype
Hi Jon -

Yes, but.

It depends what you mean by 'indexing'. If you mean creating a list of
words to search on, we're obviously already do that with Lucene
<http://lucene.apache.org> - and that's the current word list the site
search runs off. We're still working on the index configuration:
there's quite a lot we could still tweak.

If you mean strict or not-so-strict 'keywording', we did run an
earlier demo with Delicious <http://delicious.com/> links embedded in
each page, but we decided to take out the javascript that ran that in
order to speed up page loading. Of course, people can still bookmark
and tag us in Delicious, Google Bookmarks, Connotea or whatever
service they want to use.

'Strict' keywords - say, from an official thesaurus - we've not
approached. We could provide the tools, others would have to provide
the content.

And we did do a short demo using the Yahoo term extraction API. It was
a little underwhelming. Fine - but not what we were looking for.

Of course, someone could always download the source files <http://
www.hansard-archive.parliament.uk/> and index them any way they chose!

In summary: yes. Ish. Sort of.

Robert

Rob .

unread,
Aug 8, 2008, 1:14:08 PM8/8/08
to Hansard Prototype
2008/8/8 Jon <jpr_h...@hotmail.com>:

> Presumabley the audience for this information is potentially pretty
> varied as such current indexing doesn't neccessarily reflect what they
> might be after.
>
> Have you considered opening up indexing (I suspect you have) to others

The content is public on the Web at persistent URLs. It has already
been indexed by others including Google and Yahoo.

> Would this be a significant leap of function or would it be more of a
> natural progression ?

Being indexed by others occurs naturally from having content public on
the Web at meaningful, persistent URLs.

> My assumption would be that this would make the information accessible
> to wider range of 'niche' audiences

Currently over 90% of the visitors to the site are arriving via doing
searches of the Web. Most of those visitors have followed links to the
site that have appeared in Google search results.

In the past month, over 44,000 different search terms were entered in
Web searches that led to people reaching the Hansard site. Seems like
a wide range of niche audiences may already be finding information of
interest to them.

What form of indexing did you feel would make the information more
accessible to a wider range of audiences? How would we define the goal
of making the information more accessible, and how would we measure
whether we achieved that goal?

-Rob

> regards, Jon

Guy Drury

unread,
Aug 8, 2008, 4:56:49 PM8/8/08
to hansard-...@googlegroups.com
Hi. Maybe make it more accessible by linking from parliament.uk? (At
the moment it says there aren't any hansard versions online save the
main [current] one) Is there a reason why this version is not
currently linked? I only found about it from a reply on twitter!
Guy.

--
Sent from Google Mail for mobile | mobile.google.com

Robert Brook

unread,
Aug 8, 2008, 6:01:12 PM8/8/08
to Hansard Prototype
We were actually on the front page of http://www.parliament.uk for a
couple of days this week, but we've dropped off now...

The site is listed on the "Digitised Historical Parliamentary
Material" page at: <http://www.parliament.uk/
parliamentary_publications_and_archives/parliamentary_archives/
archives_electronic.cfm#hansardhc> - and even that's probably pushing
the boat out a bit. We're operating quite a bit out of the official
limits.

We're still an experimental site. That might change.

On Aug 8, 9:56 pm, "Guy Drury" <guydr...@googlemail.com> wrote:
> Hi. Maybe make it more accessible by linking from parliament.uk? (At
> the moment it says there aren't any hansard versions online save the
> main [current] one) Is there a reason why this version is not
> currently linked? I only found about it from a reply on twitter!
> Guy.
>
> On 8/8/08, Rob . <rob.02...@gmail.com> wrote:
>
>
>
>
>
> > 2008/8/8 Jon <jpr_hol...@hotmail.com>:

Rob .

unread,
Aug 15, 2008, 9:24:40 PM8/15/08
to Hansard Prototype, Jon
Here's a video by an Assistant Professor of Cultural Anthropology from
Kansas State University that summarizes the change we are seeing now
that we and information are networked via the Web:

Information R/evolution - http://www.youtube.com/watch?v=-4CV05HyAbM

2008/8/8 Rob . <rob....@gmail.com>:

Jon

unread,
Aug 18, 2008, 10:56:04 AM8/18/08
to Hansard Prototype
I think this probably answers the question I was trying to ask, which
basically was are you/the system going to allow people to tag
information directly in this system or are you going to 'rely' (I'm
not sure rely is the correct word here) on others to undertake tagging
via external means.

I think if that if the link is representative of your view, then you
have answered my question

On Aug 16, 2:24 am, "Rob ." <rob.02...@gmail.com> wrote:
> Here's a video by an Assistant Professor of Cultural Anthropology from
> Kansas State University that summarizes the change we are seeing now
> that we and information are networked via the Web:
>
> Information R/evolution -http://www.youtube.com/watch?v=-4CV05HyAbM
>
> 2008/8/8 Rob . <rob.02...@gmail.com>:
>
> > 2008/8/8 Jon <jpr_hol...@hotmail.com>:
Reply all
Reply to author
Forward
0 new messages