Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

THEORY: InfoRaptor project

0 views

Skip to first unread message

Jorn Barger

unread,

Mar 12, 2003, 7:43:43 AM3/12/03

I've started a new subpage for my DrawBack browser project:
http://www.robotwisdom.com/drawback/inforaptor.html
that will cover the knowledge acquisition bot which I'm
calling "InfoRaptor".

The theory is that as one does research on the Web, current
tools continually *throw away* information, so that when
you want to look up something you ran across some time ago,
your odds of finding it quickly are pretty slim.

So the first strategy of InfoRaptor will be to archive and
word-index all visited pages, and especially all search-
results.

But a local word-index is only going to be a slight
improvement over Google, so the second level of attack is
topical master-pages, that bring together on a single
local (unpublished) webpage _all_ the resources you've
come across for a given topic (even likely links that you
haven't followed yet).

Pages for publication will be represented as a subset of
the master page, and InfoRaptor will allow the author to
click on a section of a published page and jump to the
corresponding part of the master page, where all the
related-but-rejected links, and all the related search-
queries, will be cached.

So the design challenge is to automate the maintenance
of these master-pages as much as possible, so that
they're useful, despite their gigantic size...

I've posted before to alt.hypertext about what I called
'action hypertext'-- this is the latest evolution of
that. The (new) general idea is that you have a master-
hierarchy of topics, and as you do research InfoRaptor
will display your current place in that hierarchy, plus
a range of nearby topics.

When you want to 'dismiss' a page, you select the best
topic-match, and type a few words describing the page.
(Action-hypertext theory suggests that there will be
various standard categories that may be offered in a
menu as well, like 'image' or 'etext' or 'analysis',
or 'read later' or 'too deep' or 'too shallow' etc.)

The masterpage will remember the date, the annotations,
the original link, and the local archived copy, sorted
into a convenient spot under the proper subtopic.

You'll have the ability to flag search-strings and
websites for the bot to monitor for new content or
other changes. (One of the main functions is to
semi-automate finding substitute links when one goes
404.)

0 new messages