Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

GoogleThrashing

0 views
Skip to first unread message

Michael Bernstein

unread,
Apr 12, 2002, 9:13:17 PM4/12/02
to
This is a copy of a posting I made to my weblog
http://www.michaelbernstein.com/weblog/

The original posting is at
http://www.michaelbernstein.com/weblog/archive/2002_04_12a/view

Google's announcement of a Web API to search their two billion page
database, and Dave Winer's coverage and implementation has me
thinking.

I'll probably end up using Mark Pilgrim's Python Google Interface when
I get around to playing with this, but meanwhile I thought I'd put out
a few ideas I had.

First, it occurred to me that besides regular and phrase searches, it
would be interesting to search for documents that link to the one
you're viewing, and see who considers it to be authoritative (it would
also be nice to get similar results from blogdex and daypop, merging
the three result sets). Then I thought, wouldn't Google then index the
generated links on the resulting page, causing a positive feedback
loop? I promptly decided to name this (as yet unobserved)
epiphenomenon GoogleThrashing.

Pondering a bit more, I can see that various sorts of feedback loops
can occur with practically any search. Consider a search for 'michael
bernstein' , which currently lists my site as it's number one result.
A Googlebox that included these links would automatically also link to
other Michael Bernsteins on the net, most likely raising their
relevance compared to mine.

While this little ego-surfing example isn't terribly earth-shaking, it
does point out the need for part of the content on a page to be able
to say "don't index me" to Google to prevent at least *unintended*
GoogleThrashing. Wrapping a Googlebox in a flag in the form of <span
class="dontindex"> or something similar would seem to do the trick, if
Google pays attention to it.

John Whitworth

unread,
Apr 18, 2002, 6:21:05 PM4/18/02
to
PageRank must obviously deal with link loops. I guess it does some
netting off so if you, result #1, link in a Googlebox to result #2
whick links back, the net effect is a slight decrease in your page's
rank since you link to each other, but your rank is slightly higher.

If links weren't netted off page A could have a million links to page
B which had a million links to page A and even though noone else had
any links to A or B, A and B would be top of the rankings with a
million links each.

Thus the need for a directive for PageRank to ignore links is not
really necessary.

John


webm...@lvcm.com (Michael Bernstein) wrote in message news:<a5814370.02041...@posting.google.com>...

0 new messages