Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
Message from discussion The C-Prize
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
jim_bow...@hotmail.com  
View profile  
 More options Jun 18 2005, 5:28 pm
Newsgroups: comp.compression
From: jim_bow...@hotmail.com
Date: 18 Jun 2005 14:28:35 -0700
Local: Sat, Jun 18 2005 5:28 pm
Subject: Re: The C-Prize

Matt Mahoney writes:
> Nice choice.  Lots of contributors, lots of topics, high quality.

Here are the drawbacks I see to using the "cur" download of Wikipedia:

1) The purported "neutral point of view" is subject to systemic bias.
The process supported by Wikipedia biases the content toward people who
are able and willing to contribute under those circumstances.  It's not
clear how to go about identifying this bias let alone its impact.

2) If a snapshot of the "cur" downloads is delayed, it will be subject
to gaming by people who submit changes to articles that have a
Kolmogorov complexity that is known to be low just to the gamers.
Wikipedia keeps prior archives of the 'cur' downloads and it is
unlikely anyone has gamed the system so far but now that we've broached
the subject there is the potential that future versions of the 'cur'
downloads will have been so-gamed.

3) The entire history of edits is about a factor of 10 larger than the
current version.  This would be a superior corpus for the purpose of
ferreting out how various points of view bias content -- and be very
relevant to epistemology, social and political sciences -- thereby
creating a superior AI capable of considering the source of various
assertions.  If this really is beyond the capacity of computers that
would be available to viable contestants then it might be necessary to
defer use of that larger corpus until more capacity or more funding for
the prize is available.

For a discussion of the downloads see:

http://en.wikipedia.org/wiki/Wikipedia:Database_download#Weekly_datab...

> When computers are smarter than humans, how will we know?

Given Hutter's proof my guess is we'll know when they are capable of
more compression than humans given a comparable base of knowledge.
There is always the question of how do we know how much compression
humans are capable of. I don't have that question answered.

 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.