Arc Prize for AGI

5 views
Skip to first unread message

Jack Park

unread,
Jun 12, 2024, 11:36:29 AMJun 12
to ontolog-forum

John F Sowa

unread,
Jun 12, 2024, 3:30:11 PMJun 12
to ontolo...@googlegroups.com
Jack,

I believe that a better term is "phony competition".   Nobody has a clue about how to measure and compare intelligence in humans and other animals.  The only thing they can measure is test taking ability on artificially designed tests.

I sent a note yesterday about the difficulty of measuring what LLMs can do:  if they can find some kinds of reasoning methods somewhere on the WWW, they can get an A+ on tests that use those methods.   If not, they get a C- or worse.

For WWI, recruits were tested on the newly developed IQ tests.  The published scores showed that Whites scored significantly higher than Blacks.   Several decades later, researchers who went back to the original data found that blacks from the north scored higher than whites from the south.  That data was suppressed in the original publications.

The reason for the results depended heavily on the background of the test takers and test developers.  The people who designed the IQ tests were city dwellers.  City dwellers from the north scored significantly higher than farmers from the south.  If they had used IQ tests designed by and for farmers.  Blacks from the south would have outperformed whites from the north.

Those tests discussed in your citations were designed by and for people who take tests on paper or a computer screen because that is the only kind of data that LLMs can process.  People and other animals process information from multiple senses and propriosenses in a moving 3d environment.  AI systems are hopelessly bad compared to humans and other animals in such environments.

Examples include the failure of driverless cars in complex environments.   Carnegie Mellon tests cars in Pittsburgh, which has more bridges than any other city in the world.  It also has hills that create steep winding roads that go in, out, and over tunnels and many kinds of obstructions.  No driverless cars can safely drive in Pittsburgh without having a driver who is ready to grab the wheel at any moment,

The tests described in your citations are designed for testing computer software, not for testing humans or other animals.  They were designed by and for researchers who want to get more funding for their pet projects.

Any results they get are worse than worthless (zero value).  They would have the negative value of generating extremely misleading results and extorting funding from more important R & D.

Anybody who doubts these points should forward this note to the people who are designing those tests.  I'd like to see their complaints.

John
 


From: "Jack Park" <jack...@gmail.com>
Reply all
Reply to author
Forward
0 new messages