Wolfram LLM Benchmarking Project

7 views

Skip to first unread message

Alex Shkotin

unread,

Oct 6, 2024, 5:10:34 AMOct 6

to ontolog-forum

Hi All,

In our last meeting today Mike Peters presented on the second slide the page from his blog about great Wolfram's team benchmarking project. With 114 LLM evaluated in coding with highest score for "semantics" just 52.2% 🐓

"All LLM are wrong but some are useful"

Alex

Reply all

Reply to author

Forward

0 new messages