Thank heavens for friends.
After yesterday's post, a friend emailed me to ask "why didn't you use the advanced models of the LLMs?"
Fair question, but in my defense, there are about a zillion models out there and I can't test them all. Besides, what I was asking the AIs to do wasn't that hard. (At least to me, a human.) So I assumed they could all do it.
But as we found yesterday, only Claude (Opus 4.5) was able to do it "out of the box."
So I tried using the more advanced models. Guess what? They all worked and did a much better job.
However, they're not perfect.
In my previous post I used Gemini 3 (Thinking) and ChatGPT (5.2 Auto). They both failed in interesting and distinctive ways.
This morning I tried Gemini 3 (Pro) next to ChatGPT (5.2 Thinking /Extended) and got much better answers.
The input data was the same (here's the data SRS blog posts from 2025) which I uploaded to each.
I then prompted them with:
[This is a list of text links to SearchResearch blog posts for 2025. For each text link here, please create a spreadsheet with each text link in Column A, then please extract the URL from that link into Column B, and then in Column C, please write a 100 word summary of the content on the URL.]
Gemini did this, then stopped after a few and asked "Would you like me to continue generating this table for the remaining older items in your list (e.g., from September and August 2025)?"
Yeah, I would. But I replied with:
[Please generate all 55 entries]
Gemini and ChatGPT both created new spreadsheets with CORRECT URLs and summaries.
Here's the side-by-side sheet of each link to the blog post and the Summary by each of the 3 systems.
And here's what one entry looks like: