My use of LLMs is VS Code

6 views
Skip to first unread message

Michael Wimble

unread,
Oct 19, 2025, 8:44:39 PM (3 days ago) Oct 19
to hbrob...@googlegroups.com
I posted this to the RSSC List as a response so this is a repost here. Your mileage may vary.

I have certainly found a huge difference in capabilities of the models, and I switch between several of them constantly. My experience here reflects just usage in trying to:

* Add a completely new capability to my robot, such as predictive LIDAR correction, where I look at cmd_vel over the period of the LIDAR sweep and try to correct the beam readings to try to make all the beam readings correspond to a single point of time. A project I ended up abandoning for now.
* Analyzing a complex project architecturally and refactoring it. This was my code base for several Teensy 4.1 custom boards that deal with guaranteed sensor scheduling and high frame rates.
* Reconfiguring the Nav2 stack to usefully add 8 time-of-flight sensors, mostly to the local costmap.

There were other projects, but the gist is that I’m not talking about my experiences using LMs for other than ROS2 projects in my critique.

* Grok Code Fast 1 (Preview)
A more useless waste of energy and effort I have not encountered in a long time. I abandoned trying to use this after a couple of hours. I can’t emphasize how stupid this was and infuriating to attempt to get even the simplest things done.

* GPT-5
Pretty good for research and analysis. I often use this when I want to explore a new idea or analyze my code and make suggestions for areas of improvement.

Pretty bad for actual coding. It’s kind of deceptive and seductive. I’ve asked several times to create new projects, and it has the appearance of beauty. It’s all shiny and talks real pretty. But every last time I’ve asked GPT-5 to actually code something, I’ve wasted huge amounts of time trying to twist the code back into something that actually works and is useful. GPT-5 has a tendency to want to generate huge, complex solutions that would look well on your resume 5 years ago, especially if you worked in a dinosaur company. The solutions are always excessively complex. It nearly always ignores my requests for changes. It will start down a path where I suggest removing some feature, but 10 minutes later, it has totally forgotten what my requirements or suggestions were. It frequently says, “I will now…” and it doesn’t. It is frequently incapable of playing nicely with VS Code.

So, I often use it to explore a concept. I often use it to analyze my code for potential errors or missing features. I even use it to suggest the architecture of some new code. But I’m not sure I have a single instance where I didn’t waste days trying to warp its actual code back into something that is actually useful.

This is probably the system where I have actually spent my most time literally shouting at the computer. My wife is somewhat amused by my “YOU F****ING MORON, I JUST TOLD YOU NOT TO DO THAT” shouts coming from the lab.

* Gemini 2.5 Pro

Kind of a mixed bag here. When my favorite systems get in an echo chamber, can’t come up with a reasonable solution, or I just don’t like the uninspired solution or other LLMs, I switch to this Gemini 2.5 for a while.

I find it’s sometimes good for focused tasks. It is rather meh at analysis, meh at helping me learn something new, and meh at doing deep research to find solutions others have found that I might incorporate for my needs. But if I need a mid-level coder to just grind some code, like refactor a module, or add a small feature that needs a new parser, YAML feature, logging, and so forth, it’s pretty good at grinding, if I’m fairly explicit at limiting the bounds of what I want done and sometimes providing a hint on how to do it.

* Claude Sonnet 4

My go-to work horse, often a bit smarter than me, plays well with VS Code, can take the suggestion of what I want done and come up with a good, 80% solution. It builds code that isn’t pointlessly complex; it doesn’t need to show off. It can remember threads of conversation for days at a time, though it can be stupidly dogmatic about ignoring some of my requirements. For instance, I can’t tell you how many times it tries to compile ROS2 code from the wrong directory using the wrong command-line options. It then says, “The compilation failed”, generates a ton of code changes, and changes whole algorithms because it thought the last edits failed. I have to then remind it, sometimes gently, sometimes with a bit of shouting, that it didn’t switch to the correct directory, that it didn’t wait for the compile to finish, that it forgot to use “—symlink-install” and it’s a half hour lost getting back to where we were.

But overall, it makes me able to generate code on the order of 10 times faster. But I pity the company that is employing any of these LLMs and firing programmers because they think this stuff is good on its own. Or using junior programmers to pass on the results. None of the LLMs are even close to being able to generate correct code the first time, or the 10th time. Complex code requires fairly deep knowledge to review for correctness.

Perhaps it’s because of the nature of the things I’m trying to do. I’m not generating simple websites using some well-worn MVC framework that just needs a tweak in graphics and a connection to a commerce engine.

I abandoned the LIDAR correction project because the LLMs just generated crap that mimicked what it thought it read without understanding even the basic algorithms. GPT-5 is the very worst of this (well, of course I wouldn’t trust GROK to write a birthday card greeting, it’s literally that stupid). I often tell GPT-5 “do a deep read of the RoboClaw user manual so that you understand each of the commands, especially how the motor movement commands affect acceleration”, then it generates code and I ask, “So, why did you use the simple move command instead of the command which sets acceleration, speed and distance when I told you you must shape acceleration, set the commanded speed and limit the distance, did you actually read the manual?” And it says, "no, I didn’t actually read the manual”. “OK, try again, but make sure you actually read the manual first. Do not generate any code until you read the manual. If you do not read the manual, do not generate code. Now, read the manual and change the cmd_vel handler to use the speed, distance, and acceleration form of the command”. And it generates the simple move command. And I ask, “Did you read the manual”, And it says, “No, because…”. And I shout, “YOU F***KING MORON”.

I use the non-premium AIs a lot for trivial things, like generating simple Python scripts, fixing up the package structure for colcon, making sure package.xml has included all of the dependencies and things like that. They mostly work a treat.

I pay for Copilot, and for the last couple of months I’ve had to pay extra for my premium usage. One month I used just over $20 extra beyond my Copilot fee, and that was with a fairly large usage, probably 10 hours a day of programming for a couple of weeks. I think last month I went over by a couple of bucks.

There are new models I haven’t tried. Some models aren’t good at reviewing the past conversation to catch up. I’m not good at all at starting new chats for new problems. I’m only moderately good at creating the static prompts to help in long-term knowledge, but I’m pretty good at just quickly correcting my prompts to get it to focus correctly on the existing problem. Well, I’m good when I use Claude.

Thomas Messerschmidt

unread,
Oct 20, 2025, 1:40:25 AM (3 days ago) Oct 20
to hbrob...@googlegroups.com, hbrob...@googlegroups.com
MICHAEL, 

Thank you very much for this long insightful evaluation. It will save me a whole lot of time when trying to get AI to help with coding.

So ChatGPT said, "no, I didn’t actually read the manual”. 😆😆😆

So TL;DR! 
😆



Thomas Messerschmidt

-  

Need something prototyped, built or coded? I’ve been building prototypes for companies for 15 years. I am now incorporating generative AI into products.

Contact me directly or through LinkedIn:   



Reply all
Reply to author
Forward
0 new messages