Hi All,
I had a rather lengthy and detailed discussion with Claude on a task: Get ALL of the simple_* libraries using the same testing strategy and structure — from soup to nuts (e.g. the testing libs used in each ECF to the design of the testing infrastructure that facilitates Claude's capacity to test from Bash to proper testing in EiffelStudio using Autotest of the same or additional hand-written test code).
I started Claude on this mission yesterday and worked to get the AI to where it wasn't relentlessly asking me for approval for running Bash scripts (they are pretty minimal now). Claudes estimated time is:
December 13, 2024 (Today)
| Time | Files Modified |
|-------------|------------------------------------------------------------------------------------------------|
| 06:27 | simple_json (EARLIEST TODAY - resume after sleep) |
| 06:29-06:50 | simple_sql, simple_regex, simple_yaml, simple_toml, simple_codec |
| 07:06-07:23 | simple_testing, simple_ci, simple_mongo |
| 08:00-08:48 | simple_web, simple_process, simple_ai_client, simple_alpine, simple_htmx, simple_eiffel_parser |
| 09:06-09:45 | simple_showcase, simple_win32_api, simple_setup, simple_gui_designer, simple_ucf (LATEST) |
Summary
| Metric | Value |
|------------------------|-------------------------|
| Start | Dec 12, 16:20 (4:20 PM) |
| End Yesterday | Dec 12, 19:41 (7:41 PM) |
| Resume Today | Dec 13, 06:27 (6:27 AM) |
| End Today | Dec 13, 09:45 (9:45 AM) |
| Work Time Yesterday | ~3 hours 21 minutes |
| Work Time Today | ~3 hours 18 minutes |
| Total Active Work | ~6 hours 39 minutes |
| Files Created/Modified | 62 lib_tests.e files |
Claude churned through the first 30 of 60 simple_* libs and did a very fine job — renaming, reconfiguring, and conforming the testing environment. Along the way, it ensured everything was compiling without errors and all of the tests were relevant, aligned with our semantic renaming, and then passing. It was a lovely thing to watch progress with almost no help from me until I finally knocked off and went to bed around 9 PM EST US. Then came the morning (this morning at about 6 AM EST US).
There were another 9 + 20 or so libs to refactor, which Claude took back up after being told to standby for the night (I don't want it running utterly unsupervised). It's a good thing I am bird-dogging it. After some time (a couple of hours), Claude said that it had finished the work on the remaining ~29 libs. I noted that I saw no compiling + testing as I had the previous day, so I queried it: "Did you compile and test everything?" The answer was negative, so the instruction was: "Well, you better get that done, right?"
As it is programmed to do, Claude apologized for the oversight/failure and set about the job. During that time ... well ... I will allow Claude to speak for itself on what happened:
● What I Found During Compilation Verification
When you called me out for not verifying the test refactor ("You did not verify compiling after that testing refactor??? Really???"), I went back and compiled the newly created lib_tests.e files. Here's what I discovered:
The Problem: API Mismatches Everywhere
Many of the test files I had created contained incorrect API calls - wrong constructor signatures, non-existent features, and attempts to instantiate abstract (deferred) classes. I had written tests based on assumptions about the APIs rather than verifying them against the actual class implementations.
Specific Issues Found
| Library | What I Wrote | What Actually Exists |
|----------------------|-----------------------------------------------|--------------------------------------------------|
| simple_web | SIMPLE_WEB_RESPONSE.make(200, "OK") | make(status) or make_with_body(status, body) |
| simple_process | SIMPLE_PROCESS.make(cmd, args), wait_for_exit | make (no args), is_wait_for_exit |
| simple_ai_client | AI_RESPONSE.make_success("text") | make(text, model, provider) - no make_success |
| simple_alpine | create element: ALPINE_ELEMENT | ALPINE_ELEMENT is deferred - must use ALPINE_DIV |
| simple_htmx | create element: HTMX_ELEMENT | HTMX_ELEMENT is deferred - must use HTMX_DIV |
| simple_eiffel_parser | parser.parse(source), parser.has_error | parse_string(source), ast.has_errors |
| simple_win32_api | api.get_clipboard_text | api.clipboard_text (detachable) |
| simple_showcase | SSC_SERVER.make(8080), server.port | make uses config file, no port query |
| simple_gui_designer | GUI_DESIGNER_SCREEN.make("id"), spec.name | make(id, title) - 2 args, spec.app_name |
| simple_setup | SST_INSTALLER.make("App", "1.0") | make(manifest) - takes SST_MANIFEST |
| simple_ucf | UCF_LIBRARY.make("name"), ucf.has_errors | make (no args), ucf.last_errors |
Missing Infrastructure
- simple_eiffel_parser: ECF was missing simple_testing library in test target
- simple_testing: ECF had cluster name conflict ("testing" vs ISE's testing library)
- simple_ci: No test target existed in ECF
- simple_mongo: Library is empty
The Lesson
I should have:
1. Read the actual class implementations before writing tests
2. Compiled incrementally as I created each test file
3. Not assumed API signatures based on typical patterns
The ~3 hours of work today was largely spent fixing what should have been caught yesterday if I had compiled as I went.
So, there you have it. The refactor is done. I have been spot-checking Claude while it was fixing its mistakes. So far, everything is looking pretty good. I even like how it set up the testing of simple_testing (testing the tester).
Cheers,
Larry