Hi Larry,
Is it the same AI "person" who is generating the code, with its
assertions, and the tests? Because this AI "person" could generate
wrong assertions, wrong code based on these wrong assertions,
and then wrong tests which pass.
Ideally, we should have:
1) an AI "person" who generates abstract (or empty shell) classes
with the assertions based on some specifications (e.g. some
standard documents). But these assertions should be more
exhaustive than just "arg /= Void" or "not arg.is_empty".
2) have a human person validating these assertions (until we have
a reliable way to translate specifications to assertions,
or until specifications are directly expressed in the form of
assertions).
Now that we have a validated set of assertions:
3) have another AI "person" generate the code.
4) Ideally, have a code verifier like AutoProof to validate that
the implementation is correct (with respect to the specification
expressed through the validated assertions).
or: Have yet another AI "person", or AutoTest, generate and run
as many tests as possible.
The goal here is to make sure that the AI is not generating
erroneous code and making us believe that it is correct because
it compiles and passes the tests that this same AI generated.
--
Eric Bezault
mailto:
er...@gobosoft.com
http://www.gobosoft.com
On 21/11/2025 17:05, Liberty Lover wrote:
> I took the eat-my-own-dogfood a step further. The original EM_CLI was an
> empty stub. Almost all of the code you see below (and in the Github
> repo) is generated using the software itself (e.g. I added a $AI_TODO$
> with description of "what-to-do/is-needed"). The REQUEST built was:
>
> EifMate Request Manifest
>
> Request ID: req_202511210881139
> Timestamp: 202511210881139
>
> Files in Package
>
> * D:\prod\eifmate\src\cli\core\em_cli.e
> * D:\prod\eifmate\testing\cli\mocks\mock_01.e
>
> AI-TODO Items
> D:\prod\eifmate\src\cli\core\em_cli.e:1
>
> Scope: Line
>
> TODO:
>
> 1:-- $AI-TODO$ Create tests for EM_CLI.
>
> I then gave the entire package to Claude-AI Sonnet 4.5 and it dutifully
> created both the new EM_CLI and TEST_EM_CLI classes. There were problems.
>
> * The AI tried to put the JSON text on multiple lines and failed to
> escape the double-quotes at the start of each new line.
> * I then ran "ec -flat TEST_EM_CLI -config eifmate.ecf -target
> eifmate_tests -batch -project_path d:\prod\eifmate" from CLI.
> * The results of that spat out the errors, which I then gave to Claude.
> * That resulted in a little back and forth, where it would correct one
> problem location, leaving other instances untouched.
> * After about 3 instances, it finally decided to correct the entire
> TEST_EM_CLI class and plopped that in Obsidian for me.
> * There was one more issue, and another call to "ec -flat TEST_EM_CLI
> -config eifmate.ecf -target eifmate_tests -batch -project_path d:
> \prod\eifmate" ensued.
> * That was the final pass in the cycle, which ended with code that
> compiled.
> * I think ran the test and the all passed, which I am presently
> suspicious of when I see it (of course).
> * That led to investigation of the run, whereupon I found silent fails
> swallowed by poorly formed PROCESS calls to ec.exe (e.g. it had
> "ec.exe\ec.exe" instead of "ec_path": "C:\\Program Files\\Eiffel
> Software\\EiffelStudio 25.02 Standard\\studio\\spec\\win64\\bin\
> \ec.exe")
> * That took a little while to track down, which is what led to the
> "make_with_default_ec_path" on EM_CLI.
> * The TEST_EM_CLI also had an issue with Utf-8 structure of the path,
> which I fixed manually.
> * Now, the test no longer silently failed. NOTE that I still need to
> go through the backend code and put in contracts that will detect
> silent fails. I may or may not use Claude to assist with that. It's
> pretty straight forward.
> * With the pathing properly formatted for UTF-8 and getting the silent
> fails handled, the code is pretty solid now. I can depend on EM_CLI
> at the moment.
> * So, what's the next steps?
> * There is a choice: I am at a place to start thinking about direct
> HTTPS client-server interaction between Eifmate and Claude.ai. I
> could also implement a plan to use SQLite3 as a repo DB for various
> things like:
> o temp store of classes under construction (for quick restores to
> a known-passing-compile point - or perhaps use Github through
> CLI to restore from the repo)
> o store "lessons learned" from the interactions and pull from
> those bits-and-parts as a part of forming prompts for Claude.ai
> to use and shape the interaction.
> o Log the interactions, requests, responses, packages, and so on.
> This looks forward to the day that I don't use Obsidian notes,
> which Claude.ai can be a little funky about using deterministically.
> o Other uses? I don't know yet. Those are evolving.
> * I like the idea of using SQLite3 locally for recording/logging/etc.
> because while it involves creation of some complex code, I now have
> the Claude.ai-helper/pair-programmer to assist in that function, so
> it isn't really all that daunting to think about creating such a thing.
> o I may (in fact) do with SQLite3 like I did the SIMPLE_JSON
> library (which is working very well so far) to create a high-
> level API project that will spare me from using low-level
> boilerplate code (like Eiffel's JSON library which is very low-
> level from a consumers point of view)
> o So, a high-level SQLite3 library might be the next step, and
> then using that library in the Eifmate context to build-out the
> logging/recording bits instead of using Obsidian.
>
> That's the very long-winded view of where I am in this process.
> Suggestions are welcome!!!
>
> claude-ai-plus-myself-EM_CLI-plus-tests.png
>
> On Thursday, November 20, 2025 at 3:13:40 PM UTC-5 Liberty Lover wrote:
>
> In case you're wondering: Here is one of the actual request file
> contents sent over to Claude AI (below). Notice that it contains all
> of the $AI-TODO$ items. The code made a pair of these. I may have
> run the test twice and confused the pair of requests with each TODO
> item. However, you will notice that both of the AI-TODO items are in
> this one request. So, all I really needed to do was send just the
> one Request + Class-file(s) "package" ... Claude was obviously
> "smart enough" to see my redundancy and handle it. AI can be very
> forgiving (until it is not). :-) The "9:" and "10:" values are the
> line numbers in the MOCK_01 class where each $AI-TODO$ item can be
> found. Of course, the "instructions for Claude" tell it where to put
> the output in my local Obsidian folder system and ensures that I can
> see what request number it is linked to. This will ultimately
> function as a unique key ID in a SQLite3 DB for tracking all of this
> and keeping a history. That DB might also be where code is
> temporarily stored while its being worked with so it can be "rolled
> back" if needed through Eifmate.
>
> That's all I know to share. Feedback welcomed!
>
>
> EifMate Request Manifest
>
> Request ID: req_2025112002141147
> Timestamp: 2025112002141147
>
> Files in Package
>
> * D:\prod\eifmate\testing\cli\mocks\mock_01.e
>
> AI-TODO Items
> D:\prod\eifmate\testing\cli\mocks\mock_01.e:9
>
> Scope: Line
>
> TODO:
>
> 9: -- $AI-TODO$ create a `name' feature as STRING_32 and create a
> setter for it.
> D:\prod\eifmate\testing\cli\mocks\mock_01.e:10
>
> Scope: Line
>
> TODO:
>
> 10: -- $AI-TODO$ ensure the class has all class-level, feature-level
> design-by-contract assertions.
> Instructions for Claude
>
> 1. Review the AI-TODO items above
> 2. Implement the requested changes
> 3. Write results to in-from-AI/req_2025112002141147/ folder
> 4. Include header comment: -- RESPONSE_TO: req_2025112002141147
>
>
> On Thursday, November 20, 2025 at 2:54:52 PM UTC-5 Liberty Lover wrote:
>
> Hey all,
>
> So, the Eiffel code in Eifmate was able to:
>
> * Find all classes in the project with $AI-TODO$ markers (e.g.
> 2 each in MOCK_01)
> * It was then able to create a pair of "request" documents
> with specifications and a copy of the MOCK_01 class file.
> * I then manually pushed over the files created by Eifmate.
> * Claude then made the resulting new MOCK_01 class and put it
> properly in my Obsidian notes, where ...
> * I copied out the file, renamed the class to MOCK_01_AI
> * The class compiled without issue. Testing will confirm.
>
> The idea is to get Eifmate into a "watch" mode with HTTPS
> linkage to Claude AI (userID plus keys, et al) and then Eifmate
> itself (when told to GO) will handle the entire round-trip.
>
> The ultimate goal is to have it:
>
> * Use the ec.exe to do a -flat against the incoming class(es)
> and detect problems/errors (if any).
> * If it finds problems through the compiler, it gathers the
> ec.exe output, creates a prompt with the output and sends it
> to Claude to think/try-again.
> * That cycle will continue for 3 to 5 attempts. If it can
> solve/resolve on its own, then great.
> * If it cannot solve/resolve, then it stops, restores the code
> to where it was before it started the cycle, and reports the
> entire matter to me as the programmer for me to sort it out.
> Claude will produce a detailed report for me to read about
> what it tried and what keeps failing. The idea is for me as
> a programmer to then put my head to it and see what's wrong.
> * In this cycle, there is a document that Claude has serving
> as "lessons learned" and other guidance files. Those can
> then be updated. A local SQLite3 DB will serve as a local
> means to record such data and the Eiffel code of Eifmate
> will eventually be able to recognize problems as they are
> happening in real-time, adjusting the "prompts" for Claude
> to help it over the humps.
> * There might be aux prompts in between code-writes/compiles/
> testing that will help further enhance the process. The hope
> is to get to where Eifmate with Claude will handle 90-95% of
> the process from need --> design --> architecture --> coding
> --> testing and so on.
>
> So far, this all looks very good and very doable. What I am
> hoping for is to get Eifmate to a state where it can be folded
> into EiffelStudio and any AI-backend can be used (not just
> Claude AI).
>
> It's a little bit of a slog right now, but my next step is to
> literally "eat-my-own-dogfood" and start baking Eifmate into the
> creation of Eifmate code and tests. The foundations are there.
> The cycle is established. Now, onward to automating, discovering
> nuances, needs, and building what I need to make the product itself.
>
> Wish me luck!!
>
> --
> You received this message because you are subscribed to the Google
> Groups "Eiffel Users" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
eiffel-users...@googlegroups.com <mailto:
eiffel-
>
users+un...@googlegroups.com>.
> To view this discussion visit
https://groups.google.com/d/msgid/eiffel-
> users/0ce0b112-4259-44c4-a4e2-77f708f758fan%
40googlegroups.com <https://
>
groups.google.com/d/msgid/eiffel-users/0ce0b112-4259-44c4-
> a4e2-77f708f758fan%
40googlegroups.com?utm_medium=email&utm_source=footer>.