AI FAIL

47 views
Skip to first unread message

Liberty Lover

unread,
Feb 6, 2026, 12:33:28 PM (4 days ago) Feb 6
to Eiffel Users
And this happened. It's hilarious and the real story is: They do NOT know why!

Liberty Lover

unread,
Feb 6, 2026, 12:44:19 PM (4 days ago) Feb 6
to eiffel...@googlegroups.com

This transcript presents a sobering "post-mortem" of the AI hype cycle from the perspective of 2026. It argues that the 2023–2025 push to replace software engineers with AI has largely failed, resulting in massive technical debt, security vulnerabilities, and a broken talent pipeline.


Executive Summary

The video outlines a transition from "AI Hype" to "AI Reality." While companies aggressively laid off staff to "realign" for an AI-centric future, the empirical results have been disastrous for enterprise stability.

  • The Productivity Paradox: While AI helps with speed in simple tasks, it produces "unmaintainable" code and requires senior engineers to spend up to 11 hours a week "babysitting" or correcting AI hallucinations.

  • The "Slop Layer": AI-generated code often relies on "vibe coding" (natural language prompts) which leads to a lack of structural diversity and "code cloning," creating a massive backlog of technical debt.

  • Security Risks: Nearly half of AI-generated code contains critical vulnerabilities, with Java failure rates exceeding 70%.

  • The Junior Death Spiral: By automating entry-level tasks, the industry has gutted the hiring of junior developers (down 50%), threatening the future supply of senior architects.

  • The "AI Lie": High-profile failures, like the collapse of Builder.ai and the "Anti-Gravity" incident, have exposed that many "autonomous" tools were either human-powered sweatshops or dangerously unaccountable.


Cited Reports and Entities

The transcript references several real-world and (narratively) projected 2025/2026 reports. Here are the citations mentioned:

Academic & Research Institutions

  • MIT Nandanda Center: Report titled "The Gen AI Divide" (claims 95% of enterprise AI pilots failed to deliver ROI).

  • Stanford Digital Economy Lab: Research on code structure (noted AI code is simpler/repetitive) and the decline of younger workers in AI-exposed roles.

Corporate & Industry Reports

  • Veracode (2025 Gen AI Report): Claims 45% of AI code contains OWASP Top 10 vulnerabilities.

  • CAS Software: Analysis of 10 billion lines of code regarding global technical debt.

  • Code Rabbit: Data showing AI pull requests contain nearly double the issues (10.8) of human code (6.4).

  • Hayes / IT Jobs Watch: Data regarding the 9% dip in median software salaries and the "low-hire, low-fire" market trend.

News & Media Outlets

  • Reuters: Reporting on tech leaders' failure to save human headcount despite AI integration.

  • Bloomberg: Investigation into the Builder.ai scandal (alleged AI-washing/human-powered backend).

  • Forbes: Commentary on the lack of accountability in AI-driven engineering.

  • The Guardian: Coverage of the global crisis in AI-assisted development.

Key Figures & Incidents

  • Sundar Pichai (Google CEO): Cited for his 2024 statement that 25% of Google’s code was AI-generated.

  • The "Anti-Gravity" Incident (Late 2025): A specific anecdote involving a Google AI tool accidentally deleting a 2TB production drive.


This leaves a set of open questions ...

On Fri, Feb 6, 2026 at 12:33 PM Liberty Lover <rix....@gmail.com> wrote:
And this happened. It's hilarious and the real story is: They do NOT know why!

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/eiffel-users/e08d9c21-ccfc-455d-8cc2-53f2b7a59034n%40googlegroups.com.

Ulrich Windl

unread,
Feb 6, 2026, 1:42:39 PM (4 days ago) Feb 6
to eiffel...@googlegroups.com
Well, in a nutshell: I had tried AI a few times when I was stuck, but soon found out that AI also couldn't solve the problem while claiming otherwise.
Asking back for details it turned out that AI had no idea.
Most of the dialogs were like (me) "I think ... is wrong because...", and AI replied "Of course you are absolutely right..."
When such cycles had repeated a few times, I eventually gave up as AI simply couldn't solve the problem.
The real danger is if you don't realize that the AI is wrong.

Or, as Angela Merkel had said (citing from memory): "If you know nothing, you'll have to believe everything"

Ulrich

06.02.2026 18:44:05 Liberty Lover <rix....@gmail.com>:

> This transcript presents a sobering "post-mortem" of the AI hype cycle from the perspective of 2026. It argues that the 2023–2025 push to replace software engineers with AI has largely failed, resulting in massive technical debt, security vulnerabilities, and a broken talent pipeline.
>
> ----------------------------------------
> *Executive Summary*
>
> The video outlines a transition from "AI Hype" to "AI Reality." While companies aggressively laid off staff to "realign" for an AI-centric future, the empirical results have been disastrous for enterprise stability.
> * *The Productivity Paradox:* While AI helps with speed in simple tasks, it produces "unmaintainable" code and requires senior engineers to spend up to 11 hours a week "babysitting" or correcting AI hallucinations.
> * *The "Slop Layer":* AI-generated code often relies on "vibe coding" (natural language prompts) which leads to a lack of structural diversity and "code cloning," creating a massive backlog of technical debt.
> * *Security Risks:* Nearly half of AI-generated code contains critical vulnerabilities, with Java failure rates exceeding 70%.
> * *The Junior Death Spiral:* By automating entry-level tasks, the industry has gutted the hiring of junior developers (down 50%), threatening the future supply of senior architects.
> * *The "AI Lie":* High-profile failures, like the collapse of Builder.ai and the "Anti-Gravity" incident, have exposed that many "autonomous" tools were either human-powered sweatshops or dangerously unaccountable.
>
> ----------------------------------------
> *Cited Reports and Entities*
>
> The transcript references several real-world and (narratively) projected 2025/2026 reports. Here are the citations mentioned:
>
> *Academic & Research Institutions*
>
> * *MIT Nandanda Center:* Report titled /"The Gen AI Divide"/ (claims 95% of enterprise AI pilots failed to deliver ROI).
> * *Stanford Digital Economy Lab:* Research on code structure (noted AI code is simpler/repetitive) and the decline of younger workers in AI-exposed roles.
>
> *Corporate & Industry Reports*
>
> * *Veracode (2025 Gen AI Report):* Claims 45% of AI code contains OWASP Top 10 vulnerabilities.
> * *CAS Software:* Analysis of 10 billion lines of code regarding global technical debt.
> * *Code Rabbit:* Data showing AI pull requests contain nearly double the issues (10.8) of human code (6.4).
> * *Hayes / IT Jobs Watch:* Data regarding the 9% dip in median software salaries and the "low-hire, low-fire" market trend.
>
> *News & Media Outlets*
>
> * *Reuters:* Reporting on tech leaders' failure to save human headcount despite AI integration.
> * *Bloomberg:* Investigation into the *Builder.ai* scandal (alleged AI-washing/human-powered backend).
> * *Forbes:* Commentary on the lack of accountability in AI-driven engineering.
> * *The Guardian:* Coverage of the global crisis in AI-assisted development.
>
> *Key Figures & Incidents*
>
> * *Sundar Pichai (Google CEO):* Cited for his 2024 statement that 25% of Google’s code was AI-generated.
> * *The "Anti-Gravity" Incident (Late 2025):* A specific anecdote involving a Google AI tool accidentally deleting a 2TB production drive.
>
>
> This leaves a set of open questions ...
>
> On Fri, Feb 6, 2026 at 12:33 PM Liberty Lover <rix....@gmail.com> wrote:
>> And this happened. It's hilarious and the real story is: They do NOT know why!
>>
>> https://youtu.be/WfjGZCuxl-U?si=SIjBzP6SDLMU9d66
>>
>> --
>> You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.
>> To view this discussion visit https://groups.google.com/d/msgid/eiffel-users/e08d9c21-ccfc-455d-8cc2-53f2b7a59034n%40googlegroups.com[https://groups.google.com/d/msgid/eiffel-users/e08d9c21-ccfc-455d-8cc2-53f2b7a59034n%40googlegroups.com?utm_medium=email&utm_source=footer].
>
> --
> You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/eiffel-users/CA%2B3qnjePHdgiEKbr1ue2US3MHNRJw2ToYGTEa43nF6PjLFf2og%40mail.gmail.com[https://groups.google.com/d/msgid/eiffel-users/CA%2B3qnjePHdgiEKbr1ue2US3MHNRJw2ToYGTEa43nF6PjLFf2og%40mail.gmail.com?utm_medium=email&utm_source=footer].

Liberty Lover

unread,
Feb 6, 2026, 1:55:22 PM (4 days ago) Feb 6
to eiffel...@googlegroups.com
Here is another take on the matter; claiming a "study" on the matter:


In this video, Dave Farley discusses the findings of a controlled study regarding the impact of AI coding tools on software maintainability. While most discussions focus on how much faster AI helps developers type, this study explored the long-term "total cost of ownership," where maintenance typically accounts for 50% to 80% of total expenses.

The Experiment

The study involved 151 participants (95% professional developers) and was conducted in two phases:

  • Phase 1: Developers added a feature to buggy code—some used AI, some did not.

  • Phase 2: A different set of developers (without AI) were tasked with maintaining and evolving that code without knowing if AI had been used to create it.


Key Findings

  • No "Maintenance Nightmare": Contrary to fears that AI produces "unmaintainable slop," the study found no significant difference in the cost or difficulty of maintaining AI-generated code versus human-generated code.

  • Speed Boost: AI users were 30% faster in initial development, while habitual AI users were up to 55% faster.

  • The "Boring" Advantage: For experienced developers, AI usage actually led to a slight improvement in maintainability. This is attributed to AI producing boring, idiomatic code, which is easier for others to read than "clever" or surprising code.

  • Skill is the Multiplier: AI acts as an amplifier. If a developer has good engineering discipline (modular design, small batches, TDD), AI scales that quality. If the developer lacks skill, AI simply helps them "dig a deeper hole faster."


The Risks of AI

Farley highlights two primary "slippery slopes" for teams using AI:

  1. Code Bloat: Because generating code is now nearly free, there is a temptation to create massive volumes of it, which increases complexity.

  2. Cognitive Debt: If developers stop "really thinking" about the code they generate, their skills atrophy, and long-term innovation slows down.

Conclusion

AI tools are excellent for short-term productivity and do not inherently damage code health. However, they do not replace the need for engineering discipline. The core of software development remains the ability to decompose complex problems into small, manageable pieces—a skill that humans must still master to guide AI effectively.


lar...@eiffel.com

unread,
Feb 6, 2026, 2:28:40 PM (4 days ago) Feb 6
to eiffel...@googlegroups.com

Many thanks!

--

You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

Liberty Lover

unread,
Feb 6, 2026, 2:43:40 PM (4 days ago) Feb 6
to eiffel...@googlegroups.com
Ulrich,

AI already suffers not only hallucination, but a severe case of sloppy-copy. It doesn't know the entire codebase, specs, and so on. Even if it does, it is prone to forget on compaction cycles. The developer needs a way to recapture what's been lost before continuing across a compaction barrier. Claude, even the new and touted Opus 4.6 only addresses the compaction issue and even then not much better than Opus 4.5. The only thing I can say in its favor is that is a far cry from Haiku (very old model by now).

The Farley study tries to paint a different view: It points to something perhaps important: Skill is the multiplier, not the tool. The developers who were already disciplined saw 30-55% speed gains. The ones who weren't — well, the AI amplified that too. The experiment was conducted largely on Java web project codebases, which is a tell for me. How so? Coding in Java for web apps is a lot like coding in Visual FoxPro was for me "back-in-the-day". How so? As a VFP dev, I had to be the compiler. All VFP had was a very simple interpreter that checked some basic syntax and that was it. Not only was there no type-safety, but there was no DBC and no testing at all. I learned how to apply DBC to VFP code, which helped, but — again — the entire thing rested on my personal knowledge and self-discipline learned from years of experience. I am quite sure this is where Farley reaches his conclusion of AI amplification of experience and amplification of a lack thereof.

Most of us here can quickly create a mental laundry list of why Eiffel is far better:
  • The Eiffel compiler is not only a syntax disciplinarian, but a type-safety Nazi and much more as we already know. The static type checking in Eiffel is superb — hence, the first set of guardrails that AI MUST comply with. I've watched Claude for three months now. This is its first barrier. It's first step is "compile" and handle the compiler errors (syntax, type-checking, et al).
  • Design-by-Contract is the next set of guardrails generally NOT available (and Farley is utterly ignorant that such a thing can exist). As Eiffel programmers, we know it. As such, I have written /eiffel-* skill specifically targeting, prompting, and goading Claude into deep code review and creation of contract assertions from single classes to entire libraries. Bertrand offered the wonderful ideas that ultimately became the simple_MML library, which gets applied through a targeted /eiffel-mml skill as a part of a Claude skills tool-chain applied to any target ECF project. There are also skills starting with research and development early in a new lib building process that are designed to create Eiffel classes with no function code implementation in their do..end bodies. Instead, the skill targets creating classes based on the research, intent, specs, and plans where only contracts are implemented into the stubbed classes.
Between the Eiffel compiler, DBC, testing to exercise all of that, plus the various eiffel-* skill toolchain, there are but a few remaining issues when using Claude code on the Opus 4.6 model.
  • Hallucinations-first — Claude wants to hallucinate rather than check surrounding code before it rushes off to write code.
  • Ensign-Eager Syndrome — Claude doesn't want to ask, verify, consult, design-partner, and so on. It wants to rush headlong into coding as though it is the expert and you, as the Eiffel engineer, have nothing to offer it in terms of guidance and input into the process. This is especially dangerous in those post-compaction cycles.
  • People-pleaser-mode — Claude is eager to please you and is constantly viewing you through the emotional-validation lens. Hence, the "You're absolutely right to call me out on that!" and other nonsense statements that are repeating and meaningless phrases cluttering the interactions. There's no honesty involved at all.
  • Ego-trip-AI — To top all of that off, Claude thinks very highly of its own work (e.g. PRODUCTION READY!!! is a constant self-proclaimed assessment of what it makes, whether you think so or not).
What Anders has proposed (and I certainly wanted to build as well) is an actual Eiffel-based LoRA LLM, where the model is trained and highly curated Eiffel data. The hope is that the model will have its weights and biases probablistically aimed at actually writing decent Eiffel code. My hope is that such a model can be coupled with something like Claude code Opus 4.6 (or better) to where Claude depends on the Eiffel-trained model for code production, where Claude is supplying the specifications based on a higher project-wide or cluster-wide understanding, framed with solid research, background data, specifications, plans, and existing codebase understood through those lenses.

I am quite convinced that Eiffel is the best candidate for producing excellent systems instead of the AI SLOP being produced elsewhere.


Larry

On Fri, Feb 6, 2026 at 1:42 PM Ulrich Windl <u202...@gmail.com> wrote:

Liberty Lover

unread,
Feb 6, 2026, 2:44:09 PM (4 days ago) Feb 6
to eiffel...@googlegroups.com
You're welcome! Of course! :-) 

Richie Bielak

unread,
Feb 6, 2026, 3:00:12 PM (4 days ago) Feb 6
to eiffel...@googlegroups.com

Liberty Lover

unread,
Feb 6, 2026, 3:05:25 PM (4 days ago) Feb 6
to eiffel...@googlegroups.com
This is Claude compiling after running the /eiffel-mml skill:

image.png

And what are the results?

image.png

And I then run them in EiffelStudio to ensure they pass for me as well as where I can inspect them.

image.png

Liberty Lover

unread,
Feb 6, 2026, 3:09:55 PM (4 days ago) Feb 6
to eiffel...@googlegroups.com
@Richie ... yep, exactly. So, the notion of "AI SLOP" is well-earned. It's a hole that the majority of shops that gave in to the AI Genie will have to dig out of. I think our goal is to take clear advantage of this GOLDEN opportunity for Eiffel to SHINE into a the dark place of literally STUPID people, where "stupid" is defined as people who go-along-to-get-along with the HERD and then defend the herd and what it believes. I have come to believe that human beings are stupid based on WHAT they believe so much as WHY they believe it. We've had a tough-sell of Eiffel for the last 40 years because the herd is a herd and individual intelligence gets easily swallowed up by the HERD dynamics, which is what creates STUPID people within the herd.

Message has been deleted
Message has been deleted

Liberty Lover

unread,
Feb 6, 2026, 4:02:31 PM (4 days ago) Feb 6
to Eiffel Users
Last one:


This videos evaluation holds true for nearly every mainstream language, from Python and Java to Rust and Kotlin. However, a critical question remains unaddressed: To what extent does the architecture of these languages contribute to AI’s inherent weaknesses? We must ask what the human programmer understands about the "intent" of the tool that remains invisible to the LLM. Because AI functions via probabilistic prediction rather than a deterministic mental model, the code samples it trains on only tell a fragment of the story. A significant portion of the logic is disconnected—some of it exists in the documentation the model hasn't synthesized, but the most vital part resides exclusively within the human mind. This creates a fundamental "contextual siloing" where the AI is effectively guessing at a solution without access to the human's underlying rationale, a gap that no amount of parameter scaling can bridge.

The "fragment" visible to an LLM is essentially the surface syntax—the linguistic patterns and structural arrangements of code as it appears on GitHub or Stack Overflow (or other sources). This represents only a small slice of the engineering lifecycle, roughly the "recipe" without the "kitchen." While the model has ingested trillions of tokens, it remains disconnected from the runtime reality and the physical constraints of the software. It can replicate the how (the syntax) because it is a mathematical engine for token distribution, but it cannot access the why (the intent). This creates a "contextual silo" where the AI sees the static result of a human decision but remains blind to the trade-offs, legacy constraints, and specific business logic that necessitated that code in the first place.

The "missing story" is the internal mental model that the transcript highlights as the hallmark of a human developer. This includes the tactical intuition required to anticipate edge cases—like a specific warehouse's spotty Wi-Fi—and the architectural foresight to write code that a junior developer can maintain six months from now. Because these considerations are rarely documented in the code itself, they are "automatically disconnected" from the AI’s training data. Consequently, the AI isn't "coding" so much as it is performing high-stakes guessing. It is looking at a high-resolution photo of a finished bridge, while the human engineer is the only one who understands the soil samples, the wind-speed calculations, and the structural integrity required to keep it standing.

In comparison to mainstream languages like Python, Java, or C++, the fundamental weakness lies in the implied nature of their logic. Most popular languages treat the "how" of a program as the primary artifact, while the "what" and the "under what conditions" are left as invisible assumptions in the programmer’s mind. When an AI model is trained on these languages, it consumes billions of lines of code where the safety boundaries and logical constraints are completely unstated. This creates a massive training bias where the AI learns to prioritize syntactic fluency over structural correctness. Because the language doesn’t force a developer to embed the "rules of engagement" directly into the syntax, the AI effectively learns to guess the next line of code without any governing framework to verify that its guess is even legal in a logical sense.

The absence of integrated safety boundaries and explicit success conditions in mainstream tools means that AI-generated code often suffers from "silent failures." In languages that allow variables to be empty or null by default, or that rely on separate documentation for specifications, the AI has no internal compass to navigate the "known unknowns" of the codebase. It generates fragments that look correct but fail at runtime because the implicit assumptions—the guiding rules that were human-known and used to build the software—were never made visible to the model. This lack of a built-in verification layer is amplified by the AI's probabilistic nature; it treats code as a sequence of likely tokens rather than a set of provable claims. Without a programming language that forces these claims to be written alongside the logic, the AI is essentially building a house of cards where every third card is a guess about the foundation.

Furthermore, the disconnect between code and its underlying specification in mainstream development leads to a massive technical debt when using AI assistants. In these popular environments, testing and documentation are usually external artifacts—secondary thoughts that may or may not reflect the actual state of the code. When an AI generates code for such systems, there is no "hard link" between the generated implementation and its required behavior or promised results. In contrast, if a language treated the specification as a first-class citizen embedded in every routine, the AI would be forced to "see" the constraints. Without this, the AI hallucinates logic that sounds plausible but violates the invisible intent of the system (both training data and new code it is being asked to write). We are left with a generation of "almost-right" code that requires a human to spend more time reverse-engineering the AI's guesses than it would have taken to simply define the requirements in a way that the machine could actually check.

There are solutions. You should go looking for them. I would guide you, but the herd mentality would probably poo-poo the idea off-the-cuff simply because that's how herds work. And smart humans become STUPID when they give-in to HERD DYNAMICS (aka Python IS THE BEST!!! RA-RA-RA SIS-BOOM-BAH)

Ulrich Windl

unread,
Feb 7, 2026, 3:32:31 AM (3 days ago) Feb 7
to eiffel...@googlegroups.com
So AI (the YouTube algorithm) caught you 😉
Probably it's too late to clear your cookies.

Ulrich

06.02.2026 21:59:41 Liberty Lover <rix....@gmail.com>:

> So, YouTube is now feeding me these videos left and right.
Message has been deleted

javier...@gmail.com

unread,
Feb 9, 2026, 9:27:59 AM (22 hours ago) Feb 9
to Eiffel Users
Hi Larry, it should be interesting to see how to add Eiffel here https://huggingface.co/datasets/tencent/AutoCodeBenchmark

Ian Joyner

unread,
Feb 9, 2026, 5:19:41 PM (14 hours ago) Feb 9
to eiffel...@googlegroups.com
A while ago I asked for a clear explanation of what this was about. Someone kindly responded with a link to a page with a good explanation of AI slop and using Eiffel and DbC to avoid such slop.

There are articles starting to appear that discuss how AI slop is now making programming less productive because it generates bugs that no one knows how to fix.

What Larry is doing with Claude and DbC is relevant to that, and I’d like to pass that link on.

But I lost the link and can’t find with with a swearch (web search).

Could someone repost that link, or similar links that explain to non-Eiffel people what is so powerful to the approach to deal with AI slop?

Thanks
Ian

--
You received this message because you are subscribed to the Google Groups "Eiffel Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to eiffel-users...@googlegroups.com.

Chris Tillman

unread,
1:59 AM (6 hours ago) 1:59 AM
to eiffel...@googlegroups.com
Hi Ian,

There's a little bit at https://github.com/simple-eiffel about halfway down the page. But there were lots of details Larry posted on this group about ways he's been fighting the slop/hallucination issues. He did a very good Reddit post, but it got removed by the Reddit "filters" whatever excuse that was. Here is probably the most in-depth link: https://github.com/simple-eiffel/claude_eiffel_op_docs/blob/main/analyses/higher-API/SIMPLE_EIFFEL_STRATEGIC_ANALYSIS_2026.md




--
Chris Tillman
Developer

Ian Joyner

unread,
2:56 AM (5 hours ago) 2:56 AM
to eiffel...@googlegroups.com
Hi Chris,

Sorry, it was a more elementary explanation and not on github.

Thanks
Ian

Reply all
Reply to author
Forward
0 new messages